summaryrefslogtreecommitdiff
path: root/vp8
AgeCommit message (Collapse)Author
2016-12-16vp8 : use threading mutex's for tsan only.Jim Bankoski
To avoid decode performance hit of 2% when running on hyperthreaded cores. This patch only uses the mutex's when we are running tsan. This is safe because 32 bit operations like read and store are atomic on all the platforms we care about. Tsan warns about race situations, but in this case either situation ( read occurs before write or write before read) the worst case is that we go around one extra time in the loop. So the ordering doesn't really matter. That said a few other things have been tried : for instance as per here: webrtc/base/atomicops.h#52 In this patch they use: __atomic_load_n(i, __ATOMIC_ACQUIRE); __atomic_store_n(i, value, __ATOMIC_RELEASE); This code works on gcc, clang ( replacing protected write and read), and avoids tsan errors. Incurring no penalty in performance. In C11 its replaced by straight atomic operands. However there is no equivalent in the visual studio's we support as int32 on all windows platforms is already atomic. To avoid tsan like warnings on windows we'd need to use interlocked exchange and the end result doesn't gain us any thing. Change-Id: I2066e3c7f42641ebb23d53feb1f16f23f85bcf59
2016-12-14Change order of operation to avoid ubsan warningsYaowu Xu
This commit change an order of operation to avoid left shifts of negative numbers. Change-Id: I607c7eb91658c7a5ef397fc1504721d1b10e3dd6
2016-12-13Reapply 'Amend and improve VP8 multithreading implementation'Jim Bankoski
Reapply this patch: ff0107f Amend and improve VP8 multithreading implementation Amended the patch to add a unit test, and fix an asan error. BUG=webm:851 Change-Id: I6572c03256169c64e80248bf5a5e99f59a2fc93c
2016-11-22Fix mips dspr2 build warningKaustubh Raste
Change-Id: Ia8fb3ed124f01384e7896e309c9ff22c05b40719
2016-11-10Merge "*ppflags.h: remove unused *_DEBUG_* enum values"James Zern
2016-11-08*ppflags.h: remove unused *_DEBUG_* enum valuesJames Zern
usage of the vp8 versions was removed in: 3f72509 vp8: remove VP8_SET_DBG* control support vp9 had the usage stripped even earlier. Change-Id: I978142eb6492552cd29c9c6feb1e89acfc5f7b84
2016-11-08Refine vp8_refining_search_sadx4 targetingJohann
This uses the same sdx4df pointers as vp8_diamond_search_sadx4 and should therefore target the same optimizations. See e4ddf9db6a37eee59c079f5ae427643ae3424fcf Change-Id: Ic298e9b25c34bbe6b7a0799509355b0addb56675
2016-10-20vp8: Apply gf target-size boost only when refresh_golden_frame = 1.Marco
Change only affects 1 pass cbr, error resilience off. Change-Id: I68b896b09d722995a71c44331233e97bd862bcfc
2016-10-19vp8: Adjust threshold to set the gf_noboost flag.Marco
Change only affects 1 pass cbr, with error_resilient off. Change-Id: Ibf254d8772fa2a8f188c9932d37b2f42362d8003
2016-10-19vp8: Add control for gf boost for 1 pass cbr.Marco
Control already exists for vp9, adding it to vp8. Usage is only when error_resilient is off. Added a datarate unittest for non-zero boost. Change-Id: I4296055ebe2f4f048e8210f344531f6486ac9e35
2016-10-14Drop empty frames.Jim Bankoski
Change-Id: I2d45a6eb3aaca97eb61e8e7ef9e5114221091244
2016-10-12Merge "Optimize vp8 loopfilter msa functions"Kaustubh Raste
2016-10-10Merge "vp8: Change default gf behavior for 1 pass cbr."Marco Paniconi
2016-10-07vp8: Change default gf behavior for 1 pass cbr.Marco
In 1 pass CBR, with error_resilience off, allow for special logic to change the default gf behaviour. In this CL: boost is turned off and the gf period is set to a multiple of cyclic refresh period. Change only affect 1 pass CBR mode, i.e, when the flag gf_update_onepass_cbr is set. Including the previous change (3ec8e11: to allow cyclic refresh for error_resilience off), comparing metrics on RTC set for error_resilience off vs on: avgPSNR/SSIM up by ~6%. Change-Id: Id5b3fb62a4f04de5a805bd1b418f2b349574e0bc
2016-10-07Optimize vp8 loopfilter msa functionsKaustubh Raste
Updated code to process in 8bit as saturation/clipping takes care of overflow Change-Id: I35fb2c0e702fd91309cc391c5a7745a3b619a64c
2016-10-06[vpx highbd lpf NEON 1/6] horizontal 4Linfeng Zhang
BUG=webm:1300 Change-Id: Idf441806e6bf397ff5ecd8776146b3f781f50c40
2016-10-06Merge "Modify vp8 idct msa functions store method"Kaustubh Raste
2016-10-05Revert "Revert "vp8/encoder/onyx_if.c: apply clang-format""Marco Paniconi
This reverts commit a7456144ce0ab98e015548dd7cda4165ad2a800c. Change-Id: I400987fb26a09e9b9ea42c91f48ea12f7bc37356
2016-10-05Revert "vp8/encoder/onyx_if.c: apply clang-format"Marco Paniconi
This reverts commit 891a87dccddfbb9fd625f4b32aa17ae3501f30a6. Change-Id: I067b3b6a3cfb5bc760166999948b8087d4c5cb80
2016-10-05Modify vp8 idct msa functions store methodKaustubh Raste
vp8_short_inv_walsh4x4_msa - Optimized to process in short vector type Updated below functions to store exact number of bytes in output rather than complete vector idct4x4_addblk_msa idct4x4_addconst_msa dequant_idct4x4_addblk_msa dequant_idct4x4_addblk_2x_msa dequant_idct_addconst_2x_msa Change-Id: Ic1b3752e2421dc7d70a082dcdaab9d140d7e5d9c
2016-10-04vp8/encoder/onyx_if.c: apply clang-formatclang-format
after: 955b3b6 vp8: Allow for cyclic refresh even if error_resilience it off. Change-Id: Iba189b18c84be8f5140754280c6801cfc387cfcd
2016-10-04vp8: Allow for cyclic refresh even if error_resilience it off.Marco
cyclic_refresh was tied to error_resilience mode. Allow it to be on also for 1 pass CBR mode even if error_resilience is off. Other option to use new control for this, but prefer to avoid that for now. Change-Id: I3625b292ee059a890e31338b514e211bf0ab5c3e
2016-10-04Merge "Remove rate deviation metric from vp8"Sarah Parker
2016-10-04Remove rate deviation metric from vp8Sarah Parker
BUG=b/31780679 Change-Id: I2b2a43b154eeacb4f51a11f6362cc535cfe318da
2016-10-01Merge "vp8,frame_buffers: remove unused use_frame_threads"James Zern
2016-09-30*idct*_neon.c: add missing rtcd includeJames Zern
+ correct declarations as necessary BUG=webm:1294 Change-Id: I719602df9a56e79188a78e7f8b31257c6d3cc11d
2016-09-29vp8,frame_buffers: remove unused use_frame_threadsJames Zern
this was never fully implemented Change-Id: I4640cf84c40ea2cc9c6c12acf116d39df4b04578
2016-09-29vp8: remove mmx functionsJohann
When they have sse2 equivalents. Change-Id: I158f631a3bcecba57b36093ac10114b1904767a7
2016-09-29Rename _xmm functions to _sse2Johann
Avoid the extra level of indirection/confusion. Change-Id: I0555f639d67835df9fb7dac0c75085e9954805f1
2016-09-29Remove vp8_clear_system_stateJohann
Use vpx_clear_system_state instead. Change-Id: Ia3e9122f69a2c690ddd7c7bc54f92ccb9ec18b3e
2016-09-29vp8: clean up rtcdJohann
Remove lines which specify the same name for a function. Change-Id: I956bd8ce2b81a2a8feab5621d28bd2499c2b4c2d
2016-09-28Merge "Hook up vp8_diamond_search_sad_sse3"Johann Koenig
2016-09-27Hook up vp8_diamond_search_sad_sse3Johann
The original commit never set any 'specialize' line: 61311e61039c300ae872ccba22304e9e60dc0205 It appears the sadx4 version of function uses sdx4df calls to speed up the search. There are no sse3 versions of the sdx4df functions, but there are sse2 and msa versions. There is a neon version of vpx_sad16x16x4d but not any of the smaller versions. Perhaps if they existed this function could be expanded to use them. Change-Id: I936d7d6b1a3ff6dcd5a4d2322272708c47cdec13
2016-09-27mips: clean up wextra warningsJohann
Remove unused zbin variable: warning: unused parameter ‘zbin’ Use int for loop variables to avoid unsigned conversion: warning: comparison between signed and unsigned integer expressions Change-Id: Icea74b870c0ee68a8bf687e796a69392af25a8ad
2016-09-26Merge "Un-Revert "Restore vp8_sixtap_predict4x4_neon""Johann Koenig
2016-09-23Use shifted value for sinpi8sqrt2Johann
The value 35468 changes sign when stored in int16_t: implicit conversion from 'int' to 'int16_t' (aka 'short') changes value from 35468 to -30068 This negation requires adding back the original value to compensate. Shifting the value keeps the value positive and saves a post-vqdmulh shift. This technique is used in webp and idct_dequant_full_2x_neon BUG=b/28027557 Change-Id: I0c5ce09bea170fe08061856c2af6f841a557e0c3
2016-09-23Un-Revert "Restore vp8_sixtap_predict4x4_neon"Johann
This restores d9dce2f48eed1368a44c368fa87a506bd89ffec5 Switched to using signed shift-and-narrow. Instead of saturating negative results to 0, it was saturating them to 255. BUG=webm:817 BUG=webm:1273 Change-Id: I571095336aa4182e3288b17924fcaaece42b0a49
2016-09-22Merge "vp8: remove VP8_SET_DBG* control support"James Zern
2016-09-21Keep vp8 sixtap read within boundsJohann
When filtering it needs 6 pixels: 2 prior to the source, the source, and 3 after the source. When filtering 16 wide, that means 21. To accomplish this the SSE2 reads [-2] to [5], [6] to [13], and [14] to [21], a total of 24 bytes (reading in groups of 8 is easy) The filter then shifts this last set to the top half of the register and uses 'or' to combine it with the previous set. Valgrind detected an issue reading pixels [19], [20] and [21]: Address 0x7f581c2 is 434 bytes inside a block of size 441 alloc'd Note: we only need pixels [16], [17], and [18] as context for [15]. To fix this, it now reads 8 bytes starting at [11], which re-loads [11] through [13], but stops at [18] and does not over-read any values. This is shifted by 5 and 'or'd with xmm1. Although the lower bits are not cleared, they overlap directly with [11] through [13], so 'or' produces the correct results. Change-Id: I0c89c03afa660fc9b0108ac055d7bd403e493320
2016-09-20vp8: remove VP8_SET_DBG* control supportJames Zern
the --enable-postproc-visualizer configure option remains as a no-op as do the control names and values for compatibility + remove the corresponding debug flags from vpxdec: --pp-* Change-Id: I4a001cd9962b59560d7d6bda6272d4ff32b8d37c
2016-09-20Merge changes from topic 'Wshorten'James Zern
* changes: vp8: convert some uses of unsigned long to size_t vp8/encoder: quiet some -Wshorten-64-to-32 warnings
2016-09-20Merge "Enable ssse3 bilinear tests"Johann Koenig
2016-09-19vp8: convert some uses of unsigned long to size_tJames Zern
similar to changes that were done in vp9 for encoded frame size reporting. has the side-effect of quieting a -Wshorten-64-to-32 warning. Change-Id: I89f74cb617fc29334ee351dc8dfaa3b8cfd4e5af
2016-09-19vp8/encoder: quiet some -Wshorten-64-to-32 warningsJames Zern
this code is similar to other existing uses and/or vp9 Change-Id: I56e646931379759d9f7332ea6d746060007c75ee
2016-09-16Merge changes from topic 'clang-format'James Zern
* changes: apply clang-format .clang-format: update to 3.8.1
2016-09-15Enable ssse3 bilinear testsJohann
The code only has issues when xoffset == 0 and yoffset == 0 which represents a simple copy. Presumably this case does not need to be handled because the issue has existed since 2010. BUG=webm:1287 Change-Id: Ic47e2653f3b729e99b40e53d8d2d8d1501edaaa9
2016-09-16Merge "Revert "Restore vp8_sixtap_predict4x4_neon""James Zern
2016-09-16Revert "Restore vp8_sixtap_predict4x4_neon"Johann Koenig
This reverts commit d9dce2f48eed1368a44c368fa87a506bd89ffec5. Appears to be failing the SixtapPredict tests in some configurations and possibly test vectors as well. Change-Id: Ica6aa83ebac47d0a76e451846e7da67b1c17a7d7
2016-09-16Merge "Restore vp8_bilinear_predict4x4_neon"Johann Koenig
2016-09-15Restore vp8_bilinear_predict4x4_neonJohann
This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. It is still ~5x faster than C in the unaligned case and doing both filters. BUG=webm:892 BUG=webm:1273 Change-Id: Icf7167189391b46202f47233bb585c24c42bcc36