summaryrefslogtreecommitdiff
path: root/vp8
AgeCommit message (Collapse)Author
2016-09-26Merge "Un-Revert "Restore vp8_sixtap_predict4x4_neon""Johann Koenig
2016-09-23Use shifted value for sinpi8sqrt2Johann
The value 35468 changes sign when stored in int16_t: implicit conversion from 'int' to 'int16_t' (aka 'short') changes value from 35468 to -30068 This negation requires adding back the original value to compensate. Shifting the value keeps the value positive and saves a post-vqdmulh shift. This technique is used in webp and idct_dequant_full_2x_neon BUG=b/28027557 Change-Id: I0c5ce09bea170fe08061856c2af6f841a557e0c3
2016-09-23Un-Revert "Restore vp8_sixtap_predict4x4_neon"Johann
This restores d9dce2f48eed1368a44c368fa87a506bd89ffec5 Switched to using signed shift-and-narrow. Instead of saturating negative results to 0, it was saturating them to 255. BUG=webm:817 BUG=webm:1273 Change-Id: I571095336aa4182e3288b17924fcaaece42b0a49
2016-09-22Merge "vp8: remove VP8_SET_DBG* control support"James Zern
2016-09-21Keep vp8 sixtap read within boundsJohann
When filtering it needs 6 pixels: 2 prior to the source, the source, and 3 after the source. When filtering 16 wide, that means 21. To accomplish this the SSE2 reads [-2] to [5], [6] to [13], and [14] to [21], a total of 24 bytes (reading in groups of 8 is easy) The filter then shifts this last set to the top half of the register and uses 'or' to combine it with the previous set. Valgrind detected an issue reading pixels [19], [20] and [21]: Address 0x7f581c2 is 434 bytes inside a block of size 441 alloc'd Note: we only need pixels [16], [17], and [18] as context for [15]. To fix this, it now reads 8 bytes starting at [11], which re-loads [11] through [13], but stops at [18] and does not over-read any values. This is shifted by 5 and 'or'd with xmm1. Although the lower bits are not cleared, they overlap directly with [11] through [13], so 'or' produces the correct results. Change-Id: I0c89c03afa660fc9b0108ac055d7bd403e493320
2016-09-20vp8: remove VP8_SET_DBG* control supportJames Zern
the --enable-postproc-visualizer configure option remains as a no-op as do the control names and values for compatibility + remove the corresponding debug flags from vpxdec: --pp-* Change-Id: I4a001cd9962b59560d7d6bda6272d4ff32b8d37c
2016-09-20Merge changes from topic 'Wshorten'James Zern
* changes: vp8: convert some uses of unsigned long to size_t vp8/encoder: quiet some -Wshorten-64-to-32 warnings
2016-09-20Merge "Enable ssse3 bilinear tests"Johann Koenig
2016-09-19vp8: convert some uses of unsigned long to size_tJames Zern
similar to changes that were done in vp9 for encoded frame size reporting. has the side-effect of quieting a -Wshorten-64-to-32 warning. Change-Id: I89f74cb617fc29334ee351dc8dfaa3b8cfd4e5af
2016-09-19vp8/encoder: quiet some -Wshorten-64-to-32 warningsJames Zern
this code is similar to other existing uses and/or vp9 Change-Id: I56e646931379759d9f7332ea6d746060007c75ee
2016-09-16Merge changes from topic 'clang-format'James Zern
* changes: apply clang-format .clang-format: update to 3.8.1
2016-09-15Enable ssse3 bilinear testsJohann
The code only has issues when xoffset == 0 and yoffset == 0 which represents a simple copy. Presumably this case does not need to be handled because the issue has existed since 2010. BUG=webm:1287 Change-Id: Ic47e2653f3b729e99b40e53d8d2d8d1501edaaa9
2016-09-16Merge "Revert "Restore vp8_sixtap_predict4x4_neon""James Zern
2016-09-16Revert "Restore vp8_sixtap_predict4x4_neon"Johann Koenig
This reverts commit d9dce2f48eed1368a44c368fa87a506bd89ffec5. Appears to be failing the SixtapPredict tests in some configurations and possibly test vectors as well. Change-Id: Ica6aa83ebac47d0a76e451846e7da67b1c17a7d7
2016-09-16Merge "Restore vp8_bilinear_predict4x4_neon"Johann Koenig
2016-09-15Restore vp8_bilinear_predict4x4_neonJohann
This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. It is still ~5x faster than C in the unaligned case and doing both filters. BUG=webm:892 BUG=webm:1273 Change-Id: Icf7167189391b46202f47233bb585c24c42bcc36
2016-09-16Merge "Restore vp8_sixtap_predict4x4_neon"Johann Koenig
2016-09-16vp8 postproc: expand CONFIG_POSTPROC guardJohann
postproc.c is overloaded and used for both postproc and internal stats. If only --enable-internal-stats is specified there are issues with non-existent struct members and unused functions. Change-Id: I82367f1ffce659c3918c9f964dbce94a716fbb89
2016-09-15apply clang-formatclang-format
Change-Id: I501597b7c1e0f0c7ae2aea3ee8073f0a641b3487
2016-09-15Restore vp8_sixtap_predict4x4_neonJohann
This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. The store, when unaligned, has a version that is ~25% slower but safe when xoffset = 0 (second pass filter only). When the first pass filter (or both) are in play, the new version is almost identical in speed. Worst case performance (both filters, unaligned stores) is roughly 3-4x faster than C. BUG=webm:817 BUG=webm:1273 Change-Id: I1e490e94453e0872151fe0dafb05557463f6247d
2016-09-14Merge "cosmetics,vp8: join some lines, fix table format"James Zern
2016-09-13vp8 decoder: cast decoding_thread_count to intJohann
For some reason allocated_decoding_thread_count is signed, but decoding_thread_count is not. Cleans -Wextra/-Wsign-compare: comparison between signed and unsigned integer expressions Change-Id: Id0ada78100acff27c1c4ed7493c563d13c55cdcd
2016-09-09cosmetics,vp8: join some lines, fix table formatJames Zern
Change-Id: Idcf3b68f0e59bd74c9d332bbd4a7c1484ddb691a
2016-09-09vp8: Set the skin model to mode 1.Marco
This change was reverted before due to a hangouts encode-time regression investigation. But since then this change has been cleared of causing any noticeable regression. This mode reduces some false detection, and uses the same model as in vp9. Change-Id: I9c82a748c5f601d0aca9f61ee218abfbd58c62bd
2016-09-08vp8: Remove TSAN warning around end of encode.Alexander Potapenko
Tsan warns when run in one pass and there is a recode loop. Change-Id: Ice2ecb2270f09ebd49efbd49c0e4f77d32e23c0f
2016-09-01vp8_cx_iface: quiet -Wshorten-64-to-32 warningJames Zern
set_reference_and_update(): use the correct type for flags, vpx_enc_frame_flags_t Change-Id: I257da784537ff18686f6db8665f99af6ea6a86ba
2016-09-01get_cpu_count: quiet -Wshorten-64-to-32 warningsJames Zern
sysconf returns a long; cast (unsigned) dwNumberOfProcessors to int for good measure Change-Id: I1f181d7bd9a060c0898db41f66a5065394afdc4e
2016-09-02Merge changes from topic 'Wundef'Johann Koenig
* changes: Enable -Wundef by default Define VP8_TEMPORAL_ALT_REF to !CONFIG_REALTIME_ONLY Remove CONFIG_DEBUG guards from assert() Remove unused function vpx_de_mblock Fix -Wundef warning for OUTPUT_FPF Fix -Wundef warning for __SANITIZE_ADDRESS__
2016-09-01Merge "Fix formatting in internal stats for vp8 and vp9"Yaowu Xu
2016-08-31Define VP8_TEMPORAL_ALT_REF to !CONFIG_REALTIME_ONLYJohann
Previously VP8_TEMPORAL_ALT_REF was only defined for non-realtime-only builds. However, its value was checked with #if, not #ifdef. Fixes -Wundef warnings. BUG=webm:1069 Change-Id: If78d8731298f3f0d3662ffa25f973e7adaf67152
2016-08-31Remove CONFIG_DEBUG guards from assert()Johann
When 'NDEBUG' is set, assert() generates no code. Change-Id: Icf61cfc1a8f6e5f0770b3626d8c73ae968df1108
2016-08-31Fix -Wundef warning for OUTPUT_FPFJohann
BUG=webm:1069 Change-Id: I3d13d07cf0934e6e262c8033bd77d7197d03ce21
2016-08-29Merge "vp8: Move loopfilter synchronization to end of encode_frame call."Marco Paniconi
2016-08-26Merge changes I353da4a2,I423f2153James Zern
* changes: vp8_decoder_create_threads: check sem/pthread returns vp8_create_decoder_instances: add missing setjmp
2016-08-26Merge "Remove halfpix specialization"Johann Koenig
2016-08-25Fix formatting in internal stats for vp8 and vp9Sarah Parker
This corrects a formatting error introduced in: I1e9d548ce445d29002f0c59ebfd3957a6f15e702 where spaces were used as delimiters instead of tabs. The corresponding fix for vp10 is in Ica3d625d6672b3c47e0e208b45eede29b9004030. Change-Id: Ibc4eb8fd82e6b926ba259a679dc98557cadba9b1
2016-08-25vp8: Move loopfilter synchronization to end of encode_frame call.Marco
Allow loopfilter to continue until encode_frame is completed. Change-Id: I7bbccc3d409e263aab6a6ff24588d8b2a964a96e
2016-08-23vp8_decoder_create_threads: check sem/pthread returnsJames Zern
Change-Id: I353da4a2f988ca51d48d0ca91236e8cc0bb48ff5
2016-08-23vp8_create_decoder_instances: add missing setjmpJames Zern
vp8_decoder_create_threads() has allocations that expect one is set. Change-Id: I423f2153a2969c88d48ba45cc9ead4a01443ce65
2016-08-23Remove halfpix specializationJohann
This function only exists as a shortcut to subpixel variance with predefined offsets. xoffset = 4 for horizontal, yoffset = 4 for vertical and both for "hv" Removing this allows the existing optimizations for the variance functions to be called. Instead of having only sse2 optimizations, this gives sse2, ssse3, msa and neon. BUG=webm:1273 Change-Id: Ieb407b423b91b87d33c4263c6a1ad5e673b0efd6
2016-08-23vp8: fix decoder crash with invalid leading keyframesJames Zern
decoding the same invalid keyframe twice would result in a crash as the second time through the decoder would be assumed to have been initialized as there was no resolution change. in this case the resolution was itself invalid (0x6), but vp8_peek_si() was only failing in the case of 0x0. invalid-vp80-00-comprehensive-018.ivf.2kf_0x6.ivf tests this case by duplicating the first keyframe and additionally adds a valid one to ensure decoding can resume without error. BUG=b/30593765 Change-Id: If0859035908b7870d67a7f3f646b5a080252eb6d
2016-08-22Revert "vp8: Move loopfilter synchronization to end of encode_frame call."Marco Paniconi
This reverts commit c2fe9acceda922ca1d9f0d6185b340560b93597a. This change break linux browser test in chromium: https://build.chromium.org/p/chromium.webrtc/builders/Linux%20Tester Change-Id: I226782fad480c17a99ec6c785ad93cf4ab88f0ae
2016-08-16vp8: Move loopfilter synchronization to end of encode_frame call.Marco
Change-Id: I5bdfea7f51df1f1fa5d9c1597e96988acce6c2f2
2016-08-10Align thread entry point stackAleksey Vasenev
_beginthreadex does not align the stack on 16-byte boundary as expected by gcc. On x86 targets, the force_align_arg_pointer attribute may be applied to individual function definitions, generating an alternate prologue and epilogue that realigns the run-time stack if necessary. This supports mixing legacy codes that run with a 4-byte aligned stack with modern codes that keep a 16-byte stack for SSE compatibility. https://gcc.gnu.org/onlinedocs/gcc/x86-Function-Attributes.html Change-Id: Ie4e4ab32948c238fa87054d5664189972ca6708e Signed-off-by: Aleksey Vasenev <margtu-fivt@ya.ru>
2016-08-04Remove armv6 targetJohann
Change-Id: I1fa81cc9cabf362a185fc3a53f1e58de533a41e5
2016-08-03Pad 'Left' when building under ASanJohann
The neon intrinsics are not able to load just the 4 values that are used. In vpx_dsp/arm/intrapred_neon.c:dc_4x4 it loads 8 values for both the 'above' and 'left' computations, but only uses the sum of the first 4 values. BUG=webm:1268 Change-Id: I937113d7e3a21e25bebde3593de0446bf6b0115a
2016-07-28vp8: Switch skin model to mode 0 to save some cycle.JackyChen
This change will speed up vp8 encoder by 1.5% ~ 2% on linux. No much speed change on Mac. Change-Id: Id957f19ddd89805baa2af84c5027d52d9a48553f
2016-07-23vp8/decodeframe: fix signed/unsigned comparisonJames Zern
quiets a visual studio warning Change-Id: Ic7725616bc2cb837e6f79294d4fcff36b67af834
2016-07-23vp8/postproc.c: disable clang-format for RGB_TO_YUVclang-format
Change-Id: Id2a936301ec1e3d5648b4f8adbf4e6625002589d
2016-07-23Merge changes I0089e884,Icb0ecb9eJames Zern
* changes: vp8/postproc: fix implicit float conversion blockiness_test: fix implicit float conversion