summaryrefslogtreecommitdiff
path: root/vp8/common/arm/neon
AgeCommit message (Collapse)Author
2019-01-07vp8 idct: remove returnJohann
Change-Id: Ib1648e1f6559e65ddf11cb54266c7eeff37a6ea6
2019-01-07vp8 idct dequant: resolve missing declarationsJohann
BUG=webm:1584 Change-Id: Iecd2a0154c523fa61349c456befdf6c37d980efc
2019-01-07arm neon: resolve missing declarationsJohann
BUG=webm:1584 Change-Id: I2dcf39f2327b72b58be72c27f952ea781a790dd3
2018-10-30clang-tidy: fix vp8/common parametersJohann
Match function definitions to declarations BUG=webm:1444 Change-Id: Ib96d3b735eaf81cece5406c89cc5156bc2cde462
2017-05-17use memcpy for unaligned neon storesJohann
Advise the compiler that the store is eventually going to a uint8_t buffer. This helps avoid getting alignment hints which would cause the memory access to fail. Originally added as a workaround for clang: https://bugs.llvm.org//show_bug.cgi?id=24421 Change-Id: Ie9854b777cfb2f4baaee66764f0e51dcb094d51e
2016-09-30*idct*_neon.c: add missing rtcd includeJames Zern
+ correct declarations as necessary BUG=webm:1294 Change-Id: I719602df9a56e79188a78e7f8b31257c6d3cc11d
2016-09-26Merge "Un-Revert "Restore vp8_sixtap_predict4x4_neon""Johann Koenig
2016-09-23Use shifted value for sinpi8sqrt2Johann
The value 35468 changes sign when stored in int16_t: implicit conversion from 'int' to 'int16_t' (aka 'short') changes value from 35468 to -30068 This negation requires adding back the original value to compensate. Shifting the value keeps the value positive and saves a post-vqdmulh shift. This technique is used in webp and idct_dequant_full_2x_neon BUG=b/28027557 Change-Id: I0c5ce09bea170fe08061856c2af6f841a557e0c3
2016-09-23Un-Revert "Restore vp8_sixtap_predict4x4_neon"Johann
This restores d9dce2f48eed1368a44c368fa87a506bd89ffec5 Switched to using signed shift-and-narrow. Instead of saturating negative results to 0, it was saturating them to 255. BUG=webm:817 BUG=webm:1273 Change-Id: I571095336aa4182e3288b17924fcaaece42b0a49
2016-09-16Merge "Revert "Restore vp8_sixtap_predict4x4_neon""James Zern
2016-09-16Revert "Restore vp8_sixtap_predict4x4_neon"Johann Koenig
This reverts commit d9dce2f48eed1368a44c368fa87a506bd89ffec5. Appears to be failing the SixtapPredict tests in some configurations and possibly test vectors as well. Change-Id: Ica6aa83ebac47d0a76e451846e7da67b1c17a7d7
2016-09-15Restore vp8_bilinear_predict4x4_neonJohann
This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. It is still ~5x faster than C in the unaligned case and doing both filters. BUG=webm:892 BUG=webm:1273 Change-Id: Icf7167189391b46202f47233bb585c24c42bcc36
2016-09-15Restore vp8_sixtap_predict4x4_neonJohann
This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. The store, when unaligned, has a version that is ~25% slower but safe when xoffset = 0 (second pass filter only). When the first pass filter (or both) are in play, the new version is almost identical in speed. Worst case performance (both filters, unaligned stores) is roughly 3-4x faster than C. BUG=webm:817 BUG=webm:1273 Change-Id: I1e490e94453e0872151fe0dafb05557463f6247d
2016-07-18prepend ++ instead of post in for loops.Jim Bankoski
Applied the following regex : search for: (for.*\(.*;.*;) ([a-zA-Z_]*)\+\+\) replace with: \1 ++\2) This misses some for loops: ie : for (mb_col = 0; mb_col < oci->mb_cols; mb_col++, mi++) Change-Id: Icf5f6fb93cced0992e0bb71d2241780f7fb1f0a8
2016-07-15vp8: apply clang-formatclang-format
Change-Id: I7605b6678014a5426ceb45c27b54885e0c4e06ed
2016-05-06Remove sixtap/bilinear 4x4 neon implementationsJohann
These implementations rely on casting the pointers to load the data. Clang implemented optimizations which automatically add alignment hints to such loads. The 4x4 filters do not guarantee the necessary alignment so the resulting assembly is broken. https://llvm.org/bugs/show_bug.cgi?id=24421 BUG=webm:817 BUG=webm:892 Change-Id: I608885299f1f86ff83653b65e0e40d0ae87fb3fe
2015-09-30vp8: change build_intra_predictors_mbuv_s to use vpx_dsp.Ronald S. Bultje
Change-Id: I936c2430c3c5b1e0ab5dec0a20110525e925b5e4
2015-09-30vp8: change build_intra_predictors_mby_s to use vpx_dsp.Ronald S. Bultje
Change-Id: I2000820e0c04de2c975d370a0cf7145330289bb2
2015-08-18Rename vp8 loopfilter[_neon.c]Johann
Avoid conflict with vpx_dsp version Change-Id: I041b1532a9276400a5547de8dfed1de43ad4e83d
2015-07-07Move sub pixel variance to vpx_dspJohann
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-06-30loopfiltersimpleverticaledge_neon: quiet uninit var warningsJames Zern
take 2. localize the function parameter to actually remove the warning Change-Id: I23c02061b5e21b0b75bd33c26062d1e531df7b92
2015-06-25loopfiltersimpleverticaledge_neon: quiet uninit var warningsJames Zern
the vector used in vld*_lane_* should be initialized before use Change-Id: Idce95354737915f6fb4e6b5e8980a050e953036d
2015-06-25idct_dequant_0_2x_neon: quiet uninit var warningsJames Zern
the vector used in vld*_lane_* should be initialized before use Change-Id: I6b791088479fec3bc021ca75cc2af5adcc39d954
2015-06-23vp8_subpixelvariance_neon: right size coeff tableJames Zern
only uint8 is required; each use only loads one value as a uint8 quiets a few type conversion warnings Change-Id: I03dc0dc0eb01ac23a6e8673daa2b77c6c57bf1b0
2015-05-26Move variance functions to vpx_dspJohann
subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
2015-05-07replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNEDJames Zern
this macro was used inconsistently and only differs in behavior from DECLARE_ALIGNED when an alignment attribute is unavailable. this macro is used with calls to assembly, while generic c-code doesn't rely on it, so in a c-only build without an alignment attribute the code will function as expected. Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
2015-05-06Move shared SAD code to vpx_dspJohann
Create a new component, vpx_dsp, for code that can be shared between codecs. Move the SAD code into the component. This reduces the size of vpxenc/dec by 36k on x86_64 builds. Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
2015-02-03Use correct buffer size in vp8 subpixel varianceJohann
In vp8_sub_pixel_variance8x8_neon the temp2 buffer is only initialized to kHeight8 * kWidth8. However, in the case that xoffset != 0 and yoffset == 0, var_filter_block2d_bil_w8 is called with output_width kHeight8PlusOne. Thanks to cmugurel for diagnosing and yulius for the patch. Change-Id: Ib71ffd96ffad963c92b8b7ca23f303942785b8e0 https://code.google.com/p/webrtc/issues/detail?id=4190
2014-09-25Clarify GCC version checkJohann
The version check was incorrectly matching some versions of clang which reported as gcc 4.2 Change-Id: I686d3576e71883fe1463206b56ab5e2aa9bb68a8
2014-09-14vp8/vp9: neon: msvc: move the 'ifdef _MSC_VER' bit to vpx_ports/mem.h.Jia Jia
fix compiling warning. Change-Id: If8706a9046436f704c597e4275a6810c76ba7daa
2014-09-04bilinearpredict_neon: fix type conversion warningsJames Zern
make bifilter4_coeff[][] uint8_t, no values exceed this range and they're loaded with vdup_n_u8(). Change-Id: I921983e9edd828d29820e40ac30a7801dbe0fb4f
2014-09-04Merge "arm: Fix building vp8_subpixelvariance_neon.c with MSVC"James Zern
2014-09-04Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 10""Scott LaVarnway
This reverts commit 677fb5123e0ece1d6c30be9d0282e1e1f594a56d Compiles with 4.6. Change-Id: I7f87048911b6bc28a61741d95501fa45ee97b819
2014-09-04arm: Fix building vp8_subpixelvariance_neon.c with MSVCMartin Storsjo
Use the right return values - vget_low_s64 returns int64x1_t, not a normal int64_t. Also make __builtin_prefetch a no-op on MSVC for this file. Change-Id: I4d2fce01d0ba106b98d3d53b137803119c2c2c08
2014-09-03Neon version of vp8_build_intra_predictors_mby_s() andScott LaVarnway
vp8_build_intra_predictors_mbuv_s(). This patch replaces the assembly version with an intrinsic version. On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~2.6%. Change-Id: I9ef65bad929450c0215253fdae1c16c8b4a8f26f
2014-09-03VP8 for ARMv8 by using NEON intrinsics 17Scott LaVarnway
Add vp8_subpixelvariance_neon.c - vp8_sub_pixel_variance16x16_neon_func - vp8_variance_halfpixvar16x16_h_neon - vp8_variance_halfpixvar16x16_v_neon - vp8_variance_halfpixvar16x16_hv_neon - vp8_sub_pixel_variance8x8_neon Change-Id: I3e5d85b2eafc26be0eef6a777789b80e4579257b Signed-off-by: James Yu <james.yu@linaro.org>
2014-09-03Merge "Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This ↵Johann
reverts commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb.""
2014-08-29Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 08""Scott LaVarnway
This reverts commit 928ff03889dadc3f63883553b443c08e625b4885 Compiles with 4.6 now. Change-Id: Ib455da1098bb0e0623248be07579882a425fcbd1
2014-08-20Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts ↵Johann
commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb." This reverts commit 920f803f2e2f41395311f96fec1b4a0c1b2b631a Change-Id: I410d9036214a1b18427cca70b4bc6d8239740737
2014-07-11vp8_bilinear_predict4x4_neon: init src vectorsJames Zern
quiets uninitialized warnings on the first load. Change-Id: I58a5af337087d96b4eaea8991a0f85c4ba58aebe
2014-07-10vp8_sixtap_predict4x4_neon: init src vectorsJames Zern
quiets uninitialized warnings on the first load. Change-Id: Ied9b03928537a9ed2cd414b9e8a0be00191b0f32
2014-05-14Remove intermediate step in vp8_dequantize_bJohann
With the intrinsics it is no longer necessary to have a stub/helper function. Change-Id: I3695961c3c94f1bb750d3b7b29716e509ebba482
2014-05-13Revert "VP8 for ARMv8 by using NEON intrinsics 06"Johann
This reverts commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb. This exposes a bug in gcc 4.9 regarding register allocation. Will reland when 4.9 is fixed. Change-Id: I2d8a04e4edde93719280e41550f4c0765608ec4d
2014-05-07Merge "Revert "VP8 for ARMv8 by using NEON intrinsics 10""Johann
2014-05-07Merge "arm: Use a correct neon vector type for 64 bit integers"Johann
2014-05-07arm: Add a no-op define of __builtin_prefetch for MSVCMartin Storsjo
Both GCC and RVCT/ARMCC support __builtin_prefetch, but MSVC doesn't. Change-Id: I44e1eecead61bc88d8fdfd3fef03d76d4f5afe08
2014-05-07arm: Use a correct neon vector type for 64 bit integersMartin Storsjo
This fixes building with MSVC. Change-Id: I763ba8855c8083d82c8b477d3a297e310e93a335
2014-05-06Revert "VP8 for ARMv8 by using NEON intrinsics 10"Johann
This reverts commit c500fc22c1bb2a3ae5c318bfb806f7e9bd57ce25 There is an issue with gcc 4.6 in the Android NDK: loopfiltersimpleverticaledge_neon.c: In function 'vp8_loop_filter_bvs_neon': loopfiltersimpleverticaledge_neon.c:176:1: error: insn does not satisfy its constraints: Change-Id: I95b6509d12f075890308914cc691b813d2e5cd9f
2014-05-06Revert "VP8 for ARMv8 by using NEON intrinsics 08"Johann
This reverts commit a5d79f43b963ced59b462206faf3b7857bdeff7b There is an issue with gcc 4.6 in the Android NDK: loopfilter_neon.c: In function 'vp8_loop_filter_vertical_edge_y_neon': loopfilter_neon.c:394:1: error: insn does not satisfy its constraints: Change-Id: I2b8c6ee3fa595c152ac3a5c08dd79bd9770c7b52
2014-05-05Merge "VP8 for ARMv8 by using NEON intrinsics 16"Johann