summaryrefslogtreecommitdiff
path: root/vp8/common/arm/neon
AgeCommit message (Collapse)Author
2016-09-16Merge "Revert "Restore vp8_sixtap_predict4x4_neon""James Zern
2016-09-16Revert "Restore vp8_sixtap_predict4x4_neon"Johann Koenig
This reverts commit d9dce2f48eed1368a44c368fa87a506bd89ffec5. Appears to be failing the SixtapPredict tests in some configurations and possibly test vectors as well. Change-Id: Ica6aa83ebac47d0a76e451846e7da67b1c17a7d7
2016-09-15Restore vp8_bilinear_predict4x4_neonJohann
This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. It is still ~5x faster than C in the unaligned case and doing both filters. BUG=webm:892 BUG=webm:1273 Change-Id: Icf7167189391b46202f47233bb585c24c42bcc36
2016-09-15Restore vp8_sixtap_predict4x4_neonJohann
This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. The store, when unaligned, has a version that is ~25% slower but safe when xoffset = 0 (second pass filter only). When the first pass filter (or both) are in play, the new version is almost identical in speed. Worst case performance (both filters, unaligned stores) is roughly 3-4x faster than C. BUG=webm:817 BUG=webm:1273 Change-Id: I1e490e94453e0872151fe0dafb05557463f6247d
2016-07-18prepend ++ instead of post in for loops.Jim Bankoski
Applied the following regex : search for: (for.*\(.*;.*;) ([a-zA-Z_]*)\+\+\) replace with: \1 ++\2) This misses some for loops: ie : for (mb_col = 0; mb_col < oci->mb_cols; mb_col++, mi++) Change-Id: Icf5f6fb93cced0992e0bb71d2241780f7fb1f0a8
2016-07-15vp8: apply clang-formatclang-format
Change-Id: I7605b6678014a5426ceb45c27b54885e0c4e06ed
2016-05-06Remove sixtap/bilinear 4x4 neon implementationsJohann
These implementations rely on casting the pointers to load the data. Clang implemented optimizations which automatically add alignment hints to such loads. The 4x4 filters do not guarantee the necessary alignment so the resulting assembly is broken. https://llvm.org/bugs/show_bug.cgi?id=24421 BUG=webm:817 BUG=webm:892 Change-Id: I608885299f1f86ff83653b65e0e40d0ae87fb3fe
2015-09-30vp8: change build_intra_predictors_mbuv_s to use vpx_dsp.Ronald S. Bultje
Change-Id: I936c2430c3c5b1e0ab5dec0a20110525e925b5e4
2015-09-30vp8: change build_intra_predictors_mby_s to use vpx_dsp.Ronald S. Bultje
Change-Id: I2000820e0c04de2c975d370a0cf7145330289bb2
2015-08-18Rename vp8 loopfilter[_neon.c]Johann
Avoid conflict with vpx_dsp version Change-Id: I041b1532a9276400a5547de8dfed1de43ad4e83d
2015-07-07Move sub pixel variance to vpx_dspJohann
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-06-30loopfiltersimpleverticaledge_neon: quiet uninit var warningsJames Zern
take 2. localize the function parameter to actually remove the warning Change-Id: I23c02061b5e21b0b75bd33c26062d1e531df7b92
2015-06-25loopfiltersimpleverticaledge_neon: quiet uninit var warningsJames Zern
the vector used in vld*_lane_* should be initialized before use Change-Id: Idce95354737915f6fb4e6b5e8980a050e953036d
2015-06-25idct_dequant_0_2x_neon: quiet uninit var warningsJames Zern
the vector used in vld*_lane_* should be initialized before use Change-Id: I6b791088479fec3bc021ca75cc2af5adcc39d954
2015-06-23vp8_subpixelvariance_neon: right size coeff tableJames Zern
only uint8 is required; each use only loads one value as a uint8 quiets a few type conversion warnings Change-Id: I03dc0dc0eb01ac23a6e8673daa2b77c6c57bf1b0
2015-05-26Move variance functions to vpx_dspJohann
subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
2015-05-07replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNEDJames Zern
this macro was used inconsistently and only differs in behavior from DECLARE_ALIGNED when an alignment attribute is unavailable. this macro is used with calls to assembly, while generic c-code doesn't rely on it, so in a c-only build without an alignment attribute the code will function as expected. Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
2015-05-06Move shared SAD code to vpx_dspJohann
Create a new component, vpx_dsp, for code that can be shared between codecs. Move the SAD code into the component. This reduces the size of vpxenc/dec by 36k on x86_64 builds. Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
2015-02-03Use correct buffer size in vp8 subpixel varianceJohann
In vp8_sub_pixel_variance8x8_neon the temp2 buffer is only initialized to kHeight8 * kWidth8. However, in the case that xoffset != 0 and yoffset == 0, var_filter_block2d_bil_w8 is called with output_width kHeight8PlusOne. Thanks to cmugurel for diagnosing and yulius for the patch. Change-Id: Ib71ffd96ffad963c92b8b7ca23f303942785b8e0 https://code.google.com/p/webrtc/issues/detail?id=4190
2014-09-25Clarify GCC version checkJohann
The version check was incorrectly matching some versions of clang which reported as gcc 4.2 Change-Id: I686d3576e71883fe1463206b56ab5e2aa9bb68a8
2014-09-14vp8/vp9: neon: msvc: move the 'ifdef _MSC_VER' bit to vpx_ports/mem.h.Jia Jia
fix compiling warning. Change-Id: If8706a9046436f704c597e4275a6810c76ba7daa
2014-09-04bilinearpredict_neon: fix type conversion warningsJames Zern
make bifilter4_coeff[][] uint8_t, no values exceed this range and they're loaded with vdup_n_u8(). Change-Id: I921983e9edd828d29820e40ac30a7801dbe0fb4f
2014-09-04Merge "arm: Fix building vp8_subpixelvariance_neon.c with MSVC"James Zern
2014-09-04Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 10""Scott LaVarnway
This reverts commit 677fb5123e0ece1d6c30be9d0282e1e1f594a56d Compiles with 4.6. Change-Id: I7f87048911b6bc28a61741d95501fa45ee97b819
2014-09-04arm: Fix building vp8_subpixelvariance_neon.c with MSVCMartin Storsjo
Use the right return values - vget_low_s64 returns int64x1_t, not a normal int64_t. Also make __builtin_prefetch a no-op on MSVC for this file. Change-Id: I4d2fce01d0ba106b98d3d53b137803119c2c2c08
2014-09-03Neon version of vp8_build_intra_predictors_mby_s() andScott LaVarnway
vp8_build_intra_predictors_mbuv_s(). This patch replaces the assembly version with an intrinsic version. On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~2.6%. Change-Id: I9ef65bad929450c0215253fdae1c16c8b4a8f26f
2014-09-03VP8 for ARMv8 by using NEON intrinsics 17Scott LaVarnway
Add vp8_subpixelvariance_neon.c - vp8_sub_pixel_variance16x16_neon_func - vp8_variance_halfpixvar16x16_h_neon - vp8_variance_halfpixvar16x16_v_neon - vp8_variance_halfpixvar16x16_hv_neon - vp8_sub_pixel_variance8x8_neon Change-Id: I3e5d85b2eafc26be0eef6a777789b80e4579257b Signed-off-by: James Yu <james.yu@linaro.org>
2014-09-03Merge "Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This ↵Johann
reverts commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb.""
2014-08-29Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 08""Scott LaVarnway
This reverts commit 928ff03889dadc3f63883553b443c08e625b4885 Compiles with 4.6 now. Change-Id: Ib455da1098bb0e0623248be07579882a425fcbd1
2014-08-20Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts ↵Johann
commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb." This reverts commit 920f803f2e2f41395311f96fec1b4a0c1b2b631a Change-Id: I410d9036214a1b18427cca70b4bc6d8239740737
2014-07-11vp8_bilinear_predict4x4_neon: init src vectorsJames Zern
quiets uninitialized warnings on the first load. Change-Id: I58a5af337087d96b4eaea8991a0f85c4ba58aebe
2014-07-10vp8_sixtap_predict4x4_neon: init src vectorsJames Zern
quiets uninitialized warnings on the first load. Change-Id: Ied9b03928537a9ed2cd414b9e8a0be00191b0f32
2014-05-14Remove intermediate step in vp8_dequantize_bJohann
With the intrinsics it is no longer necessary to have a stub/helper function. Change-Id: I3695961c3c94f1bb750d3b7b29716e509ebba482
2014-05-13Revert "VP8 for ARMv8 by using NEON intrinsics 06"Johann
This reverts commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb. This exposes a bug in gcc 4.9 regarding register allocation. Will reland when 4.9 is fixed. Change-Id: I2d8a04e4edde93719280e41550f4c0765608ec4d
2014-05-07Merge "Revert "VP8 for ARMv8 by using NEON intrinsics 10""Johann
2014-05-07Merge "arm: Use a correct neon vector type for 64 bit integers"Johann
2014-05-07arm: Add a no-op define of __builtin_prefetch for MSVCMartin Storsjo
Both GCC and RVCT/ARMCC support __builtin_prefetch, but MSVC doesn't. Change-Id: I44e1eecead61bc88d8fdfd3fef03d76d4f5afe08
2014-05-07arm: Use a correct neon vector type for 64 bit integersMartin Storsjo
This fixes building with MSVC. Change-Id: I763ba8855c8083d82c8b477d3a297e310e93a335
2014-05-06Revert "VP8 for ARMv8 by using NEON intrinsics 10"Johann
This reverts commit c500fc22c1bb2a3ae5c318bfb806f7e9bd57ce25 There is an issue with gcc 4.6 in the Android NDK: loopfiltersimpleverticaledge_neon.c: In function 'vp8_loop_filter_bvs_neon': loopfiltersimpleverticaledge_neon.c:176:1: error: insn does not satisfy its constraints: Change-Id: I95b6509d12f075890308914cc691b813d2e5cd9f
2014-05-06Revert "VP8 for ARMv8 by using NEON intrinsics 08"Johann
This reverts commit a5d79f43b963ced59b462206faf3b7857bdeff7b There is an issue with gcc 4.6 in the Android NDK: loopfilter_neon.c: In function 'vp8_loop_filter_vertical_edge_y_neon': loopfilter_neon.c:394:1: error: insn does not satisfy its constraints: Change-Id: I2b8c6ee3fa595c152ac3a5c08dd79bd9770c7b52
2014-05-05Merge "VP8 for ARMv8 by using NEON intrinsics 16"Johann
2014-05-05Merge "VP8 for ARMv8 by using NEON intrinsics 15"Johann
2014-05-05Merge "VP8 for ARMv8 by using NEON intrinsics 14"Johann
2014-05-05Merge changes Iaf7d6b0a,Iece0bf56Johann
* changes: Use INLINE and include vpx_config.h instead of plain 'inline' Use vreinterpret instead of casting neon vector types
2014-05-04Use INLINE and include vpx_config.h instead of plain 'inline'Martin Storsjo
This fixes compilation with MSVC. Change-Id: Iaf7d6b0a0134968a6addf315fde6d852f298db8c
2014-05-04Use vreinterpret instead of casting neon vector typesMartin Storsjo
MSVC doesn't support casting neon vector types but requires using vreinterpret. Change-Id: Iece0bf5632567efd7f37f527abea38afeab4926d
2014-05-03VP8 for ARMv8 by using NEON intrinsics 16James Yu
Add variance_neon.c - vp8_variance16x16_neon - vp8_variance16x8_neon - vp8_variance8x16_neon - vp8_variance8x8_neon Change-Id: Idfb9c96134a1c6a696a98ce68b4f7ed593a00660 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 15James Yu
Add idct_dequant_0_2x_neon.c - idct_dequant_0_2x_neon Change-Id: I8e129172ef1b2517cf72ff267788921f1a792586 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 14James Yu
Add sixtappredict_neon.c - vp8_sixtap_predict16x16_neon - vp8_sixtap_predict8x8_neon - vp8_sixtap_predict8x4_neon - vp8_sixtap_predict4x4_neon Change-Id: I3b02fce48ae2e6c6099041ba5ddd7b090f1463b9 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 13James Yu
Add shortidct4x4llm_neon.c - vp8_short_idct4x4llm_neon Change-Id: I5a734bbffca8dacf8633c2b0ff07b98aa2f438ba Signed-off-by: James Yu <james.yu@linaro.org>