summaryrefslogtreecommitdiff
path: root/vp8/common/arm
AgeCommit message (Collapse)Author
2016-09-16Merge "Revert "Restore vp8_sixtap_predict4x4_neon""James Zern
2016-09-16Revert "Restore vp8_sixtap_predict4x4_neon"Johann Koenig
This reverts commit d9dce2f48eed1368a44c368fa87a506bd89ffec5. Appears to be failing the SixtapPredict tests in some configurations and possibly test vectors as well. Change-Id: Ica6aa83ebac47d0a76e451846e7da67b1c17a7d7
2016-09-15Restore vp8_bilinear_predict4x4_neonJohann
This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. It is still ~5x faster than C in the unaligned case and doing both filters. BUG=webm:892 BUG=webm:1273 Change-Id: Icf7167189391b46202f47233bb585c24c42bcc36
2016-09-15Restore vp8_sixtap_predict4x4_neonJohann
This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. The store, when unaligned, has a version that is ~25% slower but safe when xoffset = 0 (second pass filter only). When the first pass filter (or both) are in play, the new version is almost identical in speed. Worst case performance (both filters, unaligned stores) is roughly 3-4x faster than C. BUG=webm:817 BUG=webm:1273 Change-Id: I1e490e94453e0872151fe0dafb05557463f6247d
2016-08-04Remove armv6 targetJohann
Change-Id: I1fa81cc9cabf362a185fc3a53f1e58de533a41e5
2016-07-18prepend ++ instead of post in for loops.Jim Bankoski
Applied the following regex : search for: (for.*\(.*;.*;) ([a-zA-Z_]*)\+\+\) replace with: \1 ++\2) This misses some for loops: ie : for (mb_col = 0; mb_col < oci->mb_cols; mb_col++, mi++) Change-Id: Icf5f6fb93cced0992e0bb71d2241780f7fb1f0a8
2016-07-15vp8: apply clang-formatclang-format
Change-Id: I7605b6678014a5426ceb45c27b54885e0c4e06ed
2016-05-06Remove sixtap/bilinear 4x4 neon implementationsJohann
These implementations rely on casting the pointers to load the data. Clang implemented optimizations which automatically add alignment hints to such loads. The 4x4 filters do not guarantee the necessary alignment so the resulting assembly is broken. https://llvm.org/bugs/show_bug.cgi?id=24421 BUG=webm:817 BUG=webm:892 Change-Id: I608885299f1f86ff83653b65e0e40d0ae87fb3fe
2015-09-30vp8: change build_intra4x4_predictors() to use vpx_dsp.Ronald S. Bultje
I've added a few new functions (d45e, d63e, he, ve) to cover the filtered h/v 4x4 predictors that are vp8-specific, the "correct" d45 with the correctly filtered bottom-right pixel (as opposed to the unfiltered version in vp9), and the "broken" d63 with weirdly filtered bottom-right pixels (which is correctly filtered in vp9). There may be a minor performance impact on all systems because we have to do an extra copy of the Above pixel array to incorporate the topleft pixel in the same array (thus fitting the vpx_dsp API). In addition, armv6 will have a more serious performance impact b/c I removed the armv6/vp8-specific assembly. I'm not sure anyone cares... Change-Id: I7f9e5ebee11d8e21aca2cd517a69eefc181b2e86
2015-09-30vp8: change build_intra_predictors_mbuv_s to use vpx_dsp.Ronald S. Bultje
Change-Id: I936c2430c3c5b1e0ab5dec0a20110525e925b5e4
2015-09-30vp8: change build_intra_predictors_mby_s to use vpx_dsp.Ronald S. Bultje
Change-Id: I2000820e0c04de2c975d370a0cf7145330289bb2
2015-08-18Rename vp8 loopfilter[_neon.c]Johann
Avoid conflict with vpx_dsp version Change-Id: I041b1532a9276400a5547de8dfed1de43ad4e83d
2015-07-07Move sub pixel variance to vpx_dspJohann
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-06-30loopfiltersimpleverticaledge_neon: quiet uninit var warningsJames Zern
take 2. localize the function parameter to actually remove the warning Change-Id: I23c02061b5e21b0b75bd33c26062d1e531df7b92
2015-06-25loopfiltersimpleverticaledge_neon: quiet uninit var warningsJames Zern
the vector used in vld*_lane_* should be initialized before use Change-Id: Idce95354737915f6fb4e6b5e8980a050e953036d
2015-06-25idct_dequant_0_2x_neon: quiet uninit var warningsJames Zern
the vector used in vld*_lane_* should be initialized before use Change-Id: I6b791088479fec3bc021ca75cc2af5adcc39d954
2015-06-23vp8_subpixelvariance_neon: right size coeff tableJames Zern
only uint8 is required; each use only loads one value as a uint8 quiets a few type conversion warnings Change-Id: I03dc0dc0eb01ac23a6e8673daa2b77c6c57bf1b0
2015-05-26Move variance functions to vpx_dspJohann
subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
2015-05-07replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNEDJames Zern
this macro was used inconsistently and only differs in behavior from DECLARE_ALIGNED when an alignment attribute is unavailable. this macro is used with calls to assembly, while generic c-code doesn't rely on it, so in a c-only build without an alignment attribute the code will function as expected. Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
2015-05-06Move shared SAD code to vpx_dspJohann
Create a new component, vpx_dsp, for code that can be shared between codecs. Move the SAD code into the component. This reduces the size of vpxenc/dec by 36k on x86_64 builds. Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
2015-04-28vpx_mem: remove vpx_memsetJames Zern
vestigial. replace instances with memset() which they already were being defined to. Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201
2015-02-03Use correct buffer size in vp8 subpixel varianceJohann
In vp8_sub_pixel_variance8x8_neon the temp2 buffer is only initialized to kHeight8 * kWidth8. However, in the case that xoffset != 0 and yoffset == 0, var_filter_block2d_bil_w8 is called with output_width kHeight8PlusOne. Thanks to cmugurel for diagnosing and yulius for the patch. Change-Id: Ib71ffd96ffad963c92b8b7ca23f303942785b8e0 https://code.google.com/p/webrtc/issues/detail?id=4190
2014-09-25Clarify GCC version checkJohann
The version check was incorrectly matching some versions of clang which reported as gcc 4.2 Change-Id: I686d3576e71883fe1463206b56ab5e2aa9bb68a8
2014-09-14vp8/vp9: neon: msvc: move the 'ifdef _MSC_VER' bit to vpx_ports/mem.h.Jia Jia
fix compiling warning. Change-Id: If8706a9046436f704c597e4275a6810c76ba7daa
2014-09-05vp8 common: change 'HAVE_NEON_ASM' to 'HAVE_NEON' for compiling functions of ↵Jia Jia
NEON intrinsics. Change-Id: I975e5eac16f8b623ff589f0ec072cdaff2183b04
2014-09-04bilinearpredict_neon: fix type conversion warningsJames Zern
make bifilter4_coeff[][] uint8_t, no values exceed this range and they're loaded with vdup_n_u8(). Change-Id: I921983e9edd828d29820e40ac30a7801dbe0fb4f
2014-09-04Merge "arm: Fix building vp8_subpixelvariance_neon.c with MSVC"James Zern
2014-09-04Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 10""Scott LaVarnway
This reverts commit 677fb5123e0ece1d6c30be9d0282e1e1f594a56d Compiles with 4.6. Change-Id: I7f87048911b6bc28a61741d95501fa45ee97b819
2014-09-04arm: Fix building vp8_subpixelvariance_neon.c with MSVCMartin Storsjo
Use the right return values - vget_low_s64 returns int64x1_t, not a normal int64_t. Also make __builtin_prefetch a no-op on MSVC for this file. Change-Id: I4d2fce01d0ba106b98d3d53b137803119c2c2c08
2014-09-03Neon version of vp8_build_intra_predictors_mby_s() andScott LaVarnway
vp8_build_intra_predictors_mbuv_s(). This patch replaces the assembly version with an intrinsic version. On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~2.6%. Change-Id: I9ef65bad929450c0215253fdae1c16c8b4a8f26f
2014-09-03VP8 for ARMv8 by using NEON intrinsics 17Scott LaVarnway
Add vp8_subpixelvariance_neon.c - vp8_sub_pixel_variance16x16_neon_func - vp8_variance_halfpixvar16x16_h_neon - vp8_variance_halfpixvar16x16_v_neon - vp8_variance_halfpixvar16x16_hv_neon - vp8_sub_pixel_variance8x8_neon Change-Id: I3e5d85b2eafc26be0eef6a777789b80e4579257b Signed-off-by: James Yu <james.yu@linaro.org>
2014-09-03Merge "Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This ↵Johann
reverts commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb.""
2014-08-29Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 08""Scott LaVarnway
This reverts commit 928ff03889dadc3f63883553b443c08e625b4885 Compiles with 4.6 now. Change-Id: Ib455da1098bb0e0623248be07579882a425fcbd1
2014-08-20Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts ↵Johann
commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb." This reverts commit 920f803f2e2f41395311f96fec1b4a0c1b2b631a Change-Id: I410d9036214a1b18427cca70b4bc6d8239740737
2014-07-11vp8_bilinear_predict4x4_neon: init src vectorsJames Zern
quiets uninitialized warnings on the first load. Change-Id: I58a5af337087d96b4eaea8991a0f85c4ba58aebe
2014-07-10vp8_sixtap_predict4x4_neon: init src vectorsJames Zern
quiets uninitialized warnings on the first load. Change-Id: Ied9b03928537a9ed2cd414b9e8a0be00191b0f32
2014-05-16Correct HAVE_NEON_ASM defineJohann
These optimizations are currently disabled. Change-Id: I19c58c9cb82d017638b86196641b9e001dfa798b
2014-05-14Remove intermediate step in vp8_dequantize_bJohann
With the intrinsics it is no longer necessary to have a stub/helper function. Change-Id: I3695961c3c94f1bb750d3b7b29716e509ebba482
2014-05-14Build armv7a-only codeJohann
Allow disabling the more generic NEON code. Use filtered option to disable rtcd code. Change-Id: Icb4500c1a2bac16eed3c5e3ec0c35e92e6bbbb9f
2014-05-13Revert "VP8 for ARMv8 by using NEON intrinsics 06"Johann
This reverts commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb. This exposes a bug in gcc 4.9 regarding register allocation. Will reland when 4.9 is fixed. Change-Id: I2d8a04e4edde93719280e41550f4c0765608ec4d
2014-05-12Only build neon assembly for armv7 targetsJohann
Allow selectively building just the intrinsics for armv8 Change-Id: I2f29b2e4508b8b8e5649c2906b3159ad1d4ec477
2014-05-07Merge "Revert "VP8 for ARMv8 by using NEON intrinsics 10""Johann
2014-05-07Merge "arm: Use a correct neon vector type for 64 bit integers"Johann
2014-05-07arm: Add a no-op define of __builtin_prefetch for MSVCMartin Storsjo
Both GCC and RVCT/ARMCC support __builtin_prefetch, but MSVC doesn't. Change-Id: I44e1eecead61bc88d8fdfd3fef03d76d4f5afe08
2014-05-07arm: Use a correct neon vector type for 64 bit integersMartin Storsjo
This fixes building with MSVC. Change-Id: I763ba8855c8083d82c8b477d3a297e310e93a335
2014-05-06Revert "VP8 for ARMv8 by using NEON intrinsics 10"Johann
This reverts commit c500fc22c1bb2a3ae5c318bfb806f7e9bd57ce25 There is an issue with gcc 4.6 in the Android NDK: loopfiltersimpleverticaledge_neon.c: In function 'vp8_loop_filter_bvs_neon': loopfiltersimpleverticaledge_neon.c:176:1: error: insn does not satisfy its constraints: Change-Id: I95b6509d12f075890308914cc691b813d2e5cd9f
2014-05-06Revert "VP8 for ARMv8 by using NEON intrinsics 08"Johann
This reverts commit a5d79f43b963ced59b462206faf3b7857bdeff7b There is an issue with gcc 4.6 in the Android NDK: loopfilter_neon.c: In function 'vp8_loop_filter_vertical_edge_y_neon': loopfilter_neon.c:394:1: error: insn does not satisfy its constraints: Change-Id: I2b8c6ee3fa595c152ac3a5c08dd79bd9770c7b52
2014-05-05Merge "VP8 for ARMv8 by using NEON intrinsics 16"Johann
2014-05-05Merge "VP8 for ARMv8 by using NEON intrinsics 15"Johann
2014-05-05Merge "VP8 for ARMv8 by using NEON intrinsics 14"Johann