summaryrefslogtreecommitdiff
path: root/vp8/common/arm
AgeCommit message (Collapse)Author
2015-02-03Use correct buffer size in vp8 subpixel varianceJohann
In vp8_sub_pixel_variance8x8_neon the temp2 buffer is only initialized to kHeight8 * kWidth8. However, in the case that xoffset != 0 and yoffset == 0, var_filter_block2d_bil_w8 is called with output_width kHeight8PlusOne. Thanks to cmugurel for diagnosing and yulius for the patch. Change-Id: Ib71ffd96ffad963c92b8b7ca23f303942785b8e0 https://code.google.com/p/webrtc/issues/detail?id=4190
2014-09-25Clarify GCC version checkJohann
The version check was incorrectly matching some versions of clang which reported as gcc 4.2 Change-Id: I686d3576e71883fe1463206b56ab5e2aa9bb68a8
2014-09-14vp8/vp9: neon: msvc: move the 'ifdef _MSC_VER' bit to vpx_ports/mem.h.Jia Jia
fix compiling warning. Change-Id: If8706a9046436f704c597e4275a6810c76ba7daa
2014-09-05vp8 common: change 'HAVE_NEON_ASM' to 'HAVE_NEON' for compiling functions of ↵Jia Jia
NEON intrinsics. Change-Id: I975e5eac16f8b623ff589f0ec072cdaff2183b04
2014-09-04bilinearpredict_neon: fix type conversion warningsJames Zern
make bifilter4_coeff[][] uint8_t, no values exceed this range and they're loaded with vdup_n_u8(). Change-Id: I921983e9edd828d29820e40ac30a7801dbe0fb4f
2014-09-04Merge "arm: Fix building vp8_subpixelvariance_neon.c with MSVC"James Zern
2014-09-04Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 10""Scott LaVarnway
This reverts commit 677fb5123e0ece1d6c30be9d0282e1e1f594a56d Compiles with 4.6. Change-Id: I7f87048911b6bc28a61741d95501fa45ee97b819
2014-09-04arm: Fix building vp8_subpixelvariance_neon.c with MSVCMartin Storsjo
Use the right return values - vget_low_s64 returns int64x1_t, not a normal int64_t. Also make __builtin_prefetch a no-op on MSVC for this file. Change-Id: I4d2fce01d0ba106b98d3d53b137803119c2c2c08
2014-09-03Neon version of vp8_build_intra_predictors_mby_s() andScott LaVarnway
vp8_build_intra_predictors_mbuv_s(). This patch replaces the assembly version with an intrinsic version. On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~2.6%. Change-Id: I9ef65bad929450c0215253fdae1c16c8b4a8f26f
2014-09-03VP8 for ARMv8 by using NEON intrinsics 17Scott LaVarnway
Add vp8_subpixelvariance_neon.c - vp8_sub_pixel_variance16x16_neon_func - vp8_variance_halfpixvar16x16_h_neon - vp8_variance_halfpixvar16x16_v_neon - vp8_variance_halfpixvar16x16_hv_neon - vp8_sub_pixel_variance8x8_neon Change-Id: I3e5d85b2eafc26be0eef6a777789b80e4579257b Signed-off-by: James Yu <james.yu@linaro.org>
2014-09-03Merge "Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This ↵Johann
reverts commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb.""
2014-08-29Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 08""Scott LaVarnway
This reverts commit 928ff03889dadc3f63883553b443c08e625b4885 Compiles with 4.6 now. Change-Id: Ib455da1098bb0e0623248be07579882a425fcbd1
2014-08-20Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts ↵Johann
commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb." This reverts commit 920f803f2e2f41395311f96fec1b4a0c1b2b631a Change-Id: I410d9036214a1b18427cca70b4bc6d8239740737
2014-07-11vp8_bilinear_predict4x4_neon: init src vectorsJames Zern
quiets uninitialized warnings on the first load. Change-Id: I58a5af337087d96b4eaea8991a0f85c4ba58aebe
2014-07-10vp8_sixtap_predict4x4_neon: init src vectorsJames Zern
quiets uninitialized warnings on the first load. Change-Id: Ied9b03928537a9ed2cd414b9e8a0be00191b0f32
2014-05-16Correct HAVE_NEON_ASM defineJohann
These optimizations are currently disabled. Change-Id: I19c58c9cb82d017638b86196641b9e001dfa798b
2014-05-14Remove intermediate step in vp8_dequantize_bJohann
With the intrinsics it is no longer necessary to have a stub/helper function. Change-Id: I3695961c3c94f1bb750d3b7b29716e509ebba482
2014-05-14Build armv7a-only codeJohann
Allow disabling the more generic NEON code. Use filtered option to disable rtcd code. Change-Id: Icb4500c1a2bac16eed3c5e3ec0c35e92e6bbbb9f
2014-05-13Revert "VP8 for ARMv8 by using NEON intrinsics 06"Johann
This reverts commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb. This exposes a bug in gcc 4.9 regarding register allocation. Will reland when 4.9 is fixed. Change-Id: I2d8a04e4edde93719280e41550f4c0765608ec4d
2014-05-12Only build neon assembly for armv7 targetsJohann
Allow selectively building just the intrinsics for armv8 Change-Id: I2f29b2e4508b8b8e5649c2906b3159ad1d4ec477
2014-05-07Merge "Revert "VP8 for ARMv8 by using NEON intrinsics 10""Johann
2014-05-07Merge "arm: Use a correct neon vector type for 64 bit integers"Johann
2014-05-07arm: Add a no-op define of __builtin_prefetch for MSVCMartin Storsjo
Both GCC and RVCT/ARMCC support __builtin_prefetch, but MSVC doesn't. Change-Id: I44e1eecead61bc88d8fdfd3fef03d76d4f5afe08
2014-05-07arm: Use a correct neon vector type for 64 bit integersMartin Storsjo
This fixes building with MSVC. Change-Id: I763ba8855c8083d82c8b477d3a297e310e93a335
2014-05-06Revert "VP8 for ARMv8 by using NEON intrinsics 10"Johann
This reverts commit c500fc22c1bb2a3ae5c318bfb806f7e9bd57ce25 There is an issue with gcc 4.6 in the Android NDK: loopfiltersimpleverticaledge_neon.c: In function 'vp8_loop_filter_bvs_neon': loopfiltersimpleverticaledge_neon.c:176:1: error: insn does not satisfy its constraints: Change-Id: I95b6509d12f075890308914cc691b813d2e5cd9f
2014-05-06Revert "VP8 for ARMv8 by using NEON intrinsics 08"Johann
This reverts commit a5d79f43b963ced59b462206faf3b7857bdeff7b There is an issue with gcc 4.6 in the Android NDK: loopfilter_neon.c: In function 'vp8_loop_filter_vertical_edge_y_neon': loopfilter_neon.c:394:1: error: insn does not satisfy its constraints: Change-Id: I2b8c6ee3fa595c152ac3a5c08dd79bd9770c7b52
2014-05-05Merge "VP8 for ARMv8 by using NEON intrinsics 16"Johann
2014-05-05Merge "VP8 for ARMv8 by using NEON intrinsics 15"Johann
2014-05-05Merge "VP8 for ARMv8 by using NEON intrinsics 14"Johann
2014-05-05Merge changes Iaf7d6b0a,Iece0bf56Johann
* changes: Use INLINE and include vpx_config.h instead of plain 'inline' Use vreinterpret instead of casting neon vector types
2014-05-04Use INLINE and include vpx_config.h instead of plain 'inline'Martin Storsjo
This fixes compilation with MSVC. Change-Id: Iaf7d6b0a0134968a6addf315fde6d852f298db8c
2014-05-04Use vreinterpret instead of casting neon vector typesMartin Storsjo
MSVC doesn't support casting neon vector types but requires using vreinterpret. Change-Id: Iece0bf5632567efd7f37f527abea38afeab4926d
2014-05-03VP8 for ARMv8 by using NEON intrinsics 16James Yu
Add variance_neon.c - vp8_variance16x16_neon - vp8_variance16x8_neon - vp8_variance8x16_neon - vp8_variance8x8_neon Change-Id: Idfb9c96134a1c6a696a98ce68b4f7ed593a00660 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 15James Yu
Add idct_dequant_0_2x_neon.c - idct_dequant_0_2x_neon Change-Id: I8e129172ef1b2517cf72ff267788921f1a792586 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 14James Yu
Add sixtappredict_neon.c - vp8_sixtap_predict16x16_neon - vp8_sixtap_predict8x8_neon - vp8_sixtap_predict8x4_neon - vp8_sixtap_predict4x4_neon Change-Id: I3b02fce48ae2e6c6099041ba5ddd7b090f1463b9 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 13James Yu
Add shortidct4x4llm_neon.c - vp8_short_idct4x4llm_neon Change-Id: I5a734bbffca8dacf8633c2b0ff07b98aa2f438ba Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 12James Yu
Add sad_neon.c - vp8_sad16x16_neon - vp8_sad16x8_neon - vp8_sad8x8_neon - vp8_sad8x16_neon - vp8_sad4x4_neon Change-Id: I08eaae49ec03fb91b394354660a5df0367cea311 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 11James Yu
Add mbloopfilter_neon.c - vp8_mbloop_filter_horizontal_edge_y_neon - vp8_mbloop_filter_horizontal_edge_uv_neon - vp8_mbloop_filter_vertical_edge_y_neon - vp8_mbloop_filter_vertical_edge_uv_neon Change-Id: Ia9084e0892d4d49412d9cf2b165a0f719f2382d7 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 10James Yu
Add loopfiltersimpleverticaledge_neon.c - vp8_loop_filter_bvs_neon - vp8_loop_filter_mbvs_neon Change-Id: I7cf0a161ad4ae37c881b94cc0122f895d3baae79 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 09James Yu
Add loopfiltersimplehorizontaledge_neon.c - vp8_loop_filter_bhs_neon - vp8_loop_filter_mbhs_neon Change-Id: I77f9721b20585da8bf3869a3850ff0ae4b4bfeea Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02VP8 for ARMv8 by using NEON intrinsics 08James Yu
Add loopfilter_neon.c - vp8_loop_filter_horizontal_edge_y_neon - vp8_loop_filter_horizontal_edge_uv_neon - vp8_loop_filter_vertical_edge_y_neon - vp8_loop_filter_vertical_edge_uv_neon Change-Id: I50b57dedabd42d2a3c183c1738cc5346f0e71ed8 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02VP8 for ARMv8 by using NEON intrinsics 07James Yu
Add iwalsh_neon.c - vp8_short_inv_walsh4x4_neon Change-Id: I8beda6ce11ad8ce9e80cc0a38d40161938359162 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02VP8 for ARMv8 by using NEON intrinsics 06James Yu
Add idct_dequant_full_2x_neon.c - idct_dequant_full_2x_neon ==== Summary of apply VP8 decode patch series ==== Benchmark on Samsung Chromebook, Cortex-A15, 1.7GHz, Dual core Toolchain: linaro-1.13.1-4.8-2014.01 Compile argument: CROSS=arm-linux-gnueabihf- ../libvpx/configure --target=armv7-linux-gcc --prefix=$HOME/out --enable-shared --cpu=cortex-a7 Test argument: vpxdec --summary --noblit ./tears_of_steel_1080p.webm NEON assembly 46.68 (fps) Apply patch 06 46.65, -0.03 Apply patch 07 46.86, +0.21 Apply patch 08 46.58, -0.28 Apply patch 09 46.57, -0.01 Apply patch 10 46.51, -0.06 Apply patch 11 46.13, -0.38 Apply patch 12 45.42, -0.71 Apply patch 13 46.06, +0.64 Apply patch 14 45.19, -0.87 Apply patch 15 45.93, +0.74 Apply patch 16 45.48, -0.45 Apply patch 17 45.84, +0.36 Apply patch 18 45.91, +0.07 <= With all NEON intrinsics patches Total -0.77 fps, 1.65% performance regression Change-Id: I77bfc9eaccfb97b8d401e949ceff8795e26ca6b7 Signed-off-by: James Yu <james.yu@linaro.org>
2014-04-29Remove VP8 save_reg_neon functionYunqing Wang
This patch did a cleanup following the commit "Save NEON registers in VP8 NEON functions". The pushing/poping of callee-saved NEON registers was moved into individual NEON functions. Therefore, we don't need to save those registers at the beginning of codec. The related code was removed. Change-Id: I5648166514fc9beffb780aa138495597731f49ea
2014-04-28Save NEON registers in VP8 NEON functionsYunqing Wang
The recent compiler can generate optimized code that uses NEON registers for various operations besides floating-point operations. Therefore, only saving callee-saved registers d8 - d15 at the beginning of the encoder/decoder is not enough anymore. This patch added register saving code in VP8 NEON functions that use those registers. Change-Id: Ie9e44f5188cf410990c8aaaac68faceee9dffd31
2014-02-26VP8 for ARMv8 by using NEON intrinsics 05James Yu
Add dequantizeb_neon.c - vp8_dequantize_b_loop_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.23 (fps) Change-Id: Iebe3b0c6ed2359c778b0570763c5681ae25fef0c Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26VP8 for ARMv8 by using NEON intrinsics 04James Yu
Add dequant_idct_neon.c - vp8_dequant_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.22 (fps) Change-Id: Id48f39e1da58dd3d8d37658e94989411997f4f7c Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26VP8 for ARMv8 by using NEON intrinsics 03James Yu
Add dc_only_idct_add_neon.c - vp8_dc_only_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.24 (fps) Change-Id: I5e9e277ec3a3ca67e13c8cc4c324a6fbe8a897fc Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-23VP8 for ARMv8 by using NEON intrinsics 02James Yu
Add copymem_neon.c - vp8_copy_mem16x16_neon - vp8_copy_mem8x8_neon - vp8_copy_mem8x4_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.25 (fps) Change-Id: Ib956b5a20522ff57dc8a580bf0aef7b252bddba6 Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-12minor spelling cleanup in commentsAndrew Russell
Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06