summaryrefslogtreecommitdiff
path: root/vp9/encoder/arm
AgeCommit message (Collapse)Author
2017-10-09Rename some inline functions in NEON scalingLinfeng Zhang
Change-Id: I9d4c1af53d57f72fc716bacbe3b0965719c045ac
2017-10-02Add 4 to 3 scaling NEON optimizationLinfeng Zhang
Speed comparing with the one calling vpx_scaled_2d_neon() ~1.7 x in general ~2.8x for BILINEAR filter BUG=webm:1419 Change-Id: I8f0a54c2013e61ea086033010f97c19ecf47c7c6
2017-09-19cosmetics: NEON scaling codeLinfeng Zhang
Change-Id: Ib91054622c1f09c4ca523bc6837d7d8ab9f03618
2017-09-11Add 4 to 1 scaling NEON optimizationLinfeng Zhang
BUG=webm:1419 Change-Id: If82a93935d2453e61b7647aae70983db1740bec7
2017-09-07Add 2 to 1 scaling NEON optimizationLinfeng Zhang
BUG=webm:1419 Change-Id: I99c954ffa50a62ccff2c4ab54162916141826d9b
2017-08-23quantize fp: neon implementationJohann
About 4x faster when values are below the dequant threshold and 10x faster if everything needs to be calculated. Both numbers would improve if the division for dqcoeff could be simplified. BUG=webm:1426 Change-Id: I8da67c1f3fcb4abed8751990c1afe00bc841f4b2
2017-08-21quantize fp: ignore skip_block in armJohann
Change-Id: Ie8ac00efa826eead2a227726a1add816e04ff147
2017-05-15move neon load/stores to a new fileJohann
Move the tran_low_t helper functions to a new file. Additional load/store functions will be added here. Change-Id: I52bf652c344c585ea2f3e1230886be93f5caefc3
2017-05-05vp9: Neon optimization for denoiser. Add unit tests.Jerome Jiang
Denoiser on Neon is 5x faster than C code. BUG=webm:1420 Change-Id: I805ab64f809ff2137354116be6213e7ec29c1dcb
2017-02-16Drop zbin_ptr and quant_shift_ptrJohann
vp9[_highbd]_quantize]_fp[_32x32] and vp9_fdct8x8_quant do not make use of these parameters. scan is used for C code and iscan is used for SIMD implementations. Change-Id: I908a0ff7d3febac33da97e0596e040ec7bc18ca5
2017-02-14vp9 fdct higbd neon: connect existing highbd callsJohann
Change-Id: Ia8f822bd6e70b3911bc433a5a750bfb6f9a3a75c
2017-02-14quantize_fp highbd neon: use tran_low_t for coeffJohann
Change-Id: I90fd815f15884490ad138f35df575a00d31e8c95
2016-08-02vp9/encoder: apply clang-formatclang-format
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
2015-12-14move vp9_avg to vpx_dspJames Zern
Change-Id: I7bc991abea383db1f86c1bb0f2e849837b54d90f
2015-12-08Add vp9_avg_4x4_neon and the unit test.jackychen
Change-Id: I3ef9a9648841374ed3cc865a02053c14ad821a20
2015-11-24add vp9_satd_neonJames Zern
~60-65% faster at the function level across block sizes Change-Id: Iaf8cbe95731c43fdcbf68256e44284ba51a93893
2015-07-31add vp9_vector_var_neonJames Zern
~50-60% faster depending on the width Change-Id: I9d007cfa10b9aaa2169c8c009d95522df6123a92
2015-07-29Merge "add vp9_block_error_fp_neon"James Zern
2015-07-28Replace vp9_ prefix in 2D-DCT functions with vpx_Jingning Han
Clean up the forward 2D-DCT function names in vpx_dsp. Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25
2015-07-28Move DC only forward 2D-DCT functions to vpx_dspJingning Han
This completes the forward transform functions layout refactoring. Change-Id: I996fb0fb795f41e2040f7b21db985774098aedbd
2015-07-27add vp9_block_error_fp_neonJames Zern
~60-70% faster depending on the block size Change-Id: Icdbaa9977a91a63cbcc6ead0cf19d5a2af7f27e1
2015-07-27Replace vp9_idct.h for precise dependencyJingning Han
This commit replaces vp9_idct.h with txfm_common.h in many SIMD implementation files for precise file dependency. Change-Id: If73dd726bb16537e7494f28538b0a169810f9756
2015-07-22Factor forward 2D-DCT transforms into vpx_dspJingning Han
This commit factors the 4x4, 8x8, and 16x16 2D-DCT forward transform operations into vpx_dsp folder. Change-Id: I084b117b79c0925edcbcabb93f62b9f4bf8dbe7d
2015-07-15Add vp9_int_pro_col_neon.Frank Galligan
BUG=https://code.google.com/p/webm/issues/detail?id=1023 Change-Id: I212a1d67b23ce3b5ce08800de369b25b9e375e7d
2015-07-08Merge "Add vp9_int_pro_row_neon."Frank Galligan
2015-07-07Move sub pixel variance to vpx_dspJohann
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-07-06Merge "vp9_variance*.c: make static tables const"James Zern
2015-07-06vp9_variance*.c: make static tables constJames Zern
Change-Id: Ia5044d13c09685c401191fe87fbf90d36203aadd
2015-07-06Move subtract functions from vp9 to vpx_dspJingning Han
Factor out the subtraction operator as common function. Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b
2015-06-23Add vp9_int_pro_row_neon.Frank Galligan
BUG=https://code.google.com/p/webm/issues/detail?id=1022 Change-Id: I510c3b0a70158fa2e4da554f7c5d7558021a6ddf
2015-06-03Make vp9 subpixel match vp8Johann
The only difference between the two was that the vp9 function allowed for every step in the bilinear filter (16 steps) while vp8 only allowed for half of those. Since all the call sites in vp9 (<< 1) the input, it only ever used the same steps as vp8. This will allow moving the subpel variance to vpx_dsp with the rest of the variance functions. Change-Id: I6fa2509350a2dc610c46b3e15bde98a15a084b75
2015-05-26Move variance functions to vpx_dspJohann
subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
2015-05-07replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNEDJames Zern
this macro was used inconsistently and only differs in behavior from DECLARE_ALIGNED when an alignment attribute is unavailable. this macro is used with calls to assembly, while generic c-code doesn't rely on it, so in a c-only build without an alignment attribute the code will function as expected. Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
2015-05-06Move shared SAD code to vpx_dspJohann
Create a new component, vpx_dsp, for code that can be shared between codecs. Move the SAD code into the component. This reduces the size of vpxenc/dec by 36k on x86_64 builds. Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
2015-04-28vpx_mem: remove vpx_memsetJames Zern
vestigial. replace instances with memset() which they already were being defined to. Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201
2015-01-27Add vp9_sad32x32x4d_neon Neon intrinsic function.Frank Galligan
On Nexus 7 speed -6 saw ~18% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: I70ccdea0326750552ed946fb004507d6efe02d5c
2015-01-27Add vp9_sad16x16x4d_neon Neon intrinsic function.Frank Galligan
On Nexus 7 speed -6 saw ~15% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: I4b2006b644c488f42bf06d8a22ef0e6120a96bf9
2015-01-27Add vp9_sad64x64x4d_neon Neon intrinsic function.Frank Galligan
On Nexus 7 speed -6 saw ~30% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: Id12af7d1883243c23e6692e898aea82299633d58
2015-01-24Add Neon intrinsic vp9_fdct8x8_quant_neonFrank Galligan
On Nexus 7 speed -5 got ~2%, -6 got ~15%, -7 and -8 got ~30% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I83246d63b96674d170098a572fa4fe28a05aaf51
2015-01-20Merge "Add Neon intrinsics for vp9_avg_8x8_neon"Frank Galligan
2015-01-17Fix variance Neon intrinsics > 32x32Frank Galligan
The 16 bit sum vector was overflowing. Change-Id: I0fdf38e832ee99457ec8680a92691a6175ff8c3f
2015-01-15Add Neon intrinsics for vp9_avg_8x8_neonFrank Galligan
On Nexus 7 speed -5, -6, -7, and -8 saw about a 1% increase in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 1.5% increase in perf for 720p. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: Ibf17ebfd952a6aec941719bd8306df8ec4574bee
2015-01-14Merge "Switch remaining Neon variance functions to shifts"Frank Galligan
2015-01-14Add 64x64 sub_pel_variance Neon functionFrank Galligan
On Nexus 7 speed -5, -6, -7, and -8 saw about a 15% increase in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 10% increase in perf for 720p. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I2fa5315845e3021c9a6e2ea47e52e68b398d8334
2015-01-14Switch remaining Neon variance functions to shiftsFrank Galligan
Saves 5 instructions on 8x8 and 16x16 and 8 instructions on 32x32, when compiled with 4.9. Change-Id: Id3da613a36a9d27d8c5169c59ba45d247c920c6c
2015-01-13Add 64x variance Neon functionsFrank Galligan
Add optimized Neon functions of: vp9_variance32x64 vp9_variance64x32 vp9_variance64x64 On Nexus 7 speed -5 and -6 saw about a 4% increase in perf. Speeds -7 and -8 saw about a 6% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I5a81f13c9897eb927fa39662530f5524a0f768fa
2014-12-22Revert "Revert "Removal of legacy zbin_extra / zbin_oq_value.""Jingning Han
This reverts commit 9946ee23e0a4c158e26a505b162a072f81b8a3be. Fix the ssse3 asm function. Change-Id: I07f77a63aa98087626e45c4e87aa5dcafc0b0b07
2014-12-19Revert "Removal of legacy zbin_extra / zbin_oq_value."Paul Wilkins
This reverts commit e9b586e21bb899e247346e82bccf5afb42604910. Change-Id: I5b36e6727da6c05278d97e2c37b80c109f79bed4
2014-12-18Removal of legacy zbin_extra / zbin_oq_value.Paul Wilkins
zbin extra / zbin_oq_value was widely passed around, hence removal touches a lot of code. Change-Id: Idc94359735b60c38a160e4385ae09d5ca8b6b8e5
2014-08-08Improved vp9_quantize_fp_neon()Scott LaVarnway
Eliminated instructions by using better neon instructions and rearranging the loop. On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~1.0%. Change-Id: I6b1700e79318f647ea67ef25e954c308932950ec