summaryrefslogtreecommitdiff
path: root/vp9/vp9_common.mk
AgeCommit message (Collapse)Author
2013-11-21Revert "Add 16 wide neon horz loopfilter."Frank Galligan
The change caused mismatches with some test vectors on neon. Original CL: https://gerrit.chromium.org/gerrit/#/c/67863/ Change-Id: I913891636d53783e93cb1865ca78ded1821dc4b0
2013-11-21Merge "Add 16 wide neon horz loopfilter."Frank Galligan
2013-11-21Add 16 wide neon horz loopfilter.Frank Galligan
Add support to do 16 pixel horizontal filtering in Neon. Nexus devices saw about 0.5% decode speed increase. Change-Id: I2993f6c2d49f31fa74976879eeaa289fd3f4e15d
2013-11-19Move vp9_sadmxn.h from common to encoderYaowu Xu
Change-Id: I6f6ba91b1b8b280902b171472314d665aa0baf0b
2013-11-18Merge "Move vp9_extend.{h,c} from common to encoder"Yaowu Xu
2013-11-18Move vp9_extend.{h,c} from common to encoderYaowu Xu
Since they used in encoder only. This commit also re-order includes for the files that include vp9_extend.h Change-Id: I929fc113f2135d3198cd1fc6a17434e5a2f8a459
2013-11-15Do horizontal loopfiltering in parallelYunqing Wang
This patch followed "Rewrite filter_selectively_horiz for parallel loopfiltering" commit, and added x86 SSE2 optimization to do 16-pixel filtering in parallel. Also, corrected the declaration of aligned arrays. For 8-pixel-in-parallel case, improved the calculation of the masks and filters. Updated the threshold loading since the thresholds were already duplicated. Updated neon C functions to call neon loopfilters twice. Using tulip clip, tests showed it gave a ~1.5% decoder speed gain. Change-Id: Id02638626ac27a4b0e0b09d71792a24c0499bd35
2013-11-13Merge "mips dsp-ase r2 vp9 decoder intra module optimizations (rebase)"Johann
2013-11-13mips dsp-ase r2 vp9 decoder intra module optimizations (rebase)Parag Salasakar
Change-Id: Ib27fc4f3dbe01fe8adfa04a61aaba21b3480e75c
2013-11-13mips dsp-ase r2 vp9 decoder loopfilter module optimizations (rebase)Parag Salasakar
Change-Id: Ia7f640ca395e8deaac5986f19d11ab18d85eec2d
2013-11-07Merge "Add back vp9_short_idct32x32_1_add_neon which is deleted in cleanup ↵hkuang
I63df79a13cf62aa2c9360a7a26933c100f9ebda3."
2013-11-05Add back vp9_short_idct32x32_1_add_neon which is deleted inhkuang
cleanup I63df79a13cf62aa2c9360a7a26933c100f9ebda3. Change-Id: I034848cf05031618818f7df2e7f9c35102686948
2013-10-31mb_lpf_horizontal_edge AVX2 optimizationTamar Levy
This CL contains two AVX2 optimized loop filter functions, mb_lpf_horizontal_edge_w_avx2_8 and mb_lpf_horizontal_edge_w_avx2_16. Change-Id: I604e4fe6e99752b7800c2ea98721d97f7e0b931b
2013-10-24mips dsp-ase r2 vp9 decoder idct module optimizations (rebase)Parag Salasakar
Change-Id: Iedcdb8867084f328f4fce2fadb968e0984217308
2013-10-11Merge "SSE2 8-tap sub-pixel filter optimization"Yunqing Wang
2013-10-10SSE2 8-tap sub-pixel filter optimizationYunqing Wang
To ensure fast encoding/decoding on devices without ssse3 support, SSE2 optimization of sub-pixel filters was done. Test using 1080p clip showed the decoder speeds were ~70fps with ssse3 filters, ~60fps with sse2 filters, and ~15fps with c filters. Change-Id: Ie2088f87d83a889fba80a613e4d0e287aadd785c
2013-10-10Merge "Moving all scan/iscan code into separate vp9_scan.{h, c} files."Dmitry Kovalev
2013-10-09mips dsp-ase r2 vp9 decoder bilinear convolve optimizationsParag Salasakar
Change-Id: Ic31b4ef85e65070b4f8b9f26e068ccfaae00c4f0
2013-10-07Moving all scan/iscan code into separate vp9_scan.{h, c} files.Dmitry Kovalev
Now we have entropy code separate from scan/iscan code. The next step in future is to move iscan code from common part to the encoder. Change-Id: Id9732f7d80aec00af35c1d58d1137c4c96c91451
2013-10-02mips dsp-ase r2 vp9 decoder convolve module optimizationsParag Salasakar
Change-Id: I401536778e3c68ba2b3ae3955c689d005e1f1d59
2013-09-29Merge "Removing vp9_subpelvar.h from common."Dmitry Kovalev
2013-09-27Properly save neon registers.Christian Duvivier
Replace current code which corrupts the stack by duplicate of vp8 code to save and restore neon registers. Change-Id: Ibb0220b9aa985d10533befa0a455ebce57a2891a
2013-09-25Fix a bunch of TODO from vp9_short_idct32x32_add_neon.Christian Duvivier
- full ASM version, no more C gateway file. - integrate combine-add with last step of 2nd pass. - remove a few push/pop pairs. - some instruction reordering to hide latency. Change-Id: Ic9d9933c908b65d1bf7ba8fd47b524cda808c9c6
2013-09-25Removing vp9_subpelvar.h from common.Dmitry Kovalev
Moving all code from that file to vp9_variace_c.c in the encoder. Change-Id: Ic803d5b4c78d5191e4d25541b3df97337878fc3e
2013-09-13Revert "Improved 8t filters"James Zern
This is incompatible with most toolchains other than gcc. Revert "Deleted #include <inttypes.h>" This reverts commit 4d018be950ef8b056a7c797a22ee58012443df26. This reverts commit d22a504d11a15dc3eab666859db0046b5a7d75c5. Change-Id: I1751dc6831f4395ee064e6748281418e967e1dcf
2013-09-12Merge "Add neon optimize iht8x8 which is 282% faster than C."hkuang
2013-09-12Add neon optimize iht8x8 which is 282% faster than C.hkuang
Change-Id: I963dd4a6e8671957403ccbb9a16ea7de703e3530
2013-09-11First draft of vp9_short_idct32x32_add_neon.Christian Duvivier
Lots of TODO which will be taken care in upcoming changes. As is, about 6x faster than C version. Change-Id: Ie2557b72fd2d8edca376dbf400a4d173aa5e63e0
2013-09-11Improved 8t filtersScott LaVarnway
Reformatted version of a patch submitted by Erik/Tamar from Intel. For the test clips used, the decoder performance improved by ~2%. Change-Id: Ifbc37ac6311bca9ff1cfefe3f2e9b7f13a4a511b
2013-09-04Merge "Add neon optimize vp9_short_iht4x4_add."hkuang
2013-09-04Add neon optimize vp9_short_iht4x4_add.hkuang
Change-Id: I42c497b68ae1ee645b59c9968ad805db0a43e37e
2013-09-04make vp9 postproc a config optionJim Bankoski
Vp9 postproc is disabled for now as its not been shown to help and may be merged with vp8. Change-Id: I25620d6cd34c6e10331b18c7b5ef7482e39c6057
2013-08-27Add neon optimize vp9_short_idct16x16_1_add.hkuang
Change-Id: Ib9354c1d975d03e8081df20d50b6a77dfe2dc7e5
2013-08-26Add neon optimize vp9_short_idct8x8_1_add.hkuang
Change-Id: I0b15d5e3b0eb97abb9ab5ec08e88b61f8723aaf4
2013-08-26Add neon optimize vp9_short_idct4x4_1_add.hkuang
Change-Id: I6ecb5c4a1a472feb8e84e9f3352b536d5e28a4a5
2013-08-15Merge "vp9: neon: add vp9_convolve_avg_neon"Johann
2013-08-15Merge "vp9: neon: add vp9_convolve_copy_neon"Johann
2013-08-14Merge "Add neon optimize vp9_short_idct16x16_add."hkuang
2013-08-14Add neon optimize vp9_short_idct16x16_add.hkuang
Change-Id: I27134b9a5cace2bdad53534562c91d829b48838d
2013-08-14vp9: neon: add vp9_convolve_avg_neonMans Rullgard
Change-Id: I33cff9ac4f2234558f6f87729f9b2e88a33fbf58
2013-08-14vp9: neon: add vp9_convolve_copy_neonMans Rullgard
Change-Id: I15adbbda15d1842e9f15f21878a5ffbb75c3c0c9
2013-08-09Moving scale_factors and related code to separate files.Dmitry Kovalev
Change-Id: I531829e5aee2a4a7a112d528ecccbddf052d0e74
2013-08-06Neon version of vp9_short_idct4x4_add.Christian Duvivier
Change-Id: Idec4cae0cb9b3a29835fd2750d354c1393d47aa4
2013-08-06sse3 intrapred x86inc protectedJim Bankoski
Change-Id: I4a3c83119cdf8a205920034c8019d855d5504605
2013-08-06intrapred x86inc guardsJim Bankoski
Change-Id: If0399d8e11f4ebe75a5c91abb8d6a52a7709065b
2013-08-05Begin to restrict x86inc.asm usageJim Bankoski
Chromium does not support 32bit builds for Mac which use x86inc.asm. Make the files which include it work if 64bit or not PIC enabled starting with vp9_copy_sse2.asm Consolidate these targets in vp9_rtcd_defs.sh Change-Id: If18f0b957a611efd085a3ee7d245cf1eb91e8248
2013-08-02vp9: neon: add vp9_mb_lpf_* functionsMans Rullgard
Change-Id: I13e0880df234f15abc4cc7c57fe84488d5d46a75
2013-07-18Add neon optimize vp9_short_idct8x8_add.hkuang
Change-Id: Ic32acf3e2939c6d12d9c2bf192a5f5da59705fda
2013-07-17Merge "vp9_convolve8_neon placeholder"Johann
2013-07-17vp9_convolve8_neon placeholderJohann
Call the individually optimized horizontal and vertical functions. This implementation abuses the temp buffer. This will be replaced with a custom optimized function. Over 2x speedup. Change-Id: I5b908d2a73d264e9810d6022bbff73207a3055dd