summaryrefslogtreecommitdiff
path: root/vp9/vp9_common.mk
AgeCommit message (Collapse)Author
2014-01-16Revert "Revert "Revert "SSSE3 convolution optimization"""Yunqing Wang
This reverts commit f9404f240642222775a371acde8fc0721b3812df. This patch caused some ASAN error. Change-Id: If15b7e581310e19061d111c69f2931809662ed19
2014-01-13Revert "Revert "SSSE3 convolution optimization""Yunqing Wang
This reverts commit b645257121da20b422dbbebf02aae0fc6dff95d4. Change-Id: I60d1bf57ae8e9eb6127f42f2d5a780124ac51b45
2014-01-10Revert "SSSE3 convolution optimization"Paul Wilkins
This reverts commit 511d218c60b9b6c1ab9383db746815e907af0359. In current form intrinsics break borg build. Change-Id: Ied37936af841250ecff449802e69a3d3761c91b9
2014-01-09Merge "SSSE3 convolution optimization"Yunqing Wang
2014-01-09SSSE3 convolution optimizationlevytamar82
Optimizing all SSSE3 assembly for convolution: 1. vp9_filter_block1d4_h8_sse2 2. vp9_filter_block1d8_h8_sse2 3. vp9_filter_block1d16_h8_sse2 4. vp9_filter_block1d4_v8_sse2 5. vp9_filter_block1d8_v8_sse2 6. vp9_filter_block1d16_v8_sse2 my optimization include: -processing 2x8 elements in one 128 bit register instead of processing 8 elements in one 128 bit register. -removing unecessary loads. This optimization gives between 2.4% user level gain for 480p input and 1.6% user level gain for 720p. This Optimization done only for 64bit. Change-Id: Icb586dc0c938b56699864fcee6c52fd43b36b969
2014-01-08Merge "Add initial intra frame neon optimization. 1~2% gain."hkuang
2014-01-08Add initial intra frame neon optimization. 1~2% gain.hkuang
More intra optimizations will be added. Change-Id: I33ae8d93f6002bf7b64cc2669602d9e6bfa5a6e8
2013-12-19Removing vp9_findnearmv.{h, c} files.Dmitry Kovalev
Moving all code from that files to vp9_mvref_common.{h, c}. Change-Id: Ibc4afcb8cea6847166ff411130e93611ebe63b20
2013-12-16Converting vp9_treecoder.h to vp9_prob.{h, c}Dmitry Kovalev
Moving vp9_norm probability table from vp9_entropy.c to vp9_prob.c Change-Id: Ie757b73860c6f43130790c332b292e2a1a81b788
2013-12-05Moving vp9_tree_probs_from_distribution() to encoder.Dmitry Kovalev
Writing custom coeff branch count calculation (which is much clearer) in adapt_coef_probs() function. Removing vp9_treecoder.c file. Change-Id: I8880fb7a39996c8bcf6cd0acf9898a8c712ba91f
2013-12-04Removing vp9_default_coef_probs.h file.Dmitry Kovalev
Moving all probability tables from removed file to vp9_entropy.c. Change-Id: I12846f1da778c3016d96b82e53384d4634883430
2013-11-26Fix 16 wide neon horz loopfilter.Frank Galligan
Multiply by 3 was on 8bit vectors when it should have been on 16bit vectors. Change-Id: I248c1429b3134dfd171dfab0ebb109fd2437e1fc
2013-11-21Revert "Add 16 wide neon horz loopfilter."Frank Galligan
The change caused mismatches with some test vectors on neon. Original CL: https://gerrit.chromium.org/gerrit/#/c/67863/ Change-Id: I913891636d53783e93cb1865ca78ded1821dc4b0
2013-11-21Merge "Add 16 wide neon horz loopfilter."Frank Galligan
2013-11-21Add 16 wide neon horz loopfilter.Frank Galligan
Add support to do 16 pixel horizontal filtering in Neon. Nexus devices saw about 0.5% decode speed increase. Change-Id: I2993f6c2d49f31fa74976879eeaa289fd3f4e15d
2013-11-19Move vp9_sadmxn.h from common to encoderYaowu Xu
Change-Id: I6f6ba91b1b8b280902b171472314d665aa0baf0b
2013-11-18Merge "Move vp9_extend.{h,c} from common to encoder"Yaowu Xu
2013-11-18Move vp9_extend.{h,c} from common to encoderYaowu Xu
Since they used in encoder only. This commit also re-order includes for the files that include vp9_extend.h Change-Id: I929fc113f2135d3198cd1fc6a17434e5a2f8a459
2013-11-15Do horizontal loopfiltering in parallelYunqing Wang
This patch followed "Rewrite filter_selectively_horiz for parallel loopfiltering" commit, and added x86 SSE2 optimization to do 16-pixel filtering in parallel. Also, corrected the declaration of aligned arrays. For 8-pixel-in-parallel case, improved the calculation of the masks and filters. Updated the threshold loading since the thresholds were already duplicated. Updated neon C functions to call neon loopfilters twice. Using tulip clip, tests showed it gave a ~1.5% decoder speed gain. Change-Id: Id02638626ac27a4b0e0b09d71792a24c0499bd35
2013-11-13Merge "mips dsp-ase r2 vp9 decoder intra module optimizations (rebase)"Johann
2013-11-13mips dsp-ase r2 vp9 decoder intra module optimizations (rebase)Parag Salasakar
Change-Id: Ib27fc4f3dbe01fe8adfa04a61aaba21b3480e75c
2013-11-13mips dsp-ase r2 vp9 decoder loopfilter module optimizations (rebase)Parag Salasakar
Change-Id: Ia7f640ca395e8deaac5986f19d11ab18d85eec2d
2013-11-07Merge "Add back vp9_short_idct32x32_1_add_neon which is deleted in cleanup ↵hkuang
I63df79a13cf62aa2c9360a7a26933c100f9ebda3."
2013-11-05Add back vp9_short_idct32x32_1_add_neon which is deleted inhkuang
cleanup I63df79a13cf62aa2c9360a7a26933c100f9ebda3. Change-Id: I034848cf05031618818f7df2e7f9c35102686948
2013-10-31mb_lpf_horizontal_edge AVX2 optimizationTamar Levy
This CL contains two AVX2 optimized loop filter functions, mb_lpf_horizontal_edge_w_avx2_8 and mb_lpf_horizontal_edge_w_avx2_16. Change-Id: I604e4fe6e99752b7800c2ea98721d97f7e0b931b
2013-10-24mips dsp-ase r2 vp9 decoder idct module optimizations (rebase)Parag Salasakar
Change-Id: Iedcdb8867084f328f4fce2fadb968e0984217308
2013-10-11Merge "SSE2 8-tap sub-pixel filter optimization"Yunqing Wang
2013-10-10SSE2 8-tap sub-pixel filter optimizationYunqing Wang
To ensure fast encoding/decoding on devices without ssse3 support, SSE2 optimization of sub-pixel filters was done. Test using 1080p clip showed the decoder speeds were ~70fps with ssse3 filters, ~60fps with sse2 filters, and ~15fps with c filters. Change-Id: Ie2088f87d83a889fba80a613e4d0e287aadd785c
2013-10-10Merge "Moving all scan/iscan code into separate vp9_scan.{h, c} files."Dmitry Kovalev
2013-10-09mips dsp-ase r2 vp9 decoder bilinear convolve optimizationsParag Salasakar
Change-Id: Ic31b4ef85e65070b4f8b9f26e068ccfaae00c4f0
2013-10-07Moving all scan/iscan code into separate vp9_scan.{h, c} files.Dmitry Kovalev
Now we have entropy code separate from scan/iscan code. The next step in future is to move iscan code from common part to the encoder. Change-Id: Id9732f7d80aec00af35c1d58d1137c4c96c91451
2013-10-02mips dsp-ase r2 vp9 decoder convolve module optimizationsParag Salasakar
Change-Id: I401536778e3c68ba2b3ae3955c689d005e1f1d59
2013-09-29Merge "Removing vp9_subpelvar.h from common."Dmitry Kovalev
2013-09-27Properly save neon registers.Christian Duvivier
Replace current code which corrupts the stack by duplicate of vp8 code to save and restore neon registers. Change-Id: Ibb0220b9aa985d10533befa0a455ebce57a2891a
2013-09-25Fix a bunch of TODO from vp9_short_idct32x32_add_neon.Christian Duvivier
- full ASM version, no more C gateway file. - integrate combine-add with last step of 2nd pass. - remove a few push/pop pairs. - some instruction reordering to hide latency. Change-Id: Ic9d9933c908b65d1bf7ba8fd47b524cda808c9c6
2013-09-25Removing vp9_subpelvar.h from common.Dmitry Kovalev
Moving all code from that file to vp9_variace_c.c in the encoder. Change-Id: Ic803d5b4c78d5191e4d25541b3df97337878fc3e
2013-09-13Revert "Improved 8t filters"James Zern
This is incompatible with most toolchains other than gcc. Revert "Deleted #include <inttypes.h>" This reverts commit 4d018be950ef8b056a7c797a22ee58012443df26. This reverts commit d22a504d11a15dc3eab666859db0046b5a7d75c5. Change-Id: I1751dc6831f4395ee064e6748281418e967e1dcf
2013-09-12Merge "Add neon optimize iht8x8 which is 282% faster than C."hkuang
2013-09-12Add neon optimize iht8x8 which is 282% faster than C.hkuang
Change-Id: I963dd4a6e8671957403ccbb9a16ea7de703e3530
2013-09-11First draft of vp9_short_idct32x32_add_neon.Christian Duvivier
Lots of TODO which will be taken care in upcoming changes. As is, about 6x faster than C version. Change-Id: Ie2557b72fd2d8edca376dbf400a4d173aa5e63e0
2013-09-11Improved 8t filtersScott LaVarnway
Reformatted version of a patch submitted by Erik/Tamar from Intel. For the test clips used, the decoder performance improved by ~2%. Change-Id: Ifbc37ac6311bca9ff1cfefe3f2e9b7f13a4a511b
2013-09-04Merge "Add neon optimize vp9_short_iht4x4_add."hkuang
2013-09-04Add neon optimize vp9_short_iht4x4_add.hkuang
Change-Id: I42c497b68ae1ee645b59c9968ad805db0a43e37e
2013-09-04make vp9 postproc a config optionJim Bankoski
Vp9 postproc is disabled for now as its not been shown to help and may be merged with vp8. Change-Id: I25620d6cd34c6e10331b18c7b5ef7482e39c6057
2013-08-27Add neon optimize vp9_short_idct16x16_1_add.hkuang
Change-Id: Ib9354c1d975d03e8081df20d50b6a77dfe2dc7e5
2013-08-26Add neon optimize vp9_short_idct8x8_1_add.hkuang
Change-Id: I0b15d5e3b0eb97abb9ab5ec08e88b61f8723aaf4
2013-08-26Add neon optimize vp9_short_idct4x4_1_add.hkuang
Change-Id: I6ecb5c4a1a472feb8e84e9f3352b536d5e28a4a5
2013-08-15Merge "vp9: neon: add vp9_convolve_avg_neon"Johann
2013-08-15Merge "vp9: neon: add vp9_convolve_copy_neon"Johann
2013-08-14Merge "Add neon optimize vp9_short_idct16x16_add."hkuang