summaryrefslogtreecommitdiff
path: root/vpx_dsp
AgeCommit message (Collapse)Author
2018-05-08Merge "Add vpx_sum_squares_2d_i16_neon()"Linfeng Zhang
2018-05-07Update vpx_sum_squares_2d_i16_sse2()Linfeng Zhang
Change-Id: I5a2ca2ed246277cf6b1ef2ffac34ce5c40aa0158
2018-05-07Add vpx_sum_squares_2d_i16_neon()Linfeng Zhang
Perf shows CPU time of this function dropped from 0.81% to 0.15%. Change-Id: I8a7649ca5c15af2fc65cfb848f5befa0cc5e64f2
2018-04-25vp9: [loongson] optimize vpx_convolve8 with mmiguxiwei-hf@loongson.cn
1. vpx_convolve_avg_mmi 2. vpx_convolve8_avg_horiz_mmi Change-Id: Ie544aac45b4b1c0a0e51b44b650189ae5e88aee1
2018-04-17Update variance avx2 functionsLinfeng Zhang
Old vs New Variance 64x64 time: 1145 ms 797 ms Variance 64x32 time: 1200 ms 831 ms Variance 32x32 time: 1228 ms 1135 ms Variance 32x16 time: 1374 ms 1491 ms Variance 16x16 time: 1688 ms 1571 ms sse2 vs avx2 Variance 32x64 time: 1645 ms 957 ms Variance 16x32 time: 2031 ms 1243 ms Variance 16x8 time: 3071 ms 2275 ms Change-Id: I0202a556e4629977d647e219c2e897e1ab6accb2
2018-04-17Update variance sse2 functionsLinfeng Zhang
Old vs New Variance 64x64 time: 197 ms 143 ms Variance 64x32 time: 200 ms 146 ms Variance 32x64 time: 203 ms 140 ms Variance 32x32 time: 214 ms 152 ms Variance 32x16 time: 243 ms 153 ms Variance 16x32 time: 234 ms 197 ms Variance 16x16 time: 205 ms 205 ms Variance 16x8 time: 228 ms 222 ms Variance 8x16 time: 228 ms 232 ms Variance 8x8 time: 282 ms 240 ms Variance 8x4 time: 506 ms 341 ms Variance 4x8 time: 518 ms 415 ms Variance 4x4 time: 604 ms 628 ms Observed vp9 encoder speed up when encoding a 720p video. Change-Id: Iebb98f3b3d8adbc11a733a529d8427ce3d2a5314
2018-04-12Silence warning when built with --enable-internal-stats.Jerome Jiang
Change-Id: I3a600a9baf2b8e46c109f4ec2b5bd6bafda4bf58
2018-04-03rm CONVERT_TO_SHORTPTR in vpx_highbd_comp_avg_predLinfeng Zhang
BUG=webm:1388 Change-Id: I1d0dd9af52a1461e3e2b2d60e8c4b6b74c3b90b0
2018-04-02Merge changes I5704bd66,I4d548e97Linfeng Zhang
* changes: Shrink size of mode_map in struct TileDataEnc Update sad4d x86 functions
2018-03-28Update sad4d x86 functionsLinfeng Zhang
Speed change is marginal. Change-Id: I4d548e9763ce43bd546f19132202f7a8509a32bf
2018-03-28vp9: [loongson] optimize vpx_convolve8 with mmi.gxw
1. vpx_convolve8_vert_mmi 2. vpx_convolve8_horiz_mmi 3. vpx_convolve8_mmi 4. vpx_convolve8_avg_mmi 5. vpx_convolve8_avg_vert_mmi Change-Id: I41a6b3b4f327d6b67d282e0163cfa0aee8648abe
2018-03-13Add vp9_highbd_iht16x16_256_add_neon()Linfeng Zhang
BUG=webm:1403 Change-Id: I2293c11666786be276909d48ee78dacb40a89e25
2018-02-27Add vp9_iht16x16_256_add_neon()Linfeng Zhang
BUG=webm:1403 Change-Id: I1413cc3dfcb62143ba04fe9b0f8d8b010fdf69b6
2018-02-26Fix a bug in create_s16x4_neon()Linfeng Zhang
This bug exposes when 2nd argument is negative, and the higher 32 bits would be all 1s. Change-Id: I189ee8cd3753fde00a34847e7a37cde2caa4ba72
2018-02-23Merge "Add vp9_highbd_iht8x8_16_add_neon()"Linfeng Zhang
2018-02-21Merge "Fold adds in 16->32-bit converts in SSE2/AVX2 fDCT"Kyle Siefring
2018-02-20Add vp9_highbd_iht8x8_16_add_neon()Linfeng Zhang
BUG=webm:1403 Change-Id: I11efb652f1aee371c71eee2d29e33793e4736832
2018-02-20remove deprecated 'register' keywordJohann
Will be removed in C++17: http://en.cppreference.com/w/cpp/language/storage_duration Change-Id: Iadce5e2b974c707799fa939f3ff1c420fb79a871
2018-02-10Fold adds in 16->32-bit converts in SSE2/AVX2 fDCTKyle Siefring
Changes in the function size in bytes (in lieu of performance metrics) Before After Diff vpx_fdct32x32_avx2 29564 -> 28334 -1230 vpx_fdct32x32_sse2 38053 -> 36309 -1744 Change-Id: Ie0b3e6ed7c3f2e9ea45f9d6a1ce1e27d068cee6b
2018-02-08Update iadst NEON functionsLinfeng Zhang
Use scalar multiply. No impact on clang, but improves gcc compiling. BUG=webm:1403 Change-Id: I4922e7e033d9e93282c754754100850e232e1529
2018-02-05Add vp9_highbd_iht4x4_16_add_neon()Linfeng Zhang
BUG=webm:1403 Change-Id: Id9833e985fb70958cf4bde38f8e6303ed83c12f9
2018-02-01inv_txfm_vsx.c: make code c90 compatibleJames Zern
move for loop declarations to function scope Change-Id: I84d92a1a6ca6c5ac30aacb0f55d87ca3aef4c98f
2018-01-29Update vp9_iht8x8_64_add_neon()Linfeng Zhang
Change-Id: Ie70ed8b9273df5e1fd06bc93cb469e80630941d2
2018-01-29Clean dct_const_round_shift() related neon codeLinfeng Zhang
Change-Id: I8f4e0fc6ecb77b623519f2dd3cd2886f89218ddd
2018-01-29Merge "cosmetic: clean idct neon functions"Linfeng Zhang
2018-01-24Merge "BUG FIX: sse2 subpel variance is not PIC compliant"Scott LaVarnway
2018-01-24cosmetic: clean idct neon functionsLinfeng Zhang
Change-Id: I9c7c52567850aded0437b13ba1260e94441bc49d
2018-01-24BUG FIX: sse2 subpel variance is not PIC compliantScott LaVarnway
BUG=webm:1464 Change-Id: Ibc15bac54aaf509365bed5892a26a29972ad3540
2018-01-24Merge "vp9_quantize_fp_avx2()"Scott LaVarnway
2018-01-23Add vp9_highbd_iht16x16_256_add_sse4_1()Linfeng Zhang
BUG=webm:1413 Change-Id: I8d7eeae1bd219eb848c1a86071046a477f7a91af
2018-01-23Add "vpx_" prefix to 2 idct x86 functionsLinfeng Zhang
Change-Id: I4f3052d8748e16b06e9155f8daf22f867dfaa7a3
2018-01-23Merge "Add vp9_highbd_iht8x8_64_add_sse4_1()"Linfeng Zhang
2018-01-18Add vp9_highbd_iht8x8_64_add_sse4_1()Linfeng Zhang
BUG=webm:1413 Change-Id: Id9038226902b2d793fc6c17ac81bb104c1a18988
2018-01-18vp9_quantize_fp_avx2()Scott LaVarnway
Started from vp9_quantize_fp_sse2 and tweaked to use avx2. Change-Id: Ic2da50cc9d73896c7ef2f3cd3db5b1c5d7795b8b
2018-01-18clang-format v5.0.0 vpx_dsp/Johann
Remove comments above #define statements because they get indented unnecessarily. https://bugs.llvm.org/show_bug.cgi?id=35930 Add blank lines to prevent comments from being treated as blocks. Change-Id: I04dce21b2a10e13b8dc07411a0019c098f6dd705
2018-01-11adopt some clang 5.0.0 formattingJohann
At least the changes that don't conflict with 4.0.1 Change-Id: I9b6a7c14dadc0738cd0f628a10ece90fc7ee89fd
2018-01-08Add vp9_highbd_iht4x4_16_add_sse4_1()Linfeng Zhang
BUG=webm:1413 Change-Id: I14930d0af24370a44ab359de5bba5512eef4e29f
2018-01-08Update dct_test.ccLinfeng Zhang
Make 8-bit functions testing available in high bitdepth. Change-Id: Ic030c75aa4c6b649c52426abb4bb2122882de0fe
2017-12-28Update iadst4_sse2()Linfeng Zhang
Change-Id: I21ff81df0d6898170a3b80b3b5220f9f3ac7f4e8
2017-12-14add copyright to rtcd filesJohann
Allows them to pass the license check in chromium. BUG=chromium:98319 Change-Id: Iefc1706152a549d8c4ae774c917596bf1c9492d8
2017-12-04Merge "vpx_dsp: [loongson] optimize variance v2."Shiyou Yin
2017-12-01explicitly label .text sectionsJohann
nasm should infer .text but does not for windows: https://bugzilla.nasm.us/show_bug.cgi?id=3392451 Change-Id: Ib195465e5f33405f5ff61c4cf88aa2a72640cacb
2017-12-01vpx_dsp: [loongson] optimize variance v2.Shiyou Yin
1. Delete unnecessary zero setting process. 2. Optimize the method of calculating SSE in vpx_varianceWxH. Change-Id: I58890c6a2ed1543379acb48e03e620c144f6515f
2017-12-01Merge "mips msa optimize vpx_scaled_2d function"Kaustubh Raste
2017-11-30Merge "vpx: [loongson] fix bug in var_filter_block2d_bil_16x"Shiyou Yin
2017-11-29Merge "Remove unnecessary includes of emmintrin_compat.h"Kyle Siefring
2017-11-29Remove unnecessary includes of emmintrin_compat.hKyle Siefring
Change-Id: Ie60381a0c6ee01f828cd364a43f01517f4cb03e9
2017-11-29mips msa optimize vpx_scaled_2d functionKaustubh Raste
Change-Id: I638507b360c71489ab0e87bd558d2719ad995333
2017-11-29vpx: [loongson] fix bug in var_filter_block2d_bil_16xShiyou Yin
Which cause failed case: 1. MMI/VpxSubpelVarianceTest.Ref/6 2. MMI/VpxSubpelVarianceTest.Ref/7 3. MMI/VpxSubpelVarianceTest.ExtremeRef/6 4. MMI/VpxSubpelVarianceTest.ExtremeRef/7 Change-Id: I122ca20089e14ac324edd61295cf8f506e06afc8
2017-11-27quantize x86: dedup some partsJohann
Change-Id: I9f95f47bc7ecbb7980f21cbc3a91f699624141af