Age | Commit message (Collapse) | Author |
|
Change-Id: I3177251a5935453a23a23c39ea5f6fd41254775e
|
|
vp9_quantize_fp_sse2 was only tested in non-hbd
configuration. Missed when fixing this for
vpx_quantize_b_sse2.
Change-Id: Ide346e5727d74281c774f605c90d280050e0bf62
|
|
Change-Id: I7af273e979415a8b8cafb7494728d2736862f4a5
|
|
Total gain for 12-bit encoding:
* ~4.8% for best profile
* ~6.2% for rt profile
Change-Id: I61e646ab7aedf06a25db1365d6d1cf7b05101c21
|
|
Up to 2.6x faster than vp9_highbd_quantize_fp_32x32_c() for full
calculations.
Bug: b/237714063
Change-Id: Icfeff2ad4dcd57d0ceb47fe04789710807b9cbad
|
|
Up to 4.1x faster than vp9_highbd_quantize_fp_c() for full
calculations.
~1.3% overall encoder improvement for the test clip used.
Bug: b/237714063
Change-Id: I8c6466bdbcf1c398b1d8b03cab4165c1d8556b0c
|
|
~4x faster than vp9_highbd_quantize_fp_32x32_c() for full
calculations.
Bug: b/237714063
Change-Id: Iff2182b8e7b1ac79811e33080d1f6cac6679382d
|
|
Up to 5.37x faster than vp9_highbd_quantize_fp_c() for full
calculations.
~1.6% overall encoder improvement for the test clip used.
Bug: b/237714063
Change-Id: I584fd1f60a3e02f1ded092de98970725fc66c5b8
|
|
Up to 1.80x faster than vp9_quantize_fp_32x32_ssse3() for full
calculations.
Bug: b/237714063
Change-Id: Ic4ae4724fce7ac85c7a089535b16a999e02f0a10
|
|
No change in performance.
Bug: b/237714063
Change-Id: I8ea42759cc4dc57be6a29c23784997cb90ad4090
|
|
Up to 11.78x faster than vpx_quantize_b_32x32_sse2() for full
calculations.
~1.7% overall encoder improvement for the test clip used.
Bug: b/237714063
Change-Id: Ib759056db94d3487239cb2748ffef1184a89ae18
|
|
Up to 3.61x faster than vpx_highbd_quantize_b_sse2() for full
calculations.
~2.3% overall encoder improvement for the test clip used.
Bug: b/237714063
Change-Id: I23f88d2a7f96aaa4103778372f4f552207f73cee
|
|
Up to 1.36x faster than vpx_quantize_b_32x32_avx() for full
calculations. Up to 1.29x faster for VP9_HIGHBITDEPTH builds.
Bug: b/237714063
Change-Id: I97aa6a18d4dc2f3187b76800f91bbba7be447ef1
|
|
Up to 1.58x faster than vpx_quantize_b_avx() depending
on the size.
Bug: b/237714063
Change-Id: I595a6bb32ebee63f69f27b5a15322fdeae1bf70e
|
|
Bug: b/237714063
Change-Id: I4304ba8d976fed3613e28442983b04a9cfc15b79
|
|
1. vpx_quantize_b_lsx
2. vpx_quantize_b_32x32_lsx
Bug: webm:1755
Change-Id: I476c8677a2c2aed7248e088e62c3777c9bed2adb
|
|
This reverts commit 2200039d33c49a9f7a5c438656df143755b022c4.
This causes failures with VP9/EndToEndTestLarge.EndtoEndPSNRTest/*; it
seems the assembly does not match the C code.
Bug: webm:1586
Change-Id: I4c63beebf88d4c12789d681b0d38014510b147fe
|
|
This reverts commit 89cfe3835c47dabf77d38edb3af190155984fa9a.
This is a prerequisite for reverting
2200039d33c49a9f7a5c438656df143755b022c4 which causes high bitdepth test
failures
Bug: webm:1586
Change-Id: I28f3b98f3339f3573b1492b88bf733dade133fc0
|
|
The only difference between the code is the clamp. For
8 bit it is purely an optimization. The values outside
this range will still saturate.
Change-Id: I2a770b140690d99e151b00957789bd72f7a11e13
|
|
The optimized quantize functions were already built to handle
highbd values. The only difference is the clamping. All highbd
functions expand to 32bits when running in highbd mode.
Removes vpx_highbd_quantize_32x32_sse2 as it is slower than the
C version in the worst case.
Bug: webm:1586
Change-Id: I49bf8a6a2041f78450bf43a4f655c67656b0f8d9
|
|
Whether a block is skipped is handled by mi->skip. x->skip_block
is kept exclusively to verify that the quantize functions are not
called for skip blocks.
Finishes the cleanup in 13eed991f
Bug: libvpx:1612
Change-Id: I1598c3b682d3c5e6c57a15fa4cb5df2c65b3a58a
|
|
This should clean up clangtidy warnings
Change-Id: Ifb5a986121b2d0bd71b9ad39a79dd46c63bdb998
|
|
this moves the framework to c++11 and changes *_TEST_CASE* to
_TEST_SUITE
BUG=webm:1695
Change-Id: I07f2c20850312a9c7e381b38353d2f9f45889cb1
|
|
this prevents redefinition warnings if a toolchain sets one
BUG=b/117240165
Change-Id: Ib5d8c303cd05b4dbcc8d42c71ecfcba8f6d7b90c
|
|
implicit conversion from type 'int' of value 42126 (32-bit, signed)
to type 'tran_low_t' (aka 'short') changed the value to -23410 (16-bit, signed)
BUG=webm:1615
Change-Id: I339c640fce81e9f2dd73ef9c9bee084b6a5638dc
|
|
Change-Id: I7850a5c5aea3633e50e9a2efc8116b9e16383a8f
|
|
since:
77fa51003 Replace deprecated scoped_ptr with unique_ptr
c++11 has been required so <tuple> is safe to use
Change-Id: I873cb953104b361a8503b5839a3372ce2b99e73c
|
|
BUG=webm:1448
Change-Id: I2140fb9b6ce92716d2d9509f3031244088a62127
|
|
This slows down low bitdepth builds but is necessary to obtain correct
values.
BUG=webm:1448
Change-Id: I4ca9145f576089bb8496fcfeedeb556dc8fe6574
|
|
Calculate the high bits of dqcoeff and store them appropriately in high
bit depth builds.
Low bit depth builds still do not pass. C truncates the results after
division. X86 only supports packing with saturation at this step.
BUG=webm:1448
Change-Id: Ic80def575136c7ca37edf18d21e26925b475da98
|
|
Calculate the high bits of dqcoeff in high bit depth builds and store
them appropriately.
BUG=webm:1448
Change-Id: I61a2f8bfcf2e30765f10a94073c4d58321d2fa24
|
|
include vpx_ports/msvc.h to avoid issues with snprintf issues with MSVC.
Change-Id: Ida09cff8ee3b84e09fd61de131f84b32c113fa1a
|
|
Low bit depth version only. Passes the VP9QuantizeTest test suite.
VP9QuantizeTest Speed Test (POWER8 Model 2.1)
32x32 C time = 93.1 ms (±0.4 ms), VSX time = 6.5 ms (±0.2 ms) [14.4x]
Change-Id: I7f1fd0fc987af86baf2b74147a25aee811289112
|
|
Low bit depth version only. Passes the VP9QuantizeTest test suite.
VP9QuantizeTest Speed Test (POWER8 Model 2.1)
4x4 C time = 86.3 ms (±0.7 ms), VSX time = 18.2 ms (±0.0 ms) [ 4.7x]
8x8 C time = 57.7 ms (±0.3 ms), VSX time = 7.6 ms (±0.0 ms) [ 7.6x]
16x16 C time = 50.7 ms (±0.1 ms), VSX time = 4.9 ms (±0.0 ms) [10.3x]
Change-Id: Ic09bc786c57cc89bba14624064216b52996075eb
|
|
functions: upper camelcase
members: lowercase with trailing '_'
decl order: functions (overrides marked virtual), members
after:
656e8ac61 VSX version of vpx_post_proc_down_and_across_mb_row
766d875b9 VSX version of vpx_mbpost_proc_ip
35e98a70b VSX version of vpx_mbpost_proc_down
b2898a9ad Bench Class For More Robust Speed Tests
Change-Id: Ib257bd607c5c1248d30e619ec9e8a47cc629825b
|
|
To make speed testing more robust, the AbstractBench runs the
desired code multiple times and report the median run time with
mean absolute deviation around the median.
To use the AbstractBench, simply add it as a parent to your test
class, and implement the run() method (with the code you want to
benchmark).
Sample output for VP9QuantizeTest
[ BENCH ] Bypass calculations 4x4 165.8 ms ( ±1.0 ms )
[ BENCH ] Full calculations 4x4 165.8 ms ( ±0.9 ms )
[ BENCH ] Bypass calculations 8x8 129.7 ms ( ±0.9 ms )
[ BENCH ] Full calculations 8x8 130.3 ms ( ±1.4 ms )
[ BENCH ] Bypass calculations 16x16 110.3 ms ( ±1.4 ms )
[ BENCH ] Full calculations 16x16 110.1 ms ( ±0.9 ms )
Change-Id: I1dd649754cb8c4c621eee2728198ea6a555f38b3
|
|
Low bit depth version only. Passes the VP9QuantizeTest.
VP9QuantizeTest Speed Test (POWER8 Model 2.1)
Full calculations:
C time = 1456 ms, VSX time = 80 ms (18x)
Change-Id: I1b1d6d03b1aeff63640efbdeb222cab857ddd95e
|
|
Low bit depth version only. Passes the VP9QuantizeTest.
Change-Id: I6546f872864bd404a7e353348b0554aab1de5bf0
|
|
googletest imports tuple into testing to allow for compatibility across
c++ versions where tuple may be in std::tr1 or std. fixes deprecation
warnings under visual studio 2017
Change-Id: Id78b372d5478b12d8c8f63fd3f2166fec25aa8be
|
|
Started from vp9_quantize_fp_sse2 and tweaked to use avx2.
Change-Id: Ic2da50cc9d73896c7ef2f3cd3db5b1c5d7795b8b
|
|
This c version uses the shortcuts found in the
vp9_quantize_fp_32x32_ssse3 function.
Change-Id: I2e983adb00064e070b7f2b1ac088cc58cf778137
|
|
This c version uses the shortcuts found in the x86
vp9_quantize_fp functions.
The test was updated to use the correct quant/round range.
Change-Id: Ie5871f710d9eb39047d8d9f48b907c0633e1f830
|
|
This reverts commit 86842855d30d6ca6befdcf5108003e027d90daa9.
SSSE3/VP9QuantizeTest.EOBCheck/1 fails on Mac and the build breaks under
visual studio due to a #if within another macro.
Change-Id: I475095a04aafcc714fade2b24e4df7b682be2cd1
|
|
This c version uses the shortcuts found in the x86
vp9_quantize_fp functions.
The test was updated to use the correct quant/round range.
Change-Id: I5d19f8af2fddda8e50910249eafb740acb29415b
|
|
This reverts commit 8c42237bb200253931c49e2c530838f3a877dd65.
Because ssse3 code is used for the reference, the qcoeff and dqcoeff
reference buffers must be aligned.
Original change's description:
> quantize avx: copy 32x32 implementation
>
> Ensure avx and ssse3 stay in sync by testing them against each other.
>
> Change-Id: I699f3b48785c83260825402d7826231f475f697c
Change-Id: Ieeef11b9406964194028b0d81d84bcb63296ae06
|
|
|
|
This reverts commit f60d1dcd3de46f72bafc5eeef481bd1a4e203301.
Reason for revert: <INSERT REASONING HERE>
Failures in AVX/VP9QuantizeTest in nightly tests.
Original change's description:
> quantize avx: copy 32x32 implementation
>
> Ensure avx and ssse3 stay in sync by testing them against each other.
>
> Change-Id: I699f3b48785c83260825402d7826231f475f697c
TBR=slavarnway@google.com,johannkoenig@google.com,builds@webmproject.org
Change-Id: Ibd38636212269328317dd0721be9d25452113d1c
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
|
|
|
|
|
|
Ensure avx and ssse3 stay in sync by testing them against each other.
Change-Id: I699f3b48785c83260825402d7826231f475f697c
|