summaryrefslogtreecommitdiff
path: root/test/vp9_quantize_test.cc
AgeCommit message (Collapse)Author
2018-06-18include msvc.h for snprintf support in benchmarksLuc Trudeau
include vpx_ports/msvc.h to avoid issues with snprintf issues with MSVC. Change-Id: Ida09cff8ee3b84e09fd61de131f84b32c113fa1a
2018-06-11VSX Version of vp9_quantize_fp_32x32Luc Trudeau
Low bit depth version only. Passes the VP9QuantizeTest test suite. VP9QuantizeTest Speed Test (POWER8 Model 2.1) 32x32 C time = 93.1 ms (±0.4 ms), VSX time = 6.5 ms (±0.2 ms) [14.4x] Change-Id: I7f1fd0fc987af86baf2b74147a25aee811289112
2018-06-11VSX Version of vp9_quantize_fpLuc Trudeau
Low bit depth version only. Passes the VP9QuantizeTest test suite. VP9QuantizeTest Speed Test (POWER8 Model 2.1) 4x4 C time = 86.3 ms (±0.7 ms), VSX time = 18.2 ms (±0.0 ms) [ 4.7x] 8x8 C time = 57.7 ms (±0.3 ms), VSX time = 7.6 ms (±0.0 ms) [ 7.6x] 16x16 C time = 50.7 ms (±0.1 ms), VSX time = 4.9 ms (±0.0 ms) [10.3x] Change-Id: Ic09bc786c57cc89bba14624064216b52996075eb
2018-06-04test,cosmetics: fix func/member naming, decl orderJames Zern
functions: upper camelcase members: lowercase with trailing '_' decl order: functions (overrides marked virtual), members after: 656e8ac61 VSX version of vpx_post_proc_down_and_across_mb_row 766d875b9 VSX version of vpx_mbpost_proc_ip 35e98a70b VSX version of vpx_mbpost_proc_down b2898a9ad Bench Class For More Robust Speed Tests Change-Id: Ib257bd607c5c1248d30e619ec9e8a47cc629825b
2018-05-29Bench Class For More Robust Speed TestsLuc Trudeau
To make speed testing more robust, the AbstractBench runs the desired code multiple times and report the median run time with mean absolute deviation around the median. To use the AbstractBench, simply add it as a parent to your test class, and implement the run() method (with the code you want to benchmark). Sample output for VP9QuantizeTest [ BENCH ] Bypass calculations 4x4 165.8 ms ( ±1.0 ms ) [ BENCH ] Full calculations 4x4 165.8 ms ( ±0.9 ms ) [ BENCH ] Bypass calculations 8x8 129.7 ms ( ±0.9 ms ) [ BENCH ] Full calculations 8x8 130.3 ms ( ±1.4 ms ) [ BENCH ] Bypass calculations 16x16 110.3 ms ( ±1.4 ms ) [ BENCH ] Full calculations 16x16 110.1 ms ( ±0.9 ms ) Change-Id: I1dd649754cb8c4c621eee2728198ea6a555f38b3
2018-05-14VSX version of vpx_quantize_b_32x32_vsxLuc Trudeau
Low bit depth version only. Passes the VP9QuantizeTest. VP9QuantizeTest Speed Test (POWER8 Model 2.1) Full calculations: C time = 1456 ms, VSX time = 80 ms (18x) Change-Id: I1b1d6d03b1aeff63640efbdeb222cab857ddd95e
2018-05-09VSX version of vpx_quantize_b_vsxLuc Trudeau
Low bit depth version only. Passes the VP9QuantizeTest. Change-Id: I6546f872864bd404a7e353348b0554aab1de5bf0
2018-03-28test: use testing::*tuple instead of std::tr1James Zern
googletest imports tuple into testing to allow for compatibility across c++ versions where tuple may be in std::tr1 or std. fixes deprecation warnings under visual studio 2017 Change-Id: Id78b372d5478b12d8c8f63fd3f2166fec25aa8be
2018-01-18vp9_quantize_fp_avx2()Scott LaVarnway
Started from vp9_quantize_fp_sse2 and tweaked to use avx2. Change-Id: Ic2da50cc9d73896c7ef2f3cd3db5b1c5d7795b8b
2017-12-26Add quantize_fp_32x32_nz_c()Scott LaVarnway
This c version uses the shortcuts found in the vp9_quantize_fp_32x32_ssse3 function. Change-Id: I2e983adb00064e070b7f2b1ac088cc58cf778137
2017-12-21Add vp9_quantize_fp_nz_c() -- 2Scott LaVarnway
This c version uses the shortcuts found in the x86 vp9_quantize_fp functions. The test was updated to use the correct quant/round range. Change-Id: Ie5871f710d9eb39047d8d9f48b907c0633e1f830
2017-12-21Revert "Add vp9_quantize_fp_nz_c()"James Zern
This reverts commit 86842855d30d6ca6befdcf5108003e027d90daa9. SSSE3/VP9QuantizeTest.EOBCheck/1 fails on Mac and the build breaks under visual studio due to a #if within another macro. Change-Id: I475095a04aafcc714fade2b24e4df7b682be2cd1
2017-12-19Add vp9_quantize_fp_nz_c()Scott LaVarnway
This c version uses the shortcuts found in the x86 vp9_quantize_fp functions. The test was updated to use the correct quant/round range. Change-Id: I5d19f8af2fddda8e50910249eafb740acb29415b
2017-09-12Revert "Revert "quantize avx: copy 32x32 implementation""Johann
This reverts commit 8c42237bb200253931c49e2c530838f3a877dd65. Because ssse3 code is used for the reference, the qcoeff and dqcoeff reference buffers must be aligned. Original change's description: > quantize avx: copy 32x32 implementation > > Ensure avx and ssse3 stay in sync by testing them against each other. > > Change-Id: I699f3b48785c83260825402d7826231f475f697c Change-Id: Ieeef11b9406964194028b0d81d84bcb63296ae06
2017-08-25Merge "Revert "quantize avx: copy 32x32 implementation""Marco Paniconi
2017-08-25Revert "quantize avx: copy 32x32 implementation"Marco Paniconi
This reverts commit f60d1dcd3de46f72bafc5eeef481bd1a4e203301. Reason for revert: <INSERT REASONING HERE> Failures in AVX/VP9QuantizeTest in nightly tests. Original change's description: > quantize avx: copy 32x32 implementation > > Ensure avx and ssse3 stay in sync by testing them against each other. > > Change-Id: I699f3b48785c83260825402d7826231f475f697c TBR=slavarnway@google.com,johannkoenig@google.com,builds@webmproject.org Change-Id: Ibd38636212269328317dd0721be9d25452113d1c No-Presubmit: true No-Tree-Checks: true No-Try: true
2017-08-24Merge "quantize avx: copy 32x32 implementation"Johann Koenig
2017-08-24Merge "quantize test: skip block was removed"Johann Koenig
2017-08-24quantize avx: copy 32x32 implementationJohann
Ensure avx and ssse3 stay in sync by testing them against each other. Change-Id: I699f3b48785c83260825402d7826231f475f697c
2017-08-24quantize ssse3: copy implementation to intrinsicsJohann
Still does not pass tests. Does match the previous assembly, although saving the sign before multiplying is dubious. Change-Id: Ia163f18c755aba542d6e93f7bf7343184660df5a
2017-08-24quantize test: skip block was removedJohann
Change-Id: I1d93698bc27529b0544d79dd7b9fe37afa51ef87
2017-08-23quantize test: set threshold for 32x32Johann
Change-Id: I77be617c7d7c64929dd51c6077322f4f8ad23897
2017-08-23Merge "quantize avx: copy implementation to intrinsics"Johann Koenig
2017-08-23quantize avx: copy implementation to intrinsicsJohann
Adds an early exit based on ptest. Slightly slower than ssse3 in the full case because of the extra check, but potentially faster if lots of rows can be skipped. Very close in speed to the assembly. Can run in 32 bit, unlike the assembly. Allows reworking the function prototype to use structs. Change-Id: If80e2b9ba059370a4cad3c973196e82a97b4330e
2017-08-23quantize fp: neon implementationJohann
About 4x faster when values are below the dequant threshold and 10x faster if everything needs to be calculated. Both numbers would improve if the division for dqcoeff could be simplified. BUG=webm:1426 Change-Id: I8da67c1f3fcb4abed8751990c1afe00bc841f4b2
2017-08-21quantize test: test _fp_ version of quantizeJohann
None of the x86 optimizations pass the tests. Change-Id: Ic67f2ba1977b657e68f2a13b0711fc5fcbafd909
2017-08-21Remove skip_block from quantizeJohann
This condition is handled before this code is reached. The ssse3 version of the function has always crashed when attempting to handle the skip_block condition. Add assert() and comments regarding the usage of skip_block. Removing the parameter is a fairly involved process so leave it be for the moment. Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a
2017-08-15quantize test: quiet overflow warningJohann
Promote the result of RandRange to signed Change-Id: I89313cace3bcbe9af96946bef00b6857fc48b128
2017-08-14Merge changes I4b4beab1,I02f74decJohann Koenig
* changes: quantize test: check skip_block quantize test: use negative input
2017-08-14disable SSSE3/VP9QuantizeTest* in hbd buildsJames Zern
this test fails with the configuration similar to the assembly prior to: d52cb5972 quantize: copy ssse3 optimizations to intrinsics BUG=webm:1458 Change-Id: Idc5c0b84c0598259fc49609a9f0756de531d3baf
2017-08-10Merge "neon: vpx_quantize_b_32x32"Johann Koenig
2017-08-08quantize test: check skip_blockJohann
Not all sizes were tested previously. Only 4x4 and 32x32 Change-Id: I4b4beab1b92a810a097a7306de04cc9e0e260315
2017-08-08quantize test: use negative inputJohann
coeff contains signed values. Change-Id: I02f74decf30379a28122169ab3e844d0f3bd7d23
2017-08-08neon: vpx_quantize_b_32x32Johann
With skip block the neon is about twice as fast as C. The neon has no shortcut for coeff < zbin so it always takes the same amount of time. Even if the C can take the shortcut, it is over twice as fast in neon. If it can't, that gap increases to over 10x. BUG=webm:1426 Change-Id: I400722146c1b5a5f6289f67d85fd642463d2bfc6
2017-08-08quantize: copy ssse3 optimizations to intrinsicsJohann
Fairly minor differences from sse2. pabsw and psignw are the big gains. Also re-uses some values in eob calculation to avoid an extra pcmp. Fixes test failures in HBD and OS X builds. Allows using it in 32bit builds, where it is about 40% faster than sse2. Substantially faster than the assembly for skip_block. 10-20% faster the rest of the time. Change-Id: If783bb3567e561e47667e10133b9c84414a334e2
2017-08-04quantize test: consolidate sizesJohann
Pass a max txfm size parameter and combine the base quantize test with the 32x32 test. Change-Id: I72ddf020fe6888e864ea9f3642ee2d9a8e48a04b
2017-08-02quantize test: add speed comparisonJohann
Test some possible scenarios. Change-Id: I1a612e7153b31756be66390ceea55877856d5a33
2017-07-31neon: vpx_quantize_bJohann
With skip block or coeff < zbin it is about twice as fast as C. If most coeff values are > zbin it is about 10-15x as fast as C. BUG=webm:1426 Change-Id: I5d3c007b014a372d5ef0882b39bb48983b4131c7
2017-07-20quantize test: promote RandRange() result to signedJohann
Avoid unsigned overflow warning: unsigned integer overflow: 19974 - 32703 cannot be represented in type 'unsigned int' Change-Id: Ifebee014342e4c6f3b53306c0cad6ae0b465ac12
2017-07-20quantize test: lowbd functions do not pass in highbdJohann
qcoeff output looks OK but dqcoeff is no good. BUG=webm:1448 Change-Id: I07211db8a8b74f1f45fdd059852e2de0e5ee18fd
2017-07-19quantize test: eob is outputJohann
eob values are generated by the function. Change-Id: I8ce92100e83022bff99888a5a7e6ef378c49fda3
2017-07-18quantize test: test sse2 and avx optimizationsJohann
ssse3 does not pass either of the tests. avx 32x32 does not pass. Change-Id: I62c2e31336fd2327327afaa0da896ad79a3def44
2017-07-18quantize test: extend arraysJohann
Officially the quant structures are 8 elements, with one dc element and 7 repeated ac elements. The low bit depth optimizations take advantage of this to fill the xmm registers. The high bit depth version manually duplicates the values. If all the optimizations were unified, the structure sizes could be greatly reduced. Change-Id: Ibd7a0337a7832ce2a1a05ee433c310077e1059ae
2017-07-18quantize test: restrict and correct inputJohann
Use only valid values for quantize inputs. These were determined by looping over vp9_init_quantizer and looking for max and min values. This allows extending the test to the low bit depth functions which were not designed to handle all possible inputs but only valid inputs. Change-Id: I94e1d8863a49ac227845b65c6b50130e10e6319e
2017-07-13quantize test: use BufferJohann
Although the low bitdepth functions are identical (excepting the need for larger intermediate values) they do not pass these tests. This improves the error output to aid debugging. Simplify buffer usage with Buffer and removing unnecessarily aligned variables. eob is a single element and never written using aligned instructions. BUG=webm:1426 Change-Id: Ic95789a135cf1e8a3846d85270f2b818f6ec7e35
2016-07-27test: apply clang-formatclang-format
Change-Id: I0d9ab85855eb723f653a7bb09b3d0d31dd6cfd2f
2015-08-04Change vp9_quantize to vpx_quantizeJingning Han
This commit clears all the vp9_ prefix use case in vpx_dsp. It gets the vp9 folder ready to branch out vp10. Change-Id: I2906eec179ee792b4af8c9b4161313653050e931
2015-07-17Migrate quantization functions from vp9/ to vpx_dsp/Yunqing Wang
The following quantization functions were moved: vp9_quantize_b vp9_quantize_b_32x32 vp9_highbd_quantize_b vp9_highbd_quantize_b_32x32 vp9_quantize_dc vp9_quantize_dc_32x32 vp9_highbd_quantize_dc vp9_highbd_quantize_dc_32x32 The purpose of doing that was to allow these functions to be shared by multiple codecs. Change-Id: Id8ab939f283353cdd07bd930d47db3d932a5d87f
2015-05-22Re-worked header filesScott LaVarnway
Various header/test files had to be re-worked in order to build "Remove cm parameter from vp9_decode_block_tokens()". This patch reverts the "Remove cm" part and only contains the re-worked header files. Change-Id: I520958a88d1991fee988a3c784d0eac40e117a32
2015-05-07replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNEDJames Zern
this macro was used inconsistently and only differs in behavior from DECLARE_ALIGNED when an alignment attribute is unavailable. this macro is used with calls to assembly, while generic c-code doesn't rely on it, so in a c-only build without an alignment attribute the code will function as expected. Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79