summaryrefslogtreecommitdiff
path: root/vp9/common
AgeCommit message (Collapse)Author
2014-06-12Merge "Fast computation path for forward transform and quantization"Jingning Han
2014-06-12Fast computation path for forward transform and quantizationJingning Han
This commit enables a fast path computational flow for forward transformation. It checks the sse and variance of prediction residuals and decides if the quantized coefficients are all zero, dc only, or more. It then selects the corresponding coding path in the forward transformation and quantization stage. It is currently enabled in rtc coding mode. Will do it for rd coding mode next. In speed -6, the runtime for pedestrian_area 1080p at 1000 kbps goes down from 14234 ms to 13704 ms, i.e., about 4% speed-up. Overall coding performance for rtc set is changed by -0.18%. Change-Id: I0452da1786d59bc8bcbe0a35fdae9f623d1d44e1
2014-06-10Merge changes I6abc0657,I8224fba2,I04f64a45,I5d49d119,I76b4d171,I88c11ac3James Zern
* changes: vp9_sub_pixel_*variance*: disable avx2 variants vp9_sad*x4d: disable avx2 variants vp9_f(dct|ht): disable avx2 variants convolve: disable avx2 variants fdct8x8_test: add missing avx2 functions dct4x4_test: add missing avx2 functions
2014-06-10vp9_sub_pixel_*variance*: disable avx2 variantsJames Zern
tests failing under Win32/Win64 + variance_test: add missing avx2 functions (partially disabled) Change-Id: I6abc0657ea076379ab9ca65c12678b9ea199849d
2014-06-10vp9_sad*x4d: disable avx2 variantsJames Zern
tests failing under Win32/Win64 + sad_test: add missing avx2 functions (disabled) Change-Id: I8224fba2b270f6039ab1877d71e1e512f0081856
2014-06-10Add mode info arrays and mode info index.hkuang
In non frame-parallel decoding, this works the same way as current decoding scheme. Every time after decoder finish decoding a frame, it will swap the current mode info pointer and previous mode info pointer if the decoded frame needs to be shown. Both mode info pointer and previous mode info pointer are from mode info arrays. In frame-parallel decoding, this will become more complicated as current frame's mode info pointer will be shared with next frame as previous mode info pointer. But when one decoder thread finishes decoding one frame and starts to work on next available frame, it needs to retain the decoded frame's mode info pointers until next frame finishes decoding. The mode info index will serve this purpose. The decoder will use different buffer in the mode info arrays and use the other buffer to save previous decoded frame’s mode info. Change-Id: If11d57d8eb0ee38c8876158e5482177fcb229428
2014-06-09vp9_f(dct|ht): disable avx2 variantsJames Zern
tests failing under Win32/Win64 + dct16x16_test: add missing avx2 functions (partially disabled) exercises the forward transforms no idct/iht implementations, so the c-code is used Change-Id: I04f64a457fa0828a00f32b5c9fe4f55294f21f61
2014-06-09convolve: disable avx2 variantsJames Zern
tests failing under Win32/Win64 Change-Id: I5d49d11911bcda3a832b14efe5500d22597bedcf
2014-06-03Merge "Enable SSSE3 inverse 2D-DCT with 10 non-zero coeffs"Jingning Han
2014-06-03Merge "Reusing existing vp9_get{8x8, 16x16}var() instead of new ones."Dmitry Kovalev
2014-06-02Remove Wextra warnings from vp9_sad.cDeb Mukherjee
As a side-effect, the sad unit tests for VP8 and VP9 had to be separated. Fixes a bug in original patch: (https://gerrit.chromium.org/gerrit/#/c/70163/8) that was reverted due to a nightly test failure. Change-Id: Ia2a4e9e278fd3c89d6c3c82fcc6381320ec2a8a6
2014-06-01Merge "Revert "Remove Wextra warnings from vp9_sad.c""Frank Galligan
2014-06-01Revert "Remove Wextra warnings from vp9_sad.c"Frank Galligan
This reverts commit 916550428db803c54c993ff9d3c34b9b0bcebb7c Change-Id: I500822b03f09c64ff6ec5396c68edee9ca3b75cb
2014-05-30Merge "Fix a potential overflow issue in inverse 16x16 full 2D-DCT"Jingning Han
2014-05-29Fix a potential overflow issue in inverse 16x16 full 2D-DCTJingning Han
An overflow issue could potentially happen in the second round 1-D transform of the SSSE3 full inverse 16x16 2D-DCT. This commit fixes this issue. Change-Id: Ia19e4888fda1cc929a28a5f89a5beec612d628dc
2014-05-29Merge "Moving itxm_add pointer from MACROBLOCKD to MACROBLOCK."Dmitry Kovalev
2014-05-29Reusing existing vp9_get{8x8, 16x16}var() instead of new ones.Dmitry Kovalev
Change-Id: I87b7c657d8813d7fb383ab519d150c0ffb1dd377
2014-05-28Enable SSSE3 inverse 2D-DCT with 10 non-zero coeffsJingning Han
This commit enables SSSE3 implementation of the inverse 2D-DCT with only first 10 coefficients non-zero. It reduces the runtime of SSE2 version from 745 cycles to 538 cycles, i.e., 27% speed-up. Change-Id: I18ba4128859b09c704a6ee361d69a86c09fe8dfe
2014-05-27Merge "Fix compiling error in MSVS"Jingning Han
2014-05-27Fix compiling error in MSVSJingning Han
Need to include math.h before tmmintrin.h in some versions of MSVS. Change-Id: Ia6b83ae599316887ecf30c4e4b9e4355fb8a4219
2014-05-27Revert "Making vp9_get_sse_sum_{8x8, 16x16} static."Yunqing Wang
This reverts commit e8bbb3d9db797dab7c2f947cc43e8d0f168e4953. Change-Id: Ie368d36fd249d323d859d208609c711f04537bbc
2014-05-27Merge "Remove Wextra warnings from vp9_sad.c"Deb Mukherjee
2014-05-27Merge "Fix decoder mismatch in sub-pixel AVX2 intrinsic filters"Yunqing Wang
2014-05-23Fix decoder mismatch in sub-pixel AVX2 intrinsic filterslevytamar82
The subpixel SSSE3 was fixed in this patch: https://gerrit.chromium.org/gerrit/#/c/70283/ So the equivalent AVX2 is fixed accordingly. Change-Id: Ieebbc1949c99d34b12b8b47692df71aca5001f3a
2014-05-23Merge "Inverse 16x16 2D-DCT SSSE3 implementation"Jingning Han
2014-05-23Inverse 16x16 2D-DCT SSSE3 implementationJingning Han
This commit enables the SSSE3 implementation of full inverse 16x16 2D-DCT. The unit runtime goes down from 1642 cycles to 1519 cycles, about 7% speed-up. Change-Id: I14d2fdf9da1fb4ed1e5db7ce24f77a1bfc8ea90d
2014-05-23Merge "Fix decoder mismatch in sub-pixel SSSE3 intrinsic filters"Yunqing Wang
2014-05-23Merge "Removing vp9_pragmas.h."Dmitry Kovalev
2014-05-23Fix decoder mismatch in sub-pixel SSSE3 intrinsic filtersYunqing Wang
In 8-tap filtering, to guarantee the intermediate results fit in 16 bits, the order of accumulating the products needs to be done correctly, and the largest product should be added last. This patch fixed the problem using the method in commit "Correct ssse3 8/16-pixel wide sub-pixel filter calculation". Change-Id: I79d0ad60c057b15011ece84cda9648eee0809423
2014-05-23Merge "change to use assembly version of ssse3 filter code"Yaowu Xu
2014-05-22Remove Wextra warnings from vp9_sad.cDeb Mukherjee
As a side-effect, the sad unit tests for VP8 and VP9 had to be separated. Change-Id: I068cc2391eed51e9b140ea6aba78338c5fec8d71
2014-05-22change to use assembly version of ssse3 filter codeYaowu Xu
As mismatchs were found between the intrinsic version and c only. The commit temporarily revert to use the matching assembly version to allow further investigation. Change-Id: I08436c47d4888b562c0eac8e8856d90a831442df
2014-05-22Merge "Fix a decoding mismatch in sub-pixel filters"Yunqing Wang
2014-05-22Fix a decoding mismatch in sub-pixel filtersYunqing Wang
This did the same correction as the one in commit "Correct ssse3 8/16-pixel wide sub-pixel filter calculation" to avoid saturation during filtering. Change-Id: Ife9aa3f62daf9114eb24fe38f7baa3c3f361b2d6
2014-05-22Removing vp9_pragmas.h.Dmitry Kovalev
Change-Id: I9120a87e27e73e496932d11716937e2fad246521
2014-05-21Renames x86_64 specific asm filesDeb Mukherjee
Renames all x86_64 specific assembly files to consistently end in _x86_64.asm. This will be useful for build systems to handle these files differently. All new 64-bit specific assembly files should use the new naming convention. Change-Id: I36c89584967c82ffc4088b1b5044ac15d2bb7536
2014-05-21Moving itxm_add pointer from MACROBLOCKD to MACROBLOCK.Dmitry Kovalev
The final goal is eventually to get rid of both itxm_add and fwd_txm4x4. This patch does it in the decoder. Change-Id: Ibb3db57efbcbb1ac387c6742538a9fcf2c6f24a5
2014-05-20Merge "Extends temporal filtering to work for 422 data"Deb Mukherjee
2014-05-20Extends temporal filtering to work for 422 dataDeb Mukherjee
This is needed for profiles 1 and 2. Change-Id: I5dd7644c2932d055ab89e050d4be7d4117cd1028
2014-05-20Refactor decode_tiles and loopfilter code.hkuang
The current decode_tiles decodes the frame one tile by one tile and then loopfilter the whole frame or use another worker thread to do loopfiltering. |------|------|------|------| |Tile1-|Tile2-|Tile3-|Tile4-| |------|------|------|------| For example, if a tile video has one row and four cols, decode_tiles will decode the Tile1, then Tile2, then Tile3, then Tile4. And during decode each tile, decode_tile will decode row by row in each tile. For frame parallel decoding, decode_tiles will decode video in row order across the tiles. So the order will be: "Decode 1st row of Tile1" -> "Decode 1st row of Tile2" -> "Decode 1st row of Tile3" -> "Decode 1st row of Tile4" -> "Decode 2nd row of Tile1" -> "Decode 2nd row of Tile2" -> "Decode 2nd row of Tile3" -> "Decode 2nd row of Tile4"-> "loopfilter 1st row" Change-Id: I2211f9adc6d142fbf411d491031203cb8a6dbf6b
2014-05-19Merge "Hiding vp9_sub_pel_filters_{8, 8s, 8lp} filters in *.c file."Dmitry Kovalev
2014-05-16Removing MACROBLOCKD dependency from loop filter.Dmitry Kovalev
Change-Id: I9ef40f3d95ab8f94f69e92ea25678a40956bc1ce
2014-05-16Merge "Fix post-processor macros & remove vizualization"Adrian Grange
2014-05-15Merge "Removing redundant "8x8" suffix from MODE_INFO vars."Dmitry Kovalev
2014-05-15Merge "Revert "Remove Wextra warnings from vp9_sad.c""Jim Bankoski
2014-05-15Merge "AVX2 To VP9 Block Error Optimization"Yunqing Wang
2014-05-15Removing redundant "8x8" suffix from MODE_INFO vars.Dmitry Kovalev
Change-Id: I7ed7fecc959c6598ff98895f1a5cf7e11ac1615f
2014-05-15Fix post-processor macros & remove vizualizationAdrian Grange
Make all post-processor code conditionally compilable based on the CONFIG_VP9_POSTPROC macro. Also, remove the vizualization code from VP9 since it is out of date and will not compile. Change-Id: I1e9e13a09ecd43e9a3f3704c175ae8cd258ababd
2014-05-15Revert "Remove Wextra warnings from vp9_sad.c"Jim Bankoski
This reverts commit 7ab9a9587b96db4edce6be916c1f02297a9555ff Nightly test http://build.webmproject.org/jenkins/view/libvpx-nightly-tests/job/libvpx%20unit%20tests%20(valgrind-2)/arch=x86_64-linux-gcc,filter=-*VP8*:*Large.*/276/console Failed This patch did not address all the assembly issues some of the vp8 assembly counts on 5 arguments being passed in to this function: one example : vp8_sad8x16_wmt Please address or split this into vp9 and vp8 patches. Change-Id: I78afcc171649894f887bb8ee3c66de24aaddc7ca
2014-05-15Merge "vp9_decodeframe.c: cleanup -wextra warnings"Yaowu Xu