summaryrefslogtreecommitdiff
path: root/vp9
AgeCommit message (Collapse)Author
2013-03-07Optimize add_constant_residual functionYunqing Wang
Optimized adding constant diff to predictor, which gave about 2% decoder performance gain. Change-Id: I47db20c31428e8c4a8f16214a85cbe386a6e9303
2013-03-07Merge "Allocate 16-byte aligned diff buffer" into experimentalYunqing Wang
2013-03-07Allocate 16-byte aligned diff bufferYunqing Wang
This was done based on John's suggestion. Change-Id: I62516a513c31fe3dbea0d6cd063df79d9e819ec8
2013-03-07Update ADST selection if tx_size < block_size.Ronald S. Bultje
Change-Id: Ic9b336486774c95ffbb92adcb110cc0fc2a83cc5
2013-03-07Re-add support for ADST in superblocks.Ronald S. Bultje
This also changes the RD search to take account of the correct block index when searching (this is required for ADST positioning to work correctly in combination with tx_select). Change-Id: Ie50d05b3a024a64ecd0b376887aa38ac5f7b6af6
2013-03-07Fix issue in add_residual intrinsic functionYunqing Wang
Yaowu found this function had a compiling issue with MSVC because of using _mm_storel_pi((__m64 *)(dest + 0 * stride), (__m128)p0). To be safe, changed back to use integer store instruction. Also, for some build, diff could not always be 16-byte aligned. Changed that in the code. Change-Id: I9995e5446af15dad18f3c5c0bad1ae68abef6c0d
2013-03-07Coding con-zero count rather than EOB for coeffsDeb Mukherjee
This patch revamps the entropy coding of coefficients to code first a non-zero count per coded block and correspondingly remove the EOB token from the token set. STATUS: Main encode/decode code achieving encode/decode sync - done. Forward and backward probability updates to the nzcs - done. Rd costing updates for nzcs - done. Note: The dynamic progrmaming apporach used in trellis quantization is not exactly compatible with nzcs. A suboptimal approach has been used instead where branch costs are updated to account for changes in the nzcs. TODO: Training the default probs/counts for nzcs Change-Id: I951bc1e22f47885077a7453a09b0493daa77883d
2013-03-06Merge "Code cleanup." into experimentalDmitry Kovalev
2013-03-06Merge "Added stricter Q control flag." into experimentalPaul Wilkins
2013-03-06Added stricter Q control flag.Paul Wilkins
Added a variant of the one shot maxQ flag for two pass that forces a fixed Q for the normal inter frames. Disabled by default. Also small adjustment to the Bits per MB estimation. Change-Id: I87efdfb2d094fe1340ca9ddae37470d7b278c8b8
2013-03-05Merge "Optimize add_residual function" into experimentalYunqing Wang
2013-03-05Optimize add_residual functionYunqing Wang
Optimized adding diff to predictor, which gave 0.8% decoder performance gain. Change-Id: Ic920f0baa8cbd13a73fa77b7f9da83b58749f0f8
2013-03-05Code cleanup.Dmitry Kovalev
Removing redundant 'extern' keywords, fixing formatting and #include order, code simplification. Change-Id: I0e5fdc8009010f3f885f13b5d76859b9da511758
2013-03-05Merge changes Ifacbf5a0,Ibad7c3dd into experimentalRonald S. Bultje
* changes: vpxenc: actually report mismatch on stderr. Make superblocks independent of macroblock code and data.
2013-03-04Merge "Code cleanup and simplification of build_4x4uvmvs function." into ↵Dmitry Kovalev
experimental
2013-03-04Make superblocks independent of macroblock code and data.Ronald S. Bultje
Split macroblock and superblock tokenization and detokenization functions and coefficient-related data structs so that the bitstream layout and related code of superblock coefficients looks less like it's a hack to fit macroblocks in superblocks. In addition, unify chroma transform size selection from luma transform size (i.e. always use the same size, as long as it fits the predictor); in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma transform will now use the 16x16 (instead of the 8x8) chroma transform, and 64x64 superblocks using the 32x32 luma transform will now use the 32x32 (instead of the 16x16) chroma transform. Lastly, add a trellis optimize function for 32x32 transform blocks. HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's a few negative points here and there that I might want to analyze a little closer. Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430
2013-03-04Merge "Code cleanup." into experimentalDmitry Kovalev
2013-03-04Merge "Optimize vp9_short_idct4x4llm function" into experimentalYunqing Wang
2013-03-04Optimize vp9_short_idct4x4llm functionYunqing Wang
Wrote a SSE2 vp9_short_idct4x4llm to improve the decoder performance. Change-Id: I90b9d48c4bf37aaf47995bffe7e584e6d4a2c000
2013-03-04Support 16K sequence codingJingning Han
Fixed a couple of variable/function definitions, as well as header handling to support 16K sequence coding at high bit-rates. The width and height are each specified by two bytes in the header. Use an extra byte to explicitly indicate the scaling factors in both directions, each ranging from 0 to 15. Tested coding up to 16400x16400 dimension. Change-Id: Ibc2225c6036620270f2c0cf5172d1760aaec10ec
2013-03-01Add unit test for x4 multi-SAD functionsJohn Koleszar
Update the function prototypes to match between VP9 and VP8. Change-Id: If58965073989e87df3b62b67a030ec6ce23ca04f
2013-03-01Code cleanup and simplification of build_4x4uvmvs function.Dmitry Kovalev
Change-Id: Iab0176f058045181821ded95ff1cf423af1625f9
2013-03-01Code cleanup.Dmitry Kovalev
Removing redundant 'extern' keyword, lowercase variable names. Change-Id: I608e8d8579aba8981f5fac3493f77b4481b13808
2013-03-01Merge master branch into experimentalJohn Koleszar
Picks up some build system changes, compiler warning fixes, etc. Change-Id: I2712f99e653502818a101a72696ad54018152d4e
2013-03-01Merge "Adjust the max_gf_interval initialization" into experimentalYaowu Xu
2013-03-01Merge "Add eob<=10 case in idct32x32" into experimentalYunqing Wang
2013-03-01Adjust the max_gf_interval initializationYaowu Xu
to be a fixed value of 15. Test results: cif: .124%, .068%, .081% std-hd: 2.809%, 3.174%, 2.705% Change-Id: I380c8152c973506094da15eab59e3aa22b75a983
2013-02-28Merge "Code cleanup." into experimentalDmitry Kovalev
2013-02-28Add eob<=10 case in idct32x32Yunqing Wang
Simplified idct32x32 calculation when there are only 10 or less non-zero coefficients in 32x32 block. This helps the decoder performance. Change-Id: If7f8893d27b64a9892b4b2621a37fdf4ac0c2a6d
2013-02-28Merge changes I9be9c990,Ic3b97339 into experimentalDmitry Kovalev
* changes: Ignoring test video sequences in the source tree. Code cleanup.
2013-02-28firstpass.c: correct casting around gf_group_bitsJames Zern
gf_group_bits is int64_t remove casts to int. Change-Id: I3b4225905041fac9af9fdfcbcb6f1c357ea4b593
2013-02-28Merge "Fix use of uninitialized memory in CONFIG_ABOVESPREFMV" into experimentalJohn Koleszar
2013-02-28Merge "mv dct_sse2.c dct_sse2_intrinsics.c to avoid collision" into experimentalJim Bankoski
2013-02-28Code cleanup.Dmitry Kovalev
Lower case variable names, converting while loops to for loops. Change-Id: Ic3b973391eef7472a99d18d02fe79cfef5e04e62
2013-02-28Merge "Refactor vp9_dequant_idct_add function" into experimentalYunqing Wang
2013-02-28Refactor vp9_dequant_idct_add functionYunqing Wang
Provided a wrapper and removed duplicate code. Change-Id: Iaef842226ec348422e459202793b001d0983ea30
2013-02-28Removed vp9_dequantize_bScott LaVarnway
Change-Id: Ie89bd00d58e30bf4094cb748a282f1dfa81a31d8
2013-02-28mv dct_sse2.c dct_sse2_intrinsics.c to avoid collisionJim Bankoski
Change-Id: Id786be31da3c91d95d2955aa569ecdc6e66650df
2013-02-28Fix use of uninitialized memory in CONFIG_ABOVESPREFMVJohn Koleszar
The ABOVESPREFMV experiment uses four pixels to the left of the current block, which don't exist for the left-most column. Change-Id: I4cf0b42ae8f54c0b3e7b1ed8755704b74fafc39c
2013-02-28Merge "Dequantization code cleanup." into experimentalDmitry Kovalev
2013-02-28Dequantization code cleanup.Dmitry Kovalev
Removing redundant variables, using x *= y instead x = x * y, moving variable declarations into inner blocks. Change-Id: I884f95c755f55d51b7c1c6585f10296919063e41
2013-02-28Code cleanup.Dmitry Kovalev
Removing redundant 'extern' keyword, better formatting, code simplification. Change-Id: I132fea14f08c706ee9ea147d19464d03f833f25b
2013-02-28Fix incorrect comparison of frame sizeJohn Koleszar
The width and height stored in the reference frames are padded out to a multiple of 16. The Width and Height variables in common are the displayed size, which may be smaller. The incorrect comparison was causing scaling related code to be called when it shouldn't have been. A notable case where this happens is 1080p, since 1088 != 1080. Change-Id: I55f743eeeeaefbf2e777e193bc9a77ff726e16b5
2013-02-28this commit converts all sad ptrs to uint32Jim Bankoski
sse4_1 code used uint16_t for returning sad, but that won't work for 32x32 or 64x64. This code fixes the assembly for those and also reenables sse4_1 on linux Change-Id: I5ce7288d581db870a148e5f7c5092826f59edd81
2013-02-28fix to parameters to match rtcdJim Bankoski
Change-Id: I919e2dd72292fe44f2e53ada56bd42287d50cdeb Signed-off-by: Jim Bankoski <jimbankoski@google.com>
2013-02-27Faster vp9_short_fdct8x8.Christian Duvivier
Scalar path is about 1.4x faster (4% overall encoder speedup). SSE2 path is about 7x faster (13% overall encoder speedup). Change-Id: I7e85d8225a914a74c61ea370210414696560094d
2013-02-27Code cleanup.Dmitry Kovalev
Fixing code style, using array lookup instead of switch statements for forward hybrid transforms (in the same way as for their inverses). Consistent usage of ROUND_POWER_OF_TWO macro in appropriate places. Change-Id: I0d3822ae11f928905fdbfbe4158f91d97c71015f
2013-02-27Merge "Motion vectors code cleanup." into experimentalDmitry Kovalev
2013-02-27Merge "Remove unused file" into experimentalYunqing Wang
2013-02-27Merge "Remove unused vp9_copy32xn" into experimentalJohn Koleszar