summaryrefslogtreecommitdiff
path: root/vp9
AgeCommit message (Collapse)Author
2013-09-09Merge "Reduce the amount of extension in src frames"Yaowu Xu
2013-09-09Merge "Enable kf restrictions at speed 4"Paul Wilkins
2013-09-08Reduce the amount of extension in src framesYaowu Xu
The commit changes the border pixel extension from 160 pixel each side to what is necessary in arnr filter or motion estimation portion, i.e. 16 pixel on top and left side. For right or bottom side, the extension is changed to either round up image size to multiple of 64 or at least 16 pixels. Change-Id: Ic05e19b94368c1ab4df568723aae5734e6c3d2c5
2013-09-08Merge "New mode_info_context storage"Jim Bankoski
2013-09-06Fix overflow issue in 16x16 quantization SSSE3Jingning Han
The 16x16 transform unit test suggested that the peak coefficient value can reach 32639. This could cause potential overflow issue in the SSSE3 implmentation of 16x16 block quantization. This commit fixes this issue by replacing addition with saturated addition. Change-Id: I6d5bb7c5faad4a927be53292324bd2728690717e
2013-09-06Enable kf restrictions at speed 4Paul Wilkins
Change-Id: I453409d3be3f5fe118b15affde45cb52184aef20
2013-09-06Support a constant quality mode in VP9Deb Mukherjee
Adds a new end-usage option for constant quality encoding in vpx. This first version implemented for VP9, encodes all regular inter frames using the quality specified in the --cq-level= option, while encoding all key frames and golden/altref frames at a quality better than that. The current performance on derfraw300 is +0.910% up from bitrate control, but achieved without multiple recode loops per frame. The decision for qp for each altref/golden/key frame will be improved in subsequent patches based on better use of stats from the first pass. Further, the qp for regular inter frames may also be varied around the provided cq-level. Change-Id: I6c4a2a68563679d60e0616ebcb11698578615fb3
2013-09-06New mode_info_context storageScott LaVarnway
mode_info_context was stored as a grid of MODE_INFO structs. The grid now constists of a pointer to a MODE_INFO struct and a "in the image" flag. The MODE_INFO structs are now stored as a stream, eliminating unnecessary copies and is a little more cache friendly. For the test clips used, the decoder performance improved by ~4.3% (1080p) and ~9.7% (720p). Patch Set 2: Re-encoded clips with latest. Now ~1.7% (1080p) and 5.9% (720p). Change-Id: I846f29e88610fce2523ca697a9a9ef2a182e9256
2013-09-06Merge "fix loop filter setup_mask could reach out of bounds issue"Jim Bankoski
2013-09-05Merge "Speed up idct8x8 by rearrange instructions. Speed improve from 264% ~ ↵hkuang
270% to 280% ~ 300% base on assembly-perf."
2013-09-05fix loop filter setup_mask could reach out of bounds issueJim Bankoski
Change-Id: Ic8446c4f26b6782a6dc482c19ea73c77646df418
2013-09-05Merge "Use saturated addition in SSSE3 of 32x32 quant"Jingning Han
2013-09-05Merge "resolve clang warnings : uninitialized vars in vp9_entropy.h"Jim Bankoski
2013-09-05Use saturated addition in SSSE3 of 32x32 quantJingning Han
The 32x32 forward transform can potentially reach peak coefficient value close to 32700, while the rounding factor can go upto 610. This could cause overflow issue in the SSSE3 implementation of 32x32 quantization process. This commit resolves this issue by replacing the addition operations with saturated addition operations in 32x32 block quantization. Change-Id: Id6b98996458e16c5b6241338ca113c332bef6e70
2013-09-05Merge "faster accounting of inc_mv"Jim Bankoski
2013-09-05Merge "make bsize requirement for SEG_LVL_SKIP explicit"Yaowu Xu
2013-09-04resolve clang warnings : uninitialized vars in vp9_entropy.hJim Bankoski
This helps clear out some of the warnings Change-Id: Ie7ccaca8fd92542386a7f1b257398e1bdf2f55dc
2013-09-04Merge "wrap non420 loop filter code in macro"Jim Bankoski
2013-09-04Merge "Attempt to fix speed 4"Paul Wilkins
2013-09-04make bsize requirement for SEG_LVL_SKIP explicitYaowu Xu
The segment feature SEG_LVL_SKIP requires the prediction unit size to be at least BLOCK_8X8. This commit makes the requirement to be explicit. This is to prevent future encoder implementations from making wrong choices. Change-Id: I0127f0bd4c66e130b81f0cb0a8d3dbfe3b2da5c2
2013-09-04Speed up idct8x8 by rearrange instructions.hkuang
Speed improve from 264% ~ 270% to 280% ~ 300% base on assembly-perf. Change-Id: I3e2cc818ec14b432204ff43732f39b6438db685d
2013-09-04Merge "Fixing problem with invalid delta_q reading."Yaowu Xu
2013-09-04Merge "Add neon optimize vp9_short_iht4x4_add."hkuang
2013-09-04Add neon optimize vp9_short_iht4x4_add.hkuang
Change-Id: I42c497b68ae1ee645b59c9968ad805db0a43e37e
2013-09-04Fixing problem with invalid delta_q reading.Dmitry Kovalev
This is a bitstream change but no currently produces videos should be affected. https://code.google.com/p/webm/issues/detail?id=610 Change-Id: Ic85a6477df6c201cdf7f70f6bd84607b71f4593c
2013-09-04Merge "Replacing init_dequantizer() with setup_plane_dequants()."Yaowu Xu
2013-09-04Merge "speed up inc_mv_component"Jim Bankoski
2013-09-04Merge "make vp9 postproc a config option"Jim Bankoski
2013-09-04Merge "Use correct bit cost while static-thresh is on"Yunqing Wang
2013-09-04wrap non420 loop filter code in macroJim Bankoski
Change-Id: I62bca0e7a4bffc1a78b750dbb9df9d2378e92423
2013-09-04make vp9 postproc a config optionJim Bankoski
Vp9 postproc is disabled for now as its not been shown to help and may be merged with vp8. Change-Id: I25620d6cd34c6e10331b18c7b5ef7482e39c6057
2013-09-04faster accounting of inc_mvJim Bankoski
Moves counting of mv branches to where we have a new mv, instead of after the whole frame is summed. Change-Id: I945d9f6d9199ba2443fe816c92d5849340d17bbd
2013-09-04Replacing init_dequantizer() with setup_plane_dequants().Dmitry Kovalev
Change-Id: Ib67e996b4a6dcb6f481889f5a0d84811a9e3c5d1
2013-09-04speed up inc_mv_componentJim Bankoski
Convert mv_class if statements to look up. re order to avoid ifs... Change-Id: I76966a21bf517bb1f9a7957c08c476c7bb3e9a63
2013-09-03Merge "Fix intermediate height in convolve_c"James Zern
2013-09-03Attempt to fix speed 4Paul Wilkins
Speed 4 fixed partition size. Use fixed size unless it does not fit inside image, in which case use the largest size that does. Change-Id: I250f7a80506750dd82ab355721624a1344247223
2013-09-03Merge "Fix 32x32 forward transform SSE2 version"Jingning Han
2013-09-03Merge "Improved mb_lpf_horizontal_edge_w_sse2_8"Scott LaVarnway
2013-08-31Fix 32x32 forward transform SSE2 versionJingning Han
This commit fixed the potential overflow issue in the SSE2 implementation of 32x32 forward DCT. It resolved the corrupted coded frames in the border of scenes. Change-Id: If87eef2d46209269f74ef27e7295b6707fbf56f9
2013-08-30Use correct bit cost while static-thresh is onYunqing Wang
While static-thresh is on, we only need to transmit skip flag if skip = 1. The cost of skip bit is added to the total rate cost. Change-Id: I64e73e482bc297eba22907026298a15fa8cc3920
2013-08-30Merge "Added per pixel inter rd hit count stats"Paul Wilkins
2013-08-30Fix intermediate height in convolve_cTero Rintaluoma
- Intermediate height was not correct i.e. when block size is 4 and y_step_q4 is 6. In this case intermediate height was (4*6) >> 4 = 1 and vertical interpolation needs two source pixels plus 7 extra pixels for taps. - Also if the current output block is 16x16 and we are using 4x upscaling we need only 12 rows after horizontal filtering instead of 16. Patch Set 2: Intermediate_height updated after CL 66723 "Fix bug in convolution functions (filter selection)" Change-Id: I5a1a1bc2ac9d5edb3a6e0818de618bf318fdd589
2013-08-29Merge "rework filter_block_plane"Jim Bankoski
2013-08-29rework filter_block_planeJim Bankoski
Change-Id: I55c3b60c4c0f4910d3dfb70e3edaae00cfa8dc4d
2013-08-29Merge "Fix overflow issue in SSSE3 32x32 quantization"Jingning Han
2013-08-30Added per pixel inter rd hit count statsPaul Wilkins
Added some code to output normalized rd hit count stats. In effect this approximates to the average number of rd operations/tests per pixel for the sequence. The results are not quite accurate and I have not bothered to account for partial SB64s at frame edges and for key frames However they do give some idea of the number of modes / prediction methods being tested for each pixel across the different partition sizes. This indicates how much scope their is for further gains either by reducing the number of partitions examined or the modes per partition through heuristics. Patch 3 moved place where count incremented so partial rd tests that are aborted with INT_MAX return are also counted. Example numbers for first 50 frames of Akiyo. Speed 0 ~84.4 rd operations / pixel Speed 1 ~28.8 Speed 2 ~11.9 Change-Id: Ib956e787e12f7fa8b12d3a1a2f6cda19a65a6cb8
2013-08-29Merge "Adds a speed feature for fast 1-loop forw updates"Deb Mukherjee
2013-08-29Merge changes Ib1e853f9,Ifd75c809,If3e83404James Zern
* changes: consistently name VP9_COMMON variables #3 consistently name VP9_COMMON variables #2 consistently name VP9_COMMON variables #1
2013-08-29Merge "Fixed potential overflows"Yaowu Xu
2013-08-29consistently name VP9_COMMON variables #3James Zern
stragglers Change-Id: Ib1e853f9a331b7b66639dc34d79568d84d1930f1