summaryrefslogtreecommitdiff
path: root/vp9/encoder
AgeCommit message (Collapse)Author
2015-04-09Merge "Remove get_nonrd_var_based_fixed_partition function"Jingning Han
2015-04-09Merge "Compute prediction filter type cost only when needed"Jingning Han
2015-04-09Merge "SSSE3 assembly implementation of 8x8 Hadamard transform"Jingning Han
2015-04-09Remove get_nonrd_var_based_fixed_partition functionJingning Han
This function has been replaced by other approaches and is not in use now. Change-Id: I387f45b5607d202539e482468ccc70e6c0f9341f
2015-04-08Merge "Improve accuracy of rate control in CQ mode"Debargha Mukherjee
2015-04-07Merge "vp9_full_search_sadx[38]: align sad arrays"James Zern
2015-04-07Merge "Optimize the checking for transform skipping"Yaowu Xu
2015-04-07Merge "move ref_frame_cost computations into a function"Yaowu Xu
2015-04-07Improve accuracy of rate control in CQ modeDebargha Mukherjee
Modifies a special handling that improves rate control accuracy in the constrained quality mode, when the undershoot and overshoot limits are set tighter. Change-Id: If62103f0ef3ed1cac92807400678c93da50cf046
2015-04-07vp9_full_search_sadx[38]: align sad arraysJames Zern
the sse4 code expects 16-byte aligned arrays; vp8 already had a similar change applied: b2aa401 Align SAD output array to be 16-byte aligned Change-Id: I5e902035e5a87e23309e151113f3c0d4a8372226
2015-04-07Merge "Enable Hadamard transform based cost estimate for all block sizes"Jingning Han
2015-04-07Merge "Account for eob cost in the RTC mode decision process"Jingning Han
2015-04-07Compute prediction filter type cost only when neededJingning Han
Skip redundant prediction filter type cost in filter search loop, if the rate value will be reset in Hadamard transform based rate distortion estimate. Change-Id: Ie5221f4bc8da9461c449df367251aeeac52c6e5d
2015-04-06Optimize the checking for transform skippingYaowu Xu
If U is not skippable, then do not perform the check on V. Change-Id: Iba5e8362bd42390197f373c44388a426a4404549
2015-04-04SSSE3 assembly implementation of 8x8 Hadamard transformJingning Han
It uses about 10% less CPU cycles than the SSE2 intrinsic implementation. Change-Id: I91017c0c068679a214b98cdd4cff3a6facfb7499
2015-04-04Enable Hadamard transform based cost estimate for all block sizesJingning Han
This commit turns on the Hadamard transform based rate distortion estimate for all block sizes in RTC coding mode. It conditionally skips the rate distortion estimation if all zero block flag is set on. No significant encoding speed change is observed. The compression performance of speed -6 is improved by 1.7% over using it only for block sizes of 32x32 and below. Change-Id: I768145e6f05c737b05b5b5f1ee674e929532cafb
2015-04-03Merge "Fix the scaling factor in UV skipping test"Yunqing Wang
2015-04-03Fix the scaling factor in UV skipping testYunqing Wang
The threshold scaling factor was calculated wrong using partition size "bsize". Thank Yaowu for pointing it out. It was fixed and no speed change was seen. Change-Id: If7a5564456f0f68d6957df3bd2d1876bbb8dfd27
2015-04-03Merge "Tune SSSE3 assembly implementation to improve quantization speed"Jingning Han
2015-04-03Account for eob cost in the RTC mode decision processJingning Han
This commit accounts for the transform block end of coefficient flag cost in the RTC mode decision process. This allows a more precise rate estimate. It also turns on the model to block sizes up to 32x32. The test sequences shows about 3% - 5% speed penalty for speed -6. The average compression performance improvement for speed -6 is 1.58% in PSNR. The compression gains for hard clips like jimredvga, mmmoving, and tacomascmv at low bit-rate range are 1.8%, 2.1%, and 3.2%, respectively. Change-Id: Ic2ae211888e25a93979eac56b274c6e5ebcc21fb
2015-04-02Merge "Set vbp thresholds for aq3 boosted blocks"Yunqing Wang
2015-04-02move ref_frame_cost computations into a functionYaowu Xu
Change-Id: Iebf2ad2b1db7e2874788fda8d55e67f4cb1149f1
2015-04-02Merge "Code cleanup: put (8x8/4x4)fill_variance into separate function."Marco
2015-04-02Set vbp thresholds for aq3 boosted blocksYunqing Wang
The vbp thresholds are set seperately for boosted/non-boosted superblocks according to their segment_id. This way we don't have to force the boosted blocks to split to 32x32. Speed 6 RTC set borg test result showed some quality gains. Overall PSNR: +0.199%; Avg PSNR: +0.245%; SSIM: +0.802%. No speed change was observed. Change-Id: I37c6643a3e2da59c4b7dc10ebe05abc8abf4026a
2015-04-02Code cleanup: put (8x8/4x4)fill_variance into separate function.Marco
Code cleanup, no change in behavior. Change-Id: I043b889f8f0b3afb49de0da00873bc3499ebda24
2015-04-02Small fix to segment check in pickmode.Marco
Change-Id: Id5fd82a504def2523292466fbaad5dade9424c72
2015-04-01Merge "Reduce required xmm number by one in block_error_fp"Jingning Han
2015-04-01Tune SSSE3 assembly implementation to improve quantization speedJingning Han
Change-Id: If0ca8b25b4800d4336e6cbc97194cd9b01c5b5a3
2015-04-01Merge "Simplify bsize calculation"Yaowu Xu
2015-04-01Merge "Optimize quantization simd implementation"Jingning Han
2015-04-01Merge "Simplify effective src_diff address computation"Jingning Han
2015-04-01Merge "Refactor block_yrd function for RTC coding mode"Jingning Han
2015-04-01Simplify bsize calculationYaowu Xu
Change-Id: Ibc514684def9914c66f04cb7931f773e2b79c168
2015-04-01Simplify effective src_diff address computationJingning Han
Remove redundant offset calculation for effective src_diff address. Change-Id: I4aab241a36abcef7fd8adf74aed5e12b8b88e0ef
2015-04-01Reduce required xmm number by one in block_error_fpJingning Han
Use 6 xmms instead of 8. Change-Id: If976ad85d09191d2fb0565399d690f2869dbbcc7
2015-04-01Refactor block_yrd function for RTC coding modeJingning Han
This commit separates Hadamard transform/quantization operations from rate and distortion computation in block_yrd. This allows one to skip SATD computation when all transform blocks are quantized to zero. It also uses a new block error function that skips repeated computation of sum of squared residuals. It reduces the CPU cycles spent on block error calculation in block_yrd by 40%. Change-Id: I726acb2454b44af1c3bd95385abecac209959b10
2015-04-01Optimize quantization simd implementationJingning Han
This commit allows the quantizer to compare the AC coefficients to the quantization step size to determine if further multiplication operations are needed. It makes the quantization process 20% faster without coding statistics change. Change-Id: I735aaf6a9c0874c82175bb565b20e131464db64a
2015-04-01Enhance the transform skipping decision-making in non-rd modeYunqing Wang
For large partition blocks(block_size > 32x32), the variance calculation is modified so that every 8x8 block's variance is stored during the calculation, which is used in the following transform skipping test. Also, the variance for every tx block is calculated. The skipping test checks all tx blocks in the partition, and sets the skip flag only if all tx blocks are skippable. If the skip flag of Y plane is 1, a quick evaluation is done on UV planes. If the current partition block is skippable in YUV planes, the mode search checks fewer inter modes and doesn't check intra modes. The rtc set borg test(at speed 6) showed that: Overall psnr: -0.527%; Avg psnr: -0.510%; ssim: -0.573%. Average single-thread speedup on rtc set was 3.5%. For 720p clips, more speedups were seen. gipsrecmotion: 13% gipsrestat: 12% vidyo: 5 - 9% dark: 15% niklas: 6% Change-Id: I8d8ebec0cb305f1de016516400bf007c3042666e
2015-03-31Merge "Rename vbp thresholds"Yunqing Wang
2015-03-31Rename vbp thresholdsYunqing Wang
Code refactoring Change-Id: I410fcce1bc6d95c62c474445f4c97ea8469f1e79
2015-03-31Merge "Tuning SATD rate calculation for speed"Jingning Han
2015-03-31Merge "Use aligned copy in 8x8 Hadamard transform SSE2"Jingning Han
2015-03-31Merge "Allow block skip coding option in RTC mode"Jingning Han
2015-03-31Merge "Fix 8x8 Hadamard SSE2 implementation"Jingning Han
2015-03-31Merge "VP9E_GET_ACTIVE_MAP API function."Alex Converse
2015-03-31Tuning SATD rate calculation for speedJingning Han
This commit allows the encoder to check the eob per transform block to decide how to compute the SATD rate cost. If the entire block is quantized to zero, there is no need to add anything; if only the DC coefficient is non-zero, add its absolute value; otherwise, sum over the block. This reduces the CPU cycles spent on vp9_satd_sse2 to one third. Change-Id: I0d56044b793b286efc0875fafc0b8bf2d2047e32
2015-03-31Merge "Move vp9_coef_con_tree to common/"hui su
2015-03-31Use aligned copy in 8x8 Hadamard transform SSE2Jingning Han
This reduces the 8x8 Hadamard transform cycles by 20%. Change-Id: If34c5e02f3afa42244c6efabe121f7cf5d2df41b
2015-03-31Merge "Enable 16x16 Hadamard transform in SATD based mode decision"Jingning Han
2015-03-31Merge "Use SATD based mode decision for block sizes below 16x16"Jingning Han