summaryrefslogtreecommitdiff
path: root/vp9/encoder/vp9_rdopt.c
AgeCommit message (Collapse)Author
2013-08-30Added per pixel inter rd hit count statsPaul Wilkins
Added some code to output normalized rd hit count stats. In effect this approximates to the average number of rd operations/tests per pixel for the sequence. The results are not quite accurate and I have not bothered to account for partial SB64s at frame edges and for key frames However they do give some idea of the number of modes / prediction methods being tested for each pixel across the different partition sizes. This indicates how much scope their is for further gains either by reducing the number of partitions examined or the modes per partition through heuristics. Patch 3 moved place where count incremented so partial rd tests that are aborted with INT_MAX return are also counted. Example numbers for first 50 frames of Akiyo. Speed 0 ~84.4 rd operations / pixel Speed 1 ~28.8 Speed 2 ~11.9 Change-Id: Ib956e787e12f7fa8b12d3a1a2f6cda19a65a6cb8
2013-08-29Merge "Fixed potential overflows"Yaowu Xu
2013-08-29Merge "Renaming txfm_size to tx_size."Dmitry Kovalev
2013-08-29Fixed potential overflowsYaowu Xu
The two arrays are typically initialized to INT64_MAX, if they are not filled with valid values before the addition, the values can overflow and lead to wrong results. Change-Id: I515de22cf3e8f55af4b74bdb2c8eb821a02d3059
2013-08-28General code cleanup.Dmitry Kovalev
Switching from mi_{width, height}_log2 and b_{width, height}_log2 to num_8x8_blocks_{wide, high} and num_4x4_blocks_{wide, high}. Removing redundant code, adding const. Change-Id: Iaab2207590fd24d0b76999071778d1395dc5cd5d
2013-08-27Renaming txfm_size to tx_size.Dmitry Kovalev
Change-Id: I752e374867d459960995b24d197301d65ad535e3
2013-08-27Merge "Fix buf alignment in sub8x8 comp inter-inter pred"Jingning Han
2013-08-27Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the encoder.Dmitry Kovalev
Change-Id: I62bb07c377f947cb72fac68add7a6b199e42c6b9
2013-08-27Merge "Cleaning up model_rd_for_sb_y_tx."Dmitry Kovalev
2013-08-27Merge "Renaming D27 to D207."Dmitry Kovalev
2013-08-27Fix buf alignment in sub8x8 comp inter-inter predJingning Han
This commit resolved a mis-alignment issue in compound inter-inter prediction of sub8x8. This patch follows solution from dkovalev@. Change-Id: I3cc0cf7e55b84110e0c42ef4b2e6ca7ac3f8f932
2013-08-26Cleaning up model_rd_for_sb_y_tx.Dmitry Kovalev
Removing references to plane_block_width and plane_block_height (we are going to delete the latter ones). Change-Id: I7982da4d373aebb54d2209dc8886f6192df4d287
2013-08-26Merge "Changes to adaptive inter rd thresholds."Paul Wilkins
2013-08-26Merge "Limit Key frame Intra modes checks."Paul Wilkins
2013-08-23cosmetics: strip 'VP9_' from defines in vp9 only codeJames Zern
Change-Id: I481d9bb2fa3ec72b6a83d5f04d545ad8013f295c
2013-08-23Renaming D27 to D207.Dmitry Kovalev
I've already renamed d27_predictor to d207_predictor but forgot about the corresponding constant. Change-Id: Id312aa80fc5b5a1ab8a709a33418a029552a6857
2013-08-23Cleanup in mvref_common.{h, c}.Dmitry Kovalev
Making code more compact, adding consts, removing redundant arguments, adding do/while(0) for macros. Change-Id: Ic9ec0bc58cee0910a5450b7fb8cfbf35fa9d0d16
2013-08-23Changes to adaptive inter rd thresholds.Paul Wilkins
Values now carried over frame to frame. Change to algorithm for decreasing threshold after a hit and to max threshold (now based on speed) Removed some old commented out code relating to VP8 adaptive thresholds. The impact of these changes tested on Akiyo (50 frames) and measured in terms of unit rd hits is as follows: Speed 0 84.36 -> 84.67 Speed 1 29.48 -> 22.22 Speed 2 11.76 -> 8.21 Speed 3 12.32 -> 7.21 Encode speed impact is broadly in line with these. Change-Id: I5b886efee3077a11553fa950d796fd6d00c8cb19
2013-08-23Limit Key frame Intra modes checks.Paul Wilkins
Most of the focus so far has been on inter frames. At high speed settings the key frame is now taking a high % of the cycles. This patch puts in some masking to reduce the number of INTRA modes searched during key frame coding (as already happens for inter frames) at higher speed settings TODO: Develop this further with either adaptive rd thresholds when choosing which intra modes to consider or some other heuristic. Impact. At high speed settings on some clips the key frame was starting to dominate. In a coding of the first 50 frames of AKIYO at speed 2 limiting the key frame intra modes to DC or TM_PRED resulted in ~30% overall speedup. For Bus the number was lower at ~4-5%. Change-Id: I7bde68aee04995f9d9beb13a1902143112e341e2
2013-08-22Adding vp9_is_scaled function.Dmitry Kovalev
Change-Id: Ieb7077ca3586b9491912027eed450a4f6fd38d30
2013-08-21Merge "Enable zero coeff check in sub8x8 UV rd loop"Jingning Han
2013-08-21Removing PLANE_TYPE argument from cost_coeffs function.Dmitry Kovalev
We can determine plane_type for another function arguments. Change-Id: I85331877aedb357632ae916a37b5b15f22c0bb1f
2013-08-21Fix typos and minor stylistic cleanupAdrian Grange
Change-Id: I32e43474e8651ef2eb181d24860a8f118cfea7bf
2013-08-20Merge "Passing plane_bsize to foreach_transformed_block_visitor."Dmitry Kovalev
2013-08-20Enable zero coeff check in sub8x8 UV rd loopJingning Han
Check the minimum rate-distortion cost of regular quantization and all zero coeffs cases in the sub8x8 inter prediction rd loop for luma components. Use this as the cumulative rdcost sent to UV rd estimation. Change-Id: Ia4bc7700437d5e13d7cdad4cf9ae57ab036d3e97
2013-08-20Cleanup/enhancements of switchable filter searchDeb Mukherjee
Cleans up the switchable filter search logic. Also adds a speed feature - a variance threshold - to disable filter search if source variance is lower than this value. Results: derfraw300 threshold = 16, psnr -0.238%, 4-5% speedup (tested on football) threshold = 32, psnr -0.381%, 8-9% speedup (tested on football) threshold = 64, psnr -0.611%, 12-13% speedup (tested on football) threshold = 96, psnr -0.804%, 16-17% speedup (tested on football) Based on these results, the threshold is chosen as 16 for speed 1, 32 for speed 2, 64 for speed 3 and 96 for speed 4. Change-Id: Ib630d39192773b1983d3d349b97973768e170c04
2013-08-19Enable early termination in uv rd loopJingning Han
This commit enables early termination in the rate-distortion optimization search loop for chroma components. When the cumulative rd cost is above the current best value, skip the rest per-block transform/quantization/coeff_cost and continue to the next prediction mode. For bus_cif at 2000 kbps, the average run-time goes down from 168546ms -> 164678ms, (2% speed-up) at speed 0 36197ms -> 34465ms, (4% speed-up) at speed 1 Change-Id: I9d3043864126e62bd0166250d66b3170d520b3c0
2013-08-19Passing plane_bsize to foreach_transformed_block_visitor.Dmitry Kovalev
Updating all foreach_transformed_block_visitor functions to work with plane block size instead of general block. Removing a lot of duplicated code. Change-Id: I6a9069e27528c611f5a648e1da0c5a5fd17f1bb4
2013-08-19Merge "Fix potential use of uninitialized value"Jingning Han
2013-08-19Merge "Fix the returned distortion value in rd_pick_intra"Jingning Han
2013-08-19Using plane_bsize instead of bsize.Dmitry Kovalev
This change set is intermediate. The next one will remove all repetitive plane_bsize calculations, because it will be passed as argument to foreach_transformed_block_visitor. Change-Id: Ifc12e0b330e017c6851a28746b3a5460b9bf7f0b
2013-08-19Fix potential use of uninitialized valueJingning Han
Initialize the best mode and tx_size values in the rate-distortion optimization search loop. Change-Id: Ibfb5c0895691f172abcd4265c23aef4cb99fa8af
2013-08-16Fix the returned distortion value in rd_pick_intraJingning Han
Return the distortion value in vp9_rd_pick_intra_mode_sb as sum of dist_y and dist_uv. Remove the right shift operation on dist_uv, and make it consistent with that of vp9_rd_pick_inter_mode_sb. Change-Id: I9d564e242d9add38e32595d33b0e0dddb1d55e5b
2013-08-16Removing unused or redundant arguments from *_args structures.Dmitry Kovalev
Redundant dst, pre[2] from build_inter_predictors_args, unused cm from encode_b_args. Change-Id: I2c476cd328c5c0cca4c78ba451ca6ba2a2c37e2d
2013-08-16Merge "Moving from ss_txfrm_size to tx_size."Dmitry Kovalev
2013-08-16Fixed typos and formattingAdrian Grange
Change-Id: I3814984a624bc64147c57efa74fbdda8eda47262
2013-08-15Moving from ss_txfrm_size to tx_size.Dmitry Kovalev
Updating foreach_transformed_block_visitor and corresponding functions to accept tx_size instead of ss_txfrm_size. List of functions per file: vp9_decodframe.c decode_block decode_block_intra vp9_detokenize.c decode_block vp9_encodemb.c optimize_block vp9_xform_quant vp9_encode_block_intra vp9_rdopt.c dist_block rate_block block_yrd_txfm vp9_tokenize.c set_entropy_context_b tokenize_b is_skippable Change-Id: I351bf563eb36cf34db71c3f06b9bbc9a61b55b73
2013-08-15Merge "Refactor rd loop for chroma components"Jingning Han
2013-08-15Merge "Converting code from using ss_txfrm_size to tx_size."Dmitry Kovalev
2013-08-15Merge "Moving segmentation struct from MACROBLOCKD to VP9_COMMON."Dmitry Kovalev
2013-08-15Refactor rd loop for chroma componentsJingning Han
This commit makes the rate-distortion optimization search of chroma components consistent across all block sizes. It removes redundant codes. Change-Id: I7e76f54d045e8efdd41d84a164c71f55b484471b
2013-08-15Merge "Unify luma and chroma rd-cost estimation"Jingning Han
2013-08-15Converting code from using ss_txfrm_size to tx_size.Dmitry Kovalev
Updated function signatures: txfrm_block_to_raster_block txfrm_block_to_raster_xy extend_for_intra vp9_optimize_b Change-Id: I7213f4c4b1b9ec802f90621d5ba61d5e4dac5e0a
2013-08-15Using { 0 } for initialization instead of memset.Dmitry Kovalev
Change-Id: I4fad357465022d14bfc7e13b348c6da267587314
2013-08-15Moving segmentation struct from MACROBLOCKD to VP9_COMMON.Dmitry Kovalev
VP9_COMMON is the right place to segmentatation struct because it has global segmentation parameters, not something specific to macroblock processing. Change-Id: Ib9ada0c06c253996eb3b5f6cccf6a323fbbba708
2013-08-15Unify luma and chroma rd-cost estimationJingning Han
This commit unifies the rate-distortion cost calculation process of luma and chroma components. It allows early termination to be enabled later in the rd search loop of chroma components, in consistent with luma pixels. Change-Id: I2e52a7c6496176bf2a5e3ef338d34ceb8aad9b3d
2013-08-14Renaming in MB_MODE_INFOPaul Wilkins
The macro block mode info context originally contained an entry for each 16x16 macroblock. In VP9 each entry refers to an 8x8 region not a macro block, so the naming is misleading. This first stage clean up changes the names of 3 entries in the structure to remove the mb_ prefix. TODO clean up the nomenclature more widely in respect of mbmi and bmi. Change-Id: Ia7305c6d0cb805dfe8cdc98dad21338f502e49c6
2013-08-13Use lookup table to find largest txfm sizeJingning Han
Refactor choose_largest_txfm_size_ and make it find the largest transform size via lookup table. Change-Id: I685e0396d71111b599d5367ab1b9c934bd5490c8
2013-08-13Merge "Refactor model based tx search in super_block_yrd"Jingning Han
2013-08-12SSE2 high precision 32x32 forward DCTJingning Han
Enable SSE2 implementation of high precision 32x32 forward DCT. The intermediate stacks are of 32-bits. The run-time goes down from 32126 cycles to 13442 cycles. Change-Id: Ib5ccafe3176c65bd6f2dbdef790bd47bbc880e56