summaryrefslogtreecommitdiff
path: root/vp9/encoder/vp9_block.h
AgeCommit message (Collapse)Author
2014-02-12Remove inactive control parametersJingning Han
Change-Id: Ic5692af975fe6bd2d8ec82bbae103c6f7c2fc13e
2014-01-31Cleanup block_rd_txfm.Alex Converse
* Avoid unnecessary type erasure * Prune unused/duplicate fields from struct rdcost_block_args * Make struct rdcost_block_args a local Change-Id: I4f1fd4837ccd028bbfe727191ee8d69f0463b7e5
2014-01-24Renaming INTERPOLATION_TYPE to INTERP_FILTER.Dmitry Kovalev
Corresponding renames: subpel_kernel => interp_kernel vp9_get_filter_kernel() => vp9_get_interp_kernel() pred_filter_type => pred_interp_filter adaptive_pred_filter_type => adaptive_pred_interp_filter mcomp_filter_type => interp_filter read_interp_filter_type() => read_interp_filter() write_interp_filter_type() => write_interp_filter() fix_mcomp_filter_type() => fix_interp_filter() Change-Id: I1fa61fa1dc81ebbf043457c3ee2d8d4515bee6d3
2014-01-23vp9/encoder: add extern "C" to headersJames Zern
Change-Id: I4f51ce859a97bf1b8fd2b37ac585b7c643232b69
2014-01-16Inter-frame non-RD mode decisionJingning Han
This commit setups a test framework for real-time coding. It enables a light motion search for non-RD mode decision purpose. Change-Id: I8bec656331539e963c2b685a70e43e0ae32a6e9d
2014-01-06Remove avoid_frame_with_high_error from RD loopJingning Han
The feature undergoes prior assumption that the recursive partition size search from 4x4 to 64x64, hence utilizing information from small blocks to determine early termination in large block rate-distortion optimization search. The current codebase is now going from top down. The previous function might go with not properly initialized values, hence removed. Tested on pedestrian_area_1080p at 4000 kbps running under speed 2. No visible difference in runtime observed. Change-Id: I553df415c6191413762db7ae34e8790c71d8118e
2014-01-03Merging best_ref_mv and second_best_ref_mv into best_ref_mv[2].Dmitry Kovalev
Change-Id: If04b57828847cee09a79c94e1098d1aa4990ea0d
2013-12-27Adaptive motion control on ref and search rangeJingning Han
This commit takes a preliminary attempt to refine the motion search control. It detects the SAD associated with mv predictor per reference frame, and based on which to determine whether the encoder wants to reduce the motion search range (if the predicted mv provides fairly small SAD), or to skip the current reference frame (if there exists another ref frame that gives much smaller SAD cost). This feature is turned on in the settings of speed 1 and above. In speed 1, compression performance changed derf -0.018% yt -0.043% hd -0.045% stdhd -0.281% speed-up pedestrian_area_1080p at 4000 kbps 100 frames 199651ms -> 188846ms (5.5% speed-up) blue_sky_1080p at 6000 kbps 443531ms -> 415239ms (6.3% speed-up) In speed 2, compression performance changed derf -0.026% yt -0.090% hd -0.055% stdhd -0.210% speed-up pedstrian 113949ms -> 108855ms (4.5% speed-up) blue_sky 271057ms -> 257322ms (5% speed-up) Change-Id: I1b74ea28278c94fea329d971d706d573983d810d
2013-12-19Store the SSE of prediction residualsJingning Han
Buffer the SSE of prediction residuals in the rate-distortion optimization loop of a given block. This information would be used for later encoding control. Change-Id: If4e63f3462490513c48be9407d3327c8dd438367
2013-12-12Enable adaptive pred filter type for sub8x8Jingning Han
This commit enables an adaptive prediction filter type selection for sub8x8 block sizes. In speed 1, it re-uses the filter type of collocated 8x8 block if it is tested in the rate-distortion optimization loop, for the sub8x8 blocks. Otherwise, it runs the normal test over all the three filter types. In speed 2, it re-uses the 8x8 block's prediction filter type, if available. Otherwise, force it to be EIGHTTAP. Compression and speed performance wise: speed 1 derf -0.266% yt -0.138% bus at 2000 kbps: 33766ms -> 30451ms (10% speed-up) football at 600 kbps: 48173ms -> 43786ms (9% speed-up) speed 2 derf -0.026% yt +0.134% bus at 2000 kbps: 18973ms -> 17698ms (6% speed-up) football at 600 kbps: 26748ms -> 25096ms (6% speed-up) Change-Id: I77e097533b969fd3472147225fa79fc98095d342
2013-12-06Removing BLOCK_TYPES and adding PLANE_TYPES constant instead.Dmitry Kovalev
Change-Id: Ic3bb862e93aedf6a489a33ea6f7e5097d96855ee
2013-12-05Renaming PREV_COEF_CONTEXTS to COEFF_CONTEXTS.Dmitry Kovalev
Also adding BAND_COEFF_CONTEXTS macro to simplify for loop logic. Change-Id: I12a78a49cf1addf81e6b3fe2a3736ec2b79bd79e
2013-12-04Merge "Cleaning up vp9_entropy.h file."Dmitry Kovalev
2013-12-03Moving eob array to the encoder.Dmitry Kovalev
In the decoder we don't need to save eobs, we can pass eob as an argument. That's why removing eob arrays from VP9Decompressor and TileWorkerData, and moving eob pointer from macroblockd_plane to macroblock_plane. Change-Id: I8eb919acc837acfb3abdd8319af63d1bbca8217a
2013-12-03Cleaning up vp9_entropy.h file.Dmitry Kovalev
Renaming constants for consistency: DCT_VAL_CATEGORY1 => CATEGORY1_TOKEN DCT_VAL_CATEGORY2 => CATEGORY2_TOKEN DCT_VAL_CATEGORY3 => CATEGORY3_TOKEN DCT_VAL_CATEGORY4 => CATEGORY4_TOKEN DCT_VAL_CATEGORY5 => CATEGORY5_TOKEN DCT_VAL_CATEGORY6 => CATEGORY6_TOKEN DCT_EOB_TOKEN => EOB_TOKEN DCT_EOB_MODEL_TOKEN => EOB_MODEL_TOKEN MAX_ENTROPY_TOKENS => ENTROPY_TOKENS Moving constants: INTER_MODE_CONTEXTS from vp9_entropy.h to vp9_blockd.h. EOSB_TOKEN from vp9_entropy.h to vp9_tokenize.h Change-Id: I5fcbf081318e1d365792b6d290a930c6cb0f3fc2
2013-11-26Removing qcoeff buffers from the decoder.Dmitry Kovalev
We only need qcoeff buffers in the encoder. Reducing TileWorkerData struct and VP9Decompressor struct sizes by 24K. Change-Id: Id148868461f7ffa3d3dd634b371503ae9c57e207
2013-11-13Simplifies band-getting with a static arrayDeb Mukherjee
Simplifies the code by implementing band mapping with static arrays. A lot of the code complexity introduced in a previous patch disappears. Change-Id: Ia3fac36e594fb5ad2d55ae141c58bba4c55c2d28
2013-11-13Dual buffer encoding for intra modesJingning Han
Overall change (using dual buffer scheme for superblocks of both inter and intra modes) reduces speed 2 runtime: bluesky_1080p at 6000kbps: 263553ms -> 257441ms riverbed_1080p at 8000kbps: 233230ms -> 225308ms. Change-Id: Idf8d70f768a4b0d97b2a8506372c57b7b4022119
2013-11-12Moving q_index from MACROBLOCKD to MACROBLOCK.Dmitry Kovalev
Moving because q_index is used only by encoder. Change-Id: I0b96175614ed4fd3d76ee56a0ba36258e1e896f6
2013-11-12Merge "Enable dual buffer rd search and encoding scheme"Jingning Han
2013-11-12Merge "Moving {sb, mb, b, ab}_index from MACROBLOCKD to MACROBLOCK."Dmitry Kovalev
2013-11-12Removes conditional statements from band gettingDeb Mukherjee
Implements scan order to band map with arrays in both the encoder and decoder to remove conditional statements. Encoding seems to be about 1% faster at speed 0, tested on football. Decoding seems to be about 0.5-1% faster on a set of 25 videos. Change-Id: Idb233ca0b9e0efd790e30880642e8717e1c5c8dd
2013-11-11Enable dual buffer rd search and encoding schemeJingning Han
This commit enables the dual buffer rate-distortion optimization and encoding scheme. It stacks the original transform coefficients, quantized levels, and reconstructed coefficients, in the rate- distortion optimization search process, hence eliminates the need to re-run residual generation, forward transform, and quantization in the encoding stage. Change-Id: I011bfad3a59a380a869ee552e91dae0394ec492e
2013-11-11Allocate dual buffer sets for encodingJingning Han
Allocate memory space of dual buffer sets that store the coeff, qcoeff, dqcoeff, and eobs. Connect the pointers of macroblock_plane and macroblockd_plane to the actual buffer in use accordingly. Change-Id: I2f0b5f482ca879fae39095013eaf8901db20a5a4
2013-11-11Moving {sb, mb, b, ab}_index from MACROBLOCKD to MACROBLOCK.Dmitry Kovalev
We use {sb, mb, b, ab}_index only inside encoder, so moving them into appropriate data structure. Change-Id: Ib5c1036716354d9d321e11a60c1634c1cb8f9716
2013-11-06Removes stack allocation of token_cacheDeb Mukherjee
Removes stack-allocation of token_cache in the tokenize function in the encoder. Change-Id: Ifdbab76dc2b23384da0080d2e9390533e489db8c
2013-10-30Replacing (SWITCHABLE_FILTERS + 1) with SWITCHABLE_FILTER_CONTEXTS.Dmitry Kovalev
Change-Id: I9781a62bc1a4cd9176554d1271d87dbcafda9cb0
2013-10-24Making input pointer constant for all fdct/fht functions.Dmitry Kovalev
Change-Id: I78f7012f967a777ddd39bae6671eb501df6bbfe8
2013-10-22Removing quantize_b_4x4 function pointer.Dmitry Kovalev
The pointer was asigned only once with vp9_regular_quantize_b_4x4, calling this function directly now. Also removing unused declarations: prototype_quantize_block prototype_quantize_block_pair prototype_quantize_mb vp9_regular_quantize_b_4x4_pair vp9_regular_quantize_b_8x8 Change-Id: I14325bc2f082336820671eafbc06126651b79f73
2013-10-22Merge "Removing NUM_ prefix from constant names."Dmitry Kovalev
2013-10-22Merge "Using INTER_MODES constant instead of MB_MODE_COUNT - NEARESTMV."Dmitry Kovalev
2013-10-18Removing NUM_ prefix from constant names.Dmitry Kovalev
Renames for consistency with other constants: NUM_FRAME_TYPES -> FRAME_TYPES NUM_PARTITION_CONTEXTS -> PARTITION_CONTEXTS Change-Id: I3db30acb2868eb0a424237c831087b2e264ec47f
2013-10-18Using INTER_MODES constant instead of MB_MODE_COUNT - NEARESTMV.Dmitry Kovalev
Change-Id: Ie5ec392904d03fd5485474b33be8408108e9d3c9
2013-10-18Make memory alloc in pick_mode_context bsize awareJingning Han
This commit makes the buffer allocation of zcoeff_blk array in pick_mode_context block size aware. It calculates the number of 4x4 blocks in the partition and assigns the memory space accordingly. This process (and the uninitialization) is done once for each encoding pass. It allows memory copy of smaller buffer when possible. For football at 600kbps, the runtimes improve by about 1%: speed 1, 45961ms -> 45472ms speed 2, 23863ms -> 23598ms Change-Id: Id2ca24906fa89f46fa5fe742ec4b8efc2a61f877
2013-10-16Inlining and removing fwd_txm16x16 and fwd_txm8x8 pointers.Dmitry Kovalev
Change-Id: I3528ba1c3fee761918509f9d9dc2d842c69f5a44
2013-10-16Implement variance-based adaptive quantizationGuillaume Martres
This should be similar to what x264 does with --aq-mode 1. It works well with clips like parkjoy and touhou (http://x264.nl/developers/Dark_Shikari/LosslessTouhou.mkv). At low bitrates, the segmentation signaling overhead may negate the benefits of this feature. (PGW) Default changed to feature OFF to allow provisional merge. Change-Id: I938abf9bb487e1d4ad3b0264ea03d9826275c70b
2013-10-15Merge "Fix a few indent format issues in buffer defs"Jingning Han
2013-10-15Merge "Re-design all-zero-coeff block index buffer use"Jingning Han
2013-10-15Fix a few indent format issues in buffer defsJingning Han
Change-Id: Iac55891ac9e6f13718c9f822aa099b5ca491832a
2013-10-15Removing unused 8x4 transform from the encoder.Dmitry Kovalev
Change-Id: Icbcf68b5b685a56f255ebc3859c9692accdadf9e
2013-10-15Re-design all-zero-coeff block index buffer useJingning Han
Use the zcoeff_blk buffer of PICK_MODE_CONTEXT to store the indexes of all-zero-coeff block of the current best mode. Remove the temporary buffer best_zcoeff_blk defined in the rate-distortion optimization loop. This improves the speed performance by about 0.5% in all speed settings. Change-Id: Ie3e15988ddfa581eafa2e19a8228d3fe4a46095c
2013-10-14Move token_cache from cost_coeffs to MACROBLOCKJingning Han
This commit moves token_cache buffer into macroblock struct, instead of defining as a local variable in cost_coeffs. This avoids repeatedly re-allocating memory space in the rate-distortion optimization loop. The runtime at speed 0 reduces: bus 2000kbps, 161692ms to 159951ms football 600kbps, 229505ms to 225821ms Change-Id: If7da6b0b6d8c5138a16271a33c4548fba33d8840
2013-10-10Re-design rate-distortion cost tracking buffersJingning Han
This commit re-designs the per transformed block rate-distortion costs tracking buffers. It removes redundant buffer usage, makes the needed context memory allocation per VP9_COMP instance and reuses the same buffer sets inside the rate-distortion optimization search loop, thereby avoiding repeatedly requiring memory space. It reduces speed 0 runtime: bus at 2000 kbps from 166763ms to 158967ms, football at 600 kbps from 246614ms to 234257ms. Both about 5% speed-up. Local tests suggest about 2% to 5% speed-up for speed 1 and 2 settings. This does not change compression performance. Change-Id: I363514c5276b5cf9a38c7251088ffc6ab7f9a4c3
2013-10-09Deprecate the use of PARTITION_INFO from encoderJingning Han
Use b_mode_info to store the inter prediction mode of sub8x8 block, in replacement of the use of partition_info. Remove redundant buffer update for partition_info. For bus_cif at 2000 kbps, this seem to make speed 0 about 1% faster. Change-Id: Id1b3be45e75a24fb4b42335ac480c23e440978f6
2013-10-01vp9_block.h cpplint issues resolvedJim Bankoski
Change-Id: Icc6a76a5be77f3e19918155bab3998e0aa32ccf5
2013-09-23Enable per transformed block zero coeffs forcingJingning Han
This commit enables forcing all coefficients zero per transformed block, when its rate-distortion cost is lower than regular coeff quantization. The overall performance improvement (including its parent patch on calculating rd cost per transformed block) at speed 1: derf: 0.298% yt: 0.452% hd: 0.741% stdhd: 0.006% Change-Id: I66005fe0fd7af192c3eba32e02fd6d77952accb5
2013-09-13Adaptive motion search controlJingning Han
This commit enables adaptive constraint on motion search range for smaller partitions, given the motion vectors of collocated larger partition as a candidate initial search point. It makes speed 0 runtime of bus at CIF and 2000 kbps goes from 167s down to 162s (3% speed-up), at 0.01dB performance gains. In the settings of speed 1, this makes the runtime goes from 33687 ms to 32142 ms (4.5% speed-up), at 0.03dB performance gains. Compression performance wise, it gains at speed 1: derf 0.118% yt 0.237% hd 0.203% stdhd 0.438% Change-Id: Ic8b34c67810d9504a9579bef2825d3fa54b69454
2013-08-27Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the encoder.Dmitry Kovalev
Change-Id: I62bb07c377f947cb72fac68add7a6b199e42c6b9
2013-08-23cosmetics: strip 'VP9_' from defines in vp9 only codeJames Zern
Change-Id: I481d9bb2fa3ec72b6a83d5f04d545ad8013f295c
2013-08-07Use low precision 32x32fdct for encodemb in speed1Jingning Han
The low precision 32x32 fdct has all the intermediate steps within 16-bit depth, hence allowing faster SSE2 implementation, at the expense of larger round-trip error. It was used in the rate-distortion optimization search loop only. Using the low precision version, in replace of the high precision one, affects the compression performance by about 0.7% (derf, stdhd) at speed 0. For speed 1, it makes derf set down by only 0.017%. Change-Id: I4e7d18fac5bea5317b91c8e7dabae143bc6b5c8b