summaryrefslogtreecommitdiff
path: root/vp9/encoder/vp9_rdopt.c
AgeCommit message (Collapse)Author
2017-09-08Fix bug in intra mode rd penalty.paulwilkins
The intra mode rd penalty was implemented as a rate penalty. Code was added to scale the penalty according to block size but this was not done correctly for the SB level or sub 8x8. The code did a weird double scaling in regard to bit depth that has been removed. Given that it is a rate penalty the bit depth should not matter. This bug fix improves average metrics on our standard test sets by about 0.1% Change-Id: I7cf81b66aad0cda389fe234f47beba01c7493b1e
2017-09-05Remove get_filter_base() and get_filter_offset() in convolveLinfeng Zhang
so that the convolve functions are independent of table alignment. Change-Id: Ieab132a30d72c6e75bbe9473544fbe2cf51541ee
2017-08-21Remove skip_block from quantizeJohann
This condition is handled before this code is reached. The ssse3 version of the function has always crashed when attempting to handle the skip_block condition. Add assert() and comments regarding the usage of skip_block. Removing the parameter is a fairly involved process so leave it be for the moment. Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a
2017-07-06cosmetics,vp9/: normalize inv/fwd_txfm namingJames Zern
+ vpx_dsp/, test/ itxfm -> inv_txfm, ftxfm -> fwd_txfm Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e
2017-06-29cosmetics,vp9/encoder: s/txm/txfm/James Zern
txfm is more commonly used as an abbreviation through the codebase Change-Id: I86fd90ef132468f9da270091c05daa1f5a49ece2
2017-05-03Update highbd idct functions arguments to use uint16_t dstLinfeng Zhang
BUG=webm:1388 Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5
2017-05-03Clean CONVERT_TO_BYTEPTR/SHORTPTR in idctLinfeng Zhang
BUG=webm:1388 Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112
2017-05-01Merge "Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()"Linfeng Zhang
2017-04-26Merge "Make the row based multi-threaded encoder deterministic"Yunqing Wang
2017-04-25Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()Linfeng Zhang
BUG=webm:1388 Change-Id: I7ee32e0c08f0fb41712a8cc640b2c5bba872421d
2017-04-25Update highbd convolve functions arguments to use uint16_t src/dstLinfeng Zhang
BUG=webm:1388 Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42
2017-04-24Make the row based multi-threaded encoder deterministicYunqing Wang
This patch followed allow_exhaustive_searches feature modification and continued to modify the encoder to achieve the determinism in the row based multi-threaded encoding. While row-mt = 1 and using multiple threads, the adaptive feature in encoder was disabled, which gave BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%), but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at speed 2). These speed losses were acceptable considering the speed gains obtained from row-mt. Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
2017-04-19Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolveLinfeng Zhang
Replace by CAST_TO_BYTEPTR/SHORTPTR. The rule is: if a short ptr is casted to a byte ptr, any offset operation on the byte ptr must be doubled. We do this by casting to short ptr first, adding offset, then casting back to byte ptr. BUG=webm:1388 Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248
2017-04-06VP9 motion vector unit testYunqing Wang
To prevent the motion vector out of range bug, added a motion vector unit test in VP9. In the 4k video encoding, always forced to use extreme motion vectors and also encouraged to use INTER modes. In the decoding, checked if the motion vector was valid, and also checked the encoder/decoder mismatch. The tests showed that this unit test could reveal the issue we saw before. Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4
2017-03-22vp9_rdopt: correct size to vpx_sum_squares_2d_i16James Zern
the current implementations expect pixel size, not the block type BUG=webm:1392 Change-Id: Ib91e9f30a1f56e13566b1fb76f089dae9bb50cdc
2017-03-20Merge "Record the sum of tx block eobs in the partition block"Yunqing Wang
2017-03-20Record the sum of tx block eobs in the partition blockYunqing Wang
The sum of tx bloxk eobs is needed in the machine learning based partition early termination. The eobs are first accumulated during tx search, and then the value associated with the best tx_size is copied to ctx for later use. After the sum of eobs are calculated correctly, re-enabled ml_partition_search_early_termination speed feature. Re-did the quality/speed test to check the impact of the fix. 1. Borg test BDRATE result: 4k set: PSNR: +0.183%; SSIM: +0.100%; hdres set: PSNR: +0.168%; SSIM: +0.256%; midres set: PSNR: +0.186%; SSIM: +0.326%; 2.Average speed gain result: 4k clips: 21%; hd clips: 26%; midres clips: 15%. The result is in line with the original result. Change-Id: I4209a95c89be03b4cbfb6a95b16885f89feddbda
2017-03-16Add a vector form of routine vp9_model_rd_from_var_lapndzGabriel Marin
Add routine vp9_model_rd_from_var_lapndz_vec and call it from model_rd_for_sb to model the rate and distortion for MAX_MB_PLANE Laplacian sources in parallel. The caller ensures that all sources have non-zero variance. Measured a 18% to 25% reduction in retired instructions, and 17% to 24% reduction in instruction execution cost with different compilers for the Laplacian modeling. No change in behavior. TEST=Verified that encoded files match bit for bit, with and without this change. BUG=b/33678225 Change-Id: I6b76947f21c659a349adb896e13e99f6e3f951e6
2017-03-03Merge "Narrow cat6_high_cost tables to uint16_t"Alex Converse
2017-03-03Narrow cat6_high_cost tables to uint16_tAlex Converse
Saves 2688 bytes of rodata. Change-Id: I46633b6e50c2845181c70fff6273a8e58fdd1e56
2017-02-27vp9: Rename new_mt to row_mtVignesh Venkatasubramanian
new_mt is a very generic name that will get obsolete soon enough. Since this is exposed as a codec control, renaming it to row_mt to signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH codec control to ROW_MT_BIT_EXACT. Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558
2017-02-24consolidate block_error functionsJohann
vp9_highbd_block_error_8bit_c was a very simple wrapper around vp9_block_error_c. The SSE2 implemention was practically identical to the non-HBD one. It was missing some minor improvements which only went into the original version. In quick speed tests, the AVX implementation showed minimal improvement over SSE2 when it does not detect overflow. However, when overflow is detected the function is run a second time. The OperationCheck test seems to trigger this case and reverses any speed benefits by running ~60% slower. AVX2 on the other hand is always 30-40% faster. Change-Id: I9fcb9afbcb560f234c7ae1b13ddb69eca3988ba1
2017-02-16Structured the mode ordering code to avoid redundant memcpyRanjit Kumar Tulabandu
Change-Id: I4f5d6b54018bd1928cd9e5e42619e6f55b334803
2017-02-15Row based multi-threading of encoding stageRanjit Kumar Tulabandu
(Yunqing Wang) This patch implements the row-based multi-threading within tiles in the encoding pass, and substantially speeds up the multi-threaded encoder in VP9. Speed tests at speed 1 on STDHD(using 4 tiles) set show that the average speedups of the encoding pass(second pass in the 2-pass encoding) is 7% while using 2 threads, 16% while using 4 threads, 85% while using 8 threads, and 116% while using 16 threads. Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
2017-02-01Merge "Changes to facilitate row based multi-threading of ARNR filtering"Yunqing Wang
2017-02-01Changes to facilitate row based multi-threading of ARNR filteringRanjit Kumar Tulabandu
Change-Id: I2fd72af00afbbeb903e4fe364611abcc148f2fbb
2017-02-01vp9_rdopt: declare 'c' closer to useJohann
Clears up static clang analysis warning regarding a dead store. Only declare 'c' when it will be used. Change-Id: I1ac0fc7f94bc44da63938c63cd1efcd6b95e0eb3
2017-01-31Fix real-time compression regression in hbd modeJingning Han
This commit resolves the compression performance regression in real-time encoding setting when high bit-depth mode is enabled. The current solution temporarily disables the SIMD implementations of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode. The commit makes the coding results bit-wise identical between regular coding pipeline and high bit-depth at profile 0. BUG=webm:1365 Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf
2016-08-31Refactor uv tx size with lookup arraysDebargha Mukherjee
Change-Id: Ife6a3d301c5faaba89d16d188d638631083511f7
2016-08-25Adjust coefficient optimization and tx_domain rd speed features.paulwilkins
Previously Tx domain rd was used in all cases above speed 0. Coefficient optimization was only enabled for best and speed 0. This patch selectively sets these features at other speed settings based on block complexity. For the Netflix and HD sets in particular the quality gains are large compared to the speed hit. At speed 1 the average psnr gain in the NF set is > 2.5% with one clip coming in at 18% and some points almost 30%. Average gains for the lower resolution test sets are around 1%. The gains are biggest at low Q so some further optimization may be possible. Change-Id: I340376c7b2a78e5389a34b7ebdc41072808d0576
2016-08-12Fix another motion vector out of range bugYunqing Wang
This patch fixed a motion vector out of range bug: vpxenc: ../libvpx/vp9/encoder/vp9_mcomp.c:69: mv_cost: Assertion `mv->col >= -((1 << (11 + 1 + 2)) - 1) && mv->col < ((1 << (11 + 1 + 2)) - 1)' failed. For blocks that returned without having full-pixel search, the original MV limits were not restored, which caused the failure. Moved the set MV limit function down to fix the bug. Change-Id: Id7d798fc7214e95c6e4846c588f0233fcf1a4223
2016-08-08Refactor mv limits.Alex Converse
Change-Id: Ifebdc9ef37850508eb4b8e572fd0f6026ab04987
2016-08-05Fix a motion vector out of range bugYunqing Wang
This patch fixed a motion vector(MV) out of range bug, which was caused by not restoring the original values of the MV min/max thresholds after the sub8x8 full pixel motion search. It occurred rarely and only was seen while encoding a 4k clip for 200 frames. BUG=webm:1271 Change-Id: Ibc4e0de80846f297431923cef8a0c80fe8dcc6a5
2016-08-03Fix msvc compiler warningsYaowu Xu
MSVC 2013 complained about using 32 shift where 64 bit shift should be used. Change-Id: I7a2b165d1a92d3c0a91dd4511b27aba7709b5e55
2016-08-02vp9/encoder: apply clang-formatclang-format
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
2016-07-27Fix 64 to 32 narrowing warning.Alex Converse
- Solves potential integer overflow on 12-bit - Fixes Visual Studio build Change-Id: I26dd660451bbab23040e4123920d59e82585795c
2016-07-25Only consider visible 4x4s in pixel domain error.Alex Converse
BDRATE change derf144: -0.327 lowres: -0.048 midres: -0.125 hdres: -0.238 Change-Id: I789aba9870b5c2952373a7dd4fc8ed45590c3c54
2016-07-21VP9: get_pred_context_switchable_interp() -- encoder sideScott LaVarnway
Change-Id: I7217c90d5cf38c51b76759a2dc4f10070f3a40ac
2016-07-11Merge "vp9_rd_pick_intra_mode_sb(): set interp_filter to"Scott LaVarnway
2016-07-09vp9_rd_pick_intra_mode_sb(): set interp_filter toScott LaVarnway
SWITCHABLE_FILTERS. This is a partial fix for the build issues with Change 357240. Change-Id: I4e507c196175bae729a4f1397878ec8776b0146c
2016-07-07Enable coeff optimization for intra modesJingning Han
This further improves the coding performance by lowres 0.3% midres 0.5% hdres 0.6% Change-Id: I6a03b6da210b9cbc261474bad4a103e0ba021c68
2016-07-07Enable uniform quantization with trellis optimization in speed 0Jingning Han
This commit allows the inter prediction residual to use uniform quantization followed by trellis coefficient optimization in speed 0. It improves the coding performance by lowres 0.79% midres 1.07% hdres 1.44% Change-Id: I46ef8cfe042a4ccc7a0055515012cd6cbf5c9619
2016-07-07Refactor coeff_cost() functionJingning Han
Move the operations that update the context buffers outside this function. The coeff_cost() takes all input as const value and returns the coefficient cost. This makes preparation for the next coefficient optimization CLs. Change-Id: I850eec6e5470b91ea84646ff26b9231b09f70a0c
2016-07-06Support measure distortion in the pixel domainJingning Han
Use pixel domain distortion metric in speed 0. This improves the compression performance by 0.3% for both low and high resolution test sets. Change-Id: I5b5b7115960de73f0b5e5d0c69db305e490e6f1d
2016-07-04Remove txfrm_block_to_raster_xy() from vp9 encoderJingning Han
The transform block row and column positions are always available outside the callees. There is no need to re-compute these values again. This approach has been used by the decoder. This commit removes txfrm_block_to_raster_xy() function. Change-Id: I5b90f91a0d8b7c35cfa7d171da9edf8202630108
2016-06-29Merge "VP9: handle_inter_mode()... Use interp_filter"Scott LaVarnway
2016-06-28VP9: handle_inter_mode()... Use interp_filterScott LaVarnway
only if above/left is inter. Change-Id: I0cc1f926425c021c84536df8271e9ee5f3f87caf
2016-06-25s/UINT32_MAX/UINT_MAX/James Zern
provides better toolchain compatibility Change-Id: I8561a6de668a68ff54fe3886a4ee6300f0ae9c04
2016-06-24Merge "cosmetics: Beautify whitespaces and line wrapping"James Zern
2016-06-24cosmetics: Beautify whitespaces and line wrappingYury Gitman
Change-Id: I9afa02cae671bd3527cf344695e53d0cc767f549