summaryrefslogtreecommitdiff
path: root/vp9/encoder/vp9_block.h
AgeCommit message (Collapse)Author
2023-06-07Fix more typos (2/n)Jerome Jiang
kernal -> kernel e.g -> e.g. paritioning -> partitioning partioning -> partitioning coefficents -> coefficients i.e, -> i.e., equivalend -> equivalent recive -> receive resoultions -> resolutions Bug: webm:1803 Change-Id: I1d6176202ee5daee7a64bf59114e8b304aeb4db7
2023-06-07Fix more typos (1/n)Jerome Jiang
Dont -> Don't setings -> settings thresold -> thresh thresold -> threshold becasue -> because itterations -> iterations its a -> it's a an constant -> a constant Bug: webm:1803 Change-Id: I1e019393939ed25c59c898c88d4941ec360b026d
2023-03-20Merge "Refactor logic of skipping trellis coeff opt" into mainYunqing Wang
2023-03-19Refactor logic of skipping trellis coeff optDeepa K G
The code to enable trellis coefficient optimization is refactored using the sf 'trellis_opt_tx_rd'. This change facilitates adaptive skipping of trellis optimization based on block properties. Change-Id: Ia1ff7cbbe5acf86414410f62655d46c099387847
2023-03-06reland: quantize: simplify 32x32_b argsJohann
Allocate mb_plane_ on the heap to ensure src is aligned. Now that all the implementations of the 32x32 quantize are in intrinsics we can reference struct members directly. Saves pushing them to the stack. n_coeffs is not used at all for this function. Change-Id: Ib551f7f583977602504d962b72063bc6eda9dda9
2023-03-03Revert "Allow macroblock_plane to have its own rounding buffer"Johann
This reverts commit 5359ae810cdbb974060297ecf935183baf7b009b. Reason for revert: Blocks quantize cleanups Original change's description: > Allow macroblock_plane to have its own rounding buffer > > Add 8 bytes buffer to macroblock_plane to support rounding factor. > > Change-Id: I3751689e4449c0caea28d3acf6cd17d7f39508ed Change-Id: Ia2424d2114207370f0b45350313a5ff8521d25a8
2023-03-01Revert "quantize: simplify 32x32_b args"James Zern
This reverts commit 848f6e733789c627b6606baf1c85e32be997e36f. This has alignment issues, causing crashes in the tests: SSSE3/VP9QuantizeTest.EOBCheck/* Change-Id: Ic12014ab0a78ed3cde02d642509061552cdc8fc9
2023-02-28quantize: simplify 32x32_b argsJohann
Now that all the implementations of the 32x32 quantize are in intrinsics we can reference struct members directly. Saves pushing them to the stack. n_coeffs is not used at all for this function. Change-Id: I2104fea3fa20c455087e21b347d6abd7ea1f3e1e
2023-02-22vp9_block.h: rename diff struct to DiffJames Zern
This matches the style guide and fixes some -Wshadow warnings related to variables with the same name. Something similar was done in libaom in: 863b04994b Fix warnings reported by -Wshadow: Part2: av1 directory Bug: webm:1793 Change-Id: I4df1bbc8d079a3174d75f0d35d54c200ffdbb677
2021-06-25Disallow skipping transform and quantizationCheng Chen
The encoder has a feature to skip transform and quantization based on model rd analysis. It could happen that the model based analysis lets the encoder skips transform and quantization, while a bad prediction occurs, leading to bad reconstructed blocks, which are intrusive and apparently coding errors. We add a speed feature to guard the skipping feature. Due to the risk of bad perceptual quality, we disallow such skipping by default. On hdres test set, speed 2, the coding performance difference is 0.025%, speed difference is 1.2%, which can be considered non significant. BUG=webm:1729 Change-Id: I48af01ae8dcc7a76c05c695f3f3e68b866c89574
2019-04-01Allow macroblock_plane to have its own rounding bufferJingning Han
Add 8 bytes buffer to macroblock_plane to support rounding factor. Change-Id: I3751689e4449c0caea28d3acf6cd17d7f39508ed
2019-03-18Add perceptual AQ mode control to RD searchJingning Han
Make the rate-distortion optimization search support the perceptual quality AQ mode. Change-Id: Iee507ccfda90ac39b3623de705f187b1459e57e1
2019-02-11vp9: ML var partition as speed feature & cleanup.Jerome Jiang
Remove it from runtime flag. Add new struct for rd ml partition. BUG=webm:1599 Change-Id: I883edbba83c65b7e557b8832419e212cffc85997
2018-10-08Set up the unit scaling factor for motion searchYunqing Wang
Set up the unit scaling factor used during motion search. Change-Id: I6fda018d593b7ad4b7658d44c39be950a502d192
2018-09-28Add ml_var_partition experimentHui Su
Make partition decisions using machine learning models. The goal is to achieve better coding quality than the variance-based parititioning without much encoding speed loss. To enable this experiment, use --enable-ml-var-partition for config. When eanbled, the variance-based partitioning is replaced by this ML based partitioing for speed 6 and above in real time mode(except low resolution or high bit-depth). Current coding gains(average PSNR): speed 6 speed 7 speed 8 rtc 2.04% 2.65% 3.90% ytlivehr 3.11% 4.53% 11.57% hdres(rtc mode) 5.10% Further testing and tuning is needed to see if the speed and quality tradeoff is reasonable. Change-Id: I0da5a2fbc22c3261832b32920ee36d9b19d417af
2018-09-15cosmetics: normalize include guardsJames Zern
use the recommended format [1] of: <PROJECT>_<PATH>_<FILE>_H_ [1] https://google.github.io/styleguide/cppguide.html#The__define_Guard "All header files should have #define guards to prevent multiple inclusion. The format of the symbol name should be <PROJECT>_<PATH>_<FILE>_H_." Change-Id: I2e8ab0b32fb23c30fa43cff5fec12d043c0d2037
2018-08-14Make Sharpness parameter affect visual sharpnessJim Bankoski
1: Lower rdmult used in trellis optimization 2: Shut off the end of block optimization that tries end of block at every sub position if any of the coefficients are > 1. 3: Change the rounding and zbin factor according to sharpness. 4: Disable the skip block check that calculates RD using SSE from predictor. Change-Id: I247b61a26fa22f12f8b684e7cd6d4e368de7c3e4
2018-07-02Merge "Exploit the spatial variance in temporal dependency model"Jingning Han
2018-07-01vp9: Fix to screen content artifact for real-time.Marco Paniconi
Reset segment to base (segment#0) on spatially flat stationary blocks (source_variance = 0). Also increase dc_skip threshold for these blocks. Reduces artifacts on flat areas in screen content mode. Change-Id: I7ee0c80d37536db7896fa74a83f75799f1dcf73d
2018-06-29Exploit the spatial variance in temporal dependency modelJingning Han
Adapt the Lagrangian multipler based on the spatial variance in the temporal dependency model. The functionality is disabled by default. To turn on, set enable_tpl_model to 1. Change-Id: I1b50606d9e2c8eb9c790c49eacc12c00d3d7c211
2017-09-28vp9: Modification to adapt the ARF usage for 1 pass vbrMarco
Add stats for past ARF usage, and use it to disable ARF usage based on some conditions. Overall improvement on ytlive set, reduces the regression on the problem clips for this feature. Only affects when sf->use_altref_onepass is enabled (currently off by default). Change-Id: I66267f227ea132dc86acb730e9882f85bead2cdb
2017-07-30vp9: Fix denoising condition when pickmode partition is used.Marco
When the superblock partition is based on the nonrd-pickmode, we need to avoid the denoising. Current condition was based on the speed level. This change is to make the condition at the superblock level, as the switch in partitioning may be done at sb level based on source_sad (e.g., in speed 6). Change-Id: I12ece4f60b93ed34ee65ff2d6cdce1213c36de04
2017-07-17vp9: Reuse motion from choose_partitioning in NEWMV search.Marco
When int_pro_motion_estimation is done for superblock in choose_partitioning, use it to avoid the full_pixel_search for NEWMV mode, if bsize is >= 32X32. For speed > 7. Small/neutral change on RTC metrics. ~1-2% speedup on arm on high motion clip. Change-Id: I3cfe6833ff4bf75d4afa83eaf058ad45729de85b
2017-07-06cosmetics,vp9/: normalize inv/fwd_txfm namingJames Zern
+ vpx_dsp/, test/ itxfm -> inv_txfm, ftxfm -> fwd_txfm Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e
2017-06-29cosmetics,vp9/encoder: s/txm/txfm/James Zern
txfm is more commonly used as an abbreviation through the codebase Change-Id: I86fd90ef132468f9da270091c05daa1f5a49ece2
2017-05-03Update highbd idct functions arguments to use uint16_t dstLinfeng Zhang
BUG=webm:1388 Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5
2017-04-24vp9; Reduce artifact in non-rd pickmode for lighting changes.Marco
Add a low-variance high-sumdiff to the superblock content state and use it to limit the mv and bias some decisions in non-rd pickmode. Only affects speed >= 6. Reduces artifact for lighting changes. Small/no difference in metrics on RTC set. Change-Id: Ic84b2379fe0ae3fa71ae826ee6bae3eaf551a25b
2017-04-21Make allow_exhaustive_searches feature no longer adaptiveYunqing Wang
A previous patch turned on allow_exhaustive_searches feature only for FC_GRAPHICS_ANIMATION content. This patch further modified the feature by removing the exhaustive search limit, and made it no longer adaptive. As a result, the 2 counts that recorded the number of motion searches were removed, which helped achieve the determinism in the row based multi-threading encoding. Tests showed that this patch didn't cause the encoder much slower. Used exhaustive_searches_thresh for this speed feature, and removed allow_exhaustive_searches. Also, refactored the speed feature code to follow the general speed feature setting style. Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa
2017-03-27vp9: 1 pass: Move source sad computation into encodeframe loop.Marco
Refactor to split the 1 passs source sad computation into scene detection (currently used for VBR and screen-content mode), and superblock based source sad computation (used in non-rd CBR mode). This allows the source sad computation for CBR mode to be multi-threaded. No change in compression. Change-Id: I112f2918613ccbd37c1771d852606d3af18c1388
2017-03-20Merge "Record the sum of tx block eobs in the partition block"Yunqing Wang
2017-03-20vp9: Use sb content measure to bias against golden.Marco
For each superblock, keep track of how far from current frame was the last significant content change, and use that (along with GF distance), to turnoff GF search in non-rd pickmode. Only enabled for speed >= 8. avgPNSR on RTC/RTC_derf down by ~0.9/1.2. Speedup on mac: ~3-5%. Speedup on arm: 3.6% for VGA and 4.4% for HD. Change-Id: Ic3f3d6a2af650aca6ba0064d2b1db8d48c035ac7
2017-03-20Record the sum of tx block eobs in the partition blockYunqing Wang
The sum of tx bloxk eobs is needed in the machine learning based partition early termination. The eobs are first accumulated during tx search, and then the value associated with the best tx_size is copied to ctx for later use. After the sum of eobs are calculated correctly, re-enabled ml_partition_search_early_termination speed feature. Re-did the quality/speed test to check the impact of the fix. 1. Borg test BDRATE result: 4k set: PSNR: +0.183%; SSIM: +0.100%; hdres set: PSNR: +0.168%; SSIM: +0.256%; midres set: PSNR: +0.186%; SSIM: +0.326%; 2.Average speed gain result: 4k clips: 21%; hd clips: 26%; midres clips: 15%. The result is in line with the original result. Change-Id: I4209a95c89be03b4cbfb6a95b16885f89feddbda
2017-03-10vp9/encoder: fix segfault on win32 using vs < 2015James Zern
shift the bsse[] member of the macroblock struct to the front to avoid an incorrect offset (0) to the upper half of bsse[0] which leads to a negative resulting in a crash. restrict this to visual studio versions before 2015 (the bug was observed with 2013, fixed in 2015) to avoid any potential cache impact on other platforms. https://connect.microsoft.com/VisualStudio/feedback/details/2396360/bad-structure-offset-in-32-bit-code BUG=webm:1054 Change-Id: I40f68a1d421ccc503cc712192263bab4f7dde076
2017-02-15Row based multi-threading of encoding stageRanjit Kumar Tulabandu
(Yunqing Wang) This patch implements the row-based multi-threading within tiles in the encoding pass, and substantially speeds up the multi-threaded encoder in VP9. Speed tests at speed 1 on STDHD(using 4 tiles) set show that the average speedups of the encoding pass(second pass in the 2-pass encoding) is 7% while using 2 threads, 16% while using 4 threads, 85% while using 8 threads, and 116% while using 16 threads. Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
2017-01-24Multi-threading of first pass stats collectionRanjit Kumar Tulabandu
(yunqingwang) 1. Rebased the patch. Incorporated recent first pass changes. 2. Turned on the first pass unit test. Change-Id: Ia2f7ba8152d0b6dd6bf8efb9dfaf505ba7d8edee
2017-01-20vp9: Add feature to use block source_sad for realtime mode.Marco
Only for speed >= 7, and affects skipping of intra modes. Threshold is set low for now, needs to be tuned. Small/no difference in metrics on rtc clips. Change-Id: If9bdbd43f08d1f80407cdd2e9e5e96780dcd2424
2016-08-25Adjust coefficient optimization and tx_domain rd speed features.paulwilkins
Previously Tx domain rd was used in all cases above speed 0. Coefficient optimization was only enabled for best and speed 0. This patch selectively sets these features at other speed settings based on block complexity. For the Netflix and HD sets in particular the quality gains are large compared to the speed hit. At speed 1 the average psnr gain in the NF set is > 2.5% with one clip coming in at 18% and some points almost 30%. Average gains for the lower resolution test sets are around 1%. The gains are biggest at low Q so some further optimization may be possible. Change-Id: I340376c7b2a78e5389a34b7ebdc41072808d0576
2016-08-08Refactor mv limits.Alex Converse
Change-Id: Ifebdc9ef37850508eb4b8e572fd0f6026ab04987
2016-08-02vp9/encoder: apply clang-formatclang-format
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
2016-06-13vp9: Encoding cycle reduction for speed 8.JackyChen
1. Skip golden non-zeromv and newmv-last for bsize >= 16x16 if the temporal variance obtained from choose_partitioning is very low. 2. Skip horz and vert INTRA mode for speed 8. This change works best on the clips with little noise and with some motion (e.g. gips_motion which has > 5% speed up). PSNR drop is 1.78% on rtc test set, no obvious visual quality regression found. Change-Id: Ib43b5b20e67809d03c5a6890818ddff59e1fc94a
2016-06-01vp9: Skip some modes when variance is low for big blocks, for 1 pass real-time.jackychen
Skip intra-mode and some inter-modes (newmv, nearmv, nearestmv) for golden frame if the variance got from choose_partitioning is very low. Only for 1 pass real-time CBR mode and bsize >= 32x32, it has ~2.5% speed up with less than 0.1% PSNR drop for rtc test set. Don't see visual regression. Change-Id: I70efbc95a1007231ae36f02c5b2fbf6cd35077ad
2016-02-09Restore previous motion search bit-error scale.Alex Converse
The bit to error transformation got doubled as a result of going from 8-bit to 9-bit costs (change d13385c). Use defines to derive the scale numbers and comment some of the fields. derf: -0.023 BDRATE hevcmr: +0.067 BDRATE stdhd: +0.098 BDRATE (These are substantially smaller than than the original gains from 8 to 9 bit costing.) Change-Id: I6a2b3b029b2f1415e4f90a05709b2333ec0eea9b
2016-01-27vp9 non-rd mode: Modification for detected skin areas.Marco
If a superblock contains alot of "skin" then force split of 64x64 partition, and make some adjustments in mode selection. This helps to reduce artifacts on moving face/skin areas at low bitrates. Little/no change in metrics: avgPSNR/SSIM down by ~0.12%. Small encoding time increase < 1%. Change-Id: Ic57f52148c3716f391419fab0530d916e4c1d186
2015-11-13Changes to exhaustive motion search.paulwilkins
This change alters the nature and use of exhaustive motion search. Firstly any exhaustive search is preceded by a normal step search. The exhaustive search is only carried out if the distortion resulting from the step search is above a threshold value. Secondly the simple +/- 64 exhaustive search is replaced by a multi stage mesh based search where each stage has a range and step/interval size. Subsequent stages use the best position from the previous stage as the center of the search but use a reduced range and interval size. For example: stage 1: Range +/- 64 interval 4 stage 2: Range +/- 32 interval 2 stage 3: Range +/- 15 interval 1 This process, especially when it follows on from a normal step search, has shown itself to be almost as effective as a full range exhaustive search with step 1 but greatly lowers the computational complexity such that it can be used in some cases for speeds 0-2. This patch also removes a double exhaustive search for sub 8x8 blocks which also contained a bug (the two searches used different distortion metrics). For best quality in my test animation sequence this patch has almost no impact on quality but improves encode speed by more than 5X. Restricted use in good quality speeds 0-2 yields significant quality gains on the animation test of 0.2 - 0.5 db with only a small impact on encode speed. On most clips though the quality gain and speed impact are small. Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
2015-07-31Give skip_txfm constants names.Alex Converse
This is using a define instead of an enum to keep byte packing. Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792
2015-07-29Comment zcoeff_blk.Alex Converse
Change-Id: Iefc2eb78e71472ecf51802ec59ff32caef4bd0f4
2015-06-29VP9: Move ref_mvs[][] and mode_context[] from MB_MODE_INFOScott LaVarnway
to MB_MODE_INFO_EXT. This saves 36 bytes per 8x8 area for both the decoder and encoder. (encoder has two MODE_INFO buffers) Change-Id: If006abb2224acaf326df3c2be09e77e967662107
2015-02-04Account for chroma component costs in RTC mode decisionJingning Han
This commit allows the encoder to account for additional chroma plane costs in the mode decision process, if the current block potentially contains significant color change. It improves the visual quality at very low bit-rates. The compression performance of dark720p is improved by 12.39% in speed 6. For jimred at 150 kbps, the PSNR of V component (red) increased by 0.2 dB, at the expense of about 5% increase in encoding time. Note that for sequences where the chroma components are fairly consistent, the encoding time increase is negligible. On average the rtc set compression performance is improved by 1.172% in PSNR and 1.920% in SSIM. Change-Id: Ia55b24ef23a25304f7ec9958fbf07fd6e658505c
2014-12-22Revert "Revert "Removal of legacy zbin_extra / zbin_oq_value.""Jingning Han
This reverts commit 9946ee23e0a4c158e26a505b162a072f81b8a3be. Fix the ssse3 asm function. Change-Id: I07f77a63aa98087626e45c4e87aa5dcafc0b0b07
2014-12-19Revert "Removal of legacy zbin_extra / zbin_oq_value."Paul Wilkins
This reverts commit e9b586e21bb899e247346e82bccf5afb42604910. Change-Id: I5b36e6727da6c05278d97e2c37b80c109f79bed4