summaryrefslogtreecommitdiff
path: root/vp9/encoder/vp9_rd.h
AgeCommit message (Collapse)Author
2018-05-14Make a config time flagYaowu Xu
This commit replace a hard coded macro with a macro defined by a configure command. Change-Id: Ib31354d61865314ed43e2c429c72b4ef2c8fa2a7
2018-05-14Fixes for consistent encoding across recodes of a frameRanjit Kumar Tulabandu
Change-Id: I094bca857f0fc2c067a4d08d1b36370fe61c25aa
2017-11-13New content type to improve grain retention.paulwilkins
For new VP9 only content type adjust the rate distortion and ARF filter based on the relative spatial variance of the source and reconstruction. In regards to the RD loop the method favors modes where the reconstruction variance is similar to the source variance. However it is currently only applied to regions where the source variance is quite low. For very low variance blocks it applies a further bias against intra coding and large prediction block sizes (the later in particular limit the usefulness of the loop filter). The final part of this change is to lower the strength of the ARF filter for blocks where the source has very low spatial variance, to encourage some low amplitude texture or noise to pass through the filter. This change improves the retention of film grain and fine noise / texture in spatially flat regions, but as expected causes a significant drop in PSNR on many clips. This is to be expected because similar but misaligned noise or texture will give a lower PSNR than a flat noise free reconstruction. However, it is worth noting that most clips show a strong gain in FAST SSIM. The features are enabled on the vpxenc command line by setting --tune-content=film. VPX_ENCODER_ABI_VERSION bumped for this change and cvbr. Change-Id: I26a4e4edfa3dc5cacead82fa701fe7a9118ccd0a
2017-09-08Fix bug in intra mode rd penalty.paulwilkins
The intra mode rd penalty was implemented as a rate penalty. Code was added to scale the penalty according to block size but this was not done correctly for the SB level or sub 8x8. The code did a weird double scaling in regard to bit depth that has been removed. Given that it is a rate penalty the bit depth should not matter. This bug fix improves average metrics on our standard test sets by about 0.1% Change-Id: I7cf81b66aad0cda389fe234f47beba01c7493b1e
2017-04-24Make the row based multi-threaded encoder deterministicYunqing Wang
This patch followed allow_exhaustive_searches feature modification and continued to modify the encoder to achieve the determinism in the row based multi-threaded encoding. While row-mt = 1 and using multiple threads, the adaptive feature in encoder was disabled, which gave BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%), but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at speed 2). These speed losses were acceptable considering the speed gains obtained from row-mt. Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
2017-03-21vp9: Enable adaptive_rd_threshold for row mt for realtime speed 8.Jerome Jiang
Change it to row based array to avoid the slow down cause by sync. row-mt on, speed 8, 2 threads: ~4% speedup for VGA on ARM benefited from adaptive_rd_threshold. Change-Id: I887e65a53af20a6c4f48d293daaee09dab3512cf
2017-03-16Add a vector form of routine vp9_model_rd_from_var_lapndzGabriel Marin
Add routine vp9_model_rd_from_var_lapndz_vec and call it from model_rd_for_sb to model the rate and distortion for MAX_MB_PLANE Laplacian sources in parallel. The caller ensures that all sources have non-zero variance. Measured a 18% to 25% reduction in retired instructions, and 17% to 24% reduction in instruction execution cost with different compilers for the Laplacian modeling. No change in behavior. TEST=Verified that encoded files match bit for bit, with and without this change. BUG=b/33678225 Change-Id: I6b76947f21c659a349adb896e13e99f6e3f951e6
2017-02-15Row based multi-threading of encoding stageRanjit Kumar Tulabandu
(Yunqing Wang) This patch implements the row-based multi-threading within tiles in the encoding pass, and substantially speeds up the multi-threaded encoder in VP9. Speed tests at speed 1 on STDHD(using 4 tiles) set show that the average speedups of the encoding pass(second pass in the 2-pass encoding) is 7% while using 2 threads, 16% while using 4 threads, 85% while using 8 threads, and 116% while using 16 threads. Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
2017-01-24Initialize errorperbit and sabperbit in ARNR filteringRanjit Kumar Tulabandu
(Yunqing) This patch added the missing initialization in temporal filter. Borg test BDRate results: PSNR: -0.019%(lowres); -0.013%(hdres); SSIM: -0.001%(lowres); -0.010%(hdres). Other q values gave comparable but no better results. Change-Id: I7ad0c18b39e6f558342688e2fe1e12fdb133ce9b
2016-08-02vp9/encoder: apply clang-formatclang-format
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
2016-02-09Restore previous motion search bit-error scale.Alex Converse
The bit to error transformation got doubled as a result of going from 8-bit to 9-bit costs (change d13385c). Use defines to derive the scale numbers and comment some of the fields. derf: -0.023 BDRATE hevcmr: +0.067 BDRATE stdhd: +0.098 BDRATE (These are substantially smaller than than the original gains from 8 to 9 bit costing.) Change-Id: I6a2b3b029b2f1415e4f90a05709b2333ec0eea9b
2016-01-22Short circuit flat blocks when coding screen content at realtime speed.Alex Converse
In inter mode search skip all modes except NEARESTMV and DC_PRED. 10% less encode latency for large frames using the chromium remoting_perftests. +0.313% BDRATE on the screencast set at speed -6. Change-Id: Ib97a39dd8bcdeab545509e0e02d78ce7033f8c63
2016-01-15Tie the bit cost scale to a define.Alex Converse
This is a pure-refactor in preparation to potentially raise the bit-cost resolution. Verified at good speed 0 and rt speed -6. Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9
2015-07-27Remove tx_select_threshesYunqing Wang
Removed unused tx_select_threshes and tx_select_diff. Change-Id: I5e9e7ad170056efe14b5f071e94d0c5a36e4a34c
2015-05-15vp9: correct some function signaturesJames Zern
silences missing prototype warnings Change-Id: Idaf68d83d2cb03847f3ee002c4d00c2ac79da604
2015-03-06vp9_ethread: fix me consts initialization to support aq_mode=3 encodingYunqing Wang
While turning on "--aq_mode=3", the quantizers are updated by each thread. Fixed the me consts initialization function to make sure that the correct thread data are updated. Change-Id: Ied27bb7bae76fc3fa2cda4f8c35ac0b46271bef4
2015-03-04Make encoder buffer allocation dynamicAdrian Grange
Frame buffers are now allocated dynamically on-demand. Entries in the reference frame map, cm->ref_frame_map, may now be set to -1 (INVALID_IDX) to indicate that there is not a valid reference buffer in that "slot". All slots in the reference frame map are now initialized to the empty state (-1) and each buffer is initialized to have a reference count of 0. Change-Id: Id1afe98de98db4ae8b2dfefed7889c3b28c68582
2014-12-24Enable sub8x8 inter block search for RTC coding modeJingning Han
This commit enables sub8x8 inter block coding for RTC mode. The use of sub8x8 blocks can be turned on by allowing choose_partitioning function to select 4x4/4x8/8x4 block sizes. Change-Id: Ifbf1fb3888fe4c094fc85158ac3aa89867d8494a
2014-11-24vp9_ethread: modify VP9_COMP structureYunqing Wang
This patch modified struct VP9_COMP. Created a struct ThreadData to include data that need to be copied for each thread. In multiple thread case, one thread processes one tile. all threads share one copy of VP9_COMP, (refer to VP9_COMP *cpi in the code) but each thread has its own copy of ThreadData, (refer to ThreadData *td in the code). Therefore, within the scope of encode_tiles(), both cpi and td need to be passed as function parameters. In single thread case, the FRAME_COUNTS pointer in ThreadData points to "counts" in VP9_COMMON. Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e
2014-11-20vp9_ethread: move filter_cache out of RD_OPT structYunqing Wang
Similar to mask_filter, the filter_cache in RD_OPT struct can be moved out, and declared as a local variable since it is only used in pick_inter_mode functions. Change-Id: I412b99cca82bade07ac912064ec03dd1de6b2c17
2014-11-20vp9_ethread: change mask_filter to a local variableYunqing Wang
The mask_filter in RD_OPT struct is used to record rd result in filter decision. It is only used in pick_inter_mode functions, and is removed from the struct and declared as a local variable. Change-Id: I3c95c8632ba7241591ce00ef2ef5677b5e297d7b
2014-11-14Code cleanup: remove unused members in RD_OPTYunqing Wang
These 2 members in RD_OPT were moved to TileDataEnc struct already, and therefore were removed here. Change-Id: I22fee3b67f96e473a58e194a7edc76dbd48bfa04
2014-11-14vp9_ethread: combine encoder counts in separate structYunqing Wang
Several frame counters in encoder are updated at SB level. Combine those counters and put them in a separate struct, which allows us to allocate one copy for each thread. Change-Id: I00366296a13c0ada4d8fa12f5e07728388b6cab7
2014-10-30Refactor vp9_update_rd_thresh_factJingning Han
Reduce the scope of function parameters. Change-Id: Ifef2cfb559908a97498ffdbd6ea53da1cd45a73c
2014-10-29Enable mode search threshold update in non-RD coding modeJingning Han
Adaptively adjust the mode thresholds after each mode search round to skip checking less likely selected modes. Local tests indicate 5% - 10% speed-up in speed -5 and -6. Average coding performance loss is -1.055%. speed -5 vidyo1 720p 1000 kbps 16533 b/f, 40.851 dB, 12607 ms -> 16556 b/f, 40.796 dB, 11831 ms nik 720p 1000 kbps 33229 b/f, 39.127 dB, 11468 ms -> 33235 b/f, 39.131 dB, 10919 ms speed -6 vidyo1 720p 1000 kbps 16549 b/f, 40.268 dB, 10138 ms -> 16538 b/f, 40.212 dB, 8456 ms nik 720p 1000 kbps 33271 b/f, 38.433 dB, 7886 ms -> 33279 b/f, 38.416 dB, 7843 ms Change-Id: I2c2963f1ce4ed9c1cf233b5b2c880b682e1c1e8b
2014-10-15Add init and reset functions for RD_COST structJingning Han
Change-Id: I2902de7051a883fd22e27a655209233733969cfd
2014-10-13Refactor rate distortion cost structureJingning Han
This commit makes a struct that contains rate value, distortion value, and the rate-distortion cost. The goal is to provide a better interface for rate-distortion related operation. It is first used in rd_pick_partition and saves a few RDCOST calculations. Change-Id: I1a6ab7b35282d3c80195af59b6810e577544691f
2014-09-25Adds various high bit-depth encode functionsDeb Mukherjee
Change-Id: I6f67b171022bbc8199c6d674190b57f6bab1b62f
2014-09-22Adaptive mode search schedulingJingning Han
This commit enables an adaptive mode search order scheduling scheme in the rate-distortion optimization. It changes the compression performance by -0.433% and -0.420% for derf and stdhd respectively. It provides speed improvement for speed 3: bus CIF 1000 kbps 24590 b/f, 35.513 dB, 7864 ms -> 24696 b/f, 35.491 dB, 7408 ms (6% speed-up) stockholm 720p 1000 kbps 8983 b/f, 35.078 dB, 65698 ms -> 8962 b/f, 35.054 dB, 60298 ms (8%) old_town_cross 720p 1000 kbps 11804 b/f, 35.666 dB, 62492 ms -> 11778 b/f, 35.609 dB, 56040 ms (10%) blue_sky 1080p 1500 kbps 57173 b/f, 36.179 dB, 77879 ms -> 57199 b/f, 36.131 dB, 69821 ms (10%) pedestrian_area 1080p 2000 kbps 74241 b/f, 41.105 dB, 144031 ms -> 74271 b/f, 41.091 dB, 133614 ms (8%) Change-Id: Iaad28cbc99399030fc5f9951eb5aa7fa633f320e
2014-07-02Cleanup vp9_rd.Alex Converse
Change-Id: I39a37335ba5b3a969d328afb1f425ddb2cf7ddda
2014-07-02Split vp9_rdopt into vp9_rdopt and vp9_rd.Alex Converse
vp9_rdopt is for making rd optimal mode decisions. vp9_rd is for all other rd related routines. Anything used outside of making an rd optimal decision belongs in rd. Change-Id: I772a3073f7588bdf139f551fb9810b6864d8e64b