summaryrefslogtreecommitdiff
path: root/vp9/encoder/vp9_rdopt.c
AgeCommit message (Collapse)Author
2016-01-27Switch to 9-bit rate cost constants built on a 256 probability denominator.Alex Converse
-.220 BDRATE derf: https://x20web.corp.google.com/~aconverse/results/cost256_derf.html -.675 BDRATE hevcmr: https://x20web.corp.google.com/~aconverse/results/cost256_hevcmr.html Change-Id: Ifb1646d8ce65ffe0eff9953a911b1b88735b335f
2016-01-21Merge "Tie the bit cost scale to a define."Alex Converse
2016-01-19VP9: Eliminate MB_MODE_INFOScott LaVarnway
Change-Id: Ifa607dd2bb366ce09fa16dfcad3cc45a2440c185
2016-01-15Tie the bit cost scale to a define.Alex Converse
This is a pure-refactor in preparation to potentially raise the bit-cost resolution. Verified at good speed 0 and rt speed -6. Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9
2016-01-13VP9: Remove decoder args from find_mv_refs_idx()Scott LaVarnway
The decoder does not use this function. Change-Id: Ie67f909c0f4108ef286789c70df867d4b960a780
2016-01-07Enable encoder to avoid 8x4 or 4x8 partitionsYaowu Xu
This commit enables encoder to avoid 8x4 and 4x8 partitions for scaled reference frames when libvpx is configured and built with --enable-better-hw-compatibility Change-Id: I02ad65c386f5855f4325d72570c49164ed52f413
2016-01-07Fix a typoYaowu Xu
Change-Id: I12de2dd5e5f375551804166188d76a9ad8067b41
2015-12-11Fix sub8x8 motion search on scaled reference frameJingning Han
This commit makes the sub8x8 block rate-distortion optimization scheme use precise motion compensated prediction to compute the rd cost. It fixes a potential buffer overflow issue related to sub8x8 motion search on scaled reference frame. Change-Id: I4274992ef4f54eaacfde60db045e269c13aaa2de
2015-11-19Fix unsigned overflow in rd_variance_adjustment.Alex Converse
Found with clang -fsanitize=integer Change-Id: I2538e7483cb2d5f06bceecbd3326bdd88bfecfa1
2015-11-13Changes to exhaustive motion search.paulwilkins
This change alters the nature and use of exhaustive motion search. Firstly any exhaustive search is preceded by a normal step search. The exhaustive search is only carried out if the distortion resulting from the step search is above a threshold value. Secondly the simple +/- 64 exhaustive search is replaced by a multi stage mesh based search where each stage has a range and step/interval size. Subsequent stages use the best position from the previous stage as the center of the search but use a reduced range and interval size. For example: stage 1: Range +/- 64 interval 4 stage 2: Range +/- 32 interval 2 stage 3: Range +/- 15 interval 1 This process, especially when it follows on from a normal step search, has shown itself to be almost as effective as a full range exhaustive search with step 1 but greatly lowers the computational complexity such that it can be used in some cases for speeds 0-2. This patch also removes a double exhaustive search for sub 8x8 blocks which also contained a bug (the two searches used different distortion metrics). For best quality in my test animation sequence this patch has almost no impact on quality but improves encode speed by more than 5X. Restricted use in good quality speeds 0-2 yields significant quality gains on the animation test of 0.2 - 0.5 db with only a small impact on encode speed. On most clips though the quality gain and speed impact are small. Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
2015-11-06Use accurate bit cost for uv_mode in UV intra mode RD selectionhui su
On derflr, +0.1% for VP10; however, -0.03% on VP9. Change-Id: I09c724232ede74254043d61d3cadc506256af0af
2015-10-21Optimize vp9_highbd_block_error_8bit assembly.Geza Lore
A new version of vp9_highbd_error_8bit is now available which is optimized with AVX assembly. AVX itself does not buy us too much, but the non-destructive 3 operand format encoding of the 128bit SSEn integer instructions helps to eliminate move instructions. The Sandy Bridge micro-architecture cannot eliminate move instructions in the processor front end, so AVX will help on these machines. Further 2 optimizations are applied: 1. The common case of computing block error on 4x4 blocks is optimized as a special case. 2. All arithmetic is speculatively done on 32 bits only. At the end of the loop, the code detects if overflow might have happened and if so, the whole computation is re-executed using higher precision arithmetic. This case however is extremely rare in real use, so we can achieve a large net gain here. The optimizations rely on the fact that the coefficients are in the range [-(2^15-1), 2^15-1], and that the quantized coefficients always have the same sign as the input coefficients (in the worst case they are 0). These are the same assumptions that the old SSE2 assembly code for the non high bitdepth configuration relied on. The unit tests have been updated to take this constraint into consideration when generating test input data. Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7
2015-10-08Optimization of 8bit block error for high bitdepthGeza Lore
If high bit depth configuration is enabled, but encoding in profile 0, the code now falls back on optimized SSE2 assembler to compute the block errors, similar to when high bit depth is not enabled. Change-Id: I471d1494e541de61a4008f852dbc0d548856484f
2015-09-30VP9: remove plane_type from macroblockd_planeScott LaVarnway
Change-Id: Ia5072a3a92212d8565f33359f6c146469bdfbbec
2015-09-23Adjust rd calculation in choose_tx_size_from_rdhui su
Coding gain: derflr 0.142% hevclr 0.153% hevcmr 0.124% Change-Id: I63b56ae3a9002c3a266e10e2964135ed43b0ba53
2015-09-09Fix ioc warnings related to sub8x8 reference frameJingning Han
Access scaled reference frame in the sub8x8 rate-distortion optimization loop only when the current test mode is an inter mode. This prevents an ioc warning triggered by sending intra_frame index to fetch scaled reference frame. Change-Id: I6177ecc946651dd86c7ce362e3f65c4074444604
2015-09-09Enable sub8x8 inter mode with scaled ref frame in RD optimizationJingning Han
This commit allows the encoder to include sub8x8 inter mode with scaled reference frame in the rate-distortion optimization scheme. Change-Id: Ibbe9678801592826ef22566566dcdeeb008350d5
2015-08-31Include vpx_dsp_common.h when using VPXMIN/MAXJohann
Change-Id: I2e387a06484a06301f3cd6600c4ba2f4335b61ee
2015-08-26vpx_dsp_common: add VPX prefix to MIN/MAXJames Zern
prevents redeclaration warnings; vp8 has its own define which will be resolved in a future commit Change-Id: Ic941fef3dd4262fcdce48b73075fe6b375f11c9c
2015-08-24Add transform size rate for intra skip mode in rdoptShunyao Li
stdhd +0.226 hevchr +0.091 hevcmr +0.052 derflr +0.033 Change-Id: I84034209c5760609a99bd6e0ce55e02534b72cac
2015-08-12Use sizeof(variable) instead of sizeof(type)hui su
Change-Id: Ia069da11eebb271063e9eb837bdb3e7175ecce13
2015-08-10Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.hAlex Converse
Use system_state.h in vpx_dsp and remove unneeded includes of vp9_systemdependent.h. Change-Id: I92557ec6dd5aa790160b4f31fe7967db0d7ec3c4
2015-08-06Fixed a comment on the compound ref frames.Zoe Liu
Change-Id: I77e397ac9f594c9c4c1db442e334a6ea5f53f588
2015-08-06Cosmetic - align format in vp9Jingning Han
Change-Id: I83ed3422f1f4009675ad2f5c4b7236bc7b83b30e
2015-07-31Compute skippable inside the block_rd_txfm loop.Alex Converse
Change-Id: Iaa43aeeb7a2074495e00cdb83bb551c3f13d3ed2
2015-07-31Simplify model_rd_for_sb HBD ifdefsAlex Converse
Change-Id: Ic1ce346a053800ae3b2d77178f46e6a388357f6d
2015-07-31Simplify dist_block HBD ifdefsAlex Converse
Change-Id: Ic0b4e92cbaf813bcca8a8e9052c936c2e025e114
2015-07-31Merge "Short circuit rate_block in block_rd_txfm."Aℓex Converse
2015-07-31Give skip_txfm constants names.Alex Converse
This is using a define instead of an enum to keep byte packing. Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792
2015-07-31Short circuit rate_block in block_rd_txfm.Alex Converse
Don't run rate_block (cost_coeffs) if distortion alone is enough to surpass best_rd. This decreases 2nd pass runtime on HD at speed 2 by about 2%. There is zero effect on output if tx_cache is removed. Change-Id: Ia3b1cc77bfbe6ee988c395fde06c0eb92940b784
2015-07-30Remove tx cache and speed up tx size selectionYunqing Wang
1. The RD scores obtained during the tx size selection were stored in the tx cache, and used to help make the tx decision for the following frames. This wasn't used anymore in VP9 encoder. Recovered the related decision making code from 1.5+ years ago, and borg tests didn't show any quality gain. This patch removed it to lower the complexity. 2. An optimization was done after the above refactoring. If the tx_mode is not TX_MODE_SELECT, we only need to test the chosen tx size instead of all posible tx sizes. This gave a 1.5% average speed gain at speed 2, and a 1% average speed gain at speed 3. Change-Id: Id8cd650e066a8cef33829d8c15388a8138adc78c
2015-07-30Merge "Convert simple_model_rd_from_var from a speed check to a speed feature."Aℓex Converse
2015-07-30Convert simple_model_rd_from_var from a speed check to a speed feature.Alex Converse
Change-Id: I8877025e172fff29bc4e270790211463b676b4d7
2015-07-30Cleanup rdcost_block_argsAlex Converse
Change-Id: I9d613cbe9e76b5dd15e935878ef9fd04521690ba
2015-07-28Replace vp9_ prefix in 2D-DCT functions with vpx_Jingning Han
Clean up the forward 2D-DCT function names in vpx_dsp. Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25
2015-07-20vpx_dsp/bitreader.h: vp9_->vpx_Yaowu Xu
Replace vp9_ in names to vpx_ as they are not codec specific. Change-Id: I2e583aa63dee769353ada4b42417aa15c4074ebb
2015-07-20Refactor highbd forward transform use caseJingning Han
Separate the hybrid transform case from 2D-DCT case. This will allow us to clear up cross dependency between c and SIMD implementations later. Change-Id: Iaa499e8b096850a1c5a0c50a3b6e63e15d0184bf
2015-07-13Refactor intra block prediction functionJingning Han
This commit simplifies the intra block boundary condition logic. It removes the block index from the argument set. Change-Id: If00142512eb88992613d6609356dfd73ba390138
2015-07-08Changes to use of rectangular partitions.paulwilkins
Changes to allow more use of rectangular partitions at speeds 1 and 2 for content classed by the first pass as animation and for blocks near the active image edge. This has quite a big impact in quality for the animated test sequence but also hurts encode speed for speed 2. For other content types the impact on both speed and quality is small. Added some plumbing for detection of internal vertical image edges. Change-Id: I3fc48de2349f8cb87946caaf0b06dbb0ea261a9a
2015-07-08Change speed and rd features for formatting bars.paulwilkins
Change speed features / behavior for split mode when there is an internal active edge (e.g. formatting bars). Remove some threshold constraints in rd code near the active edge of the image. Add some plumbing for left and right active edge detection. Patch set 5. Limit rd pass through for sub 8x8 to internal active edges. This takes away any speed penalty for most clips but keeps the enhanced edge coding for the more critical case of internal image edges Change-Id: If644e4762874de4fe9cbb0a66211953fa74c13a5
2015-07-07Move sub pixel variance to vpx_dspJohann
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-07-06Merge "Move subtract functions from vp9 to vpx_dsp"Jingning Han
2015-07-06remove vp9_get_interp_kernel()James Zern
expose filter_kernels[] and do the table lookup directly Change-Id: I0b10bff0327c3e01a723736141a9ffd377cd3d20
2015-07-06Move subtract functions from vp9 to vpx_dspJingning Han
Factor out the subtraction operator as common function. Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b
2015-06-29VP9: Move ref_mvs[][] and mode_context[] from MB_MODE_INFOScott LaVarnway
to MB_MODE_INFO_EXT. This saves 36 bytes per 8x8 area for both the decoder and encoder. (encoder has two MODE_INFO buffers) Change-Id: If006abb2224acaf326df3c2be09e77e967662107
2015-06-22Remove tile paramScott LaVarnway
and added to MACROBLOCKD. Change-Id: I0e60aaa9f84bcc9f2376d71bd934f251baee38db
2015-06-11inline vp9_get_segdata()Scott LaVarnway
and change name. Change-Id: I706645cf9d9dc04f1b3b6ac80df80edb7f101854
2015-06-11inline vp9_segfeature_active()Scott LaVarnway
and changed name. Change-Id: Ie023ca66cc2c823032f58d4faeb53fd1863c94f3
2015-06-04Reducing size of MODE_INFO structScott LaVarnway
Reduced size from 124 bytes to 104 bytes. For decode only builds, it is reduced to 68 bytes. Change-Id: If9e6b92285459425fa086ab5a743d0a598a69de3
2015-05-22Re-worked header filesScott LaVarnway
Various header/test files had to be re-worked in order to build "Remove cm parameter from vp9_decode_block_tokens()". This patch reverts the "Remove cm" part and only contains the re-worked header files. Change-Id: I520958a88d1991fee988a3c784d0eac40e117a32