summaryrefslogtreecommitdiff
path: root/vp9/encoder/vp9_rdopt.c
AgeCommit message (Collapse)Author
2015-11-19Fix unsigned overflow in rd_variance_adjustment.Alex Converse
Found with clang -fsanitize=integer Change-Id: I2538e7483cb2d5f06bceecbd3326bdd88bfecfa1
2015-11-13Changes to exhaustive motion search.paulwilkins
This change alters the nature and use of exhaustive motion search. Firstly any exhaustive search is preceded by a normal step search. The exhaustive search is only carried out if the distortion resulting from the step search is above a threshold value. Secondly the simple +/- 64 exhaustive search is replaced by a multi stage mesh based search where each stage has a range and step/interval size. Subsequent stages use the best position from the previous stage as the center of the search but use a reduced range and interval size. For example: stage 1: Range +/- 64 interval 4 stage 2: Range +/- 32 interval 2 stage 3: Range +/- 15 interval 1 This process, especially when it follows on from a normal step search, has shown itself to be almost as effective as a full range exhaustive search with step 1 but greatly lowers the computational complexity such that it can be used in some cases for speeds 0-2. This patch also removes a double exhaustive search for sub 8x8 blocks which also contained a bug (the two searches used different distortion metrics). For best quality in my test animation sequence this patch has almost no impact on quality but improves encode speed by more than 5X. Restricted use in good quality speeds 0-2 yields significant quality gains on the animation test of 0.2 - 0.5 db with only a small impact on encode speed. On most clips though the quality gain and speed impact are small. Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
2015-11-06Use accurate bit cost for uv_mode in UV intra mode RD selectionhui su
On derflr, +0.1% for VP10; however, -0.03% on VP9. Change-Id: I09c724232ede74254043d61d3cadc506256af0af
2015-10-21Optimize vp9_highbd_block_error_8bit assembly.Geza Lore
A new version of vp9_highbd_error_8bit is now available which is optimized with AVX assembly. AVX itself does not buy us too much, but the non-destructive 3 operand format encoding of the 128bit SSEn integer instructions helps to eliminate move instructions. The Sandy Bridge micro-architecture cannot eliminate move instructions in the processor front end, so AVX will help on these machines. Further 2 optimizations are applied: 1. The common case of computing block error on 4x4 blocks is optimized as a special case. 2. All arithmetic is speculatively done on 32 bits only. At the end of the loop, the code detects if overflow might have happened and if so, the whole computation is re-executed using higher precision arithmetic. This case however is extremely rare in real use, so we can achieve a large net gain here. The optimizations rely on the fact that the coefficients are in the range [-(2^15-1), 2^15-1], and that the quantized coefficients always have the same sign as the input coefficients (in the worst case they are 0). These are the same assumptions that the old SSE2 assembly code for the non high bitdepth configuration relied on. The unit tests have been updated to take this constraint into consideration when generating test input data. Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7
2015-10-08Optimization of 8bit block error for high bitdepthGeza Lore
If high bit depth configuration is enabled, but encoding in profile 0, the code now falls back on optimized SSE2 assembler to compute the block errors, similar to when high bit depth is not enabled. Change-Id: I471d1494e541de61a4008f852dbc0d548856484f
2015-09-30VP9: remove plane_type from macroblockd_planeScott LaVarnway
Change-Id: Ia5072a3a92212d8565f33359f6c146469bdfbbec
2015-09-23Adjust rd calculation in choose_tx_size_from_rdhui su
Coding gain: derflr 0.142% hevclr 0.153% hevcmr 0.124% Change-Id: I63b56ae3a9002c3a266e10e2964135ed43b0ba53
2015-09-09Fix ioc warnings related to sub8x8 reference frameJingning Han
Access scaled reference frame in the sub8x8 rate-distortion optimization loop only when the current test mode is an inter mode. This prevents an ioc warning triggered by sending intra_frame index to fetch scaled reference frame. Change-Id: I6177ecc946651dd86c7ce362e3f65c4074444604
2015-09-09Enable sub8x8 inter mode with scaled ref frame in RD optimizationJingning Han
This commit allows the encoder to include sub8x8 inter mode with scaled reference frame in the rate-distortion optimization scheme. Change-Id: Ibbe9678801592826ef22566566dcdeeb008350d5
2015-08-31Include vpx_dsp_common.h when using VPXMIN/MAXJohann
Change-Id: I2e387a06484a06301f3cd6600c4ba2f4335b61ee
2015-08-26vpx_dsp_common: add VPX prefix to MIN/MAXJames Zern
prevents redeclaration warnings; vp8 has its own define which will be resolved in a future commit Change-Id: Ic941fef3dd4262fcdce48b73075fe6b375f11c9c
2015-08-24Add transform size rate for intra skip mode in rdoptShunyao Li
stdhd +0.226 hevchr +0.091 hevcmr +0.052 derflr +0.033 Change-Id: I84034209c5760609a99bd6e0ce55e02534b72cac
2015-08-12Use sizeof(variable) instead of sizeof(type)hui su
Change-Id: Ia069da11eebb271063e9eb837bdb3e7175ecce13
2015-08-10Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.hAlex Converse
Use system_state.h in vpx_dsp and remove unneeded includes of vp9_systemdependent.h. Change-Id: I92557ec6dd5aa790160b4f31fe7967db0d7ec3c4
2015-08-06Fixed a comment on the compound ref frames.Zoe Liu
Change-Id: I77e397ac9f594c9c4c1db442e334a6ea5f53f588
2015-08-06Cosmetic - align format in vp9Jingning Han
Change-Id: I83ed3422f1f4009675ad2f5c4b7236bc7b83b30e
2015-07-31Compute skippable inside the block_rd_txfm loop.Alex Converse
Change-Id: Iaa43aeeb7a2074495e00cdb83bb551c3f13d3ed2
2015-07-31Simplify model_rd_for_sb HBD ifdefsAlex Converse
Change-Id: Ic1ce346a053800ae3b2d77178f46e6a388357f6d
2015-07-31Simplify dist_block HBD ifdefsAlex Converse
Change-Id: Ic0b4e92cbaf813bcca8a8e9052c936c2e025e114
2015-07-31Merge "Short circuit rate_block in block_rd_txfm."Aℓex Converse
2015-07-31Give skip_txfm constants names.Alex Converse
This is using a define instead of an enum to keep byte packing. Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792
2015-07-31Short circuit rate_block in block_rd_txfm.Alex Converse
Don't run rate_block (cost_coeffs) if distortion alone is enough to surpass best_rd. This decreases 2nd pass runtime on HD at speed 2 by about 2%. There is zero effect on output if tx_cache is removed. Change-Id: Ia3b1cc77bfbe6ee988c395fde06c0eb92940b784
2015-07-30Remove tx cache and speed up tx size selectionYunqing Wang
1. The RD scores obtained during the tx size selection were stored in the tx cache, and used to help make the tx decision for the following frames. This wasn't used anymore in VP9 encoder. Recovered the related decision making code from 1.5+ years ago, and borg tests didn't show any quality gain. This patch removed it to lower the complexity. 2. An optimization was done after the above refactoring. If the tx_mode is not TX_MODE_SELECT, we only need to test the chosen tx size instead of all posible tx sizes. This gave a 1.5% average speed gain at speed 2, and a 1% average speed gain at speed 3. Change-Id: Id8cd650e066a8cef33829d8c15388a8138adc78c
2015-07-30Merge "Convert simple_model_rd_from_var from a speed check to a speed feature."Aℓex Converse
2015-07-30Convert simple_model_rd_from_var from a speed check to a speed feature.Alex Converse
Change-Id: I8877025e172fff29bc4e270790211463b676b4d7
2015-07-30Cleanup rdcost_block_argsAlex Converse
Change-Id: I9d613cbe9e76b5dd15e935878ef9fd04521690ba
2015-07-28Replace vp9_ prefix in 2D-DCT functions with vpx_Jingning Han
Clean up the forward 2D-DCT function names in vpx_dsp. Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25
2015-07-20vpx_dsp/bitreader.h: vp9_->vpx_Yaowu Xu
Replace vp9_ in names to vpx_ as they are not codec specific. Change-Id: I2e583aa63dee769353ada4b42417aa15c4074ebb
2015-07-20Refactor highbd forward transform use caseJingning Han
Separate the hybrid transform case from 2D-DCT case. This will allow us to clear up cross dependency between c and SIMD implementations later. Change-Id: Iaa499e8b096850a1c5a0c50a3b6e63e15d0184bf
2015-07-13Refactor intra block prediction functionJingning Han
This commit simplifies the intra block boundary condition logic. It removes the block index from the argument set. Change-Id: If00142512eb88992613d6609356dfd73ba390138
2015-07-08Changes to use of rectangular partitions.paulwilkins
Changes to allow more use of rectangular partitions at speeds 1 and 2 for content classed by the first pass as animation and for blocks near the active image edge. This has quite a big impact in quality for the animated test sequence but also hurts encode speed for speed 2. For other content types the impact on both speed and quality is small. Added some plumbing for detection of internal vertical image edges. Change-Id: I3fc48de2349f8cb87946caaf0b06dbb0ea261a9a
2015-07-08Change speed and rd features for formatting bars.paulwilkins
Change speed features / behavior for split mode when there is an internal active edge (e.g. formatting bars). Remove some threshold constraints in rd code near the active edge of the image. Add some plumbing for left and right active edge detection. Patch set 5. Limit rd pass through for sub 8x8 to internal active edges. This takes away any speed penalty for most clips but keeps the enhanced edge coding for the more critical case of internal image edges Change-Id: If644e4762874de4fe9cbb0a66211953fa74c13a5
2015-07-07Move sub pixel variance to vpx_dspJohann
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-07-06Merge "Move subtract functions from vp9 to vpx_dsp"Jingning Han
2015-07-06remove vp9_get_interp_kernel()James Zern
expose filter_kernels[] and do the table lookup directly Change-Id: I0b10bff0327c3e01a723736141a9ffd377cd3d20
2015-07-06Move subtract functions from vp9 to vpx_dspJingning Han
Factor out the subtraction operator as common function. Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b
2015-06-29VP9: Move ref_mvs[][] and mode_context[] from MB_MODE_INFOScott LaVarnway
to MB_MODE_INFO_EXT. This saves 36 bytes per 8x8 area for both the decoder and encoder. (encoder has two MODE_INFO buffers) Change-Id: If006abb2224acaf326df3c2be09e77e967662107
2015-06-22Remove tile paramScott LaVarnway
and added to MACROBLOCKD. Change-Id: I0e60aaa9f84bcc9f2376d71bd934f251baee38db
2015-06-11inline vp9_get_segdata()Scott LaVarnway
and change name. Change-Id: I706645cf9d9dc04f1b3b6ac80df80edb7f101854
2015-06-11inline vp9_segfeature_active()Scott LaVarnway
and changed name. Change-Id: Ie023ca66cc2c823032f58d4faeb53fd1863c94f3
2015-06-04Reducing size of MODE_INFO structScott LaVarnway
Reduced size from 124 bytes to 104 bytes. For decode only builds, it is reduced to 68 bytes. Change-Id: If9e6b92285459425fa086ab5a743d0a598a69de3
2015-05-22Re-worked header filesScott LaVarnway
Various header/test files had to be re-worked in order to build "Remove cm parameter from vp9_decode_block_tokens()". This patch reverts the "Remove cm" part and only contains the re-worked header files. Change-Id: I520958a88d1991fee988a3c784d0eac40e117a32
2015-05-13Relocate memory operations for common codeJohann
With the sad functions, and hopefully the variance functions soon, moving to the vpx_dsp location, place the defines used in the reference C code in a common location. Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca
2015-05-07replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNEDJames Zern
this macro was used inconsistently and only differs in behavior from DECLARE_ALIGNED when an alignment attribute is unavailable. this macro is used with calls to assembly, while generic c-code doesn't rely on it, so in a c-only build without an alignment attribute the code will function as expected. Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
2015-04-28vpx_mem: remove vpx_memsetJames Zern
vestigial. replace instances with memset() which they already were being defined to. Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201
2015-04-28vpx_mem: remove vpx_memcpyJames Zern
vestigial. replace instances with memcpy() which they already were being defined to. Change-Id: Icfd1b0bc5d95b70efab91b9ae777ace1e81d2d7c
2015-04-28vpx_mem: remove vpx_memmoveJames Zern
vestigial. replace instances with memmove() which they already were being defined to. Change-Id: If396d3f9e3cf79c0ee5d7429615ef3d6b2a34afa
2015-04-21Revert "Remove mi_grid_* structures."Scott LaVarnway
(see I3a05cf1610679fed26e0b2eadd315a9ae91afdd6) For the test clip used, the decoder performance improved by ~2%. This is also an intermediate step towards adding back the mode_info streams. Change-Id: Idddc4a3f46e4180fbebddc156c4bbf177d5c2e0d
2015-04-01Refactor block_yrd function for RTC coding modeJingning Han
This commit separates Hadamard transform/quantization operations from rate and distortion computation in block_yrd. This allows one to skip SATD computation when all transform blocks are quantized to zero. It also uses a new block error function that skips repeated computation of sum of squared residuals. It reduces the CPU cycles spent on block error calculation in block_yrd by 40%. Change-Id: I726acb2454b44af1c3bd95385abecac209959b10
2015-03-25Remove 8-bit array in HBDAdrian Grange
Creating both 8- and 16-bit arrays and then only using one of them is wasteful. Change-Id: Ic5b397c283efaff7bcfff2d2413838ba3e065561