summaryrefslogtreecommitdiff
path: root/vp9/encoder/vp9_encodemb.c
AgeCommit message (Collapse)Author
2017-07-06cosmetics,vp9/: normalize inv/fwd_txfm namingJames Zern
+ vpx_dsp/, test/ itxfm -> inv_txfm, ftxfm -> fwd_txfm Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e
2017-06-29cosmetics,vp9/encoder: s/txm/txfm/James Zern
txfm is more commonly used as an abbreviation through the codebase Change-Id: I86fd90ef132468f9da270091c05daa1f5a49ece2
2017-06-23Enable greedy version of optimize_b() in VP9 by default.Urvang Joshi
Improvements were already mentioned in the previous patch: https://chromium-review.googlesource.com/#/c/531675/ Change-Id: I4906ab1c61c25a815bdeb986016fad6dcb69eb71
2017-06-15VP9: Add greedy version of av1_optimize_b().Urvang Joshi
This was ported from the greedy version in AV1, written by Dake He (dkhe@google.com). See: https://aomedia.googlesource.com/aom/+/master/av1/encoder/encodemb.c#137 Greedy version is disabled by default, but can be picked by setting USE_GREEDY_OPTIMIZE_B to 1. To be enabled by default later. This is both faster and better in terms of compression. Compression Improvement: ------------------------ lowres: -0.119 midres: -0.064 hdres: -0.405 Speed Improvement: ------------------ (Based on encode time of 3 videos of different difficulties at 3 different target bitrates) With --cpu-used=0: 0.38% to 5.55% faster With --cpu-used=1: 0.24% to 2.79% faster With --cpu-used=2: 0.29% to 1.46% faster Change-Id: Ia7a23b3b244ad8eb253ac9e43cd03c5e021d2635
2017-05-03Update highbd idct functions arguments to use uint16_t dstLinfeng Zhang
BUG=webm:1388 Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5
2017-05-03Clean CONVERT_TO_BYTEPTR/SHORTPTR in idctLinfeng Zhang
BUG=webm:1388 Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112
2017-04-26VP9: enable trellis for high bitdepth intraPeter de Rivaz
BUG=webm:1409 Change-Id: I5236595aac1c09386c60ffe8ad621e01422ed5a7
2017-03-17Backport "Optimize the use case of token_cost table" to VP9Jingning Han
cherry picked from nextgenv2 90ea281f29df747282e56d3068a3ddbdde30cdd0 Change-Id: Ie989e60c6479ac3251cadaac9c7e795ccba52f4e
2017-03-17Drop vp9_get_token_extracostAlex Converse
vp9_get_token_cost does the same thing with one fewer lookup. Change-Id: Ifc110b12403cb1a04a3f91357ab435c67b4815d6
2017-03-16vp9_optimize_b: Combine extrabits cost with token lookupAlex Converse
About 0.6% fewer cycles spent in vp9_optimize_b. Change-Id: I2ae62a78374c594ed81d4e3100a5848e2f6f2c4e
2017-03-03Merge "Narrow cat6_high_cost tables to uint16_t"Alex Converse
2017-03-03Narrow cat6_high_cost tables to uint16_tAlex Converse
Saves 2688 bytes of rodata. Change-Id: I46633b6e50c2845181c70fff6273a8e58fdd1e56
2017-02-16Drop zbin_ptr and quant_shift_ptrJohann
vp9[_highbd]_quantize]_fp[_32x32] and vp9_fdct8x8_quant do not make use of these parameters. scan is used for C code and iscan is used for SIMD implementations. Change-Id: I908a0ff7d3febac33da97e0596e040ec7bc18ca5
2017-01-25Merge "Fix an overflow warning in optimize_b()"Hui Su
2017-01-25Fix an overflow warning in optimize_b()hui su
BUG=webm:1361 Change-Id: Ib840bf3b39f7b3c8c017d3488a83434e9a0f45f5
2017-01-24Multi-threading of first pass stats collectionRanjit Kumar Tulabandu
(yunqingwang) 1. Rebased the patch. Incorporated recent first pass changes. 2. Turned on the first pass unit test. Change-Id: Ia2f7ba8152d0b6dd6bf8efb9dfaf505ba7d8edee
2016-12-20Remove superfluous conditional on 'shortcut'Gabriel Marin
Remove superfluous test. Produces a small improvement in instruction scheduling. Measured a 1% to 1.5% reduction in execution time for routine vp9_optimize_b with different compilers. No change in behavior. TEST=Verified that encoded files match bit for bit, with and without this change. BUG=b/33678225 Change-Id: I2bf248d4c25fc0256147d7a8766ff9108ae9cba3
2016-12-19Simplify address arithmetic in vp9_optimize_bGabriel Marin
Simplify address arithmetic on token_costs to reduce the number of generated instructions that are used for address arithmetic inside routine vp9_optimize_b. It also helps improve instruction scheduling depending on compiler and optimization level. Measured a 9.3% reduction in retired instructions and 5.3% reduction in execution time for this routine with GCC v4.8.4 and optimization flags -O3, and a reduction of up to 11.6% in execution time with other compilers. No change in behavior. TEST=Verified that encoded files match bit for bit, with and without this change. BUG=b/33678225 Change-Id: I6098650fb5cd2aa04e014fe6e68ca20761f3a21f
2016-09-15apply clang-formatclang-format
Change-Id: I501597b7c1e0f0c7ae2aea3ee8073f0a641b3487
2016-08-02vp9/encoder: apply clang-formatclang-format
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
2016-07-29Cache optimizations in optimize_b().Alex Converse
Move best index into the token state. Shrink it down to one byte. This is more cache friendly (access are group together) and uses less total memory. Results in 4% fewer cycles in optimize_b(). Change-Id: I75db484fb3dc82f59928d54b659d79c80ee40452
2016-07-13Revert "Eliminate isolated and small tail coefficients:"hui su
This reverts commit ff19cdafdbb5ee470e4369582b0266f4bc23287d. Change-Id: I81f68870ca27a1ff683ee22090530b6997815fb2
2016-07-07Enable coeff optimization for intra modesJingning Han
This further improves the coding performance by lowres 0.3% midres 0.5% hdres 0.6% Change-Id: I6a03b6da210b9cbc261474bad4a103e0ba021c68
2016-07-07Use precise context to estimate coeff rate costJingning Han
Use the precise context to estimate the zero token cost in trellis optimization process. This improves the speed 0 coding performance by 0.15% for lowres and 0.1% for midres. It improves the speed 1 coding performance by 0.2% for midres and hdres. Change-Id: I59c7c08702fc79dc4f8534b64ca594da909e2c91
2016-07-07Enable uniform quantization with trellis optimization in speed 0Jingning Han
This commit allows the inter prediction residual to use uniform quantization followed by trellis coefficient optimization in speed 0. It improves the coding performance by lowres 0.79% midres 1.07% hdres 1.44% Change-Id: I46ef8cfe042a4ccc7a0055515012cd6cbf5c9619
2016-07-06Eliminate isolated and small tail coefficients:Min Ye
Improve hdres PSNR by 0.696% Improve midres PSNR by 0.313% Improve lowres PSNR by 0.142% Change-Id: Icabde78aa9689f539f6a03ec09f712c20758796c
2016-07-04Remove txfrm_block_to_raster_xy() from vp9 encoderJingning Han
The transform block row and column positions are always available outside the callees. There is no need to re-compute these values again. This approach has been used by the decoder. This commit removes txfrm_block_to_raster_xy() function. Change-Id: I5b90f91a0d8b7c35cfa7d171da9edf8202630108
2016-06-20Repack vp9_token_state.Alex Converse
Reduces size from 32 bytes to 24 bytes on x86_64. Change-Id: I8a22552343a1fc916117f35267fe6a295250f742
2016-06-17Refactor optimize_b for speed performanceJingning Han
This commit refactors the trellis coefficient optimization process. It saves multiplications used to generate the final dequantized coefficients. It removes two memset operations on quantized and dequantized coefficient sets. This improves the unit speed by 10%. Change-Id: I23f47c6e14582520a7f952f03ce8f72183e7f0e6
2016-06-17Port optimize_b speed-up from vp10Jingning Han
This commit back ports the speed-up from vp10. It improves the unit speed by 15%. Change-Id: Ibe8c0e0974b03266d6abd16a41e89c3b91d8db2a
2016-06-17Use 64-bit integer to store distortion in optimize_bJingning Han
This fixes the overflow issue. Bug=webm:1241 Change-Id: Ia168b7fae1ad214a6837aaa785a08bf8506987dd
2016-06-07Avoid a potential assertion fail in optimize_b()hui su
The eob of a block is not perperly set when skip_recode is true, thus triggering assert(eob <= default_eob) to fail. Change-Id: Ifecbe33dce2dc4903e0a80bd384dc09bf0dd8a44
2016-05-04Change to use proper type in vp{9,10}_token_stateYaowu Xu
"qc" in vp{9,10}_token_state is used to save quantized coefficients, this commit changes the type from short to tran_low_t to properly reflect the value range for highbitdepth build. This fixes an out-of-range bug when optimize_b is used in highbitdepth build. Change-Id: Ibf330879e6ac6ae8f099e085caa9d3d9a889fde8
2016-04-26VP9: adjust trellis quant optimization RD parametershui su
Coding gain: lowres 0.64% midres 0.38% hdres 0.58% Change-Id: I233fa2a4b24bd1e15091a5f5ef6aff661f3f50ec
2016-04-26VP9: enable trellis quantization optimization for intra blockshui su
Coding gain: lowres 0.18% midres 0.23% hdres 0.36% Change-Id: I044c8afbc481fc55b23d440352941071355b0afb
2016-04-21vp9_encodemb.c: TODO clean upJim Bankoski
huisu did in nextgen branch -> please try in vp9 Change-Id: I0ff35db07ac38464e0e2858e303be686c03a5d0e
2016-01-27Switch to 9-bit rate cost constants built on a 256 probability denominator.Alex Converse
-.220 BDRATE derf: https://x20web.corp.google.com/~aconverse/results/cost256_derf.html -.675 BDRATE hevcmr: https://x20web.corp.google.com/~aconverse/results/cost256_hevcmr.html Change-Id: Ifb1646d8ce65ffe0eff9953a911b1b88735b335f
2016-01-21Merge "Tie the bit cost scale to a define."Alex Converse
2016-01-19VP9: Eliminate MB_MODE_INFOScott LaVarnway
Change-Id: Ifa607dd2bb366ce09fa16dfcad3cc45a2440c185
2016-01-15Tie the bit cost scale to a define.Alex Converse
This is a pure-refactor in preparation to potentially raise the bit-cost resolution. Verified at good speed 0 and rt speed -6. Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9
2015-09-30VP9: remove plane_type from macroblockd_planeScott LaVarnway
Change-Id: Ia5072a3a92212d8565f33359f6c146469bdfbbec
2015-08-10Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.hAlex Converse
Use system_state.h in vpx_dsp and remove unneeded includes of vp9_systemdependent.h. Change-Id: I92557ec6dd5aa790160b4f31fe7967db0d7ec3c4
2015-08-06Cosmetic - align format in vp9Jingning Han
Change-Id: I83ed3422f1f4009675ad2f5c4b7236bc7b83b30e
2015-08-04Change vp9_quantize to vpx_quantizeJingning Han
This commit clears all the vp9_ prefix use case in vpx_dsp. It gets the vp9 folder ready to branch out vp10. Change-Id: I2906eec179ee792b4af8c9b4161313653050e931
2015-07-31Give skip_txfm constants names.Alex Converse
This is using a define instead of an enum to keep byte packing. Change-Id: I3abb07c8bfe377e19be4531b624af7b7b4207792
2015-07-28Replace vp9_ prefix in 2D-DCT functions with vpx_Jingning Han
Clean up the forward 2D-DCT function names in vpx_dsp. Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25
2015-07-24Merge "Code cleanup in vp9_encode_block_intra"Hui Su
2015-07-22Code cleanup in vp9_encode_block_intrahui su
Change-Id: Ie4d958b26e586db218f8ee95d5df4bf11f2345a1
2015-07-20Refactor highbd forward transform use caseJingning Han
Separate the hybrid transform case from 2D-DCT case. This will allow us to clear up cross dependency between c and SIMD implementations later. Change-Id: Iaa499e8b096850a1c5a0c50a3b6e63e15d0184bf
2015-07-17Migrate quantization functions from vp9/ to vpx_dsp/Yunqing Wang
The following quantization functions were moved: vp9_quantize_b vp9_quantize_b_32x32 vp9_highbd_quantize_b vp9_highbd_quantize_b_32x32 vp9_quantize_dc vp9_quantize_dc_32x32 vp9_highbd_quantize_dc vp9_highbd_quantize_dc_32x32 The purpose of doing that was to allow these functions to be shared by multiple codecs. Change-Id: Id8ab939f283353cdd07bd930d47db3d932a5d87f