Age | Commit message (Collapse) | Author |
|
Change-Id: I045b4cf625d428109688303ced5433d824df2790
|
|
Unify the transform and quantization process for 4x4 - 16x16
transform block sizes. This doesn't affect the encoding speed
visibly. Remove it to reduce the maintenance load.
Change-Id: Ifbf20bf8554ecf7970a6279a2b783b1c58fac6e4
|
|
BUG=webm:1606
Change-Id: I661485b860243c95b6450035dbac77b0dd4d9ff4
|
|
1: Lower rdmult used in trellis optimization
2: Shut off the end of block optimization that tries end of block
at every sub position if any of the coefficients are > 1.
3: Change the rounding and zbin factor according to sharpness.
4: Disable the skip block check that calculates RD using SSE from
predictor.
Change-Id: I247b61a26fa22f12f8b684e7cd6d4e368de7c3e4
|
|
To save a branch.
Change-Id: Ifa2be7583e95c6991784731c654bbd4cce31e993
|
|
Remove trailing commas to keep multiple elements on one line.
Add blank lines to prevent comments from being treated as blocks.
clang-format guards for struct with a comment in the middle.
Change-Id: I3bcb8313ae8aaf69179249a13b4087b1272cdbc0
|
|
This condition is handled before this code is reached. The ssse3 version
of the function has always crashed when attempting to handle the
skip_block condition.
Add assert() and comments regarding the usage of skip_block.
Removing the parameter is a fairly involved process so leave it be for
the moment.
Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a
|
|
The greedy version was already enabled by default here:
https://chromium-review.googlesource.com/c/546848/
And the speed+compression gains from greedy version were already
mentioned here:
https://chromium-review.googlesource.com/c/531675/
Change-Id: Iad9f7d03490c845ad1e230af028c9d39edddca97
|
|
Reduces memory usage, and speeds up encoding for some difficult clips.
No impact on output or metrics.
Ported from aomedia patch:
https://aomedia-review.googlesource.com/c/14501
Change-Id: I26ec69af8336f9e80da486a1cfbfc89a3596954d
|
|
+ vpx_dsp/, test/
itxfm -> inv_txfm, ftxfm -> fwd_txfm
Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e
|
|
txfm is more commonly used as an abbreviation through the codebase
Change-Id: I86fd90ef132468f9da270091c05daa1f5a49ece2
|
|
Improvements were already mentioned in the previous patch:
https://chromium-review.googlesource.com/#/c/531675/
Change-Id: I4906ab1c61c25a815bdeb986016fad6dcb69eb71
|
|
This was ported from the greedy version in AV1, written by Dake He
(dkhe@google.com).
See:
https://aomedia.googlesource.com/aom/+/master/av1/encoder/encodemb.c#137
Greedy version is disabled by default, but can be picked by setting
USE_GREEDY_OPTIMIZE_B to 1.
To be enabled by default later.
This is both faster and better in terms of compression.
Compression Improvement:
------------------------
lowres: -0.119
midres: -0.064
hdres: -0.405
Speed Improvement:
------------------
(Based on encode time of 3 videos of different difficulties at
3 different target bitrates)
With --cpu-used=0: 0.38% to 5.55% faster
With --cpu-used=1: 0.24% to 2.79% faster
With --cpu-used=2: 0.29% to 1.46% faster
Change-Id: Ia7a23b3b244ad8eb253ac9e43cd03c5e021d2635
|
|
BUG=webm:1388
Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5
|
|
BUG=webm:1388
Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112
|
|
BUG=webm:1409
Change-Id: I5236595aac1c09386c60ffe8ad621e01422ed5a7
|
|
cherry picked from nextgenv2 90ea281f29df747282e56d3068a3ddbdde30cdd0
Change-Id: Ie989e60c6479ac3251cadaac9c7e795ccba52f4e
|
|
vp9_get_token_cost does the same thing with one fewer lookup.
Change-Id: Ifc110b12403cb1a04a3f91357ab435c67b4815d6
|
|
About 0.6% fewer cycles spent in vp9_optimize_b.
Change-Id: I2ae62a78374c594ed81d4e3100a5848e2f6f2c4e
|
|
|
|
Saves 2688 bytes of rodata.
Change-Id: I46633b6e50c2845181c70fff6273a8e58fdd1e56
|
|
vp9[_highbd]_quantize]_fp[_32x32] and vp9_fdct8x8_quant do not make use
of these parameters.
scan is used for C code and iscan is used for SIMD implementations.
Change-Id: I908a0ff7d3febac33da97e0596e040ec7bc18ca5
|
|
|
|
BUG=webm:1361
Change-Id: Ib840bf3b39f7b3c8c017d3488a83434e9a0f45f5
|
|
(yunqingwang)
1. Rebased the patch. Incorporated recent first pass changes.
2. Turned on the first pass unit test.
Change-Id: Ia2f7ba8152d0b6dd6bf8efb9dfaf505ba7d8edee
|
|
Remove superfluous test. Produces a small improvement in instruction scheduling.
Measured a 1% to 1.5% reduction in execution time for routine vp9_optimize_b
with different compilers.
No change in behavior.
TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225
Change-Id: I2bf248d4c25fc0256147d7a8766ff9108ae9cba3
|
|
Simplify address arithmetic on token_costs to reduce the number of generated
instructions that are used for address arithmetic inside routine
vp9_optimize_b. It also helps improve instruction scheduling depending on
compiler and optimization level.
Measured a 9.3% reduction in retired instructions and 5.3% reduction in
execution time for this routine with GCC v4.8.4 and optimization flags -O3,
and a reduction of up to 11.6% in execution time with other compilers.
No change in behavior.
TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225
Change-Id: I6098650fb5cd2aa04e014fe6e68ca20761f3a21f
|
|
Change-Id: I501597b7c1e0f0c7ae2aea3ee8073f0a641b3487
|
|
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
|
|
Move best index into the token state. Shrink it down to one byte. This
is more cache friendly (access are group together) and uses less total
memory.
Results in 4% fewer cycles in optimize_b().
Change-Id: I75db484fb3dc82f59928d54b659d79c80ee40452
|
|
This reverts commit ff19cdafdbb5ee470e4369582b0266f4bc23287d.
Change-Id: I81f68870ca27a1ff683ee22090530b6997815fb2
|
|
This further improves the coding performance by
lowres 0.3%
midres 0.5%
hdres 0.6%
Change-Id: I6a03b6da210b9cbc261474bad4a103e0ba021c68
|
|
Use the precise context to estimate the zero token cost in trellis
optimization process. This improves the speed 0 coding performance
by 0.15% for lowres and 0.1% for midres. It improves the speed 1
coding performance by 0.2% for midres and hdres.
Change-Id: I59c7c08702fc79dc4f8534b64ca594da909e2c91
|
|
This commit allows the inter prediction residual to use uniform
quantization followed by trellis coefficient optimization in
speed 0. It improves the coding performance by
lowres 0.79%
midres 1.07%
hdres 1.44%
Change-Id: I46ef8cfe042a4ccc7a0055515012cd6cbf5c9619
|
|
Improve hdres PSNR by 0.696%
Improve midres PSNR by 0.313%
Improve lowres PSNR by 0.142%
Change-Id: Icabde78aa9689f539f6a03ec09f712c20758796c
|
|
The transform block row and column positions are always available
outside the callees. There is no need to re-compute these values
again. This approach has been used by the decoder. This commit
removes txfrm_block_to_raster_xy() function.
Change-Id: I5b90f91a0d8b7c35cfa7d171da9edf8202630108
|
|
Reduces size from 32 bytes to 24 bytes on x86_64.
Change-Id: I8a22552343a1fc916117f35267fe6a295250f742
|
|
This commit refactors the trellis coefficient optimization process.
It saves multiplications used to generate the final dequantized
coefficients. It removes two memset operations on quantized
and dequantized coefficient sets. This improves the unit speed
by 10%.
Change-Id: I23f47c6e14582520a7f952f03ce8f72183e7f0e6
|
|
This commit back ports the speed-up from vp10. It improves the
unit speed by 15%.
Change-Id: Ibe8c0e0974b03266d6abd16a41e89c3b91d8db2a
|
|
This fixes the overflow issue.
Bug=webm:1241
Change-Id: Ia168b7fae1ad214a6837aaa785a08bf8506987dd
|
|
The eob of a block is not perperly set when skip_recode is true,
thus triggering assert(eob <= default_eob) to fail.
Change-Id: Ifecbe33dce2dc4903e0a80bd384dc09bf0dd8a44
|
|
"qc" in vp{9,10}_token_state is used to save quantized coefficients, this
commit changes the type from short to tran_low_t to properly reflect
the value range for highbitdepth build.
This fixes an out-of-range bug when optimize_b is used in highbitdepth
build.
Change-Id: Ibf330879e6ac6ae8f099e085caa9d3d9a889fde8
|
|
Coding gain:
lowres 0.64%
midres 0.38%
hdres 0.58%
Change-Id: I233fa2a4b24bd1e15091a5f5ef6aff661f3f50ec
|
|
Coding gain:
lowres 0.18%
midres 0.23%
hdres 0.36%
Change-Id: I044c8afbc481fc55b23d440352941071355b0afb
|
|
huisu did in nextgen branch -> please try in vp9
Change-Id: I0ff35db07ac38464e0e2858e303be686c03a5d0e
|
|
-.220 BDRATE derf: https://x20web.corp.google.com/~aconverse/results/cost256_derf.html
-.675 BDRATE hevcmr: https://x20web.corp.google.com/~aconverse/results/cost256_hevcmr.html
Change-Id: Ifb1646d8ce65ffe0eff9953a911b1b88735b335f
|
|
|
|
Change-Id: Ifa607dd2bb366ce09fa16dfcad3cc45a2440c185
|
|
This is a pure-refactor in preparation to potentially raise the bit-cost
resolution.
Verified at good speed 0 and rt speed -6.
Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9
|
|
Change-Id: Ia5072a3a92212d8565f33359f6c146469bdfbbec
|