Age | Commit message (Collapse) | Author |
|
vp9_get_token_cost does the same thing with one fewer lookup.
Change-Id: Ifc110b12403cb1a04a3f91357ab435c67b4815d6
|
|
About 0.6% fewer cycles spent in vp9_optimize_b.
Change-Id: I2ae62a78374c594ed81d4e3100a5848e2f6f2c4e
|
|
Saves 2688 bytes of rodata.
Change-Id: I46633b6e50c2845181c70fff6273a8e58fdd1e56
|
|
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
|
|
Change-Id: Iac7bc0d0eba459a04688aae224d34ae9b59742db
|
|
About a 5% faster overall encode (perf cycles) at speed zero!
Change-Id: Iaf013ba75884415cd824e98349f654ffb1c3ef33
|
|
-.220 BDRATE derf: https://x20web.corp.google.com/~aconverse/results/cost256_derf.html
-.675 BDRATE hevcmr: https://x20web.corp.google.com/~aconverse/results/cost256_hevcmr.html
Change-Id: Ifb1646d8ce65ffe0eff9953a911b1b88735b335f
|
|
Repack TOKENEXTRA fields.
Speed impact within measurment margin of error.
Change-Id: I9a6d1dde1bb4a0766b02d0cb74c871ddde907cde
|
|
Change-Id: I5a2abd35cb303d8f6354b3119ab95acf90405116
|
|
change prefix vp9_ to vpx_ for non codec specific functions and data
structures.
Change-Id: I97c7e6422eceea99212b93f4942bc2187763a07c
|
|
Replace vp9_ in names to vpx_ as they are not codec specific.
Change-Id: I2e583aa63dee769353ada4b42417aa15c4074ebb
|
|
Change-Id: I1e32bf8f6872a6fb7e9cabe86483e94805e2f790
|
|
Change-Id: Iabe8a8868a747626c24bb13f1796f4c7827af367
|
|
|
|
Change-Id: I28a3d342a4a4b23e02a0f47bb8037c4403f71d61
|
|
Change-Id: Iff528c4b7528cc70320343b3a7ce07a92b024dfd
|
|
This patch modified struct VP9_COMP. Created a struct ThreadData
to include data that need to be copied for each thread. In
multiple thread case, one thread processes one tile. all threads
share one copy of VP9_COMP,
(refer to VP9_COMP *cpi in the code)
but each thread has its own copy of ThreadData,
(refer to ThreadData *td in the code).
Therefore, within the scope of encode_tiles(), both cpi and td
need to be passed as function parameters.
In single thread case, the FRAME_COUNTS pointer in ThreadData
points to "counts" in VP9_COMMON.
Change-Id: Ib37908b2d8e2c0f4f9c18f38017df5ce60e8b13e
|
|
This commit enables the encoder to skip split partition search if
the bigger block size has all non-zero quantized coefficients in low
frequency area and the total rate cost is below a certain threshold.
It logarithmatically scales the rate threshold according to the
current block size. For speed 3, the compression performance loss:
derf -0.093%
stdhd -0.066%
Local experiments show 4% - 20% encoding speed-up for speed 3.
blue_sky_1080p, 1500 kbps
51051 b/f, 35.891 dB, 67236 ms ->
50554 b/f, 35.857 dB, 59270 ms (12% speed-up)
old_town_cross_720p, 1500 kbps
14431 b/f, 36.249 dB, 57687 ms ->
14108 b/f, 36.172 dB, 46586 ms (19% speed-up)
pedestrian_area_1080p, 1500 kbps
50812 b/f, 40.124 dB, 100439 ms ->
50755 b/f, 40.118 dB, 96549 ms (4% speed-up)
mobile_calendar_720p, 1000 kbps
10352 b/f, 35.055 dB, 51837 ms ->
10172 b/f, 35.003 dB, 44076 ms (15% speed-up)
Change-Id: I412e34db49060775b3b89ba1738522317c3239c8
|
|
Miscellaneous bug-fixes for high bitdepth functionality.
With this patch, high bit-depth profiles become mostly functional,
except for an intermittent assert failure issue that is being
tracked.
Change-Id: I6a7fcbdcf1e5b09842e88535f8442d2e1230748c
|
|
Change-Id: I6f67b171022bbc8199c6d674190b57f6bab1b62f
|
|
The largest value is 13358.
Change-Id: I7a6b024a92b6250933d9ebc0cad066b966c96bd4
|
|
Change-Id: I4f51ce859a97bf1b8fd2b37ac585b7c643232b69
|
|
Change-Id: I40a070c353663e82c59e174d7c92eb84f72ed808
|
|
|
|
In the decoder we don't need to save eobs, we can pass eob as an argument.
That's why removing eob arrays from VP9Decompressor and TileWorkerData,
and moving eob pointer from macroblockd_plane to macroblock_plane.
Change-Id: I8eb919acc837acfb3abdd8319af63d1bbca8217a
|
|
Renaming constants for consistency:
DCT_VAL_CATEGORY1 => CATEGORY1_TOKEN
DCT_VAL_CATEGORY2 => CATEGORY2_TOKEN
DCT_VAL_CATEGORY3 => CATEGORY3_TOKEN
DCT_VAL_CATEGORY4 => CATEGORY4_TOKEN
DCT_VAL_CATEGORY5 => CATEGORY5_TOKEN
DCT_VAL_CATEGORY6 => CATEGORY6_TOKEN
DCT_EOB_TOKEN => EOB_TOKEN
DCT_EOB_MODEL_TOKEN => EOB_MODEL_TOKEN
MAX_ENTROPY_TOKENS => ENTROPY_TOKENS
Moving constants:
INTER_MODE_CONTEXTS from vp9_entropy.h to vp9_blockd.h.
EOSB_TOKEN from vp9_entropy.h to vp9_tokenize.h
Change-Id: I5fcbf081318e1d365792b6d290a930c6cb0f3fc2
|
|
Change-Id: I0e59d320407b3bed0ba3622a7b29975f6fad7ebf
|
|
Change-Id: Ie05cc5e2d8ce12eacdf482a8b75e5a6ce6f59f57
|
|
Change-Id: I62bb07c377f947cb72fac68add7a6b199e42c6b9
|
|
This commit makes the rate-distortion optimization search of chroma
components consistent across all block sizes. It removes redundant
codes.
Change-Id: I7e76f54d045e8efdd41d84a164c71f55b484471b
|
|
This commit unifies the rate-distortion cost calculation process of
luma and chroma components. It allows early termination to be enabled
later in the rd search loop of chroma components, in consistent with
luma pixels.
Change-Id: I2e52a7c6496176bf2a5e3ef338d34ceb8aad9b3d
|
|
Change-Id: I6aa4191935aa93461a07c41b59fdae1eb5f5f107
|
|
Removing redundant function arguments and curly braces.
Change-Id: I46e02561f33fe02e84a3b19756f03b9504bd6a1b
|
|
Change-Id: I2dfc569106b29fbe4da20585a0e85e5e9ea6a4db
|
|
Reverts to using 128 bit LUT for the coef models rather than 48
to ease hardware implementation.
Also incorporates some cleanups including removing various
hooks to support different lookup tables based on block_type and
ref_type.
Change-Id: I54100c120cca07a2ebd3a7776bc4630fa6a153f6
|
|
Merges the experiment.
Change-Id: I4eb19af6de6df6aa3a96a2e82f231d47ed9b3ae9
|
|
Cleans up the experiment. Actually uses reduced counts for backward
updates, and reduced number of probabilities in the context.
No change in bitstream when the experiment is on.
Between expt on and off:
derfraw300 is down only -0.062% (which is better than when expts
were run previously).
Change-Id: I55285a049a0c22810bdb42914212ab5a4f8521b5
|
|
Unify the tokenize_ function and enable configurable block size for
superblock 8x8. We are immigrating the functionalities of
macroblock handles into superblock ones, and eventually will remove
encode_mb and decode_mb. To be continued on detokenize_ module.
Change-Id: I9f81e8c2291082535cf5e0c4b662eb24fb7c8a7f
|
|
Change-Id: I2bc8d775f8d698bf8582f4eecabc2329452e8d9b
|
|
Adds an experiment that codes an end-of-orientation symbol
for every eligible zero encountered in scan order.
This cleans out various other sub-experiments that were part
of the origiinal patch, which will be later included if found
useful.
Results are slightly positive on all sets (0.1 - 0.2% range).
Change-Id: I57765c605fefc7fb9d1b57f1b356843602abefaf
|
|
Change-Id: I183ec5819d4d80966c92db36db75b8c3be0d381d
|
|
Use the common block walker to calculate skippability.
Change-Id: I6721e42f065df237426c91c1d871ec226ba7cdcb
|
|
Merge sb32x32 and sb64x64 functions; allow for rectangular sizes. Code
gives identical encoder results before and after. There are a few
macros for rectangular block sizes under the sbsegment experiment; this
experiment is not yet functional and should not yet be used.
Change-Id: I71f93b5d2a1596e99a6f01f29c3f0a456694d728
|
|
Pearson correlation for above or left is significantly higher than for
previous-in-scan-order (absolute values depend on position in scan, but
in general, we gain about 0.1-0.2 by using either above or left; using
both basically just makes this even better). For eob branch skipping,
we continue to use the previous token in scan order.
This helps about 0.9% on derf after re-training on a limited data set.
Full re-training and results on larger-resolution clips are pending.
Note that this commit breaks trellis, so we can probably get further
gains out of it by fixing trellis at some later point.
Change-Id: Iead68e296fc3a105cca746b5e3da9555d6010cfe
|
|
This also changes the RD search to take account of the correct block
index when searching (this is required for ADST positioning to work
correctly in combination with tx_select).
Change-Id: Ie50d05b3a024a64ecd0b376887aa38ac5f7b6af6
|
|
Split macroblock and superblock tokenization and detokenization
functions and coefficient-related data structs so that the bitstream
layout and related code of superblock coefficients looks less like it's
a hack to fit macroblocks in superblocks.
In addition, unify chroma transform size selection from luma transform
size (i.e. always use the same size, as long as it fits the predictor);
in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma
transform will now use the 16x16 (instead of the 8x8) chroma transform,
and 64x64 superblocks using the 32x32 luma transform will now use the
32x32 (instead of the 16x16) chroma transform.
Lastly, add a trellis optimize function for 32x32 transform blocks.
HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's
a few negative points here and there that I might want to analyze
a little closer.
Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430
|
|
Change-Id: I5416455f8f129ca0f450d00e48358d2012605072
|
|
Removing redundant 'extern' keywords and parentheses, fixing indentation,
making variable names lower case, using short expressions x *= c
instead of x = x * c, minor code simplifications.
Change-Id: If6a25fcf306d1db26e90d27e3c24a32735c607de
|
|
Change-Id: I7a5314daca993d46b8666ba1ec2ff3766c1e5042
|
|
Change-Id: I4f46d142c2a8d1e8a880cfac63702dcbfb999b78
|