Age | Commit message (Collapse) | Author |
|
Optimized adding constant diff to predictor, which gave about
2% decoder performance gain.
Change-Id: I47db20c31428e8c4a8f16214a85cbe386a6e9303
|
|
|
|
This was done based on John's suggestion.
Change-Id: I62516a513c31fe3dbea0d6cd063df79d9e819ec8
|
|
Change-Id: Ic9b336486774c95ffbb92adcb110cc0fc2a83cc5
|
|
This also changes the RD search to take account of the correct block
index when searching (this is required for ADST positioning to work
correctly in combination with tx_select).
Change-Id: Ie50d05b3a024a64ecd0b376887aa38ac5f7b6af6
|
|
Yaowu found this function had a compiling issue with MSVC because
of using _mm_storel_pi((__m64 *)(dest + 0 * stride), (__m128)p0).
To be safe, changed back to use integer store instruction.
Also, for some build, diff could not always be 16-byte aligned.
Changed that in the code.
Change-Id: I9995e5446af15dad18f3c5c0bad1ae68abef6c0d
|
|
This patch revamps the entropy coding of coefficients to code first
a non-zero count per coded block and correspondingly remove the EOB
token from the token set.
STATUS:
Main encode/decode code achieving encode/decode sync - done.
Forward and backward probability updates to the nzcs - done.
Rd costing updates for nzcs - done.
Note: The dynamic progrmaming apporach used in trellis quantization
is not exactly compatible with nzcs. A suboptimal approach has been
used instead where branch costs are updated to account for changes
in the nzcs.
TODO:
Training the default probs/counts for nzcs
Change-Id: I951bc1e22f47885077a7453a09b0493daa77883d
|
|
|
|
|
|
Added a variant of the one shot maxQ flag
for two pass that forces a fixed Q for the
normal inter frames. Disabled by default.
Also small adjustment to the Bits per MB
estimation.
Change-Id: I87efdfb2d094fe1340ca9ddae37470d7b278c8b8
|
|
|
|
Optimized adding diff to predictor, which gave 0.8% decoder
performance gain.
Change-Id: Ic920f0baa8cbd13a73fa77b7f9da83b58749f0f8
|
|
Removing redundant 'extern' keywords, fixing formatting and #include order,
code simplification.
Change-Id: I0e5fdc8009010f3f885f13b5d76859b9da511758
|
|
* changes:
vpxenc: actually report mismatch on stderr.
Make superblocks independent of macroblock code and data.
|
|
experimental
|
|
Split macroblock and superblock tokenization and detokenization
functions and coefficient-related data structs so that the bitstream
layout and related code of superblock coefficients looks less like it's
a hack to fit macroblocks in superblocks.
In addition, unify chroma transform size selection from luma transform
size (i.e. always use the same size, as long as it fits the predictor);
in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma
transform will now use the 16x16 (instead of the 8x8) chroma transform,
and 64x64 superblocks using the 32x32 luma transform will now use the
32x32 (instead of the 16x16) chroma transform.
Lastly, add a trellis optimize function for 32x32 transform blocks.
HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's
a few negative points here and there that I might want to analyze
a little closer.
Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430
|
|
|
|
|
|
Wrote a SSE2 vp9_short_idct4x4llm to improve the decoder
performance.
Change-Id: I90b9d48c4bf37aaf47995bffe7e584e6d4a2c000
|
|
Fixed a couple of variable/function definitions, as well as header
handling to support 16K sequence coding at high bit-rates.
The width and height are each specified by two bytes in the header.
Use an extra byte to explicitly indicate the scaling factors in
both directions, each ranging from 0 to 15.
Tested coding up to 16400x16400 dimension.
Change-Id: Ibc2225c6036620270f2c0cf5172d1760aaec10ec
|
|
Update the function prototypes to match between VP9 and VP8.
Change-Id: If58965073989e87df3b62b67a030ec6ce23ca04f
|
|
Change-Id: Iab0176f058045181821ded95ff1cf423af1625f9
|
|
Removing redundant 'extern' keyword, lowercase variable names.
Change-Id: I608e8d8579aba8981f5fac3493f77b4481b13808
|
|
Picks up some build system changes, compiler warning fixes, etc.
Change-Id: I2712f99e653502818a101a72696ad54018152d4e
|
|
|
|
|
|
to be a fixed value of 15.
Test results:
cif: .124%, .068%, .081%
std-hd: 2.809%, 3.174%, 2.705%
Change-Id: I380c8152c973506094da15eab59e3aa22b75a983
|
|
|
|
Simplified idct32x32 calculation when there are only 10 or less
non-zero coefficients in 32x32 block. This helps the decoder
performance.
Change-Id: If7f8893d27b64a9892b4b2621a37fdf4ac0c2a6d
|
|
* changes:
Ignoring test video sequences in the source tree.
Code cleanup.
|
|
gf_group_bits is int64_t remove casts to int.
Change-Id: I3b4225905041fac9af9fdfcbcb6f1c357ea4b593
|
|
|
|
|
|
Lower case variable names, converting while loops to for loops.
Change-Id: Ic3b973391eef7472a99d18d02fe79cfef5e04e62
|
|
|
|
Provided a wrapper and removed duplicate code.
Change-Id: Iaef842226ec348422e459202793b001d0983ea30
|
|
Change-Id: Ie89bd00d58e30bf4094cb748a282f1dfa81a31d8
|
|
Change-Id: Id786be31da3c91d95d2955aa569ecdc6e66650df
|
|
The ABOVESPREFMV experiment uses four pixels to the left of the
current block, which don't exist for the left-most column.
Change-Id: I4cf0b42ae8f54c0b3e7b1ed8755704b74fafc39c
|
|
|
|
Removing redundant variables, using x *= y instead x = x * y, moving
variable declarations into inner blocks.
Change-Id: I884f95c755f55d51b7c1c6585f10296919063e41
|
|
Removing redundant 'extern' keyword, better formatting, code
simplification.
Change-Id: I132fea14f08c706ee9ea147d19464d03f833f25b
|
|
The width and height stored in the reference frames are padded out to
a multiple of 16. The Width and Height variables in common are the
displayed size, which may be smaller. The incorrect comparison was
causing scaling related code to be called when it shouldn't have
been. A notable case where this happens is 1080p, since 1088 != 1080.
Change-Id: I55f743eeeeaefbf2e777e193bc9a77ff726e16b5
|
|
sse4_1 code used uint16_t for returning sad, but that
won't work for 32x32 or 64x64. This code fixes the
assembly for those and also reenables sse4_1 on linux
Change-Id: I5ce7288d581db870a148e5f7c5092826f59edd81
|
|
Change-Id: I919e2dd72292fe44f2e53ada56bd42287d50cdeb
Signed-off-by: Jim Bankoski <jimbankoski@google.com>
|
|
Scalar path is about 1.4x faster (4% overall encoder speedup).
SSE2 path is about 7x faster (13% overall encoder speedup).
Change-Id: I7e85d8225a914a74c61ea370210414696560094d
|
|
Fixing code style, using array lookup instead of switch statements for
forward hybrid transforms (in the same way as for their inverses).
Consistent usage of ROUND_POWER_OF_TWO macro in appropriate places.
Change-Id: I0d3822ae11f928905fdbfbe4158f91d97c71015f
|
|
|
|
|
|
|