Age | Commit message (Collapse) | Author |
|
|
|
1. Search for block8x16/block16x8 uses block8x8's search results.
2. Check block4x4 only if block8x8 is chosen. (This hurts quality,
which will be improved in another check-in.)
3. In block4x4 search, the previous block's result is used as
MV predictor for next block.
This change improves performance.
Change-Id: I9dc089007ca08129fb6c11fe7692777ebb8647b0
|
|
Re-calibrated sad_per_bit16 and sad_per_bit4 tables to linearly
correlated to quantizer values, these two variables are used in
motion search for costing motion vectors. This change has an small
positive effect on compression.
Change-Id: Ic9b5ea6fb8d5078ef663ba4899db019cc51f4166
|
|
In SPLITMV, the 8x8 segment will be checked first. If the 8x8 rd
is better than the best, we check the other segments. Otherwise
bail. Adjustments to the thresh_mult were necessary to make
up for the initial quality loss.
The performance improved by 20% (average) for good quality,
speed 0 and speed 1, while the overall quality remained the same.
Change-Id: I717aef401323c8a254fba3e9777d2a316c774cc3
|
|
vp8_rd_pick_best_mbsegmentation looks at y only. The new
breakout does not include the frame cost, the prob_skip_false
cost, or the uv rate. Performance improved by a few percent
and the quality remained the same.
Change-Id: I94ff013998ac51e8ecce7130870f7b6600758e15
|
|
This fix added MV range checks for NEWMV mode as suggested by Jim.
To reduce unnecessary MV range checks, I tried Yaowu's suggestion.
Update UMV borders in NEWMV mode to also cover MV range check.
Also, in this way, every MV that is valid gets checked in diamond
search function.
Change-Id: I95a89ce0daf6f178c454448f13d4249f19b30f3a
|
|
The MV's range is 256. Since the new motion search uses a different
starting MV than the center ref MV, a MV range checking needs to
be done to avoid corruption.
Change-Id: I8ae0721d1bd203639e13891e2e54a2e87276f306
|
|
Change I3430820 performed an uninitialized read when
encode_breakout == 0, since the sum and sse wouldn't be set:
if(x->encode_breakout)
VARIANCE_INVOKE(..., get16x16var)(..., &sum, &sse);
if (cpi->active_map_enabled && x->active_ptr[0] == 0) {
...
} else if (sse < x->encode_breakout)
Change-Id: I915eb76d1227b4b6d1137a0dedf2c143860098a2
|
|
|
|
|
|
Realized no need for new assembly code sum is already
calculated.
Change-Id: Ie2d94feb4b7c1f77c5359bca29b66228e41638c9
|
|
Moved the code from the segmentation loop into a function
which is now called for each segment. This will allow us
to change the segment order checking more easily.
Change-Id: I9510d26f0acae5a73043fcca8f1984b121d3e052
|
|
|
|
Add vp8_mv_pred() to better predict starting MV for NEWMV
mode in vp8_rd_pick_inter_mode(). Set different search
ranges according to MV prediction accuracy, which improves
encoder performance without hurting the quality. Also,
as Yaowu suggested, using diamond search result as full
search starting point and therefore adjusting(reducing)
full search range helps the performance.
Change-Id: Ie4a3c8df87e697c1f4f6e2ddb693766bba1b77b6
|
|
only do the variance calculation if necessary
( eg needed for breakout test)
|
|
macro_block_yrd and vp8_rdcost_mby are not called for SPLITMV.
Change-Id: I2224d3c8725df526d48426447482768d543752f1
|
|
Small changes to the default zero bin and rounding tables.
Though the tables are currently the same for the Y1 and Y2 cases
I have left them as separate tables in case we want to tune this later.
There is now some adjustment of the zbin based on the prediction mode.
Previously this was restricted to an adjustment for gf/arf 0,0 MV.
The exact quantizer now marginal outperforms and is the default.
The overall average gain is about 0.5%
Change-Id: I5e4353f3d5326dde4e86823684b236a1e9ea7f47
|
|
Using tables for the label count and label offset.
Change-Id: Iac3d5b292c37341a881be0af282f5cac3b3e01eb
|
|
NEON has optimized 16x16 half-pixel variance functions, but they
were not part of the RTCD framework. Add these functions to RTCD,
so that other platforms can make use of this optimization in the
future and special-case ARM code can be removed.
A number of functions were taking two variance functions as
parameters. These functions were changed to take a single
parameter, a pointer to a struct containing all the variance
functions for that block size. This provides additional flexibility
for calling additional variance functions (the half-pixel special
case, for example) and by initializing the table for all block sizes,
we don't have to construct this function pointer table for each
macroblock.
Change-Id: I78289ff36b2715f9a7aa04d5f6fbe3d23acdc29c
|
|
Changes 'The VP8 project' to 'The WebM project', for consistency
with other webmproject.org repositories.
Fixes issue #97.
Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba
|
|
Moved partition_bmi and partition_count out of MB_MODE_INFO and
placed into MACROBLOCK. Also reduced the size of other members
of the MB_MODE_INFO struct. For 1080p, the memory was reduced
by 1,209,516 bytes. The decoder performance appeared to improve
by 3% for the clip used.
Note: The main goal for this change is to improve the decoder
performance. The encoder will be revisited at a later date for
further structure cleanup.
Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613
|
|
The main reason for the change was to reduce cycles in the token
decoder. (~1.5% gain for 32 bit) This layout should be more
cache friendly.
As a result of this change, the encoder had to be updated.
Change-Id: Id5e804169d8889da0378b3a519ac04dabd28c837
Note: dixie uses a similar layout
|
|
These copies occurred for each macroblock in the encoder and decoder.
Thetemp MB_MODE_INFO mbmi was removed from MACROBLOCKD. As a result,
a large number compile errors had to be fixed.
Change-Id: I4cf0ffae3ce244f6db04a4c217d52dd256382cf3
|
|
The mv_ref and sub_mv_ref token encodings are indexed from NEARESTMV
and LEFT4X4, respectively, rather than being zero-based like the
other token encodings.
Change-Id: I3699c3f84111209ecfb91097c4b900773e9a3ad5
|
|
Replace the exponential search for optimal rounding during
quantization with a linear Viterbi trellis and enable it
by default when using --best.
Right now this operates on top of the output of the adaptive
zero-bin quantizer in vp8_regular_quantize_b() and gives a small
gain.
It can be tested as a replacement for that quantizer by
enabling the call to vp8_strict_quantize_b(), which uses
normal rounding and no zero bin offset.
Ultimately, the quantizer will have to become a function of lambda
in order to take advantage of activity masking, since there is
limited ability to change the quantization factor itself.
However, currently vp8_strict_quantize_b() plus the trellis
quantizer (which is lambda-dependent) loses to
vp8_regular_quantize_b() alone (which is not) on my test clip.
Patch Set 3:
Fix an issue related to the cost evaluation of successor
states when a coefficient is reduced to zero. With this
issue fixed, now the trellis search almost exactly matches
the exponential search.
Patch Set 2:
Overall, the goal of this patch set is to make "trellis"
search to produce encodings that match the exponential
search version. There are three main differences between
Patch Set 2 and 1:
a. Patch set 1 did not properly account for the scale of
2nd order error, so patch set 2 disable it all together
for 2nd blocks.
b. Patch set 1 was not consistent on when to enable the
the quantization optimization. Patch set 2 restore the
condition to be consistent.
c. Patch set 1 checks quantized level L-1, and L for any
input coefficient was quantized to L. Patch set 2 limits
the candidate coefficient to those that were rounded up
to L. It is worth noting here that a strategy to check
L and L+1 for coefficients that were truncated down to L
might work.
(a and b get trellis quant to basically match the exponential
search on all mid/low rate encodings on cif set, without
a, b, trellis quant can hurt the psnr by 0.2 to .3db at
200kbps for some cif clips)
(c gets trellis quant to match the exponential search
to match at Q0 encoding, without c, trellis quant can be
1.5 to 2db lower for encodings with fixed Q at 0 on most
derf cif clips)
Change-Id: Ib1a043b665d75fbf00cb0257b7c18e90eebab95e
|
|
At the end of the decode, frame buffers were being copied.
The frames are not updated after the copy, they are just
for reference on later frames. This change allows multiple
references to the same frame buffer instead of copying it.
Changes needed to be made to the encoder to handle this. The
encoder is still doing frame buffer copies in similar places
where pointer reference could be done.
Change-Id: I7c38be4d23979cc49b5f17241ca3a78703803e66
|
|
Following conversations with Tim T (Derf) I ran a large number of
tests comparing the existing polynomial expression with a simpler
^2 variant. Though the polynomial was sometimes a little better at
the extremes of Q it was possible to get close for most clips and
even a little better on some.
This code also changes the way the RD multiplier is calculated
when the ZBIN is extended to use a variant of the same ^2
expression.
I hope that this simpler expression will be easier to tune further
as we expand our test set and consider adjustments based on content.
Change-Id: I73b2564346e74d1332c33e2c1964ae093437456c
|
|
The new fdct lowers the round trip sum squared error for a
4x4 block ~0.12. or ~0.008/pixel. For reference, the old
matrix multiply version has average round trip error 1.46
for a 4x4 block.
Thanks to "derf" for his suggestions and references.
Change-Id: I5559d1e81d333b319404ab16b336b739f87afc79
|
|
When the license headers were updated, they accidentally contained
trailing whitespace, so unfortunately we have to touch all the files
again.
Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
|
|
Change-Id: I2a97f08cc3c7808ce5be39e910cc5147ecf03a1d
|
|
low and high Q ends.
|
|
|
|
Change-Id: I944035e720ef834561a9da0d723879a4f787312c
|
|
Change-Id: Ieebea089095d9073b3a94932791099f614ce120c
|
|
|
|
|