Age | Commit message (Collapse) | Author |
|
Consistent with VP8.
Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07
|
|
This patch allows coding frames using references of different
resolution, in ZEROMV mode. For compound prediction, either
reference may be scaled.
To test, I use the resize_test and enable WRITE_RECON_BUFFER
in vp9_onyxd_if.c. It's also useful to apply this patch to
test/i420_video_source.h:
--- a/test/i420_video_source.h
+++ b/test/i420_video_source.h
@@ -93,6 +93,7 @@ class I420VideoSource : public VideoSource {
virtual void FillFrame() {
// Read a frame from input_file.
+ if (frame_ != 3)
if (fread(img_->img_data, raw_sz_, 1, input_file_) == 0) {
limit_ = frame_;
}
This forces the frame that the resolution changes on to be coded
with no motion, only scaling, and improves the quality of the
result.
Change-Id: I1ee75d19a437ff801192f767fd02a36bcbd1d496
|
|
Ensure that all inter prediction goes through a common code path
that takes scaling into account. Removes a bunch of duplicate
1st/2nd predictor code. Also introduces a 16x8 mode for 8x8
MVs, similar to the 8x4 trick we were doing before. This has an
unexpected effect with EIGHTTAP_SMOOTH, so it's disabled in that
case for now.
Change-Id: Ia053e823a8bc616a988a0af30452e1e75a739cba
|
|
The information is a duplicate of "eob" in BLOCKD.
Change-Id: Ia6416273bd004611da801e4bfa6e2d328d6f02a3
|
|
|
|
Change-Id: I7c6e3bebd94856b24dbe2aded7f9e04ef8bb8c08
|
|
Change-Id: I7b7b8d4fda3a23699e0c920d727f8c15d37d43aa
|
|
rebased.
This patch includes 16x16 butterfly inverse ADST/DCT hybrid
transform. It uses the variant ADST of kernel
sin((2k+1)*(2n+1)/4N),
which allows a butterfly implementation.
The coding gains as compared to DCT 16x16 are about 0.1% for
both derf and std-hd. It is noteworthy that for std-hd sets
many sequences gains about 0.5%, some 0.2%. There are also few
points that provides -1% to -3% performance. Hence the average
goes to about 0.1%.
Change-Id: Ie80ac84cf403390f6e5d282caa58723739e5ec17
|
|
The commit changes the coding mode to lossless whenever the lowest
quantizer is choosen.
As expected, test results showed no difference for cif and std-hd
set where Q0 is rarely used. For yt and yt-hd set, Q0 is used for
a number of clips, where this commit helped a lot in the high end.
Average over all clips in the sets:
yt: 2.391% 1.017% 1.066%
hd: 1.937% .764% .787%
Change-Id: I9fa9df8646fd70cb09ffe9e4202b86b67da16765
|
|
Change-Id: I7a5314daca993d46b8666ba1ec2ff3766c1e5042
|
|
Change-Id: I4f46d142c2a8d1e8a880cfac63702dcbfb999b78
|
|
experimental
|
|
Since addition of the larger-scale transforms (16x16, 32x32), these
don't give a benefit at macroblock-sizes anymore. At superblock-sizes,
2nd-order transform was never used over the larger transforms. Future
work should test whether there is a benefit for that use case.
Change-Id: I90cadfc42befaf201de3eb0c4f7330c56e33330a
|
|
|
|
|
|
1. Added a bit in frame header to to indicate if a frame is encoded
in lossless mode, so decoder does not make the decision based on Q0
2. Minor changes to make sure that lossy coding works same as when
the lossless experiment is not enabled.
3. Renamed function pointers for transforms to be consistent, using
prefix fwd_txm and inv_txm for forward and inverse respectively
To encode in lossless mode, using "--lossless=1 --min-q=0 --max-q=0"
with vpxenc.
Change-Id: Ifae53b26d2ffbe378d707e29d96817b8a5e6c068
|
|
Change-Id: I95acfc1417634b52d344586ab97f0abaa9a4b256
|
|
Removal of experiment to simplify code base for other
changes.
Change-Id: If0a33952504558511926ad212bc311fc2bffb19a
|
|
|
|
fixed format issues.
Implement the inverse 4x4 ADST using 9 multiplications. For this
particular dimension, the original ADST transform can be
factorized into simpler operations, hence is retained.
Change-Id: Ie5d9749942468df299ab74e90d92cd899569e960
|
|
Replace as_mv.{first, second} with a two element array, so that they
can easily be processed with an index variable.
Change-Id: I1e429155544d2a94a5b72a5b467c53d8b8728190
|
|
Pass the current mb row and column around rather than the
recon_yoffset and recon_uvoffset, since those offsets will
change from predictor to predictor, based on the reference
frame selection.
Change-Id: If3f9df059e00f5048ca729d3d083ff428e1859c1
|
|
* changes:
Restore SSSE3 subpixel filters in new convolve framework
Convert subpixel filters to use convolve framework
Add 8-tap generic convolver
|
|
Refactor the 8x8 inverse hybrid transform. It is now consistent
with the new inverse DCT. Overall performance loss (due to the
use of this variant ADST, and the rounding errors in the butterfly
implementation) for std-hd is -0.02.
Fixed BUILD warning.
Devise a variant of the original ADST, which allows butterfly
computation structure. This new transform has kernel of the
form: sin((2k+1)*(2n+1) / (4N)). One of its butterfly structures
using floating-point multiplications was reported in Z. Wang,
"Fast algorithms for the discrete W transform and for the discrete
Fourier transform", IEEE Trans. on ASSP, 1984.
This patch includes the butterfly implementation of the inverse
ADST/DCT hybrid transform of dimension 8x8.
Change-Id: I3533cb715f749343a80b9087ce34b3e776d1581d
|
|
This patch adds column-based tiling. The idea is to make each tile
independently decodable (after reading the common frame header) and
also independendly encodable (minus within-frame cost adjustments in
the RD loop) to speed-up hardware & software en/decoders if they used
multi-threading. Column-based tiling has the added advantage (over
other tiling methods) that it minimizes realtime use-case latency,
since all threads can start encoding data as soon as the first SB-row
worth of data is available to the encoder.
There is some test code that does random tile ordering in the decoder,
to confirm that each tile is indeed independently decodable from other
tiles in the same frame. At tile edges, all contexts assume default
values (i.e. 0, 0 motion vector, no coefficients, DC intra4x4 mode),
and motion vector search and ordering do not cross tiles in the same
frame.
t log
Tile independence is not maintained between frames ATM, i.e. tile 0 of
frame 1 is free to use motion vectors that point into any tile of frame
0. We support 1 (i.e. no tiling), 2 or 4 column-tiles.
The loopfilter crosses tile boundaries. I discussed this briefly with Aki
and he says that's OK. An in-loop loopfilter would need to do some sync
between tile threads, but that shouldn't be a big issue.
Resuls: with tiling disabled, we go up slightly because of improved edge
use in the intra4x4 prediction. With 2 tiles, we lose about ~1% on derf,
~0.35% on HD and ~0.55% on STD/HD. With 4 tiles, we lose another ~1.5%
on derf ~0.77% on HD and ~0.85% on STD/HD. Most of this loss is
concentrated in the low-bitrate end of clips, and most of it is because
of the loss of edges at tile boundaries and the resulting loss of intra
predictors.
TODO:
- more tiles (perhaps allow row-based tiling also, and max. 8 tiles)?
- maybe optionally (for EC purposes), motion vectors themselves
should not cross tile edges, or we should emulate such borders as
if they were off-frame, to limit error propagation to within one
tile only. This doesn't have to be the default behaviour but could
be an optional bitstream flag.
Change-Id: I5951c3a0742a767b20bc9fb5af685d9892c2c96f
|
|
Update the code to call the new convolution functions to do subpixel
prediction rather than the existing functions. Remove the old C and
assembly code, since it is unused. This causes a 50% performance
reduction on the decoder, but that will be resolved when the asm for
the new functions is available.
There is no consensus for whether 6-tap or 2-tap predictors will be
supported in the final codec, so these filters are implemented in
terms of the 8-tap code, so that quality testing of these modes
can continue. Implementing the lower complexity algorithms is a
simple exercise, should it be necessary.
This code produces slightly better results in the EIGHTTAP_SMOOTH
case, since the filter is now applied in only one direction when
the subpel motion is only in one direction. Like the previous code,
the filtering is skipped entirely on full-pel MVs. This combination
seems to give the best quality gains, but this may be indicative of a
bug in the encoder's filter selection, since the encoder could
achieve the result of skipping the filtering on full-pel by selecting
one of the other filters. This should be revisited.
Quality gains on derf positive on almost all clips. The only clip
that seemed to be hurt at all datarates was football
(-0.115% PSNR average, -0.587% min). Overall averages 0.375% PSNR,
0.347% SSIM.
Change-Id: I7d469716091b1d89b4b08adde5863999319d69ff
|
|
Change-Id: Icb6e21dc0c2d9918faa33c8bf70943660df7ad88
|
|
First step in simplifying the segment mode and
segment EOB flags into a simpler segment skip
flag that implies 0,0 mv and EOB at position 0.
Change-Id: Ib750cac31a7a02dc21082580498efd9f7d8d72a5
|
|
This experiment gives little gains and adds relatively much code
complexity (and it hinders other experiments), so let's get rid of
it.
Change-Id: Id25e79a137a1b8a01138aa27a1fa0ba4a2df274a
|
|
Fixes some scaling issues. Adds an option to only compute the
dct on the low-low subband for 32x32 and 64x64 blocks using
only a single 16x16 dct after 1 and 2 wavelet decomposition
levels respectively. Also adds an option to use a 8x8 dct
as building block.
Currenlty with the 2/6 filter and with a single 16x16 dct on
the low low band, the reuslts compared to full 32x32 dct is
as follows:
derf: -0.15%
yt: -0.29%
std-hd: -0.18%
hd: -0.6%
These are my current recommended settings, since the 2/6 filter
is very simple.
Results with 8x8 dct are about 0.3% worse.
Change-Id: I00100cdc96e32deced591985785ef0d06f325e44
|
|
Change-Id: I615651e4c7b09e576a341ad425cf80c393637833
|
|
Change-Id: If6c88752dffdb566f8d4322f135145270716fb8e
|
|
This patch removes the old pred-filter experiment and replaces it
with one that is implemented using the switchable filter framework.
If the pred-filter experiment is enabled, three interopolation
filters are tested during mode selection; the standard 8-tap
interpolation filter, a sharp 8-tap filter and a (new) 8-tap
smoothing filter.
The 6-tap filter code has been preserved for now and if the
enable-6tap experiment is enabled (in addition to the pred-filter
experiment) the original 6-tap filter replaces the new 8-tap smooth
filter in the switchable mode.
The new experiment applies the prediction filter in cases of a
fractional-pel motion vector. Future patches will apply the filter
where the mv is pel-aligned and also to intra predicted blocks.
Change-Id: I08e8cba978f2bbf3019f8413f376b8e2cd85eba4
|
|
Change-Id: I0df99742029834a85c4933652b0587cf5b6b2587
|
|
Incorportate vp9-preview changes by merging master branch into experimental.
Conflicts:
test/test.mk
vp9/common/vp9_filter.c
vp9/common/vp9_idctllm.c
vp9/common/vp9_invtrans.h
vp9/common/vp9_mbpitch.c
vp9/common/vp9_rtcd_defs.sh
vp9/common/vp9_systemdependent.h
vp9/common/vp9_type_aliases.h
vp9/common/x86/vp9_asm_stubs.c
vp9/common/x86/vp9_subpixel_mmx.asm
vp9/decoder/vp9_decodframe.c
vp9/decoder/vp9_dequantize.c
vp9/decoder/vp9_dequantize.h
vp9/decoder/vp9_onyxd_int.h
vp9/encoder/vp9_bitstream.c
vp9/encoder/vp9_encodeframe.c
vp9/encoder/vp9_rdopt.c
Change-Id: I17f51c3666d1b59cf1a699f87607cbc5d30a87c5
|
|
3.2% gains on std/hd, 1.0% gains on hd.
Change-Id: I481d5df23d8a4fc650a5bcba956554490b2bd200
|
|
Part of NEW_MVREF experiment.
Added update-able probabilities.
Change-Id: I5a4fcf4aaed1d0d1dac980f69d535639a3d59401
|
|
Various fixups to resolve issues when building vp9-preview under the more stringent
checks placed on the experimental branch.
Change-Id: I21749de83552e1e75c799003f849e6a0f1a35b07
|
|
For coefficients, use int16_t (instead of short); for pixel values in
16-bit intermediates, use uint16_t (instead of unsigned short); for all
others, use uint8_t (instead of unsigned char).
Change-Id: I3619cd9abf106c3742eccc2e2f5e89a62774f7da
|
|
Change-Id: I0c1be01aae933243311ad321b6c456adaec1a0f5
|
|
This commit changed the ENTROPY_CONTEXT conversion between MBs that
have different transform sizes.
In additioin, this commit also did a number of cleanup/bug fix:
1. removed duplicate function vp9_fix_contexts() and changed to use
vp8_reset_mb_token_contexts() for both encoder and decoder
2. fixed a bug in stuff_mb_16x16 where wrong context was used for
the UV.
3. changed reset all context to 0 if a MB is skipped to simplify the
logic.
Change-Id: I7bc57a5fb6dbf1f85eac1543daaeb3a61633275c
|
|
This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds
code all over the place to wrap that in the bitstream/encoder/decoder/RD.
Some implementation notes (these probably need careful review):
- token range is extended by 1 bit, since the value range out of this
transform is [-16384,16383].
- the coefficients coming out of the FDCT are manually scaled back by
1 bit, or else they won't fit in int16_t (they are 17 bits). Because
of this, the RD error scoring does not right-shift the MSE score by
two (unlike for 4x4/8x8/16x16).
- to compensate for this loss in precision, the quantizer is halved
also. This is currently a little hacky.
- FDCT and IDCT is double-only right now. Needs a fixed-point impl.
- There are no default probabilities for the 32x32 transform yet; I'm
simply using the 16x16 luma ones. A future commit will add newly
generated probabilities for all transforms.
- No ADST version. I don't think we'll add one for this level; if an
ADST is desired, transform-size selection can scale back to 16x16
or lower, and use an ADST at that level.
Additional notes specific to Debargha's DWT/DCT hybrid:
- coefficient scale is different for the top/left 16x16 (DCT-over-DWT)
block than for the rest (DWT pixel differences) of the block. Therefore,
RD error scoring isn't easily scalable between coefficient and pixel
domain. Thus, unfortunately, we need to compute the RD distortion in
the pixel domain until we figure out how to scale these appropriately.
Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b
|
|
This patch reduces the cpu cost of the MV ref
search by only allowing insert for candidates
that would be in the current top 4.
This could alter the outcome and slightly favors
near candidates which are tested first but also
limits the worst case loop count to 4 and means in
many cases it will drop out and not happen.
Change-Id: Idd795a825f9fd681f30f4fcd550c34c38939e113
|
|
Change-Id: I2c252f3ddcc99e96c1f5d3dab8bcb25a2a3637ea
|
|
Change-Id: Ieefd76e164ca4aa87597da0412977614ddfbacb7
|
|
This patch allows use of 8x8 and 4x4 ADST correctly for Intra
16x16 modes and Intra 8x8 modes when the block size selected
is smaller than the prediction mode. Also includes some cleanups
and refactoring.
Rebase.
Change-Id: Ie3257bdf07bdb9c6e9476915e3a80183c8fa005a
|
|
Change-Id: Ia1cce221f8511561b9cbd8edb7726fbc286ff243
|
|
Support for gyp which doesn't support multiple objects in the same
static library having the same basename.
Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
|