Age | Commit message (Collapse) | Author |
|
|
|
Renaming vp9_init_mode_costs() to fill_mode_costs() and moving it to
vp9_rdopt.c.
Change-Id: Ib2542d216458f6dced9f4b7ccbdd2cd98176aa5a
|
|
Change-Id: I6366e84490883b72362f762369d7e5bccb64f02f
|
|
Change-Id: I6f6ba91b1b8b280902b171472314d665aa0baf0b
|
|
Since they used in encoder only. This commit also re-order includes
for the files that include vp9_extend.h
Change-Id: I929fc113f2135d3198cd1fc6a17434e5a2f8a459
|
|
There was only one function in *.c file, so moving it to vp9_encodemb.c.
Change-Id: I728859d08b3d6c05c33c1c5b21f0ea1d0e0f83af
|
|
Adding these functions to encapsulate tx_type check. Changing TX_TYPE to
int to match the declaration in vo9_rtch.h.
Change-Id: I6f3a2df6e35595ca73b6aaa9e3909ee7bc3fd16f
|
|
This should be similar to what x264 does with --aq-mode 1.
It works well with clips like parkjoy and touhou
(http://x264.nl/developers/Dark_Shikari/LosslessTouhou.mkv).
At low bitrates, the segmentation signaling overhead may negate the
benefits of this feature.
(PGW) Default changed to feature OFF to allow provisional merge.
Change-Id: I938abf9bb487e1d4ad3b0264ea03d9826275c70b
|
|
Vp9 postproc is disabled for now as its not been shown to help and
may be merged with vp8.
Change-Id: I25620d6cd34c6e10331b18c7b5ef7482e39c6057
|
|
also fixed bug in sad calcs
Change-Id: I6571fcbe37556c16ae32be66dc0fd879852aac1d
|
|
Enable use_x86inc as a commandline option. Fix Bug with sse2 when
x86inc is disabled. Adds Sad asm protection to x86inc protection
Change-Id: Iee0f9dd235ea10e8ace512eb362ba9bebe8c9df6
|
|
|
|
This is in preparation for the SSE2 version of the high-precision
32x32 forward DCT which will share a lot of code with the existing
low precision version used for rate-distortion search.
Change-Id: I7084b6bdfb480b1fabb8493fb14e3f7fcc7888c0
|
|
Change-Id: Icb607745634e10b9bac5019d06661ece09fcdb40
|
|
Support enabling it or disabling it. Moved read out to configure.sh
so that its done once instead of in make and in config.
Change-Id: I73a9190cf31de9f03e8a577f478fa522f8c01c8b
|
|
Change-Id: Ia942e56cf322821d42ba06178672791eeee2847e
|
|
|
|
Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps
goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is
x86-64 only, it needs some minor modifications to be 32bit compatible,
because it uses 15 xmm registers, whereas 32bit only has 8.
Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904
|
|
Change-Id: I83ca53bf6def871f199a382a671f26ad7cbecbca
|
|
Change vp9_block_error() to return a 64bit error variable, change all
callers to expect a 64bit return value (this will prevent overflows,
which we basically don't check for at all right now). Remove duplicate
block_error() function, which fixed that through truncation. Remove
old (incompatible) mmx/sse2 block_error SIMD versions and replace with
a new one that returns a 64bit value.
Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to
3min23, i.e. a 3% overall speedup.
Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68
|
|
3% faster overall (3min35.0 to 3min28.5).
Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
|
|
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 ->
3min58). Specific changes to timings for each function compared to
original assembly-optimized versions (or just new version timings if
no previous assembly-optimized version was available):
sse2 4x4: 99 -> 82 cycles
sse2 4x8: 128 cycles
sse2 8x4: 121 cycles
sse2 8x8: 149 -> 129 cycles
sse2 8x16: 235 -> 245 cycles (?)
sse2 16x8: 269 -> 203 cycles
sse2 16x16: 441 -> 349 cycles
sse2 16x32: 641 cycles
sse2 32x16: 643 cycles
sse2 32x32: 1733 -> 1154 cycles
sse2 32x64: 2247 cycles
sse2 64x32: 2323 cycles
sse2 64x64: 6984 -> 4442 cycles
ssse3 4x4: 100 cycles (?)
ssse3 4x8: 103 cycles
ssse3 8x4: 71 cycles
ssse3 8x8: 147 cycles
ssse3 8x16: 158 cycles
ssse3 16x8: 188 -> 162 cycles
ssse3 16x16: 316 -> 273 cycles
ssse3 16x32: 535 cycles
ssse3 32x16: 564 cycles
ssse3 32x32: 973 cycles
ssse3 32x64: 1930 cycles
ssse3 64x32: 1922 cycles
ssse3 64x64: 3760 cycles
Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
|
|
This seems to only be used in the encoder. Also remove an empty wrapper
file that contained forward declarations for this function, but didn't
actually define any actual functions.
Change-Id: Ifc561eef7ebe374a7d03698055e51e105f6d614b
|
|
Adding API to read/write uncompressed frame header bits (it is not final
yet). Separate functions to read/write uncompressed header. Moving
clr_type, error_resilient_mode, refresh_frame_context,
frame_parallel_decoding_mode, frame_context_idx from compressed partition
to uncompressed frame header.
Change-Id: Id3ed8a387980c652ae147549412f4ec24a0a5bd0
|
|
bitstream mismatches.
This reverts commit df037b615fcc0196386977faae060fdfd9a887a8
Change-Id: I1a529f2590df7bc912f5035d22311268933e3dd6
|
|
The API is not final yet and can be changed. Actual layout of
uncompressed frame part will be finalized later. Right now moving
clr_type, error_resilient_mode, refresh_frame_context,
frame_parallel_decoding_mode from first compressed partition to
uncompressed frame part.
Change-Id: I3afc5d4ea92c5a114f4c3d88f96858cccc15b76e
|
|
Change-Id: Iee9894615265d42aa23c43a4183924953aedb0c6
|
|
Files were copied from vp8 and never maintained.
Change-Id: I9659a8755985da73e8c19c3c984423b6666d8871
|
|
Conflicts:
vp9/common/vp9_findnearmv.c
vp9/common/vp9_rtcd_defs.sh
vp9/decoder/vp9_decodframe.c
vp9/decoder/x86/vp9_dequantize_sse2.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_common.mk
Resolve file name changes in favor of master. Resolve rdopt changes in
favor of experimental, preserving the newer experiments.
Change-Id: If51ed8f457470281c7b20a5c1a2f4ce2cf76c20f
|
|
vp9_dequantize_x86 has only sse2 functions.
vp9_dct_sse2_intrinsics has no namespace collision and can drop
_intrinsics.
vp9_idct_mmx.h is unused.
Change-Id: Ic16e31fb372a1d1e841a62ecb4189fe8f95808ec
|
|
This data can vary per-plane, but not per-block.
Change-Id: I1971b0b2c2e697d2118e38b54ef446e52f63c65a
|
|
Scalar path is about 1.3x faster (2.1% overall encoder speedup).
SSE2 path is about 5.0x faster (8.4% overall encoder speedup).
Change-Id: I360d167b5ad6f387bba00406129323e2fe6e7dda
|
|
Scalar path is about 1.3x faster (2.1% overall encoder speedup).
SSE2 path is about 5.0x faster (8.4% overall encoder speedup).
Change-Id: I360d167b5ad6f387bba00406129323e2fe6e7dda
|
|
Change-Id: Id786be31da3c91d95d2955aa569ecdc6e66650df
|
|
Scalar path is about 1.4x faster (4% overall encoder speedup).
SSE2 path is about 7x faster (13% overall encoder speedup).
Change-Id: I7e85d8225a914a74c61ea370210414696560094d
|
|
|
|
Consistent with VP8.
Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07
|
|
This function was part of an optimization used in VP8 that required
caching two macroblocks. This is unused in VP9, and might not
survive refactoring to support superblocks, so removing it for now.
Change-Id: I744e585206ccc1ef9a402665c33863fc9fb46f0d
|
|
Change-Id: I4f46d142c2a8d1e8a880cfac63702dcbfb999b78
|
|
Change-Id: Ic639f5742f7a007753d7a3fa5c66235172eb31d8
|
|
Also port the 4x4, 16x16, 8x16 and 16x8 versions to x86inc.asm; this
makes them all slightly faster, particularly on x86-64. Remove SSE3
sad16x16 version, since the SSE2 version is now faster.
About 1.5% overall encoding speedup.
Change-Id: Id4011a78cce7839f554b301d0800d5ca021af797
|
|
Various fixups to resolve issues when building vp9-preview under the more stringent
checks placed on the experimental branch.
Change-Id: I21749de83552e1e75c799003f849e6a0f1a35b07
|
|
Change-Id: I9f0ae635fb9a95c4aa1529c177ccb07e2b76970b
|
|
1. remove the dependency on non existing "vp9_temporal_filter_x86.h"
2. prefix filenames with vp9_ in obj_int_extract.bat to reflect the
change of the actual filenames.
Change-Id: Ib1b4d96ac41788f76917764a6722d8461c857302
|
|
Change-Id: Ieefd76e164ca4aa87597da0412977614ddfbacb7
|
|
Support for gyp which doesn't support multiple objects in the same
static library having the same basename.
Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
|
|
Creates a merge between the master and experimental branches. Fixes a
number of conflicts in the build system to allow *either* VP8 or VP9
to be built. Specifically either:
$ configure --disable-vp9 $ configure --disable-vp8
--disable-unit-tests
VP9 still exports its symbols and files as VP8, so that will be
resolved in the next commit.
Unit tests are broken in VP9, but this isn't a new issue. They are
fixed upstream on origin/experimental as of this writing, but rebasing
this merge proved difficult, so will tackle that in a second merge
commit.
Change-Id: I2b7d852c18efd58d1ebc621b8041fe0260442c21
|
|
Change-Id: Ic084c475844b24092a433ab88138cf58af3abbe4
|