Age | Commit message (Collapse) | Author |
|
Start grouping data per-plane, as part of refactoring to support
additional planes, and chroma planes with other-than 4:2:0
subsampling.
Change-Id: Idb76a0e23ab239180c818025bae1f36f1608bb23
|
|
Wrote sse2 functions of vp9_short_idct8x8 and vp9_short_idct10_8x8.
Compared to c version, the sse2 version is 2X faster. The decoder
test didn't show noticeable gain since 8x8 idct doesn't take much
of decoding time (less than 1% in my test).
Change-Id: I56313e18cd481700b3b52c4eda5ca204ca6365f3
|
|
Scalar path is about 1.5x faster (3.1% overall encoder speedup).
SSE2 path is about 7.2x faster (7.8% overall encoder speedup).
Change-Id: I06da5ad0cdae2488431eabf002b0d898d66d8289
|
|
Picks up some build system changes, compiler warning fixes, etc.
Change-Id: I2712f99e653502818a101a72696ad54018152d4e
|
|
|
|
Change-Id: Id786be31da3c91d95d2955aa569ecdc6e66650df
|
|
sse4_1 code used uint16_t for returning sad, but that
won't work for 32x32 or 64x64. This code fixes the
assembly for those and also reenables sse4_1 on linux
Change-Id: I5ce7288d581db870a148e5f7c5092826f59edd81
|
|
Scalar path is about 1.4x faster (4% overall encoder speedup).
SSE2 path is about 7x faster (13% overall encoder speedup).
Change-Id: I7e85d8225a914a74c61ea370210414696560094d
|
|
This function was part of an optimization used in VP8 that required
caching two macroblocks. This is unused in VP9, and might not
survive refactoring to support superblocks, so removing it for now.
Change-Id: I744e585206ccc1ef9a402665c33863fc9fb46f0d
|
|
s/movd/movq/
Change-Id: Id1a56de91551f8dc796f14f1056c565dfc1ba626
|
|
Change-Id: I4f46d142c2a8d1e8a880cfac63702dcbfb999b78
|
|
Change-Id: Ic639f5742f7a007753d7a3fa5c66235172eb31d8
|
|
Also port the 4x4, 16x16, 8x16 and 16x8 versions to x86inc.asm; this
makes them all slightly faster, particularly on x86-64. Remove SSE3
sad16x16 version, since the SSE2 version is now faster.
About 1.5% overall encoding speedup.
Change-Id: Id4011a78cce7839f554b301d0800d5ca021af797
|
|
7.5% faster overall encoding.
Change-Id: Ie9bb7f9fdf93659eda106404cb342525df1ba02f
|
|
Overall encoding about 15% faster.
Change-Id: I176a775c704317509e32eee83739721804120ff2
|
|
Some projects must define only win64 for Windows 64bit builds using
yasm.
Change-Id: I1d09590d66a7bfc8b4412e1cc8685978ac60b748
|
|
During master jenkins verification proces
Change-Id: I3722b8753eaf39f99b45979ce407a8ea0bea0b89
|
|
Various fixups to resolve issues when building vp9-preview under the more stringent
checks placed on the experimental branch.
Change-Id: I21749de83552e1e75c799003f849e6a0f1a35b07
|
|
Change-Id: I6e43ca73f35401a974ed8ee27738d4318f09fd37
|
|
Change-Id: Ibabf18947f90cb4f45052763ebf44cfb8209bd8b
|
|
|
|
Change-Id: I467bf0fdf3b35326bcce58d5459e6d2dbfd6c5e5
|
|
Change-Id: I2c252f3ddcc99e96c1f5d3dab8bcb25a2a3637ea
|
|
Change-Id: I1f49d96cdb5e342041c9a72ef31df361a1b609eb
|
|
and some miscellaneous invoke left overs
Change-Id: I63191b1bfd3bea4ce30cceaeb686ec850570fc43
|
|
Support for gyp which doesn't support multiple objects in the same
static library having the same basename.
Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc
|
|
This removes functions that are no longer needed and cleans up some warnings.
Change-Id: I292a4c3694e9c1d68ce99cea390905b198434719
|
|
Change-Id: I18ca713b02a5241bdb20dddcde0216467b55b596
|
|
Include upstream changes (variance fixes) into the merged code base.
Change-Id: I4182654c1411c1b15cd23235d3822702613abce1
|
|
Include upstream changes (unit test fixes, in particular) into the
merged code base.
Change-Id: I096f8a9d09e2532fbec0c95d7a995ab22fa54b29
|
|
Creates a merge between the master and experimental branches. Fixes a
number of conflicts in the build system to allow *either* VP8 or VP9
to be built. Specifically either:
$ configure --disable-vp9 $ configure --disable-vp8
--disable-unit-tests
VP9 still exports its symbols and files as VP8, so that will be
resolved in the next commit.
Unit tests are broken in VP9, but this isn't a new issue. They are
fixed upstream on origin/experimental as of this writing, but rebasing
this merge proved difficult, so will tackle that in a second merge
commit.
Change-Id: I2b7d852c18efd58d1ebc621b8041fe0260442c21
|
|
In the variance calculations the difference is summed and later squared.
When the sum exceeds sqrt(2^31) the value is treated as a negative when
it is shifted which gives incorrect results.
To fix this we force the multiplication to be unsigned.
The alternative fix is to shift sum down by 4 before multiplying.
However that will reduce precision.
For 16x16 blocks the maximum sum is 65280 and sqrt(2^31) is 46340 (and
change).
This change is based on:
1698234 Missed some variance casts
fea3556 Fix variance overflow
Change-Id: I2c61856cca9db54b9b81de83b4505ea81a050a0f
|
|
Removed invoke search from encoder
Change-Id: I3d809b795abe6df0e71366edfe94026aaede14fb
|
|
Change-Id: Ic084c475844b24092a433ab88138cf58af3abbe4
|