Age | Commit message (Collapse) | Author |
|
The functions no longer referenced.
Change-Id: If2705dfbc607f79ec8ec2242d5e03bec27a35aaf
|
|
This function never referenced.
Change-Id: I1c42cd355bfa88e17d169f7335a44be682af58cc
|
|
|
|
and remove unused code.
Change-Id: If380440c4450294b5450b7a9eeb94a376846ec01
|
|
|
|
Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34
|
|
|
|
- size_t vs int.
Change-Id: Ib47ebd932a4b69db9f52a43000bb69d0a96b9134
|
|
|
|
|
|
Improves the rd modeling function and implements them using interpolation
from a table which is a little faster. Also uses sse as input to the
modeling function rather than var - since there is no dc prediction
used and as a result the sse works a little better.
derfraw300: +0.05%
Speedup: ~1%
Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff
|
|
adds a new speed feature to force partitioning to be greater than
or less than a certain size
Change-Id: I8c048eeeef93700ae822eccf98f8751a45b2e7d0
|
|
this feature lets you set a partitioning size to be used by the entire
frame.
Change-Id: I208a4c8c701375cbb054418266f677768b6f8f06
|
|
This uses variance to split partition. Variance is calculated using
nearest mv, always from last ref frame.
Change-Id: Idd015b4a9aa3bc82591759eac239680c07496896
|
|
Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a
|
|
Change-Id: Ideee45cad8b38087c509cd404484728e85d0c427
|
|
This uses the speed feature functionality for code.
Change-Id: I9cd16c0c5f98520ae27ebba81aa2c178546587f8
|
|
force us to go through slow partitioning for keyframes, altref and
overlays.
Change-Id: I1a286361bf74083e71973575a7296be46eb98742
|
|
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 ->
3min58). Specific changes to timings for each function compared to
original assembly-optimized versions (or just new version timings if
no previous assembly-optimized version was available):
sse2 4x4: 99 -> 82 cycles
sse2 4x8: 128 cycles
sse2 8x4: 121 cycles
sse2 8x8: 149 -> 129 cycles
sse2 8x16: 235 -> 245 cycles (?)
sse2 16x8: 269 -> 203 cycles
sse2 16x16: 441 -> 349 cycles
sse2 16x32: 641 cycles
sse2 32x16: 643 cycles
sse2 32x32: 1733 -> 1154 cycles
sse2 32x64: 2247 cycles
sse2 64x32: 2323 cycles
sse2 64x64: 6984 -> 4442 cycles
ssse3 4x4: 100 cycles (?)
ssse3 4x8: 103 cycles
ssse3 8x4: 71 cycles
ssse3 8x8: 147 cycles
ssse3 8x16: 158 cycles
ssse3 16x8: 188 -> 162 cycles
ssse3 16x16: 316 -> 273 cycles
ssse3 16x32: 535 cycles
ssse3 32x16: 564 cycles
ssse3 32x32: 973 cycles
ssse3 32x64: 1930 cycles
ssse3 64x32: 1922 cycles
ssse3 64x64: 3760 cycles
Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
|
|
need to rework these
Change-Id: I17dc2c88d2faadd2f8fb117c52c25f04ea2e9856
|
|
The new print out includes skips and has prefixed sections so you can
grep to find things like transforms chosen on each frame.
Change-Id: I195043424647d9514cfc3ff6720a5b20d010fa1b
|
|
|
|
Change-Id: I26e80ede80cb4389378a95afa95d229092a9859a
|
|
Since intra block decoding is handled by decode_sb_intra() separately.
Change-Id: I42d757884714084c92fc23ec5d35d4dc946f4b15
|
|
Change-Id: Iab96e6a50aec543c63e15cd134f9d5f01ca7ceff
|
|
|
|
Optimized the quantization function by making it a two-pass
process. The first pass does a quick checking of the transform
coefficients against the base ZBIN, and only keep the good
enough set of coefficients for quantization. A skipping
check is added. If all coefficients are within the base ZBIN, no
quantization is needed. The second pass is the actual quantization
pass, which only processes the coefficient subset determined
in first pass. This reduces the computation. Furthermore, an
alternitive method is used for large transform size, which often
has sparse nonzero quantized coefficients.
Overall, the encoder speedup is about 4%. The quantization function
itself gets 20% faster.
Change-Id: I3a9dd0da6db030260b6d9c314a9fa48ecae89f22
|
|
Change-Id: Ic924f07c6ab0c929c6cdf11880d3c625806e272c
|
|
Change-Id: I183a38997a9d01e4a1b869e92509f6915216fa09
|
|
|
|
|
|
|
|
This commit makes use of dual fdct32x32 versions for rate-distortion
optimization loop and encoding process, respectively. The one for
rd loop requires only 16 bits precision for intermediate steps.
The original fdct32x32 that allows higher intermediate precision (18
bits) was retained for the encoding process only.
This allows speed-up for fdct32x32 in the rd loop. No performance
loss observed.
Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
|
|
Change-Id: I927c7223996cdeb44f46e0e6c2e2054d458c300b
|
|
This seems to only be used in the encoder. Also remove an empty wrapper
file that contained forward declarations for this function, but didn't
actually define any actual functions.
Change-Id: Ifc561eef7ebe374a7d03698055e51e105f6d614b
|
|
Moving single function from vp9_invtrans.c to vp9_encodemb.c.
Change-Id: I26bf6bb90de342a3036c0dbfba78a7dd75a61fe7
|
|
2.5% faster when encoding first 50 frames of bus @ 1500kbps.
Change-Id: I5a64703996cf7fd39b07e32c72311c4b125ec6d4
|
|
Change-Id: I5d3944051d091b4bf3eb13e2a30132d34203ef74
|
|
|
|
|
|
|
|
vp9_default_inter_mode_probs was being accessed with a different type
than it was defined with. Ensure that its declaration is included
prior to its definition.
Change-Id: I2f963f513ab2f4e339f8a3c17e3d0f03749eba16
|
|
All elements of this table are equal to 252, so replace it with a
single constant VP9_COEF_UPDATE_PROB.
Change-Id: I1e2d1d284326ce6df9899a740c2fc344b3ec81c9
|
|
|
|
The encoding time for bus at CIF goes from 661s to 625s. This commit
also enabled unit test of sad8x4/4x8 in sad_test.cc.
Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
|
|
|
|
No bitstream or output change - only cosmetics.
Change-Id: Ic8c1d7ad010a87dcf27d12a38cd7dd5adba683a7
|
|
Avoid calling decode_block, inverse transform/add in the block is
a skip block for SBs smaller than 8x8 and intra-coded SBs.
Change-Id: I1684182f4a0050c8d6bb46cba6830d9425e7127d
|
|
- size_t is 64bits in win64. int is 32 bits.
Change-Id: I4e756427ad42c841098a01a216469f65313987e7
|
|
The encoding time for bus at CIF goes from 661s to 625s. This commit
also enabled unit test of sad8x4/4x8 in sad_test.cc.
Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
|