Age | Commit message (Collapse) | Author |
|
Change-Id: Ia40229903c08f14020e90e94cfdf494aba1be827
|
|
|
|
|
|
3% faster overall (3min35.0 to 3min28.5).
Change-Id: I5ff8a5c2c91586b6632ca5009ad1ea51ce94af5e
|
|
|
|
Encoding of bus @ 1500kbps (first 50 frames) goes from 3min57 to
3min35, i.e. approximately a 10.5% speedup. Note that the SIMD versions
which use a bilinear filter (x_offset & 7 || y_offset & 7) aren't
perfectly interleaved, and can probably be improved further in the
future. I've marked this with a few TODOs/FIXMEs in the code.
Change-Id: I5c9e900c0f0d32e431a50fecae213b510b2549f9
|
|
- size_t vs int.
Change-Id: Ib47ebd932a4b69db9f52a43000bb69d0a96b9134
|
|
|
|
Overall speedup around 5% (bus @ 1500kbps first 50 frames 4min10 ->
3min58). Specific changes to timings for each function compared to
original assembly-optimized versions (or just new version timings if
no previous assembly-optimized version was available):
sse2 4x4: 99 -> 82 cycles
sse2 4x8: 128 cycles
sse2 8x4: 121 cycles
sse2 8x8: 149 -> 129 cycles
sse2 8x16: 235 -> 245 cycles (?)
sse2 16x8: 269 -> 203 cycles
sse2 16x16: 441 -> 349 cycles
sse2 16x32: 641 cycles
sse2 32x16: 643 cycles
sse2 32x32: 1733 -> 1154 cycles
sse2 32x64: 2247 cycles
sse2 64x32: 2323 cycles
sse2 64x64: 6984 -> 4442 cycles
ssse3 4x4: 100 cycles (?)
ssse3 4x8: 103 cycles
ssse3 8x4: 71 cycles
ssse3 8x8: 147 cycles
ssse3 8x16: 158 cycles
ssse3 16x8: 188 -> 162 cycles
ssse3 16x16: 316 -> 273 cycles
ssse3 16x32: 535 cycles
ssse3 32x16: 564 cycles
ssse3 32x32: 973 cycles
ssse3 32x64: 1930 cycles
ssse3 64x32: 1922 cycles
ssse3 64x64: 3760 cycles
Change-Id: I81ff6fe51daf35a40d19785167004664d7e0c59d
|
|
The new print out includes skips and has prefixed sections so you can
grep to find things like transforms chosen on each frame.
Change-Id: I195043424647d9514cfc3ff6720a5b20d010fa1b
|
|
Change-Id: Ic924f07c6ab0c929c6cdf11880d3c625806e272c
|
|
Change-Id: I183a38997a9d01e4a1b869e92509f6915216fa09
|
|
|
|
|
|
This commit makes use of dual fdct32x32 versions for rate-distortion
optimization loop and encoding process, respectively. The one for
rd loop requires only 16 bits precision for intermediate steps.
The original fdct32x32 that allows higher intermediate precision (18
bits) was retained for the encoding process only.
This allows speed-up for fdct32x32 in the rd loop. No performance
loss observed.
Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
|
|
This seems to only be used in the encoder. Also remove an empty wrapper
file that contained forward declarations for this function, but didn't
actually define any actual functions.
Change-Id: Ifc561eef7ebe374a7d03698055e51e105f6d614b
|
|
Moving single function from vp9_invtrans.c to vp9_encodemb.c.
Change-Id: I26bf6bb90de342a3036c0dbfba78a7dd75a61fe7
|
|
|
|
|
|
|
|
vp9_default_inter_mode_probs was being accessed with a different type
than it was defined with. Ensure that its declaration is included
prior to its definition.
Change-Id: I2f963f513ab2f4e339f8a3c17e3d0f03749eba16
|
|
All elements of this table are equal to 252, so replace it with a
single constant VP9_COEF_UPDATE_PROB.
Change-Id: I1e2d1d284326ce6df9899a740c2fc344b3ec81c9
|
|
|
|
The encoding time for bus at CIF goes from 661s to 625s. This commit
also enabled unit test of sad8x4/4x8 in sad_test.cc.
Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
|
|
The encoding time for bus at CIF goes from 661s to 625s. This commit
also enabled unit test of sad8x4/4x8 in sad_test.cc.
Change-Id: If3d10ebb56bda584bdb69bcf056599d580b12cb1
|
|
This flag no longer needed.
Change-Id: If13482015ddb92d225792ea5c0ee455d2285d1f6
|
|
Modified to work with 8x8 blocks of memory. Will revisit
later for further optimizations. For the HD clip used, the
decoder improved by almost 20%.
Change-Id: Iaa4785be293a32a42e8db07141bd699f504b8c67
|
|
|
|
Encoding time of crew (CIF, first 50 frames) @ 1500kbps goes from 4min56
to 4min42.
Change-Id: I92c0c8b32980d2ae7c6dafc8b883a2c7fcd14a9f
|
|
Modified to work with 8x8 blocks of memory. Will revisit
later for further optimizations. For the HD clip used, the
decoder improved my 20%.
Change-Id: Ia0057f55d66d1445882351ea6c43b595a5a980e5
|
|
* changes:
Remove unused vp9_idct_add_{y,uv}_block
Remove some unused loopfilter code
|
|
|
|
|
|
These functions are not used, and appear to have been superceded.
Change-Id: I86fe51b088264f6b1b8d4d232bba97b371b98120
|
|
Change-Id: Ic6b2881d8d495269edbc514b33376ca963798b45
|
|
This code is unreachable, and not useful for later reference.
Change-Id: I4c9a9e0fbf859c1081bbcfbcda9710afb4b4741f
|
|
Change-Id: If74bc6110016bc75ea3883ab136fbbac88f6a913
|
|
The mmx routines work as expected for the loop filter, so enable them.
Change-Id: I2bbd9b99a4445fcba17bb95002f1fb6e01fe8f85
|
|
Change-Id: I9aac140d775b7b4a8727494d15b185b75501a546
|
|
Change-Id: I86be1f7421ed49d577cacf405f6e4b0daa85cfdc
|
|
Don't do the 15 tap filter if there aren't 8 pixels below/right of the
edge.
Change-Id: I62f16437c1d9ba59b6901a5fe71ddb2f472da344
|
|
This commit fixed the allowable partition types for bottom-right
corner blocks.
When a block has over half of its pixels as valid content in both
vertical and horizontal directions, allow all the four partition
types in the bit-stream. Otherwise, apply partition type constraints.
Change-Id: I2252e2de7125a8bfb1c824bf34299a13c81102e3
|
|
|
|
* New probs for subpel filters/tx_count
* Makes a change to not reset to defaults for the tx_size
probs if an intermediate frame reverts to using a fixed tx_size.
* A few updates to the parameters for backward adaptation for mode/mv
* some cosmetic cleanups
derf300: +0.06%
Change-Id: I22994d659bc31ca7a4fc8820fde24001e64a2920
|
|
|
|
Remove the bilinear filter mode, and the no-loopfilter mode, and the
related vp9_setup_version() function.
Change-Id: I32311367812faf37863131df3af37d63d03973d7
|
|
This commit has no impact but to help us debug issues. To Use call like
this:
vp9_print_modes_and_motion_vectors(cpi->common.mi, cpi->common.mi_rows,
cpi->common.mi_cols,
cpi->common.current_video_frame,
"decode_mi.stt");
Change-Id: I89e27725dae351370eb7f311a20a145ed4f1d041
|
|
Change-Id: I524ba98841f2e1850e3276ac365c501cea31546d
|
|
|
|
|