Age | Commit message (Collapse) | Author |
|
Separate the rounding and right shift operations of forward transform
from those of inverse transform. Take out the assertion check from
inverse transforms. If the transform coefficients were constructed to
cause intermediate steps of inverse transform overflow, the codec will
just let it overflow without breaking the decoding flow.
Change-Id: I73cfc3706c4e840fc543a77cbc4cdb0b05d07730
|
|
Also renaming dest_stride to stride in some places.
Change-Id: I75f602b623a5a7071d4922b747c45fa0b7d7a940
|
|
Renames:
vp9_iht_add -> vp9_iht4x4_add
vp9_iht_add_8x8 -> vp9_iht8x8_add
vp9_iht_add_16x16 -> vp9_iht16x16_add
Change-Id: I8f1a2913e02d90d41f174f27e4ee2fad0dbd4a21
|
|
Also adding static to iadst16_1d and fadst16 functions.
Change-Id: I13c7df3b776f0f8efc6e80099bdb0a2f6d29edaf
|
|
Renames:
vp9_short_idct32x32_add -> vp9_idct32x32_1024_add
vp9_short_idct32x32_1_add -> vp9_idct32x32_1_add
vp9_idct_add_32x32 -> vp9_idct32x32_add
Change-Id: Id85306f5814bac6c47463a6b5901a93082510666
|
|
Renames:
vp9_short_idct16x16_add -> vp9_idct16x16_256_add
vp9_short_idct16x16_10_add -> vp9_idct16x16_10_add
vp9_short_idct16x16_1_add -> vp9_idct16x16_1_add
vp9_idct_add_16x16 -> vp9_idct16x16_add
Change-Id: Ief8a3904de78deab0f4ede944c4d0339c228cfc3
|
|
Renames:
vp9_short_idct8x8_add -> vp9_idct8x8_64_add
vp9_short_idct8x8_1_add -> vp9_idct8x8_1_add
vp9_short_idct8x8_10_add -> vp9_idct8x8_10_add
vp9_idct_add_8x8 -> vp9_idct8x8_add
Change-Id: Ifb8d3a45b4c0397aa805b30463f3d14581bf72c1
|
|
The idea is to have the following names for each transform size:
vp9_idct4x4_add
vp9_idct4x4_1_add
vp9_idct4x4_10_add
vp9_idct4x4_16_add
vp9_idct8x8_add
vp9_idct8x8_1_add
vp9_idct8x8_10_add
vp9_idct8x8_64_add
etc for 16x16, 32x32
The actual list of renames in this patch:
vp9_idct_add_lossless -> vp9_iwht4x4_add
vp9_short_iwalsh4x4_add -> vp9_iwht4x4_16_add
vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add
vp9_idct_add -> vp9_idct4x4_add
vp9_short_idct4x4_add -> vp9_idct4x4_16_add
vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add
Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
|
|
Moving functions from vp9_idct_blk to vp9_idct because these functions are
used from both encoder and decoder. Removing duplicated code from
vp9_encodemb.c and reusing existing functions.
Change-Id: Ia0a6782f8c4c409efb891651b871dd4bf22d5fe8
|
|
The change is to better reflect the nature of the constants.
Change-Id: Icabac6e9bceefbdb3f03f8218f88ef75943c30fb
|
|
Change-Id: I76f440a917832c02d7a727697b225bac66b99f56
|
|
Enable SSE2 implementation of high precision 32x32 forward DCT. The
intermediate stacks are of 32-bits. The run-time goes down from
32126 cycles to 13442 cycles.
Change-Id: Ib5ccafe3176c65bd6f2dbdef790bd47bbc880e56
|
|
Removing unused and duplicated constants, moving them from *.h to *.c
if possible.
Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f
|
|
43,000 -> 5,750 cycles, about 7.5x faster.
Change-Id: Ibfd92821b9603f4ed9c256e0ececec14fa4565d0
|
|
This commit makes use of dual fdct32x32 versions for rate-distortion
optimization loop and encoding process, respectively. The one for
rd loop requires only 16 bits precision for intermediate steps.
The original fdct32x32 that allows higher intermediate precision (18
bits) was retained for the encoding process only.
This allows speed-up for fdct32x32 in the rd loop. No performance
loss observed.
Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
|
|
The commit changed to use a new variant of Walsh-Hadamard Transform
by Tim Terriberry. This new variant has the best compression among a
number of variants that developed by Tim.
Change-Id: Icb3a88515463cfc644b17ca046fcd139db2557e9
|
|
Wrote sse2 functions of vp9_short_idct8x8 and vp9_short_idct10_8x8.
Compared to c version, the sse2 version is 2X faster. The decoder
test didn't show noticeable gain since 8x8 idct doesn't take much
of decoding time (less than 1% in my test).
Change-Id: I56313e18cd481700b3b52c4eda5ca204ca6365f3
|
|
Change-Id: I44660975e9985310d8c654c158ee7a61291b5a08
|
|
Wrote a SSE2 vp9_short_idct4x4llm to improve the decoder
performance.
Change-Id: I90b9d48c4bf37aaf47995bffe7e584e6d4a2c000
|
|
Scalar path is about 1.4x faster (4% overall encoder speedup).
SSE2 path is about 7x faster (13% overall encoder speedup).
Change-Id: I7e85d8225a914a74c61ea370210414696560094d
|
|
Fixing code style, using array lookup instead of switch statements for
forward hybrid transforms (in the same way as for their inverses).
Consistent usage of ROUND_POWER_OF_TWO macro in appropriate places.
Change-Id: I0d3822ae11f928905fdbfbe4158f91d97c71015f
|
|
|
|
Wrote SSE2 version of vp9_dc_only_idct_add_c function. In order to
improve performance, clipped the absolute diff values to [0, 255].
This allowed us to keep the additions/subtractions in 8 bits.
Test showed an over 2% decoder performance increase.
Change-Id: Ie1a236d23d207e4ffcd1fc9f3d77462a9c7fe09d
|
|
The commit improves the 32x32 forward dct implementation:
1. change to use same constants and rounding as other forward dcts
2. select rounding to specifically minimize the roundtrip error, which
improved average 19/block to .77/block using 100000 random input.
Test showed a small but consistent gain on all test sets, about .15%
Change-Id: If0afd6a71880a522f60c1c234be0462092c2eb53
|
|
Rebased.
Remove the old matrix multiplication transform computation. The 16x16
ADST/DCT can be switched on/off and evaluated by setting ACTIVE_HT16
300/0 in vp9/common/vp9_blockd.h.
Change-Id: Icab2dbd18538987e1dc4e88c45abfc4cfc6e133f
|
|
Removing redundant 'extern' keywords and parentheses, fixing indentation,
making variable names lower case, using short expressions x *= c
instead of x = x * c, minor code simplifications.
Change-Id: If6a25fcf306d1db26e90d27e3c24a32735c607de
|
|
Change-Id: I7c6e3bebd94856b24dbe2aded7f9e04ef8bb8c08
|
|
fixed format issues.
Implement the inverse 4x4 ADST using 9 multiplications. For this
particular dimension, the original ADST transform can be
factorized into simpler operations, hence is retained.
Change-Id: Ie5d9749942468df299ab74e90d92cd899569e960
|
|
also removed some un-unsed functions.
Change-Id: Ie363bcc8d94441d054137d2ef7c4fe59f56027e5
|