Age | Commit message (Collapse) | Author |
|
Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06
|
|
Change-Id: I5259b68dc1bcceb153e3ffe638a79a59a3019e9d
|
|
It is enough to specify (e.g.) idct16, it is obviously different from
idct16x16.
Change-Id: I6b408a37a945de3162429380b59a775b03b95db0
|
|
Separate the rounding and right shift operations of forward transform
from those of inverse transform. Take out the assertion check from
inverse transforms. If the transform coefficients were constructed to
cause intermediate steps of inverse transform overflow, the codec will
just let it overflow without breaking the decoding flow.
Change-Id: I73cfc3706c4e840fc543a77cbc4cdb0b05d07730
|
|
Adding these functions to encapsulate tx_type check. Changing TX_TYPE to
int to match the declaration in vo9_rtch.h.
Change-Id: I6f3a2df6e35595ca73b6aaa9e3909ee7bc3fd16f
|
|
Change-Id: I78f7012f967a777ddd39bae6671eb501df6bbfe8
|
|
For consistency with idct function names. Renames:
vp9_short_fdct4x4 -> vp9_fdct4x4
vp9_short_walsh4x4 -> vp9_fwht4x4
Change-Id: Id15497cc1270acca626447d846f0ce9199770f58
|
|
For consistency with idct function names.
Change-Id: Ie77b7178e0894c57cd5cb9243c949eb9224ece18
|
|
|
|
For consistency with idct function names.
Change-Id: I5ca355ba99fdba04f09254be95cf79808b534f71
|
|
For consistency with idct function names.
Change-Id: I7b6af2f92c66eff56f84ed29edc3a66af8dc421f
|
|
|
|
|
|
Just making fdct consistent with iht/idct/fht functions which all use
stride (# of elements) as input argument.
Change-Id: I0ba3c52513a5fdd194f1e7e2901092671398985b
|
|
Just making fdct consistent with iht/idct/fht functions which all use
stride (# of elements) as input argument.
Change-Id: Ibc944952a192e6c7b2b6a869ec2894c01da82ed1
|
|
Just making fdct consistent with iht/idct/fht functions which all use
stride (# of elements) as input argument.
Change-Id: I2d95fdcbba96aaa0ed24a80870cb38f53487a97d
|
|
Just making fdct consistent with iht/idct/fht functions which all use
stride (# of elements) as input argument.
Change-Id: Id623c5113262655fa50f7c9d6cec9a91fcb20bb4
|
|
Change-Id: Icbcf68b5b685a56f255ebc3859c9692accdadf9e
|
|
Also adding static to iadst16_1d and fadst16 functions.
Change-Id: I13c7df3b776f0f8efc6e80099bdb0a2f6d29edaf
|
|
Renames:
fdct4_1d -> fdct4
fadst4_1d -> fadst4
fdct8_1d -> fdct8
fadst8_1d -> fadst8
fdct16_1d -> fdct16
fadst16_1d -> fadst16
"_1d" suffix is redundant, so removing it. The same will happen with idct
in the next change sets.
Change-Id: Ibf421cd2f569146c6079269df7a31819c098265e
|
|
Change-Id: Ia21653a447040f1b472d21ebd19103b0558c4b16
|
|
The change is to better reflect the nature of the constants.
Change-Id: Icabac6e9bceefbdb3f03f8218f88ef75943c30fb
|
|
Change-Id: I76f440a917832c02d7a727697b225bac66b99f56
|
|
This commit fixed the potential overflow issue in the SSE2
implementation of 32x32 forward DCT. It resolved the corrupted
coded frames in the border of scenes.
Change-Id: If87eef2d46209269f74ef27e7295b6707fbf56f9
|
|
These serve as building blocks for SSE2 8x8 and 16x16 ADST/DCT
hybrid transform coding.
Change-Id: I4089a754c66e0c986f67d9b8ec4dfb9627ad430d
|
|
43,000 -> 5,750 cycles, about 7.5x faster.
Change-Id: Ibfd92821b9603f4ed9c256e0ececec14fa4565d0
|
|
This commit enables 8x8 DCT and hybrid transform unit tests. It
also tunes the forward hybrid transform rounding opertions for
more precise round-trip performance.
Change-Id: If05c1ce59d75d641b9c6c91527d02d3a6ef498c3
|
|
This commit makes use of dual fdct32x32 versions for rate-distortion
optimization loop and encoding process, respectively. The one for
rd loop requires only 16 bits precision for intermediate steps.
The original fdct32x32 that allows higher intermediate precision (18
bits) was retained for the encoding process only.
This allows speed-up for fdct32x32 in the rd loop. No performance
loss observed.
Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
|
|
The commit changed to use a new variant of Walsh-Hadamard Transform
by Tim Terriberry. This new variant has the best compression among a
number of variants that developed by Tim.
Change-Id: Icb3a88515463cfc644b17ca046fcd139db2557e9
|
|
Saves 1 add, 3 shifts (and a shift bias) per 1-D transform.
Change-Id: I1104bb1679fe342b2f9677df8a9cdc0cb9699e7d
|
|
Scalar path is about 1.3x faster (2.1% overall encoder speedup).
SSE2 path is about 5.0x faster (8.4% overall encoder speedup).
Change-Id: I360d167b5ad6f387bba00406129323e2fe6e7dda
|
|
VP9 preview bitstream 2, commit '868ecb55a1528ca3f19286e7d1551572bf89b642'
Conflicts:
vp9/vp9_common.mk
Change-Id: I3f0f6e692c987ff24f98ceafbb86cb9cf64ad8d3
|
|
|
|
Scalar path is about 1.5x faster (3.1% overall encoder speedup).
SSE2 path is about 7.2x faster (7.8% overall encoder speedup).
Change-Id: I06da5ad0cdae2488431eabf002b0d898d66d8289
|
|
The commit changed the name of files and function to remove obselete
reference to LLM and x8.
Change-Id: I973b20fc1a55149ed68b5408b3874768e6f88516
|
|
Scalar path is about 1.4x faster (4% overall encoder speedup).
SSE2 path is about 7x faster (13% overall encoder speedup).
Change-Id: I7e85d8225a914a74c61ea370210414696560094d
|
|
Fixing code style, using array lookup instead of switch statements for
forward hybrid transforms (in the same way as for their inverses).
Consistent usage of ROUND_POWER_OF_TWO macro in appropriate places.
Change-Id: I0d3822ae11f928905fdbfbe4158f91d97c71015f
|
|
|
|
The commit improves the 32x32 forward dct implementation:
1. change to use same constants and rounding as other forward dcts
2. select rounding to specifically minimize the roundtrip error, which
improved average 19/block to .77/block using 100000 random input.
Test showed a small but consistent gain on all test sets, about .15%
Change-Id: If0afd6a71880a522f60c1c234be0462092c2eb53
|
|
Pitch now means the number of elements, not the number of bytes.
Change-Id: Idb9f2f012e39b09d596a3cc1802305a80b7c13af
|
|
Increase the first stage dynamic range by 4 times, and reduce it
back with proper rounding before applying the second stage. Hence
it still fits in the given dynamic range and slightly improves
the key frame coding performance.
Change-Id: Ia4c5907446f20a95dc3de079c314b3ad1221d8aa
|
|
Rebased.
Remove the old matrix multiplication transform computation. The 16x16
ADST/DCT can be switched on/off and evaluated by setting ACTIVE_HT16
300/0 in vp9/common/vp9_blockd.h.
Change-Id: Icab2dbd18538987e1dc4e88c45abfc4cfc6e133f
|
|
This commit added pre/post scaling for first half of fDCT16x16 to
reduce error, by simulation of 100,000 blocks for random inputs,
the average sse reduced from 2.1/block to 0.0498/block.
also enabled tests for 16x16 fDCT and iDCT
Change-Id: Id2a95f0464c6dd4118797d456237ae90274c0f02
|
|
The commit added a final rounding choice for 8x8 forward dct to get
rid of a sign bias at DC position and improve the accuracry in term
of round trip error for 8x8 fDCT/iDCT.
This commit also enabled forward 8x8 dct test.
Change-Id: Ib67f99b0a24d513e230c7812bc04569d472fdc50
|
|
This patch includes 4x4, 8x8, and 16x16 forward butterfly ADST/DCT
hybrid transform. The kernel of 4x4 ADST is sin((2k+1)*(n+1)/(2N+1)).
The kernel of 8x8/16x16 ADST is of the form sin((2k+1)*(2n+1)/4N).
Change-Id: I8f1ab3843ce32eb287ab766f92e0611e1c5cb4c1
|
|
Change-Id: I7b7b8d4fda3a23699e0c920d727f8c15d37d43aa
|
|
Change-Id: I4f46d142c2a8d1e8a880cfac63702dcbfb999b78
|
|
Used same algorithm as others.
Change-Id: Ifdac560762aec9735cb4bb6f1dbf549e415c38a0
|
|
Removal of experiment to simplify code base for other
changes.
Change-Id: If0a33952504558511926ad212bc311fc2bffb19a
|
|
Use consistent algorithm.
Change-Id: Ib8484821ebc454b9d3380a3d6571798decd037f3
|