Age | Commit message (Collapse) | Author |
|
Change-Id: I20c7b42631b579fade6cf7ebf6d4c69b2fcb5e5e
|
|
This commit moves the module inverse transform functions from vp9
to vpx_dsp folder. The hybrid transform wrapper functions stay in
the vp9 folder, since it involves codec-specific data structures.
Change-Id: Ib066367c953d3d024c73ba65157bbd70a95c9ef8
|
|
Clean up the forward 2D-DCT function names in vpx_dsp.
Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25
|
|
This completes the forward transform functions layout refactoring.
Change-Id: I996fb0fb795f41e2040f7b21db985774098aedbd
|
|
Move the 32x32 2D-DCT implementations from vp9/ to vpx_dsp/.
Change-Id: Id3980696f8b69906ff7a59ff9fb2b9013d60047d
|
|
Change-Id: Iba03852ce778c956200818e3473cfb2b48cf8d8e
|
|
This commit replaces vp9_idct.h with txfm_common.h in many SIMD
implementation files for precise file dependency.
Change-Id: If73dd726bb16537e7494f28538b0a169810f9756
|
|
Separate the common coefficient constant into vpx_dsp/txfm_common.h.
Move the SSE2 macro definitions to vpx_dsp/x86/txfm_common_sse2.h.
This clears the use case of vp9_idct.h in vpx_dsp folder.
Change-Id: I319735a2abf42888e5080ac14cfbcde34be7b121
|
|
Change-Id: I283d364a4e65ca9bf6ff581da1d0b498433c5402
|
|
This commit factors the 4x4, 8x8, and 16x16 2D-DCT forward
transform operations into vpx_dsp folder.
Change-Id: I084b117b79c0925edcbcabb93f62b9f4bf8dbe7d
|
|
Remove redundant file dependency.
Change-Id: I4708218157617dabe00e2e33e237be2838c16603
|
|
The SSE2 version high bit-depth forward hybrid transforms are
essentially using the C functions via cross referencing to 1-D
functions in vp9_dct.c. This commit unifies the two versions and
removes the unnecessary dependency.
Change-Id: Ib4d0702a138f8daf7d0bd97c141ee7088f293765
|
|
The following quantization functions were moved:
vp9_quantize_b
vp9_quantize_b_32x32
vp9_highbd_quantize_b
vp9_highbd_quantize_b_32x32
vp9_quantize_dc
vp9_quantize_dc_32x32
vp9_highbd_quantize_dc
vp9_highbd_quantize_dc_32x32
The purpose of doing that was to allow these functions to be shared
by multiple codecs.
Change-Id: Id8ab939f283353cdd07bd930d47db3d932a5d87f
|
|
The clamp calls with INT32_MIN and INT32_MAX have no effect at all on
int values passed in, therefore this commit removes those effectless
clamps and also adds more const intermediate results to make the code
more readable.
Change-Id: I66d8811f58bb74ec31cbec9a6c441983a662352e
|
|
Change-Id: I1bab0c104df2ec4825d050cd516e26ab635a7b3e
|
|
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
|
|
Factor out the subtraction operator as common function.
Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b
|
|
This commit fixes a potential integer overflow issue in function
hadamard_16x16. It adds corresponding dynamic range comment.
Change-Id: Iec22f3be345fb920ec79178e016378e2f65b20be
|
|
The only difference between the two was that the vp9 function allowed
for every step in the bilinear filter (16 steps) while vp8 only allowed
for half of those. Since all the call sites in vp9 (<< 1) the input, it
only ever used the same steps as vp8.
This will allow moving the subpel variance to vpx_dsp with the rest of
the variance functions.
Change-Id: I6fa2509350a2dc610c46b3e15bde98a15a084b75
|
|
subpel functions will be moved in another patch.
Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
|
|
this file shouldn't be built directly, it is included in vp9_dct_sse2.c
to create a non-high-bitdepth and a high-bitdepth version
silences missing prototype warnings for the unused FDCT* functions
Change-Id: Ide6ff8c24ab31bdb0f833260505ae33660a1ad5b
|
|
this file shouldn't be built directly, it is included in vp9_dct_sse2.c
to create a non-high-bitdepth and a high-bitdepth version
silences missing prototype warnings for the unused FDCT32x32* functions
Change-Id: I0e38f16dae5ea1728de184ee2c89287d48675c51
|
|
this file shouldn't be built directly, it is included in vp9_dct_avx2.c
to create a non-high-bitdepth and a high-bitdepth version
silences missing prototype warnings for the unused FDCT32x32* functions
Change-Id: I4c19935c0e035b393be513bde735e9a78064a494
|
|
silences a missing declaration warning
Change-Id: I59a34e1a1377cf3529b678d7ec0122bd43ab1bf1
|
|
+ include vp9_rtcd.h
silences missing prototype warnings
Change-Id: I77902f07a454029baad4fe5fe6fc37c65644e6f7
|
|
silences missing prototype warnings
Change-Id: I773b6a6b5bd7c57db18c3b17c519534f80e131de
|
|
With the sad functions, and hopefully the variance functions soon,
moving to the vpx_dsp location, place the defines used in the
reference C code in a common location.
Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca
|
|
this macro was used inconsistently and only differs in behavior from
DECLARE_ALIGNED when an alignment attribute is unavailable. this macro
is used with calls to assembly, while generic c-code doesn't rely on it,
so in a c-only build without an alignment attribute the code will
function as expected.
Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
|
|
Create a new component, vpx_dsp, for code that can be shared
between codecs. Move the SAD code into the component.
This reduces the size of vpxenc/dec by 36k on x86_64 builds.
Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
|
|
vestigial. replace instances with memset() which they already were being
defined to.
Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201
|
|
vestigial. replace instances with memcpy() which they already were being
defined to.
Change-Id: Icfd1b0bc5d95b70efab91b9ae777ace1e81d2d7c
|
|
This reverts commit 004b9d83e37d355f590a6976a27b7b845d19a869
Change-Id: I2f2d0bdb9368c2c07f1d29a69cd461267a3a8743
|
|
This reverts commit eb8c667570aa83134c7db0690de9dbdde4d90291.
The patch caused mismatch while using multi-threads.
Change-Id: Icd646340af25b5d91e32f03ed3ea212e00e3e0be
|
|
Force split on 16x16 block (to 8x8) based on the minmax over the 8x8 sub-blocks.
Also increase variance threshold for 32x32, and add exit condiiton in choose_partition
(with very safe threshold) based on sad used to select reference frame.
Some visual improvement near moving boundaries.
Average gain in psnr/ssim: ~0.6%, some clips go up ~1 or 2%.
Encoding time increase (due to more 8x8 blocks) from ~1-4%, depending on clip.
Change-Id: I4759bb181251ac41517cd45e326ce2997dadb577
|
|
|
|
It uses about 10% less CPU cycles than the SSE2 intrinsic
implementation.
Change-Id: I91017c0c068679a214b98cdd4cff3a6facfb7499
|
|
|
|
|
|
Change-Id: If0ca8b25b4800d4336e6cbc97194cd9b01c5b5a3
|
|
|
|
Use 6 xmms instead of 8.
Change-Id: If976ad85d09191d2fb0565399d690f2869dbbcc7
|
|
This commit separates Hadamard transform/quantization operations
from rate and distortion computation in block_yrd. This allows one
to skip SATD computation when all transform blocks are quantized
to zero. It also uses a new block error function that skips
repeated computation of sum of squared residuals. It reduces the
CPU cycles spent on block error calculation in block_yrd by 40%.
Change-Id: I726acb2454b44af1c3bd95385abecac209959b10
|
|
This commit allows the quantizer to compare the AC coefficients to
the quantization step size to determine if further multiplication
operations are needed. It makes the quantization process 20% faster
without coding statistics change.
Change-Id: I735aaf6a9c0874c82175bb565b20e131464db64a
|
|
This reduces the 8x8 Hadamard transform cycles by 20%.
Change-Id: If34c5e02f3afa42244c6efabe121f7cf5d2df41b
|
|
This commit fixes the SSE2 version 8x8 Hadamard transform
alignment and makes it consistent with the C version.
Change-Id: I1304e5f97e0e5ef2d798fe38081609c39f5bfe74
|
|
This commit replaces the 16x16 2D-DCT transform with Hadamard
transform for RTC coding mode. It reduces the CPU cycles cost
on 16x16 transform by 5X. Overall it makes the speed -6 encoding
speed 1.5% faster without compromise on compression performance.
Change-Id: If6c993831dc4c678d841edc804ff395ed37f2a1b
|
|
This commit uses Hadamard transform based rate-distortion cost
estimate for rtc coding mode decision. It improves the compression
performance of speed -6 for many hard clips at lower bit-rates.
For example, 5.5% for jimredvga, 6.7% for mmmoving, 6.1% for
niklas720p. This will introduce extra encoding cycle costs at
this point.
Change-Id: Iaf70634fa2417a705ee29f2456175b981db3d375
|
|
add an assert to validate 'in' array size
Change-Id: Ie5a24275c066d9dd59714f6104510abbd4850dc5
|
|
add an assert to validate 'in' array size
Change-Id: Ib72946a86f34e1ce8a69954e8e3e4fe1a0f18a91
|
|
Move the scaling factor outside column projection. This avoids
repeated calculation of the same scaling factor. Profiling shows
that the percentage of vp9_int_pro_col_sse2 of overall cycles
goes from 2.29% down to 1.88%.
Change-Id: I5ac4e324ab2d7f33ba2de66dd2a12e04e04dfd66
|