Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
Minor code cleanup.
Change-Id: I47c1f794842d4570bb39cfd23b80f54f5606bba6
|
|
|
|
|
|
|
|
When the codec in VBR (or cq) mode hits its max q limits and is
struggling to hit a target bandwidth, the bit target per frame collapses.
In the first instance normal frames cap out at the maximum allowed
Q and then the ARF and GFs do the same. This latter behavior is not
generally desirable as GFs and ARFs are only effective from a quality
and data rate perspective if they have at lease some level of -Q delta
compared to the surrounding frames.
In this patch I define a separate max Q for GFs and ARFs that is
derived from but somewhat lower than that defined for normal frames.
In effect there is a minimum Q delta that will always be available for
GFs and ARFs regardless of the target rate and MAXQ setting.
This may of course mean that the absolute lowest rate obtainable for
a given clip is somewhat higher.
Change-Id: I268868b28401900d0cd87e51e609cd3b784ab54a
|
|
For VBR coding disable the recode loop for speeds > 0.
Results pending.
Change-Id: I2cd9a87c3fcbe39c05b954798d0671a4ca62c37f
|
|
Change-Id: I5222e3db2627a3a9f7fc34f2ab4554aa5807ed51
|
|
We have two SSE2-optimized functions for idct4_1d:
vp9_idct4_1d_sse2 <-- removing this one
idct4_1d_sse2
vp9_idct4_1d_sse2 was used only by the following functions which already
have SSE2 optimized variants:
vp9_idct4x4_16_add_c -> vp9_idct4x4_16_add_see2
idct8_1d -> vp9_idct8x8_{16, 10, 1}_see2
vp9_short_iht4x4_add_c -> vp9_short_iht4x4_add_see2
Change-Id: Ib0a7f6d1373dbaf7a4a41208cd9d0671fdf15edb
|
|
byte version of ronalds d207 ssse3 optimizations
(commit: f891f84d3ba9345b0074e682f0fea09b8ddf4f1e)
Change-Id: If15f71a589ea16f78ac86a501b0c5c6231dc9af1
|
|
|
|
|
|
|
|
To ensure fast encoding/decoding on devices without ssse3 support,
SSE2 optimization of sub-pixel filters was done. Test using 1080p
clip showed the decoder speeds were ~70fps with ssse3 filters, ~60fps
with sse2 filters, and ~15fps with c filters.
Change-Id: Ie2088f87d83a889fba80a613e4d0e287aadd785c
|
|
|
|
|
|
|
|
Change-Id: Ifef756a3a91423bb9f5411f06fa092027be21ecf
|
|
Renames:
fdct4_1d -> fdct4
fadst4_1d -> fadst4
fdct8_1d -> fdct8
fadst8_1d -> fadst8
fdct16_1d -> fdct16
fadst16_1d -> fadst16
"_1d" suffix is redundant, so removing it. The same will happen with idct
in the next change sets.
Change-Id: Ibf421cd2f569146c6079269df7a31819c098265e
|
|
Renames:
vp9_short_idct32x32_add -> vp9_idct32x32_1024_add
vp9_short_idct32x32_1_add -> vp9_idct32x32_1_add
vp9_idct_add_32x32 -> vp9_idct32x32_add
Change-Id: Id85306f5814bac6c47463a6b5901a93082510666
|
|
This commit re-designs the per transformed block rate-distortion
costs tracking buffers. It removes redundant buffer usage, makes
the needed context memory allocation per VP9_COMP instance and
reuses the same buffer sets inside the rate-distortion optimization
search loop, thereby avoiding repeatedly requiring memory space.
It reduces speed 0 runtime:
bus at 2000 kbps from 166763ms to 158967ms,
football at 600 kbps from 246614ms to 234257ms.
Both about 5% speed-up. Local tests suggest about 2% to 5% speed-up
for speed 1 and 2 settings. This does not change compression
performance.
Change-Id: I363514c5276b5cf9a38c7251088ffc6ab7f9a4c3
|
|
Change-Id: Id5e31833a0ef40de9f64c2f5674af7083233bf14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Increases these parameters.
There is a small efficiency gain.
Change-Id: Ie5f0ddb39c907d335e0dafa5eb112365a81f4542
derfraw300: +0.091%
stdhdraw250: +0.238%
|
|
Change-Id: I7231589bda71d0d23c730283febd5bb58585a0da
|
|
|
|
The intra mode distortion adjustment for skip_encode feature was
broken in the refactoring cc91851. This commit fixes it and tunes
the distortion models used therein.
Change-Id: I0d676e82f8e855536a90cf9b3e3fdefafcd886c6
|
|
|
|
snprintf is not supported by MSVC, the commit replace it with the msvc
variant _snprintf to enable build.
Change-Id: I686943a78c289bae6b486a5e75effad5f86c24de
|
|
|
|
Some minor cleanups in preparation for experimentation with
some encode parameters and thresholds
Change-Id: I449d66da97eae0a7acdf4aae374e2f9111342056
|
|
|
|
Use b_mode_info to store the inter prediction mode of sub8x8 block,
in replacement of the use of partition_info. Remove redundant buffer
update for partition_info. For bus_cif at 2000 kbps, this seem to make
speed 0 about 1% faster.
Change-Id: Id1b3be45e75a24fb4b42335ac480c23e440978f6
|
|
Change-Id: Ic31b4ef85e65070b4f8b9f26e068ccfaae00c4f0
|
|
|
|
Change-Id: Id1fde9920d60c6991a8ef6de5103ae3e578312ed
|
|
|
|
When all coefficients are zeros, skip the corresponding 1-D inverse
transform. This practice has been used in the SSE2 implementation of
inverse 32x32 DCT. This commit imports this algorithm into the C code.
Change-Id: I0f58bfcb183a569fab85d524d5d9cf8ae8653f86
|
|
We already have itxm_add member in MACROBLOCKD structure. Both
inv_txm4x4_1_add and inv_txm4x4_add are just its special cases for
different eob values. But eob logic is already implemented in
vp9_iwht4x4_add and vp9_idct4x4_add (that's why also removing
inverse_transform_b_4x4_add).
Change-Id: I80bec9b6f7d40c5e5033c613faca5c819c3e6326
|
|
|
|
|