Age | Commit message (Collapse) | Author |
|
Change-Id: I7bc991abea383db1f86c1bb0f2e849837b54d90f
|
|
Change-Id: I3ef9a9648841374ed3cc865a02053c14ad821a20
|
|
~60-65% faster at the function level across block sizes
Change-Id: Iaf8cbe95731c43fdcbf68256e44284ba51a93893
|
|
~50-60% faster depending on the width
Change-Id: I9d007cfa10b9aaa2169c8c009d95522df6123a92
|
|
|
|
Clean up the forward 2D-DCT function names in vpx_dsp.
Change-Id: I3117978596d198b690036e7eb05fe429caf3bc25
|
|
This completes the forward transform functions layout refactoring.
Change-Id: I996fb0fb795f41e2040f7b21db985774098aedbd
|
|
~60-70% faster depending on the block size
Change-Id: Icdbaa9977a91a63cbcc6ead0cf19d5a2af7f27e1
|
|
This commit replaces vp9_idct.h with txfm_common.h in many SIMD
implementation files for precise file dependency.
Change-Id: If73dd726bb16537e7494f28538b0a169810f9756
|
|
This commit factors the 4x4, 8x8, and 16x16 2D-DCT forward
transform operations into vpx_dsp folder.
Change-Id: I084b117b79c0925edcbcabb93f62b9f4bf8dbe7d
|
|
BUG=https://code.google.com/p/webm/issues/detail?id=1023
Change-Id: I212a1d67b23ce3b5ce08800de369b25b9e375e7d
|
|
|
|
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
|
|
|
|
Change-Id: Ia5044d13c09685c401191fe87fbf90d36203aadd
|
|
Factor out the subtraction operator as common function.
Change-Id: I526e703477c6a290e0e3e3c8898f8bb1ca82779b
|
|
BUG=https://code.google.com/p/webm/issues/detail?id=1022
Change-Id: I510c3b0a70158fa2e4da554f7c5d7558021a6ddf
|
|
The only difference between the two was that the vp9 function allowed
for every step in the bilinear filter (16 steps) while vp8 only allowed
for half of those. Since all the call sites in vp9 (<< 1) the input, it
only ever used the same steps as vp8.
This will allow moving the subpel variance to vpx_dsp with the rest of
the variance functions.
Change-Id: I6fa2509350a2dc610c46b3e15bde98a15a084b75
|
|
subpel functions will be moved in another patch.
Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
|
|
this macro was used inconsistently and only differs in behavior from
DECLARE_ALIGNED when an alignment attribute is unavailable. this macro
is used with calls to assembly, while generic c-code doesn't rely on it,
so in a c-only build without an alignment attribute the code will
function as expected.
Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
|
|
Create a new component, vpx_dsp, for code that can be shared
between codecs. Move the SAD code into the component.
This reduces the size of vpxenc/dec by 36k on x86_64 builds.
Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
|
|
vestigial. replace instances with memset() which they already were being
defined to.
Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201
|
|
On Nexus 7 speed -6 saw ~18% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: I70ccdea0326750552ed946fb004507d6efe02d5c
|
|
On Nexus 7 speed -6 saw ~15% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: I4b2006b644c488f42bf06d8a22ef0e6120a96bf9
|
|
On Nexus 7 speed -6 saw ~30% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
BUG=https://code.google.com/p/webm/issues/detail?id=908
Change-Id: Id12af7d1883243c23e6692e898aea82299633d58
|
|
On Nexus 7 speed -5 got ~2%, -6 got ~15%, -7 and -8 got ~30%
increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: I83246d63b96674d170098a572fa4fe28a05aaf51
|
|
|
|
The 16 bit sum vector was overflowing.
Change-Id: I0fdf38e832ee99457ec8680a92691a6175ff8c3f
|
|
On Nexus 7 speed -5, -6, -7, and -8 saw about a 1% increase
in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 1.5%
increase in perf for 720p.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: Ibf17ebfd952a6aec941719bd8306df8ec4574bee
|
|
|
|
On Nexus 7 speed -5, -6, -7, and -8 saw about a 15% increase
in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 10%
increase in perf for 720p.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: I2fa5315845e3021c9a6e2ea47e52e68b398d8334
|
|
Saves 5 instructions on 8x8 and 16x16 and 8 instructions
on 32x32, when compiled with 4.9.
Change-Id: Id3da613a36a9d27d8c5169c59ba45d247c920c6c
|
|
Add optimized Neon functions of:
vp9_variance32x64
vp9_variance64x32
vp9_variance64x64
On Nexus 7 speed -5 and -6 saw about a 4% increase in perf.
Speeds -7 and -8 saw about a 6% increase in perf.
Tested on Nexus 7, built with ndk r10d, gcc 4.9.
Change-Id: I5a81f13c9897eb927fa39662530f5524a0f768fa
|
|
This reverts commit 9946ee23e0a4c158e26a505b162a072f81b8a3be.
Fix the ssse3 asm function.
Change-Id: I07f77a63aa98087626e45c4e87aa5dcafc0b0b07
|
|
This reverts commit e9b586e21bb899e247346e82bccf5afb42604910.
Change-Id: I5b36e6727da6c05278d97e2c37b80c109f79bed4
|
|
zbin extra / zbin_oq_value was widely passed around,
hence removal touches a lot of code.
Change-Id: Idc94359735b60c38a160e4385ae09d5ca8b6b8e5
|
|
Eliminated instructions by using better neon instructions
and rearranging the loop.
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~1.0%.
Change-Id: I6b1700e79318f647ea67ef25e954c308932950ec
|
|
|
|
vp9_variance8x8(), and vp9_get8x8var().
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~1.2%.
Change-Id: I8a66ac2a0f550b407caa27816833bdc563395102
|
|
|
|
Change-Id: I3be8911121ef9a5f39f6c1a2e28f9e00972e0624
|
|
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~3.2%
Change-Id: I8862497264142171b7efc32df1a67714a23539f4
|
|
vp9_variance32x32(), and vp9_get32x32var().
Change-Id: I8137e2540e50984744da59ae3a41e94f8af4a548
|
|
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~12.4%
Change-Id: Id29d215acf58bb108489e218a259adf74b4768d7
|
|
vp9_variance16x16(), and vp9_get16x16var().
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~16.7%.
Change-Id: Ib163aa99f56e680194aabe00dacdd7f0899a4ecb
|
|
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~3.7%.
Change-Id: I428c72c40df82c6d537955e320a8debf99343004
|
|
and vp9_sad16x16_neon()
On a Nexus 7, vpxenc (in realtime mode, speed -6)
reported a performance improvement of ~17%.
Change-Id: I91e070cde2973451083d3f3d63b49b7886de9a85
|