Age | Commit message (Collapse) | Author |
|
|
|
This reverts commit d9dce2f48eed1368a44c368fa87a506bd89ffec5.
Appears to be failing the SixtapPredict tests in some configurations and possibly test vectors as well.
Change-Id: Ica6aa83ebac47d0a76e451846e7da67b1c17a7d7
|
|
This function was removed when clang started introducing alignment hints
which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail:
https://llvm.org/bugs/show_bug.cgi?id=24421
The load has been rendered safe with an implementation ~indiscernible
performance-wise that uses _u8 and over-reads just a touch.
It is still ~5x faster than C in the unaligned case and doing both
filters.
BUG=webm:892
BUG=webm:1273
Change-Id: Icf7167189391b46202f47233bb585c24c42bcc36
|
|
This function was removed when clang started introducing alignment hints
which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail:
https://llvm.org/bugs/show_bug.cgi?id=24421
The load has been rendered safe with an implementation ~indiscernible
performance-wise that uses _u8 and over-reads just a touch.
The store, when unaligned, has a version that is ~25% slower but safe
when xoffset = 0 (second pass filter only). When the first pass filter
(or both) are in play, the new version is almost identical in speed.
Worst case performance (both filters, unaligned stores) is roughly 3-4x
faster than C.
BUG=webm:817
BUG=webm:1273
Change-Id: I1e490e94453e0872151fe0dafb05557463f6247d
|
|
Change-Id: I1fa81cc9cabf362a185fc3a53f1e58de533a41e5
|
|
Applied the following regex :
search for: (for.*\(.*;.*;) ([a-zA-Z_]*)\+\+\)
replace with: \1 ++\2)
This misses some for loops:
ie : for (mb_col = 0; mb_col < oci->mb_cols; mb_col++, mi++)
Change-Id: Icf5f6fb93cced0992e0bb71d2241780f7fb1f0a8
|
|
Change-Id: I7605b6678014a5426ceb45c27b54885e0c4e06ed
|
|
These implementations rely on casting the pointers to load the data.
Clang implemented optimizations which automatically add alignment hints
to such loads. The 4x4 filters do not guarantee the necessary alignment
so the resulting assembly is broken.
https://llvm.org/bugs/show_bug.cgi?id=24421
BUG=webm:817
BUG=webm:892
Change-Id: I608885299f1f86ff83653b65e0e40d0ae87fb3fe
|
|
I've added a few new functions (d45e, d63e, he, ve) to cover the
filtered h/v 4x4 predictors that are vp8-specific, the "correct"
d45 with the correctly filtered bottom-right pixel (as opposed to
the unfiltered version in vp9), and the "broken" d63 with weirdly
filtered bottom-right pixels (which is correctly filtered in vp9).
There may be a minor performance impact on all systems because we
have to do an extra copy of the Above pixel array to incorporate
the topleft pixel in the same array (thus fitting the vpx_dsp API).
In addition, armv6 will have a more serious performance impact b/c
I removed the armv6/vp8-specific assembly. I'm not sure anyone
cares...
Change-Id: I7f9e5ebee11d8e21aca2cd517a69eefc181b2e86
|
|
Change-Id: I936c2430c3c5b1e0ab5dec0a20110525e925b5e4
|
|
Change-Id: I2000820e0c04de2c975d370a0cf7145330289bb2
|
|
Avoid conflict with vpx_dsp version
Change-Id: I041b1532a9276400a5547de8dfed1de43ad4e83d
|
|
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
|
|
take 2. localize the function parameter to actually remove the warning
Change-Id: I23c02061b5e21b0b75bd33c26062d1e531df7b92
|
|
the vector used in vld*_lane_* should be initialized before use
Change-Id: Idce95354737915f6fb4e6b5e8980a050e953036d
|
|
the vector used in vld*_lane_* should be initialized before use
Change-Id: I6b791088479fec3bc021ca75cc2af5adcc39d954
|
|
only uint8 is required; each use only loads one value as a uint8
quiets a few type conversion warnings
Change-Id: I03dc0dc0eb01ac23a6e8673daa2b77c6c57bf1b0
|
|
subpel functions will be moved in another patch.
Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
|
|
this macro was used inconsistently and only differs in behavior from
DECLARE_ALIGNED when an alignment attribute is unavailable. this macro
is used with calls to assembly, while generic c-code doesn't rely on it,
so in a c-only build without an alignment attribute the code will
function as expected.
Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
|
|
Create a new component, vpx_dsp, for code that can be shared
between codecs. Move the SAD code into the component.
This reduces the size of vpxenc/dec by 36k on x86_64 builds.
Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
|
|
vestigial. replace instances with memset() which they already were being
defined to.
Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201
|
|
In vp8_sub_pixel_variance8x8_neon the temp2 buffer is only initialized
to kHeight8 * kWidth8. However, in the case that xoffset != 0 and
yoffset == 0, var_filter_block2d_bil_w8 is called with output_width
kHeight8PlusOne.
Thanks to cmugurel for diagnosing and yulius for the patch.
Change-Id: Ib71ffd96ffad963c92b8b7ca23f303942785b8e0
https://code.google.com/p/webrtc/issues/detail?id=4190
|
|
The version check was incorrectly matching some versions of clang
which reported as gcc 4.2
Change-Id: I686d3576e71883fe1463206b56ab5e2aa9bb68a8
|
|
fix compiling warning.
Change-Id: If8706a9046436f704c597e4275a6810c76ba7daa
|
|
NEON intrinsics.
Change-Id: I975e5eac16f8b623ff589f0ec072cdaff2183b04
|
|
make bifilter4_coeff[][] uint8_t, no values exceed this range and
they're loaded with vdup_n_u8().
Change-Id: I921983e9edd828d29820e40ac30a7801dbe0fb4f
|
|
|
|
This reverts commit 677fb5123e0ece1d6c30be9d0282e1e1f594a56d
Compiles with 4.6.
Change-Id: I7f87048911b6bc28a61741d95501fa45ee97b819
|
|
Use the right return values - vget_low_s64 returns int64x1_t, not
a normal int64_t.
Also make __builtin_prefetch a no-op on MSVC for this file.
Change-Id: I4d2fce01d0ba106b98d3d53b137803119c2c2c08
|
|
vp8_build_intra_predictors_mbuv_s().
This patch replaces the assembly version with an intrinsic
version.
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~2.6%.
Change-Id: I9ef65bad929450c0215253fdae1c16c8b4a8f26f
|
|
Add vp8_subpixelvariance_neon.c
- vp8_sub_pixel_variance16x16_neon_func
- vp8_variance_halfpixvar16x16_h_neon
- vp8_variance_halfpixvar16x16_v_neon
- vp8_variance_halfpixvar16x16_hv_neon
- vp8_sub_pixel_variance8x8_neon
Change-Id: I3e5d85b2eafc26be0eef6a777789b80e4579257b
Signed-off-by: James Yu <james.yu@linaro.org>
|
|
reverts commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb.""
|
|
This reverts commit 928ff03889dadc3f63883553b443c08e625b4885
Compiles with 4.6 now.
Change-Id: Ib455da1098bb0e0623248be07579882a425fcbd1
|
|
commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb."
This reverts commit 920f803f2e2f41395311f96fec1b4a0c1b2b631a
Change-Id: I410d9036214a1b18427cca70b4bc6d8239740737
|
|
quiets uninitialized warnings on the first load.
Change-Id: I58a5af337087d96b4eaea8991a0f85c4ba58aebe
|
|
quiets uninitialized warnings on the first load.
Change-Id: Ied9b03928537a9ed2cd414b9e8a0be00191b0f32
|
|
These optimizations are currently disabled.
Change-Id: I19c58c9cb82d017638b86196641b9e001dfa798b
|
|
With the intrinsics it is no longer necessary to have a stub/helper
function.
Change-Id: I3695961c3c94f1bb750d3b7b29716e509ebba482
|
|
Allow disabling the more generic NEON code.
Use filtered option to disable rtcd code.
Change-Id: Icb4500c1a2bac16eed3c5e3ec0c35e92e6bbbb9f
|
|
This reverts commit 81ad047ee57ecb0e2c1ee4dcebda54a44ea54ae9.
Revert "VP8 for ARMv8 by using NEON intrinsics 15"
This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb.
This exposes a bug in gcc 4.9 regarding register allocation. Will reland
when 4.9 is fixed.
Change-Id: I2d8a04e4edde93719280e41550f4c0765608ec4d
|
|
Allow selectively building just the intrinsics for armv8
Change-Id: I2f29b2e4508b8b8e5649c2906b3159ad1d4ec477
|
|
|
|
|
|
Both GCC and RVCT/ARMCC support __builtin_prefetch, but MSVC
doesn't.
Change-Id: I44e1eecead61bc88d8fdfd3fef03d76d4f5afe08
|
|
This fixes building with MSVC.
Change-Id: I763ba8855c8083d82c8b477d3a297e310e93a335
|
|
This reverts commit c500fc22c1bb2a3ae5c318bfb806f7e9bd57ce25
There is an issue with gcc 4.6 in the Android NDK:
loopfiltersimpleverticaledge_neon.c: In function 'vp8_loop_filter_bvs_neon':
loopfiltersimpleverticaledge_neon.c:176:1: error: insn does not satisfy its constraints:
Change-Id: I95b6509d12f075890308914cc691b813d2e5cd9f
|
|
This reverts commit a5d79f43b963ced59b462206faf3b7857bdeff7b
There is an issue with gcc 4.6 in the Android NDK:
loopfilter_neon.c: In function 'vp8_loop_filter_vertical_edge_y_neon':
loopfilter_neon.c:394:1: error: insn does not satisfy its constraints:
Change-Id: I2b8c6ee3fa595c152ac3a5c08dd79bd9770c7b52
|
|
|
|
|
|
|