diff options
author | Han Shen <shenhan@google.com> | 2017-07-12 12:56:19 -0700 |
---|---|---|
committer | Han Shen <shenhan@google.com> | 2017-07-19 13:59:32 -0700 |
commit | b72d3e8a25720494cef1911f9b9dc3e2c9090323 (patch) | |
tree | 6e9b516612349fe39180ce77961fd22ac332aefa | |
parent | 89a116f4cbcb2340ea51c7484a7503005823fed7 (diff) | |
download | libvpx-b72d3e8a25720494cef1911f9b9dc3e2c9090323.tar libvpx-b72d3e8a25720494cef1911f9b9dc3e2c9090323.tar.gz libvpx-b72d3e8a25720494cef1911f9b9dc3e2c9090323.tar.bz2 libvpx-b72d3e8a25720494cef1911f9b9dc3e2c9090323.zip |
Earmark extra space for VSX.
Backend specific optimization for PPC VSX reads 16 bytes, whereas arm neon /
sse2 only reads <= 8 bytes. Although the extra bytes read are actually never
used, this is not a warrant for groping around. Fixed by allocating more when
building for VSX. This is reported by asan.
Also note - PPC does have assembly that loads 64-bit content from memory - lxsdx
loads one 64-bit doubleword (whereas lxvd2x loads two 64-bit doubleword) from
memory. However, we only have "vec_vsx_ld" builtins that mapped to lxvd2x, no
builtins to lxsdx. The only way to access lxsdx is through inline assembly,
which does not fit well in the origin paradigm.
Refer:
vsx:
vpx_tm_predictor_4x4_vsx @ third_party/libvpx/git_root/vpx_dsp/ppc/intrapred_vsx.c
neon:
vpx_tm_predictor_4x4_neon @ third_party/libvpx/git_root/vpx_dsp/arm/intrapred_neon_asm.asm
sse2:
tm_predictor_4x4 @ third_party/libvpx/git_root/vpx_dsp/x86/intrapred_sse2.asm
BUG=b/63112600
Tested:
asan tests passed.
Change-Id: I5f74b56e35c05b67851de8b5530aece213f2ce9d
-rw-r--r-- | vp8/common/reconintra.c | 8 | ||||
-rw-r--r-- | vp8/common/reconintra4x4.c | 12 |
2 files changed, 19 insertions, 1 deletions
diff --git a/vp8/common/reconintra.c b/vp8/common/reconintra.c index 986074ec7..8e2094da8 100644 --- a/vp8/common/reconintra.c +++ b/vp8/common/reconintra.c @@ -71,8 +71,16 @@ void vp8_build_intra_predictors_mbuv_s( unsigned char *uleft, unsigned char *vleft, int left_stride, unsigned char *upred_ptr, unsigned char *vpred_ptr, int pred_stride) { MB_PREDICTION_MODE uvmode = x->mode_info_context->mbmi.uv_mode; +#if HAVE_VSX + /* Power PC implementation uses "vec_vsx_ld" to read 16 bytes from + uleft_col and vleft_col. Play it safe by reserving enough stack + space here. */ + unsigned char uleft_col[16]; + unsigned char vleft_col[16]; +#else unsigned char uleft_col[8]; unsigned char vleft_col[8]; +#endif int i; intra_pred_fn fn; diff --git a/vp8/common/reconintra4x4.c b/vp8/common/reconintra4x4.c index 7852cf9da..64d33a287 100644 --- a/vp8/common/reconintra4x4.c +++ b/vp8/common/reconintra4x4.c @@ -40,7 +40,15 @@ void vp8_intra4x4_predict(unsigned char *above, unsigned char *yleft, int left_stride, B_PREDICTION_MODE b_mode, unsigned char *dst, int dst_stride, unsigned char top_left) { - unsigned char Aboveb[12], *Above = Aboveb + 4; +/* Power PC implementation uses "vec_vsx_ld" to read 16 bytes from + Above (aka, Aboveb + 4). Play it safe by reserving enough stack + space here. Similary for "Left". */ +#if HAVE_VSX + unsigned char Aboveb[20]; +#else + unsigned char Aboveb[12]; +#endif + unsigned char *Above = Aboveb + 4; #if HAVE_NEON // Neon intrinsics are unable to load 32 bits, or 4 8 bit values. Instead, it // over reads but does not use the extra 4 values. @@ -50,6 +58,8 @@ void vp8_intra4x4_predict(unsigned char *above, unsigned char *yleft, // indeed read, they are not used. vp8_zero_array(Left, 8); #endif // VPX_WITH_ASAN +#elif HAVE_VSX + unsigned char Left[16]; #else unsigned char Left[4]; #endif // HAVE_NEON |