diff options
author | Anupam Pandey <anupam.pandey@ittiam.com> | 2023-04-18 14:46:56 +0530 |
---|---|---|
committer | Anupam Pandey <anupam.pandey@ittiam.com> | 2023-05-05 15:55:16 +0530 |
commit | 255ee1888589aa15ae909b992fe123c0358b1730 (patch) | |
tree | d46b2799a29b05c325497d01d2b44b33d456ff1d /vp9/decoder/vp9_decoder.c | |
parent | 24802201acd7dfa15928bcc47c1e270e7db5afac (diff) | |
download | libvpx-255ee1888589aa15ae909b992fe123c0358b1730.tar libvpx-255ee1888589aa15ae909b992fe123c0358b1730.tar.gz libvpx-255ee1888589aa15ae909b992fe123c0358b1730.tar.bz2 libvpx-255ee1888589aa15ae909b992fe123c0358b1730.zip |
Add AVX2 intrinsic for idct16x16 and idct32x32 functions
Added AVX2 intrinsic optimization for the following functions
1. vpx_idct16x16_256_add
2. vpx_idct32x32_1024_add
3. vpx_idct32x32_135_add
The module level scaling w.r.t C function (timer based) for
existing (SSE2) and new AVX2 intrinsics:
Scaling
Function Name SSE2 AVX2
vpx_idct32x32_1024_add 3.62x 7.49x
vpx_idct32x32_135_add 4.85x 9.41x
vpx_idct16x16_256_add 4.82x 7.70x
This is a bit-exact change.
Change-Id: Id9dda933aa1f5093bb6b35ac3b8a41846afca9d2
Diffstat (limited to 'vp9/decoder/vp9_decoder.c')
-rw-r--r-- | vp9/decoder/vp9_decoder.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/vp9/decoder/vp9_decoder.c b/vp9/decoder/vp9_decoder.c index 7db8ed72d..92cd91f1e 100644 --- a/vp9/decoder/vp9_decoder.c +++ b/vp9/decoder/vp9_decoder.c @@ -87,7 +87,7 @@ void vp9_dec_alloc_row_mt_mem(RowMTWorkerData *row_mt_worker_data, row_mt_worker_data->num_sbs = num_sbs; for (plane = 0; plane < 3; ++plane) { CHECK_MEM_ERROR(cm, row_mt_worker_data->dqcoeff[plane], - vpx_memalign(16, dqcoeff_size)); + vpx_memalign(32, dqcoeff_size)); memset(row_mt_worker_data->dqcoeff[plane], 0, dqcoeff_size); CHECK_MEM_ERROR(cm, row_mt_worker_data->eob[plane], vpx_calloc(num_sbs << EOBS_PER_SB_LOG2, |