summaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
authorlevytamar82 <levytamar82@gmail.com>2013-11-21 15:49:29 -0700
committerlevytamar82 <levytamar82@gmail.com>2014-02-14 15:08:42 -0700
commit3068d7d94428d32e0c33a5d3061ba8e362838a41 (patch)
tree945a47822c6a8db9123b3db4ab6dcfc7de44a9a8 /README
parentbb07de7ccea40c145548e8d49752bcccdd08c248 (diff)
downloadlibvpx-3068d7d94428d32e0c33a5d3061ba8e362838a41.tar
libvpx-3068d7d94428d32e0c33a5d3061ba8e362838a41.tar.gz
libvpx-3068d7d94428d32e0c33a5d3061ba8e362838a41.tar.bz2
libvpx-3068d7d94428d32e0c33a5d3061ba8e362838a41.zip
SSSE3 convolution optimization
Optimizing all SSSE3 assembly for convolution: 1. vp9_filter_block1d4_h8_sse2 2. vp9_filter_block1d8_h8_sse2 3. vp9_filter_block1d16_h8_sse2 4. vp9_filter_block1d4_v8_sse2 5. vp9_filter_block1d8_v8_sse2 6. vp9_filter_block1d16_v8_sse2 my optimization include: -processing 2x8 elements in one 128 bit register instead of processing 8 elements in one 128 bit register. -removing unecessary loads. This optimization gives between 2.4% user level gain for 480p input and 1.6% user level gain for 720p. This Optimization is done only for 64 bit Change-Id: Ic07fce2f9360329b4f2d956efda1480ae958766b
Diffstat (limited to 'README')
0 files changed, 0 insertions, 0 deletions