Age | Commit message (Collapse) | Author |
|
Also use the _mm_broadcastsi128_si256 intrisic for
Apple clang versions 4.[012]
https://bugzilla.mozilla.org/show_bug.cgi?id=1085607
https://code.google.com/p/webm/issues/detail?id=1082
Change-Id: I6bc821d8163387194ef663e94bfed91fa7281d88
|
|
Change-Id: I761256a8100d83abf1b937f3739580237e3fad2a
|
|
single-threaded:
swanky (silvermont): ~1% faster overall
peppy (celeron,haswell): ~1.5% faster overall
Change-Id: Ib74f014374c63c9eaf2d38191cbd8e2edcc52073
|
|
Change-Id: If3cb9345b44162e600e6c74873e0cb4c207fc7fb
|
|
When configured with high bit detpth enabled, the 8bit quantize
function stopped using optimised code. This made 8bit content
decode slowly. This commit re-enables the SSSE3 optimisations.
Change-Id: I194b505dd3f4c494e5c5e53e020f5d94534b16b5
|
|
|
|
When configured with high bit detpth enabled, the 8bit quantize
function stopped using optimised code. This made 8bit content
decode slowly. This commit re-enables the SSE2 optimisation
(but not the SSSE3 optimisation).
Change-Id: Id015fe3c1c44580a4bff3f4bd985170f2806a9d9
|
|
Change-Id: Ia1a2cac0e9dc05f3207b3433a6c1589fa7f2aee3
|
|
|
|
|
|
This is more a proof of concept than anything else. The problem here
isn't so much how to code it, but rather where to place the resulting
code. All intrapred DSP code lives in vpx_dsp, so do we want the vp10
specific intra pred functions to live there, or in vp10/?
See issue 1015.
Change-Id: I675f7badcc8e18fd99a9553910ecf3ddf81f0a05
|
|
I've added a few new functions (d45e, d63e, he, ve) to cover the
filtered h/v 4x4 predictors that are vp8-specific, the "correct"
d45 with the correctly filtered bottom-right pixel (as opposed to
the unfiltered version in vp9), and the "broken" d63 with weirdly
filtered bottom-right pixels (which is correctly filtered in vp9).
There may be a minor performance impact on all systems because we
have to do an extra copy of the Above pixel array to incorporate
the topleft pixel in the same array (thus fitting the vpx_dsp API).
In addition, armv6 will have a more serious performance impact b/c
I removed the armv6/vp8-specific assembly. I'm not sure anyone
cares...
Change-Id: I7f9e5ebee11d8e21aca2cd517a69eefc181b2e86
|
|
Change-Id: I2000820e0c04de2c975d370a0cf7145330289bb2
|
|
When configured with high bitdepth enabled, the 8bit transform
stopped using optimised code. This made 8bit content decode slowly.
Change-Id: I67d91f9b212921d5320f949fc0a0d3f32f90c0ea
|
|
This was rewritten and moved to vpx_dsp/x86/vpx_subpixel_8t_ssse3.asm
in 195883023bb39b5ee5c6811a316ab96d9225034d
Change-Id: I117ce983dae12006e302679ba7f175573dd9e874
|
|
fixes build on windows x64; previously 'heightq' i.e., the 64-bit register
was accessed when only the 32-bit value was needed. given this is from a
stack variable the upper bits were undefined.
+ bump register/xmm counts; users of SETUP_LOCAL_VARS touch xmm13 in
64-bit builds and filter_block1d16_v* uses one extra temp variable
Change-Id: I9c768c0b2047481d1d3b11c2e16b2f8de6eb0d80
|
|
For reading, this makes the operation branchless, although it still
requires two shifts. For writing, this makes the operation as fast
as writing an unsigned value, branchlessly. This is also how other
codecs typically code signed, non-arithcoded bitstream elements.
See issue 1039.
Change-Id: I6a8182cc88a16842fb431688c38f6b52d7f24ead
|
|
Change-Id: Icf06d35ca347713253d1eba341a894b51efa81a9
|
|
This is based on the original patch optimized for 32bit
platforms by Tamar/Ilya and now uses the x86inc style asm.
The assembly was also modified to support 64bit platforms.
Change-Id: Ice12f249bbbc162a7427e3d23fbf0cbe4135aff2
|
|
Change-Id: I2e387a06484a06301f3cd6600c4ba2f4335b61ee
|
|
Change-Id: I5afa3c351ba7c5e7deb3889f7471619ac60af255
|
|
* changes:
Only build ssse3 filter functions on 64 bit
Clean up unused function warnings in vp8 encoder
Clean up unused function warnings in vp8 onyx_if.c
|
|
|
|
These were lost in the great sub pixel variance move of
6a82f0d7fb9ee908c389e8d55444bbaed3d54e9c
Not having these functions caused a ~10% performance regression in
some realtime vp8 encodes.
Change-Id: I50658483d9198391806b27899f2c0d309233c4b5
|
|
prevents redeclaration warnings;
vp8 has its own define which will be resolved in a future commit
Change-Id: Ic941fef3dd4262fcdce48b73075fe6b375f11c9c
|
|
Avoid an unused function warning by only building the functions when
they will be used.
Change-Id: I53b5bdc5a180c79d63b34e4c8921d679bbc54009
|
|
|
|
Change-Id: Ic81d435ea928183197040cdf64b6afd7dbaf57e4
|
|
|
|
|
|
Change-Id: I43bcc70680503e4c18d8f021097307778cf9ea70
|
|
Change-Id: I71d5994e21813554a927d35ebcc26bf7a68984fd
|
|
Add the dspr2 files to vpx_dsp.mk and enable these functions in
vpx_dsp_rtcd_defs.pl file.
Change-Id: I79feb5af24f174f4a0788dc6f3b6df7f4e1fa467
|
|
* changes:
VPX: removed filter == 128 checks from mips convolve code
VPX: removed step checks from mips convolve code
|
|
|
|
Change-Id: Ie1fe6603232adc22dbe4d51bd1008c856a6d40ca
|
|
The check is handled by the predictor table.
Change-Id: I2fe52bfbbfccb2edd13ba250986e3a4b4b589459
|
|
The check is handled by the predictor table.
Change-Id: I5e5084ebb46be8087c8c9d80b5f76e919a1cd05b
|
|
The check is handled by the predictor table.
Change-Id: I42479f843e77a2d40cdcdfc9e2e6c48a05a36561
|
|
|
|
|
|
This commit folks the VP9 and VP10 codebase and makes libvpx
support VP8, VP9, and VP10.
Change-Id: I81782e0b809acb3c9844bee8c8ec8f4d5e8fa356
|
|
many _sse2.asm have sse implementations as well
Change-Id: Idfa1f5cab593e4913aaad37f7223e8430188c44a
|
|
|
|
|
|
|
|
Use system_state.h in vpx_dsp and remove unneeded includes of
vp9_systemdependent.h.
Change-Id: I92557ec6dd5aa790160b4f31fe7967db0d7ec3c4
|
|
* changes:
Only use .text sections for aout
Use newer x86inc.asm
Use .text instead of .rodata on macho
Copy PIC handling code from x86_abi_support
Set 'private_extern' visibility for macho targets
Avoid 'amdnop' when building with nasm
Catch all elf formats
Expand PIC default to macho64 and respect CONFIG_PIC from libvpx
Use libvpx defines to set name mangling rules
Customize x86inc.asm for libvpx
|
|
from FUN_CONV_1D and FUN_CONV_2D macros. The functions
will not be called with these inputs.
Change-Id: I67ec75e4edafc0acee70190521a80ea85dfa521b
|
|
Change-Id: Id36f180032c8a92c686da6f716a7468332b23b94
|