Age | Commit message (Collapse) | Author |
|
Move the tran_low_t helper functions to a new file. Additional
load/store functions will be added here.
Change-Id: I52bf652c344c585ea2f3e1230886be93f5caefc3
|
|
files.
During aosp builds with binutils-2.27, we're seeing linker error
messages of this form:
libvpx.a(subpixel_mmx.o): relocation R_386_GOTOFF against preemptible
symbol vp8_bilinear_filters_x86_8 cannot be used when making a shared
object
subpixel_mmx.o is assembled from "vp8/common/x86/subpixel_mmx.asm".
Other messages refer to symbol references from deblock_sse2.o and
subpixel_sse2.o, also assembled from asm files.
This change marks such symbols as having "protected" visibility. This
satisfies the linker as the symbols are not preemptible from outside
the shared library now, which I think is the original intent anyway.
Change-Id: I2817f7a5f43041533d65ebf41aefd63f8581a452
|
|
* changes:
ppc: Add get_mb_ss_vsx
ppc: Add get4x4sse_cs_vsx
ppc: Add comp_avg_pred_vsx
|
|
Change-Id: I1b54a7a5bb642e4b836d786ea1ae506eed025e3f
|
|
Change-Id: I3028bdadf653665d18e781d28e9625f62804b3d8
|
|
Change-Id: I59788cd98231e707239c2ad95ae54f67cfe24e10
|
|
Change-Id: I84e3705fa52f75cb91b2bab4abf5cc77585ee3e2
|
|
Change-Id: I3c4f9d595275669580413a71b3c3c810e7ddcacd
|
|
|
|
Change-Id: I60619d28fffd9809f93b1af510a50e1aa02519a9
|
|
Introduced append situation in Commit 0178d97 which could be
confusing. Clean a little bit and add some comments.
Change-Id: I69ad336f805aca7ce9d45515b8cd237423fadbb2
|
|
* changes:
subpel variance neon: add mixed sizes
sub pixel variance neon: use generic variance
|
|
Change-Id: I73b8104a9e7a70ffe827c1b7ff43618f24f5d7bd
|
|
It's a bit faster to call idct4_sse2() in vpx_idct4x4_16_add_sse2()
Change-Id: I1513be7a895cd2fc190f4a8297c240b17de0f876
|
|
Read in a Q register. Works on blocks of 16 and larger.
Improvement of about 20% for 64x64. The smaller blocks are faster, but
don't have quite the same level of improvement. 16x32 is only about 5%
BUG=webm:1422
Change-Id: Ie11a877c7b839e66690a48117a46657b2ac82d4b
|
|
* changes:
neon variance: process two rows of 8 at a time
neon variance: add small missing sizes
|
|
* changes:
Split dsp/x86/inv_txfm_sse2.c
Update highbd idct functions arguments to use uint16_t dst
Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct
|
|
Add support for everything except block sizes of 4.
Performance is better but numbers will improve again when the variance
optimizations land.
BUG=webm:1423
Change-Id: I92eb4312b20be423fa2fe6fdb18167a604ff4d80
|
|
When a neon version is available it will be called. This allows
decoupling the variance implementations and has no real downside. For
most configurations, the call will be #define'd to the neon
implementation.
Change-Id: Ibb2afe4e156c5610e89488504d366b3e6d1ba712
|
|
Simplify HBD/non distinction in test.
Document why transpose_neon.h is not used
Change-Id: I17659414206ddbb8c2f1ef0d9f4a17f1745d5a52
|
|
When the width is equal to 8, process two rows at a time. This doubles
the speed of 8x4 and improves 8x8 by about 20%.
8x16 was using this technique already, but still improved a little bit
with the rewrite.
Also use this for vpx_get8x8var_neon
BUG=webm:1422
Change-Id: Id602909afcec683665536d11298b7387ac0a1207
|
|
Some of the mixed sizes were missing. They can be implemented trivially
using the existing helper function.
When comparing the previous 16x8 and 8x16 implementations, the helper
function is about 10% faster than the 16x8 version. The 8x16 is very
close, but the existing version appears to be faster.
BUG=webm:1422
Change-Id: Ib0e856083c1893e1bd399373c5fbcd6271a7f004
|
|
Spin out highbd idct functions.
BUG=webm:1412
Change-Id: I0cfe4117c00039b6778c59c022eee79ad089a2af
|
|
BUG=webm:1388
Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5
|
|
BUG=webm:1388
Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112
|
|
User level speed improvement on i7-6700, cpu-used=1,
x86_64 Linux, bitrate, 1080p, 8Mbps, 4K, 16Mbps:
- Decoder:
1080p: ~4%
4K: ~5%
- Encoder:
1080p: ~1%
4K: ~3%
Change-Id: I51b48f9c5de0d62487d5a11aa579c97bd03dd640
|
|
* changes:
Clean specializes of idct functions
Clean add_protos of highbd idct functions
Clean add_protos of idct functions
|
|
Change-Id: Ia5293d948003a7fff5a7cbad6e83d8a72717c857
|
|
Only the generic one again, speedups for 8x8 and larger blocks to
come later.
Change-Id: I90d481d3a602d1e277ead8f3934eca126b86b72d
|
|
Only the generic one again, speedups for 8x8 and larger blocks
to come later.
Change-Id: Ia509d6225984b4930ec03928c9bcbf51486da99f
|
|
The 8x8 and larger blocks cases can be sped up further.
Change-Id: I54549b03ac6c7a4e3f485738b100c3cac7ac2e15
|
|
The 8x8 and larger blocks cases can be sped up further.
Change-Id: I89b635d6b01c59f523f2d54b1284ed32916c5046
|
|
Change-Id: I8bb660de47b5f97263ec381dc428db96e9c9a4b2
|
|
Change-Id: Ica51d780b92b316ce9112740c56cdf7670816371
|
|
Change-Id: I6037525d92ec172810edab720389eb1865ed3b1a
|
|
Change-Id: Ib203c444c708f42072e38301ee3db97b5b53d014
|
|
Change-Id: Ie26d6dbe090e711d84bac01ba7da270db983f405
|
|
BUG=webm:1388
Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42
|
|
Slightly faster with the current compiler.
Change-Id: Iae225fac08395eb430c97a2abec69c60f5cf5c47
|
|
10x faster.
Change-Id: I7cedbf4df2ce7df5b6f1108b11815d088fdb9ba8
|
|
Slightly faster.
Change-Id: I0ca43f309b3d9b50435d69bd5be64b53a99bd191
|
|
2x faster.
Change-Id: I0583dec353299c6797401b646099f18db4e0420d
|
|
Slightly faster, the other dc predictors cannot be faster since
the computation speedup is overwhelmed by the time spent reading
dst to write just the 8x8 part.
Change-Id: I94a0b50500adf8b7b6bb919dbf5c7adf5b9fba66
|
|
11x faster.
Change-Id: I5b8f39213ee1f5260724fc254e3fb5c462435798
|
|
About 10x faster.
Change-Id: If7d0645f75c5d7deb9751edd0bf47e2f9068e9e7
|
|
About 18x faster.
Change-Id: Id043bf76c011e03e992085bb5e20f330d3e98cd4
|
|
About 12x faster.
Change-Id: I22c150256aefb4941861ab1f6c17d554fb694bed
|
|
About 16x faster.
Change-Id: Ie5469fb32d5fd11bb6cb06318cea475d8a5b00b9
|
|
10x and 5x faster.
Change-Id: I7913c58c768334d818f541a5e219f1035791eeaf
|
|
6x faster.
Change-Id: I717995b4056e5579c68191d11b495372971fe1ae
|