Age | Commit message (Collapse) | Author |
|
Change-Id: I6cbd4de96f9dcc783cef170bfd7652f6cbee36a2
|
|
Change-Id: I6802731a4d15feef5ce62993dc505ded55c40f7e
|
|
Updates idct/dequant mmx assembly to work with vpnext instead of vp8.
Also adds x86inc.asm
Change-Id: I6e147d5e89177ae449271e97e50d082eb11b078e
|
|
Adds 6 directional intra predictiom modes for 16x16 and 8x8 blocks.
Change-Id: I25eccc0836f28d8d74922e4e9231568a648b47d1
|
|
This is the initial patch for supporting 1/8th pel
motion. Currently if we configure with enable-high-precision-mv,
all motion vectors would default to 1/8 pel. Encode and
decode syncs fine with the current code. In the next phase
the code will be refactored so that we can choose the 1/8
pel mode adaptively at a frame/segment/mb level.
Derf results:
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hpmv.html
(about 0.83% better than 8-tap interpoaltion)
Patch 3: Rebased. Also adding 1/16th pel interpolation for U and V
Patch 4: HD results.
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hd_hpmv.html
Seems impressive (unless I am doing something wrong).
Patch 5: Added mmx/sse for bilateral filtering, as well as enforced
use of c-versions of subpel filters with 8-taps and 1/16th pel;
Also redesigned the 8-tap filters to reduce the cut-off in order to
introduce a denoising effect. There is a new configure option
sixteenth-subpel-uv which will use 1/16 th pel interpolation for
uv, if the motion vectors have 1/8 pel accuracy.
With the fixes the results are promising on the derf set. The enhanced
interpolation option with 8-taps alone gives 3% improvement over thei
derf set:
http://www.corp.google.com/~debargha/vp8_results/enhinterpn.html
Results on high precision mv and on the hd set are to follow.
Patch 6: Adding a missing condition for CONFIG_SIXTEENTH_SUBPEL_UV in
vp8/common/x86/x86_systemdependent.c
Patch 7: Cleaning up various debug messages.
Patch 8: Merge conflict
Change-Id: I5b1d844457aefd7414a9e4e0e06c6ed38fd8cc04
|
|
Conflicts:
vp8/common/defaultcoefcounts.h
vp8/common/entropy.c
vp8/encoder/bitstream.c
Change-Id: Idd4990c80d5b5494ac036254694015fab449bc08
|
|
Prepend idct function names with vp8_
so that under profiling they show up
associated with libvpx.
Change-Id: I4fe357b50236cb7730a4cc00164c0a3487a1d8b4
|
|
|
|
The data that the simple horizontal loopfilter reads is aligned, treat
it accordingly.
For the vertical, we only use the bottom 4 bytes, so don't read in 16
(and incur the penalty for unaligned access).
This shows a small improvement on older processors which have a
significant penalty for unaligned reads.
postproc_mmx.c is unused
Change-Id: I87b29bbc0c3b19ee1ca1de3c4f47332a53087b3d
|
|
Prepend . to local labels in assembly code. This
allows non unique labels within a file. Also
makes profiling information more informative
by keeping the function name with the loop name.
Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f
|
|
|
|
Change-Id: I1ed739522db7c00c189851c7095c1b64ef6412ce
|
|
|
|
This should fix binaries using PIC on x86-32. Also should
fix issue 343.
Change-Id: I591de3ad68c8a8bb16054bd8f987a75b4e2bad02
|
|
|
|
|
|
|
|
|
|
global values were being referenced, but the GOT was not being set up.
as the GOT is only required for PIC, this issue wasn't caught in the
default configuration.
Change-Id: I8006e53776139362a76f2c80cf9d0f8458602b2f
http://code.google.com/p/webm/issues/detail?id=328
|
|
Change-Id: I9467d7a50eac32d8e8f3a2f26db818e47c93c94b
|
|
|
|
Change-Id: I97124670926433bf1593c91660d8b8f8482ea9ce
|
|
Change-Id: I658a1df7d825f820573cb2d11ad402f9d2791035
|
|
|
|
removed inline from recon_wrapper_sse2.c to build
for visual stuido
Change-Id: I74a3482950448e2cdb30e9cd7087145b440d8a22
|
|
|
|
Thanks Jason for pointing that out on #vp8. ;-).
Change-Id: I5330a753e752a8704b78a409597472628e0b26a5
|
|
decoding
before
10.425
10.432
10.423
=10.426
after:
10.405
10.416
10.398
=10.406, 0.2% faster
encoding
before
14.252
14.331
14.250
14.223
14.241
14.220
14.221
=14.248
after
14.095
14.090
14.085
14.095
14.064
14.081
14.089
=14.086, 1.1% faster
Change-Id: I483d3d8f0deda8ad434cea76e16028380722aee2
|
|
Conflicts:
vp8/common/alloccommon.c
vp8/encoder/rdopt.c
Change-Id: Ic34b33577423031e277235ffa6bcaff7b252e5cb
|
|
the decision to run the regular or simple loopfilter is made outside the
function and managed with pointers
stop tracking the option in two places. use filter_type exclusively
Change-Id: I39d7b5d1352885efc632c0a94aaf56b72cc2fe15
|
|
Conflicts:
vp8/decoder/onyxd_int.h
Change-Id: Icf445b589c2bc61d93d8c977379bbd84387d0488
|
|
the win64 abi requires saving and restoring xmm6:xmm15. currently
SAVE_XMM and RESTORE XMM only allow for saving xmm6:xmm7. allow
specifying the highest register used and if the stack is unaligned.
Change-Id: Ica5699622ffe3346d3a486f48eef0206c51cf867
|
|
Went through the code and fixed it. Verified on Windows.
Where possible, remove dependencies on xmm[67]
Current code relies on pushing rbp to the stack to get 16 byte
alignment. This broke when rbp wasn't pushed
(vp8/encoder/x86/sad_sse3.asm). Work around this by using unaligned
memory accesses. Revisit this and the offsets in
vp8/encoder/x86/sad_sse3.asm in another change to SAVE_XMM.
Change-Id: I5f940994d3ebfd977c3d68446cef20fd78b07877
|
|
|
|
vp8_filter_block1d16_h4_ssse3 was never called
because UNSHADOW_ARGS moves the stack by 'mov rsp, rbp', the issue was
masked. however, if/when win64 used those registers for persistant data,
issues could/will arise.
Change-Id: I56d6effca0aeba1f86082689771cb10145d39651
|
|
|
|
Change-Id: I36ca3f2f4620358033da34daf764f0b388dacd08
|
|
Conflicts:
vp8/decoder/decodemv.c
vp8/decoder/onyxd_if.c
vp8/encoder/ratectrl.c
vp8/encoder/rdopt.c
Change-Id: Ia1c1c5e589f4200822d12378c7749ba62bd17ae2
|
|
A large number of functions were defined with external linkage, even
though they were only used from within one file. This patch changes
their linkage to static and removes the vp8_ prefix from their names,
which should make it more obvious to the reader that the function is
contained within the current translation unit. Functions that were
not referenced were removed.
These symbols were identified by:
$ nm -A libvpx.a | sort -k3 | uniq -c -f2 | grep ' [A-Z] ' \
| sort | grep '^ *1 '
Change-Id: I59609f58ab65312012c047036ae1e0634f795779
|
|
|
|
Allow compiling without adding vp8/{common,encoder,decoder} to the
include paths.
Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
|
|
Conflicts:
vp8/encoder/encodeframe.c
vp8/encoder/ethreading.c
vp8/encoder/onyx_int.h
Change-Id: I1c562d2fe6e42c0d1d86f68c77c0e899066e02bd
|
|
Change-Id: I0d41415e3961c2c9492d342290c1999f9d02e6d8
|
|
|
|
This eliminates a large set of warnings exposed by the Mozilla build
system (Use of C++ comments in ISO C90 source, commas at the end of
enum lists, a couple incomplete initializers, and signed/unsigned
comparisons).
It also eliminates many (but not all) of the warnings expose by newer
GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite
without checking the return values).
There are a few spurious warnings left on my system:
../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used
uninitialized in this function
gcc seems to be unable to figure out that the value shortcut doesn't
change between the two if blocks that test it here.
../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned
expression >= 0 is always true
../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned
expression >= 0 is always true
This is true, so far as it goes, but it's comparing against an enum, and the C
standard does not mandate that enums be unsigned, so the checks can't be
removed.
Change-Id: Iaf689ae3e3d0ddc5ade00faa474debe73b8d3395
|
|
Filed for nasm as:
https://sourceforge.net/tracker/?func=detail&atid=106208&aid=3081103&group_id=6208
nasm just does not accept any size parameter for movhps:
1.asm:2: error: mismatch in operand sizes
Some parts of libvpx already use MMWORD for movhps and MMWORD is
defined-out so it is compatible both with yasm and nasm.
Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu.
Change-Id: I4008a317ca87ec07c9ada958fcdc10a0cb589bbc
|
|
nasm does not support `label wrt rip', it requires `rel label'. It is
still fully compatible with yasm.
Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.
Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50
|
|
nasm requires the instruction length (movd/movq) to match to its
parameters. I find it more clear to really use 64bit instructions when
we use 64bit registers in the assembly.
Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.
Change-Id: Id9b1a5cdfb1bc05697e523c317a296df43d42a91
|
|
- Scheduling for Atom processors
- Combining of macros to allow for better interleaving
- Change from multiplies to adds for main filter
- Use of movhps/movlps to fill xmm registers without
shifting and orring
Change-Id: I0b3500a5f58abf7085253ec92d64c8a96723040b
|
|
Movdqu is more expensive (throughput, uops) than movq. Minimal
impact for newer big cores, but ~2.25% gain on Atom.
Change-Id: I62c80bb1cc01d8a91c350c4c7719462809a4ef7f
|