Age | Commit message (Collapse) | Author |
|
About 20% overall encoder speedup (vs. about 30% for sse4 version).
Change-Id: Ibf608a6a1bc94b14ec47e8046d3206b275b5a8bd
|
|
Unroll horizontal pass, no more intermediate buffer, faster special transpose.
Change-Id: I05df75be4e5f01420066cdf3c61a2edf35bedb64
|
|
|
|
About 3.5x faster, 30% overall encoder speedup. Rest of optimizations
will come soon (see TODO section in filter_sse4.c).
Change-Id: If18108048bfd5345fc942e8574e4c7f58e0e86e0
|
|
Further cases of inconsistent naming convention.
Change-Id: Id3411ecec6f01a4c889268a00f0c9fd5a92ea143
|
|
Merges this experiment in to make it easier to run tests on
filter precision, vectorized implementation etc.
Also removes an experimental filter.
Change-Id: I1e8706bb6d4fc469815123939e9c6e0b5ae945cd
|
|
Merged the enhanced_interp experiment.
Found and fixed a bug in the include files framework, whereby
certain encoder files were still using the old INTERP_EXTEND
value of 3 instead of 4. The thresholds for mv range mcomp.c
need a small adjustment to prevent crashes.
The results are more or less unchanged.
Change-Id: Iac5008390f1efc97ce1102fbb5f8989c847fb579
|
|
The following five experiments are merged:
newentropy
newupdate
adaptive_entropy (also includes a couple of parameter changes
that improves results a little
in common/entropymode.c and encoder/modecosts.c
that were not merged from the internal branch)
newintramodes
expanded_coef_context
Change-Id: I8a142a831786ee9dc936f22be1d42a8bced7d270
|
|
Approximate the Google style guide[1] so that that there's a written
document to follow and tools to check compliance[2].
[1]: http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml
[2]: http://google-styleguide.googlecode.com/svn/trunk/cpplint/cpplint.py
Change-Id: Idf40e3d8dddcc72150f6af127b13e5dab838685f
|
|
Change-Id: I6cbd4de96f9dcc783cef170bfd7652f6cbee36a2
|
|
Change-Id: I6802731a4d15feef5ce62993dc505ded55c40f7e
|
|
Updates idct/dequant mmx assembly to work with vpnext instead of vp8.
Also adds x86inc.asm
Change-Id: I6e147d5e89177ae449271e97e50d082eb11b078e
|
|
Adds 6 directional intra predictiom modes for 16x16 and 8x8 blocks.
Change-Id: I25eccc0836f28d8d74922e4e9231568a648b47d1
|
|
This is the initial patch for supporting 1/8th pel
motion. Currently if we configure with enable-high-precision-mv,
all motion vectors would default to 1/8 pel. Encode and
decode syncs fine with the current code. In the next phase
the code will be refactored so that we can choose the 1/8
pel mode adaptively at a frame/segment/mb level.
Derf results:
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hpmv.html
(about 0.83% better than 8-tap interpoaltion)
Patch 3: Rebased. Also adding 1/16th pel interpolation for U and V
Patch 4: HD results.
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hd_hpmv.html
Seems impressive (unless I am doing something wrong).
Patch 5: Added mmx/sse for bilateral filtering, as well as enforced
use of c-versions of subpel filters with 8-taps and 1/16th pel;
Also redesigned the 8-tap filters to reduce the cut-off in order to
introduce a denoising effect. There is a new configure option
sixteenth-subpel-uv which will use 1/16 th pel interpolation for
uv, if the motion vectors have 1/8 pel accuracy.
With the fixes the results are promising on the derf set. The enhanced
interpolation option with 8-taps alone gives 3% improvement over thei
derf set:
http://www.corp.google.com/~debargha/vp8_results/enhinterpn.html
Results on high precision mv and on the hd set are to follow.
Patch 6: Adding a missing condition for CONFIG_SIXTEENTH_SUBPEL_UV in
vp8/common/x86/x86_systemdependent.c
Patch 7: Cleaning up various debug messages.
Patch 8: Merge conflict
Change-Id: I5b1d844457aefd7414a9e4e0e06c6ed38fd8cc04
|
|
Conflicts:
vp8/common/defaultcoefcounts.h
vp8/common/entropy.c
vp8/encoder/bitstream.c
Change-Id: Idd4990c80d5b5494ac036254694015fab449bc08
|
|
Prepend idct function names with vp8_
so that under profiling they show up
associated with libvpx.
Change-Id: I4fe357b50236cb7730a4cc00164c0a3487a1d8b4
|
|
|
|
The data that the simple horizontal loopfilter reads is aligned, treat
it accordingly.
For the vertical, we only use the bottom 4 bytes, so don't read in 16
(and incur the penalty for unaligned access).
This shows a small improvement on older processors which have a
significant penalty for unaligned reads.
postproc_mmx.c is unused
Change-Id: I87b29bbc0c3b19ee1ca1de3c4f47332a53087b3d
|
|
Prepend . to local labels in assembly code. This
allows non unique labels within a file. Also
makes profiling information more informative
by keeping the function name with the loop name.
Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f
|
|
|
|
Change-Id: I1ed739522db7c00c189851c7095c1b64ef6412ce
|
|
|
|
This should fix binaries using PIC on x86-32. Also should
fix issue 343.
Change-Id: I591de3ad68c8a8bb16054bd8f987a75b4e2bad02
|
|
|
|
|
|
|
|
|
|
global values were being referenced, but the GOT was not being set up.
as the GOT is only required for PIC, this issue wasn't caught in the
default configuration.
Change-Id: I8006e53776139362a76f2c80cf9d0f8458602b2f
http://code.google.com/p/webm/issues/detail?id=328
|
|
Change-Id: I9467d7a50eac32d8e8f3a2f26db818e47c93c94b
|
|
|
|
Change-Id: I97124670926433bf1593c91660d8b8f8482ea9ce
|
|
Change-Id: I658a1df7d825f820573cb2d11ad402f9d2791035
|
|
|
|
removed inline from recon_wrapper_sse2.c to build
for visual stuido
Change-Id: I74a3482950448e2cdb30e9cd7087145b440d8a22
|
|
|
|
Thanks Jason for pointing that out on #vp8. ;-).
Change-Id: I5330a753e752a8704b78a409597472628e0b26a5
|
|
decoding
before
10.425
10.432
10.423
=10.426
after:
10.405
10.416
10.398
=10.406, 0.2% faster
encoding
before
14.252
14.331
14.250
14.223
14.241
14.220
14.221
=14.248
after
14.095
14.090
14.085
14.095
14.064
14.081
14.089
=14.086, 1.1% faster
Change-Id: I483d3d8f0deda8ad434cea76e16028380722aee2
|
|
Conflicts:
vp8/common/alloccommon.c
vp8/encoder/rdopt.c
Change-Id: Ic34b33577423031e277235ffa6bcaff7b252e5cb
|
|
the decision to run the regular or simple loopfilter is made outside the
function and managed with pointers
stop tracking the option in two places. use filter_type exclusively
Change-Id: I39d7b5d1352885efc632c0a94aaf56b72cc2fe15
|
|
Conflicts:
vp8/decoder/onyxd_int.h
Change-Id: Icf445b589c2bc61d93d8c977379bbd84387d0488
|
|
the win64 abi requires saving and restoring xmm6:xmm15. currently
SAVE_XMM and RESTORE XMM only allow for saving xmm6:xmm7. allow
specifying the highest register used and if the stack is unaligned.
Change-Id: Ica5699622ffe3346d3a486f48eef0206c51cf867
|
|
Went through the code and fixed it. Verified on Windows.
Where possible, remove dependencies on xmm[67]
Current code relies on pushing rbp to the stack to get 16 byte
alignment. This broke when rbp wasn't pushed
(vp8/encoder/x86/sad_sse3.asm). Work around this by using unaligned
memory accesses. Revisit this and the offsets in
vp8/encoder/x86/sad_sse3.asm in another change to SAVE_XMM.
Change-Id: I5f940994d3ebfd977c3d68446cef20fd78b07877
|
|
|
|
vp8_filter_block1d16_h4_ssse3 was never called
because UNSHADOW_ARGS moves the stack by 'mov rsp, rbp', the issue was
masked. however, if/when win64 used those registers for persistant data,
issues could/will arise.
Change-Id: I56d6effca0aeba1f86082689771cb10145d39651
|
|
|
|
Change-Id: I36ca3f2f4620358033da34daf764f0b388dacd08
|
|
Conflicts:
vp8/decoder/decodemv.c
vp8/decoder/onyxd_if.c
vp8/encoder/ratectrl.c
vp8/encoder/rdopt.c
Change-Id: Ia1c1c5e589f4200822d12378c7749ba62bd17ae2
|
|
A large number of functions were defined with external linkage, even
though they were only used from within one file. This patch changes
their linkage to static and removes the vp8_ prefix from their names,
which should make it more obvious to the reader that the function is
contained within the current translation unit. Functions that were
not referenced were removed.
These symbols were identified by:
$ nm -A libvpx.a | sort -k3 | uniq -c -f2 | grep ' [A-Z] ' \
| sort | grep '^ *1 '
Change-Id: I59609f58ab65312012c047036ae1e0634f795779
|
|
|
|
Allow compiling without adding vp8/{common,encoder,decoder} to the
include paths.
Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
|