Age | Commit message (Collapse) | Author |
|
Change-Id: I446b2ffcbe732ffb112dbd97a4799272d4c01a84
|
|
This reinstates reverted commit 2113a831575d81faeadd9966e256d58b6b2b1633
Change-Id: I9a9af13497d1e58d4f467e3e083fddf06b1b786c
|
|
This reverts commit 2113a831575d81faeadd9966e256d58b6b2b1633
|
|
Code clean up - removed rtcd
Change-Id: Id963ecf53c370b1d99484ef18d6befeed7e0c748
|
|
Convert copy16x16 from invoke to rtcd. The first in a long
string of converts.
Change-Id: I296b0aa32f40e9fb649f7a3cb914a4e5300cad63
|
|
About 20% overall encoder speedup (vs. about 30% for sse4 version).
Change-Id: Ibf608a6a1bc94b14ec47e8046d3206b275b5a8bd
|
|
Unroll horizontal pass, no more intermediate buffer, faster special transpose.
Change-Id: I05df75be4e5f01420066cdf3c61a2edf35bedb64
|
|
|
|
About 3.5x faster, 30% overall encoder speedup. Rest of optimizations
will come soon (see TODO section in filter_sse4.c).
Change-Id: If18108048bfd5345fc942e8574e4c7f58e0e86e0
|
|
Further cases of inconsistent naming convention.
Change-Id: Id3411ecec6f01a4c889268a00f0c9fd5a92ea143
|
|
Merges this experiment in to make it easier to run tests on
filter precision, vectorized implementation etc.
Also removes an experimental filter.
Change-Id: I1e8706bb6d4fc469815123939e9c6e0b5ae945cd
|
|
Merged the enhanced_interp experiment.
Found and fixed a bug in the include files framework, whereby
certain encoder files were still using the old INTERP_EXTEND
value of 3 instead of 4. The thresholds for mv range mcomp.c
need a small adjustment to prevent crashes.
The results are more or less unchanged.
Change-Id: Iac5008390f1efc97ce1102fbb5f8989c847fb579
|
|
The following five experiments are merged:
newentropy
newupdate
adaptive_entropy (also includes a couple of parameter changes
that improves results a little
in common/entropymode.c and encoder/modecosts.c
that were not merged from the internal branch)
newintramodes
expanded_coef_context
Change-Id: I8a142a831786ee9dc936f22be1d42a8bced7d270
|
|
Approximate the Google style guide[1] so that that there's a written
document to follow and tools to check compliance[2].
[1]: http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml
[2]: http://google-styleguide.googlecode.com/svn/trunk/cpplint/cpplint.py
Change-Id: Idf40e3d8dddcc72150f6af127b13e5dab838685f
|
|
Change-Id: I6cbd4de96f9dcc783cef170bfd7652f6cbee36a2
|
|
Change-Id: I6802731a4d15feef5ce62993dc505ded55c40f7e
|
|
Updates idct/dequant mmx assembly to work with vpnext instead of vp8.
Also adds x86inc.asm
Change-Id: I6e147d5e89177ae449271e97e50d082eb11b078e
|
|
Adds 6 directional intra predictiom modes for 16x16 and 8x8 blocks.
Change-Id: I25eccc0836f28d8d74922e4e9231568a648b47d1
|
|
This is the initial patch for supporting 1/8th pel
motion. Currently if we configure with enable-high-precision-mv,
all motion vectors would default to 1/8 pel. Encode and
decode syncs fine with the current code. In the next phase
the code will be refactored so that we can choose the 1/8
pel mode adaptively at a frame/segment/mb level.
Derf results:
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hpmv.html
(about 0.83% better than 8-tap interpoaltion)
Patch 3: Rebased. Also adding 1/16th pel interpolation for U and V
Patch 4: HD results.
http://www.corp.google.com/~debargha/vp8_results/enhinterp_hd_hpmv.html
Seems impressive (unless I am doing something wrong).
Patch 5: Added mmx/sse for bilateral filtering, as well as enforced
use of c-versions of subpel filters with 8-taps and 1/16th pel;
Also redesigned the 8-tap filters to reduce the cut-off in order to
introduce a denoising effect. There is a new configure option
sixteenth-subpel-uv which will use 1/16 th pel interpolation for
uv, if the motion vectors have 1/8 pel accuracy.
With the fixes the results are promising on the derf set. The enhanced
interpolation option with 8-taps alone gives 3% improvement over thei
derf set:
http://www.corp.google.com/~debargha/vp8_results/enhinterpn.html
Results on high precision mv and on the hd set are to follow.
Patch 6: Adding a missing condition for CONFIG_SIXTEENTH_SUBPEL_UV in
vp8/common/x86/x86_systemdependent.c
Patch 7: Cleaning up various debug messages.
Patch 8: Merge conflict
Change-Id: I5b1d844457aefd7414a9e4e0e06c6ed38fd8cc04
|
|
Conflicts:
vp8/common/defaultcoefcounts.h
vp8/common/entropy.c
vp8/encoder/bitstream.c
Change-Id: Idd4990c80d5b5494ac036254694015fab449bc08
|
|
Prepend idct function names with vp8_
so that under profiling they show up
associated with libvpx.
Change-Id: I4fe357b50236cb7730a4cc00164c0a3487a1d8b4
|
|
|
|
The data that the simple horizontal loopfilter reads is aligned, treat
it accordingly.
For the vertical, we only use the bottom 4 bytes, so don't read in 16
(and incur the penalty for unaligned access).
This shows a small improvement on older processors which have a
significant penalty for unaligned reads.
postproc_mmx.c is unused
Change-Id: I87b29bbc0c3b19ee1ca1de3c4f47332a53087b3d
|
|
Prepend . to local labels in assembly code. This
allows non unique labels within a file. Also
makes profiling information more informative
by keeping the function name with the loop name.
Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f
|
|
|
|
Change-Id: I1ed739522db7c00c189851c7095c1b64ef6412ce
|
|
|
|
This should fix binaries using PIC on x86-32. Also should
fix issue 343.
Change-Id: I591de3ad68c8a8bb16054bd8f987a75b4e2bad02
|
|
|
|
|
|
|
|
|
|
global values were being referenced, but the GOT was not being set up.
as the GOT is only required for PIC, this issue wasn't caught in the
default configuration.
Change-Id: I8006e53776139362a76f2c80cf9d0f8458602b2f
http://code.google.com/p/webm/issues/detail?id=328
|
|
Change-Id: I9467d7a50eac32d8e8f3a2f26db818e47c93c94b
|
|
|
|
Change-Id: I97124670926433bf1593c91660d8b8f8482ea9ce
|
|
Change-Id: I658a1df7d825f820573cb2d11ad402f9d2791035
|
|
|
|
removed inline from recon_wrapper_sse2.c to build
for visual stuido
Change-Id: I74a3482950448e2cdb30e9cd7087145b440d8a22
|
|
|
|
Thanks Jason for pointing that out on #vp8. ;-).
Change-Id: I5330a753e752a8704b78a409597472628e0b26a5
|
|
decoding
before
10.425
10.432
10.423
=10.426
after:
10.405
10.416
10.398
=10.406, 0.2% faster
encoding
before
14.252
14.331
14.250
14.223
14.241
14.220
14.221
=14.248
after
14.095
14.090
14.085
14.095
14.064
14.081
14.089
=14.086, 1.1% faster
Change-Id: I483d3d8f0deda8ad434cea76e16028380722aee2
|
|
Conflicts:
vp8/common/alloccommon.c
vp8/encoder/rdopt.c
Change-Id: Ic34b33577423031e277235ffa6bcaff7b252e5cb
|
|
the decision to run the regular or simple loopfilter is made outside the
function and managed with pointers
stop tracking the option in two places. use filter_type exclusively
Change-Id: I39d7b5d1352885efc632c0a94aaf56b72cc2fe15
|
|
Conflicts:
vp8/decoder/onyxd_int.h
Change-Id: Icf445b589c2bc61d93d8c977379bbd84387d0488
|
|
the win64 abi requires saving and restoring xmm6:xmm15. currently
SAVE_XMM and RESTORE XMM only allow for saving xmm6:xmm7. allow
specifying the highest register used and if the stack is unaligned.
Change-Id: Ica5699622ffe3346d3a486f48eef0206c51cf867
|
|
Went through the code and fixed it. Verified on Windows.
Where possible, remove dependencies on xmm[67]
Current code relies on pushing rbp to the stack to get 16 byte
alignment. This broke when rbp wasn't pushed
(vp8/encoder/x86/sad_sse3.asm). Work around this by using unaligned
memory accesses. Revisit this and the offsets in
vp8/encoder/x86/sad_sse3.asm in another change to SAVE_XMM.
Change-Id: I5f940994d3ebfd977c3d68446cef20fd78b07877
|
|
|
|
vp8_filter_block1d16_h4_ssse3 was never called
because UNSHADOW_ARGS moves the stack by 'mov rsp, rbp', the issue was
masked. however, if/when win64 used those registers for persistant data,
issues could/will arise.
Change-Id: I56d6effca0aeba1f86082689771cb10145d39651
|
|
|