Age | Commit message (Collapse) | Author |
|
Runs about twice as fast as C
BUG=webm:1027
Change-Id: I6760d99f4e22259439ca35d746194b12a81bfa71
|
|
To make coefficient checking consistent with the VP9 spec sections
8.7.1.6 and 8.7.1.1.
Change-Id: I92e38e89a41d1e482317bb478c48ffa608d2d6ee
|
|
|
|
* changes:
vpx_dsp,add_noise: remove mmx implementation
vpx_dsp: remove mmx variance implementations
|
|
Provides more comprehensive coverage for --enable-coefficient-checking.
The intent is to make the --enable-coefficient-checking option
consistent with the VP9 spec.
Change-Id: I12d0120756d17572ca2b2d7e6a2ab9d8071d8d58
|
|
|
|
a sse2 version exists, this is a reasonable modern baseline.
Change-Id: If31d36c8412d25b53f41b4a93cf02f46802c0c33
|
|
there are sse2 equivalents for all remaining variance implementations
Change-Id: I10b947e73fc0067688181f819b59e47966bec3d2
|
|
Replaced vpx_d45_predictor_4x4_ssse3(), vpx_d45_predictor_8x8_ssse3()
and vpx_d207_predictor_4x4_ssse3() with
created vpx_d45_predictor_4x4_sse2(), vpx_d45_predictor_8x8_sse2()
and vpx_d207_predictor_4x4_sse2() respectively.
It's mostly neutral or slightly worse than ssse3 in good cases and
better than ssse3 in the bad cases (but still worse than using the mmx
regs).
Change-Id: Ib0237ceb71d2c57b8a93fd3170330cfed9d56bdd
|
|
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1222
Change-Id: Ifb3bedf9b4e1b007b21aebaa4beb9ba50424efef
|
|
|
|
Followed the code style of other lpf fuctions.
These 2 functions put 2 rows of data in a single xmm register,
so they have similar but not identical filter operations,
and cannot share the same macros.
Change-Id: I3bab55a5d1a1232926ac8fd1f03251acc38302bc
|
|
|
|
Replace MMX with SSE2.
Change-Id: Id8482d2589131f9427e7f36bc64413f058caf31f
|
|
This reverts commit 2468163e0770108f5216b65445ce05a8241bca21.
causes valgrind errors for overread of buffer in SubpelVarianceTest
Change-Id: I448e52c76f815ac199305b71f7d169f2bc167679
|
|
|
|
|
|
|
|
This commit clarifies integer value range for vairables used in
several variance functions, also change to use proper type
conversion to reflect the value ranges.
Change-Id: Ic3234b83a912ce1ad12d1b254f3378763e15cc5c
|
|
Replace MMX with SSE2.
Change-Id: Ia8fcba755952804e347d7d7736f57d1f90c988a0
|
|
Runs about 30% faster than the C
BUG=webm:1021
Change-Id: I6809d6d84c3077ab619c53298296950e976bdaba
|
|
|
|
|
|
In motion estimation stage for subpel motion, subpel variance is
computed use bilinear interpolation. The motion vector precision
used is at 1/8 pel and three bits are used to represent the x and y
subpel offsets. Based on this, the half pel check should be against
4, not 8.
Change-Id: I1f56fa1fa3f2f5e19a20d27983efe628557f170e
|
|
there are sse2 equivalents which is a reasonable modern baseline
Removed mmx variance functions:
vpx_get_mb_ss_mmx()
vpx_get8x8var_mmx()
vpx_get4x4var_mmx()
vpx_variance4x4_mmx()
vpx_variance8x8_mmx()
vpx_mse16x16_mmx()
vpx_variance16x16_mmx()
vpx_variance16x8_mmx()
vpx_variance8x16_mmx()
Change-Id: Iffaf85344c6676a3dd337c0645a2dd5deb2f86a1
|
|
there are sse2 equivalents which is a reasonable modern baseline
Change-Id: Ibbe536a5ad1c2cccef6bdcc75c13b3dde35a56ba
|
|
Change-Id: I4906d1b79a2951e659995202b9fa97e2ea5cfba0
|
|
|
|
|
|
* changes:
The subfunctions are only defined for sse2
Unlike non-hbd variance, opt2 is never used
|
|
Change-Id: I431ea0d9abe764d110a1ba32a8cb15e2fdac8805
|
|
This change makes the c match the assembly and removes the todo's
associated with getting this to work.
Change-Id: Ie32e9ebb584a9d60399662d8bcb71b74fbd19d1e
|
|
Change-Id: Ibe0cc388226622561d2b4a00e5bdc1016a3c4a94
|
|
See highbd_subpel_variance_impl_sse2.asm
Change-Id: Id13b97f4f6d189ed71cdc6d52b3c4ea63dc1da05
|
|
Change-Id: I1d342725df332c4efc6006d9e3dcb7372c41f448
|
|
* changes:
vp9_frame_scale_ssse3.c: make 2 functions static
vp9_pickmode.c: make function static
vp9_noise_estimate.c: make function static
vp9_aq_360.c: add missing include
vp9_idct_intrin_sse2: add missing vp9_rtcd.h include
vpx_dsp/*.[hc]: add missing vpx_dsp_rtcd.h include
|
|
Change-Id: I103be7eee36492f8619144ce8325bc916d4975c7
|
|
Change-Id: I05b3028a38bbc062c388eeb95e99a3fee583ae6b
|
|
Change-Id: I1ad41c096ec86870f9aecab6fdbc3af03e972afc
|
|
|
|
In so doing this fixes a couple of bugs:
vpx_plane_add_noise.c needed to subtract a clamp instead of add.
And the assembly (mmx sse) had assumptions that parameters were
continuous in memory which was not true.
Change-Id: I76f2c43cf54bfc838eb2edf8a443eaaa7565d7b5
|
|
|
|
Change-Id: I12218d8331c0558c0587a66321e3ca46da7e5cc7
|
|
|
|
Change-Id: I481eb271b082fa3497b0283f37d9b4d1f6de270c
|
|
bits_left is in the range [0, 64 (= BD_VALUE_SIZE)] , so the narrowing
conversion should be safe.
Change-Id: I943fcd359eaad76249ee1e1fb03a2ac16945d2fd
|
|
The product always fits in uint32_t, but the operands don't.
An optimizing compiler should generate the wraparound code.
(Verified with clang).
Change-Id: I25eb64df99152992bc898b8ccbb01d55c8d16e3c
|
|
These blocks will never overflow since max sum is +/-255*w*h.
Change-Id: Ia2c630339fd9cfb411b56b6040ff402095f12a2e
|
|
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1156
Change-Id: Ief0ad8d6255b0ef0f233cda153799e3c72d3dbc6
|
|
The order of the output structure is not currently important.
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1021
Change-Id: Ibc0006d569675db6c5060c4529f5d9e73f2e96a6
|