Age | Commit message (Collapse) | Author |
|
Conflicts:
vp9/common/vp9_findnearmv.c
vp9/common/vp9_rtcd_defs.sh
vp9/decoder/vp9_decodframe.c
vp9/decoder/x86/vp9_dequantize_sse2.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_common.mk
Resolve file name changes in favor of master. Resolve rdopt changes in
favor of experimental, preserving the newer experiments.
Change-Id: If51ed8f457470281c7b20a5c1a2f4ce2cf76c20f
|
|
Remove similarly named header file. It is obsolete.
Move file to match naming style.
Adjust make file to include the file correctly and remove extra
unnecessary #if guard.
Change-Id: Ifba07ba9938a5df08a9f4eda54a3ac4d6983f7bf
|
|
First in a series of commits moving the framebuffers pointers to
per-plane data, so that they can be indexed numerically rather than
by name.
Change-Id: I6e0d60fd4d51e6375c384eb7321776564df21775
|
|
The C code was being used as a fallback for the >16 case, but only for 2D.
Change-Id: I1e2e6da9e4b28bd88bde9ba4dd32724ce466cf6f
|
|
|
|
Use in-place buffers (dst of MACROBLOCKD) for macroblock prediction.
This makes the macroblock buffer handling consistent with those of
superblock. Remove predictor buffer MACROBLOCKD.
Change-Id: Id1bcd898961097b1e6230c10f0130753a59fc6df
|
|
Updates the common convoloution code to support blocks larger than
16x16, and rectangular blocks. This uncovered a bug in the SSSE3
filtering routines due to the order of application of saturation.
This commit fixes that bug, adjusts the unit test to bias its
random values towards the extremes, and adds a test to ensure that
all filters conform to the expected pairwise addition structure.
Change-Id: I81f69668b1de0de5a8ed43f0643845641525c8f0
|
|
that are related to using reconstructed pixel for selecting reference
motion vectors.
Change-Id: I048dfae39ca7385e344b57d46347ecc6e753e1bb
|
|
VP9 preview bitstream 2, commit '868ecb55a1528ca3f19286e7d1551572bf89b642'
Conflicts:
vp9/vp9_common.mk
Change-Id: I3f0f6e692c987ff24f98ceafbb86cb9cf64ad8d3
|
|
Conflicts:
vp9/vp9_common.mk
Change-Id: I2cd5ab47dc31c4210cefc23a282102123d5e2221
|
|
Allow more careful targeting of compiler flags.
Change-Id: I963ab4a6479dedb165419310dfca52a58a9877b8
|
|
Rename the file and clean up includes. In the future we would like to
pattern match the files which need additional compiler flags.
Change-Id: I2c76256467f392a78dd4ccc71e6e0a580e158e56
|
|
Small modification of idct code.
Change-Id: I5c4e3223944c68e4ccf762f6cf07c990250e4290
|
|
Wrote sse2 version of vp9_short_idct_32x32 function. Compared
to c version, the sse2 version is 5X faster.
Change-Id: I071ab7378358346ab4d9c6e2980f713c3c209864
|
|
Wrote sse2 version of vp9_short_idct10_16x16 function. Compared
to c version, the sse2 version is 2.3X faster.
Change-Id: I314c4f09369648721798321eeed6f58e38857f26
|
|
|
|
Wrote sse2 version of vp9_short_idct16x16 function. Compared to c
version, the sse2 version is over 2.5X faster.
Change-Id: I38536e2b846427a2cc5c5423aaf305fd0e605d61
|
|
Renaming Width to width, Height to height and Version to version in
several structs and function signatures.
Change-Id: I084c3f7e747cb2ce3345aff27a3dff9b13a87543
|
|
Wrote sse2 functions of vp9_short_idct8x8 and vp9_short_idct10_8x8.
Compared to c version, the sse2 version is 2X faster. The decoder
test didn't show noticeable gain since 8x8 idct doesn't take much
of decoding time (less than 1% in my test).
Change-Id: I56313e18cd481700b3b52c4eda5ca204ca6365f3
|
|
The commit changed the name of files and function to remove obselete
reference to LLM and x8.
Change-Id: I973b20fc1a55149ed68b5408b3874768e6f88516
|
|
Added SSE2 idct4_1d which is called by vp9_short_iht4x4. Also,
modified the parameter type passed to vp9_short_iht functions to
make it work with rtcd prototype.
Change-Id: I81ba7cb4db6738f1923383b52a06deb760923ffe
|
|
|
|
Wrote a SSE2 vp9_short_idct4x4llm to improve the decoder
performance.
Change-Id: I90b9d48c4bf37aaf47995bffe7e584e6d4a2c000
|
|
Picks up some build system changes, compiler warning fixes, etc.
Change-Id: I2712f99e653502818a101a72696ad54018152d4e
|
|
|
|
Removed vp9_idctllm_mmx.asm
Change-Id: I7152756f23a5a09ed69e8fb40edb2ab3237290fe
|
|
s/movd/movq/
Change-Id: Id1a56de91551f8dc796f14f1056c565dfc1ba626
|
|
Wrote SSE2 version of vp9_dc_only_idct_add_c function. In order to
improve performance, clipped the absolute diff values to [0, 255].
This allowed us to keep the additions/subtractions in 8 bits.
Test showed an over 2% decoder performance increase.
Change-Id: Ie1a236d23d207e4ffcd1fc9f3d77462a9c7fe09d
|
|
Initial ssse3 convolve avg functions and is one step closer
to using x86inc.asm. The decoder performance improved by 8% for
the test clip used. This should be revisited later to see if
averaging outside the loop is better than having many similar
filter functions.
Change-Id: Ice3fafb423b02710b0448ffca18b296bcac649e9
|
|
A 16 bit overflow condition occurs when using the EIGHTTAP_SMOOTH filters.
(vp9_sub_pel_filters_8lp) Changed the order of the adds to fix this problem.
Also added ssse3 support for 4x4 subpixel filtering.
Change-Id: I475eaadae920794c2de5e01e9735c059a856518e
|
|
* changes:
Restore SSSE3 subpixel filters in new convolve framework
Convert subpixel filters to use convolve framework
Add 8-tap generic convolver
|
|
This commit adds the 8 tap SSSE3 subpixel filters back into the code
underneath the convolve API. The C code is still called for 4x4
blocks, as well as compound prediction modes. This restores the
encode performance to be within about 8% of the baseline.
Change-Id: Ife0d81477075ae33c05b53c65003951efdc8b09c
|
|
Change-Id: I8508f1a3d3430f998bb9295f849e88e626a52a24
|
|
Update the code to call the new convolution functions to do subpixel
prediction rather than the existing functions. Remove the old C and
assembly code, since it is unused. This causes a 50% performance
reduction on the decoder, but that will be resolved when the asm for
the new functions is available.
There is no consensus for whether 6-tap or 2-tap predictors will be
supported in the final codec, so these filters are implemented in
terms of the 8-tap code, so that quality testing of these modes
can continue. Implementing the lower complexity algorithms is a
simple exercise, should it be necessary.
This code produces slightly better results in the EIGHTTAP_SMOOTH
case, since the filter is now applied in only one direction when
the subpel motion is only in one direction. Like the previous code,
the filtering is skipped entirely on full-pel MVs. This combination
seems to give the best quality gains, but this may be indicative of a
bug in the encoder's filter selection, since the encoder could
achieve the result of skipping the filtering on full-pel by selecting
one of the other filters. This should be revisited.
Quality gains on derf positive on almost all clips. The only clip
that seemed to be hurt at all datarates was football
(-0.115% PSNR average, -0.587% min). Overall averages 0.375% PSNR,
0.347% SSIM.
Change-Id: I7d469716091b1d89b4b08adde5863999319d69ff
|
|
Updated the instrinsic code to match Yaowu's latest loopfilter change.
(I584393906c4f5f948a581d6590959522572743bb)
The decoder performance improved by ~30% for the test clip used.
Change-Id: I026cfc75d5bcb7d8d58be6f0440ac9e126ef39d2
|
|
During master jenkins verification proces
Change-Id: I3722b8753eaf39f99b45979ce407a8ea0bea0b89
|
|
Change-Id: I0c94475075e66e13cfe4c20fab7db6474441ae86
|
|
experimental
|
|
and vp9_mb_lpf_vertical_edge_w_sse2. This was quickly done so we can
run some tests over the weekend. Future commits will optimize/refactor these
functions further.
The decoder performance improved by ~17% for the clip used.
Change-Id: I612687cd5a7670ee840a0cbc3c68dc2b84d4af76
|
|
|
|
On block boundary within a MB when 8x8 block boundary only is filtered
for Y.
Change-Id: Ie1c804c877d199e78e2fecd8c2d3f1e114ce9ec1
|
|
Updated the rtcd_defs and used the sse2 uv version
of the loopfilter. The performance improved by ~8%
for the test clip used.
Change-Id: I5a0bca3b6674198d40ca4a77b8cc722ddde79c36
|
|
|
|
About 5% decoder speedup.
Change-Id: Ib6687d337af758a536a0e7e289f400990f1f9794
|
|
Incorportate vp9-preview changes by merging master branch into experimental.
Conflicts:
test/test.mk
vp9/common/vp9_filter.c
vp9/common/vp9_idctllm.c
vp9/common/vp9_invtrans.h
vp9/common/vp9_mbpitch.c
vp9/common/vp9_rtcd_defs.sh
vp9/common/vp9_systemdependent.h
vp9/common/vp9_type_aliases.h
vp9/common/x86/vp9_asm_stubs.c
vp9/common/x86/vp9_subpixel_mmx.asm
vp9/decoder/vp9_decodframe.c
vp9/decoder/vp9_dequantize.c
vp9/decoder/vp9_dequantize.h
vp9/decoder/vp9_onyxd_int.h
vp9/encoder/vp9_bitstream.c
vp9/encoder/vp9_encodeframe.c
vp9/encoder/vp9_rdopt.c
Change-Id: I17f51c3666d1b59cf1a699f87607cbc5d30a87c5
|
|
Various fixups to resolve issues when building vp9-preview under the more stringent
checks placed on the experimental branch.
Change-Id: I21749de83552e1e75c799003f849e6a0f1a35b07
|
|
These filters will not work with VP9.
Change-Id: Ic26c77961084fcea6bfa97f4cd95afdea2282e85
|
|
|
|
Change-Id: Ibc077cf1c1da0c86063f88c6d3073c6876989119
|
|
Change-Id: If7822e6fcd0d3568b934032322b19ba3e401df26
|