Age | Commit message (Collapse) | Author |
|
Add loopfiltersimpleverticaledge_neon.c
- vp8_loop_filter_bvs_neon
- vp8_loop_filter_mbvs_neon
Change-Id: I7cf0a161ad4ae37c881b94cc0122f895d3baae79
Signed-off-by: James Yu <james.yu@linaro.org>
|
|
Add loopfiltersimplehorizontaledge_neon.c
- vp8_loop_filter_bhs_neon
- vp8_loop_filter_mbhs_neon
Change-Id: I77f9721b20585da8bf3869a3850ff0ae4b4bfeea
Signed-off-by: James Yu <james.yu@linaro.org>
|
|
|
|
|
|
|
|
|
|
Add loopfilter_neon.c
- vp8_loop_filter_horizontal_edge_y_neon
- vp8_loop_filter_horizontal_edge_uv_neon
- vp8_loop_filter_vertical_edge_y_neon
- vp8_loop_filter_vertical_edge_uv_neon
Change-Id: I50b57dedabd42d2a3c183c1738cc5346f0e71ed8
Signed-off-by: James Yu <james.yu@linaro.org>
|
|
Add iwalsh_neon.c
- vp8_short_inv_walsh4x4_neon
Change-Id: I8beda6ce11ad8ce9e80cc0a38d40161938359162
Signed-off-by: James Yu <james.yu@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
Add idct_dequant_full_2x_neon.c
- idct_dequant_full_2x_neon
==== Summary of apply VP8 decode patch series ====
Benchmark on Samsung Chromebook, Cortex-A15, 1.7GHz, Dual core
Toolchain: linaro-1.13.1-4.8-2014.01
Compile argument: CROSS=arm-linux-gnueabihf- ../libvpx/configure
--target=armv7-linux-gcc --prefix=$HOME/out
--enable-shared --cpu=cortex-a7
Test argument: vpxdec --summary --noblit ./tears_of_steel_1080p.webm
NEON assembly 46.68 (fps)
Apply patch 06 46.65, -0.03
Apply patch 07 46.86, +0.21
Apply patch 08 46.58, -0.28
Apply patch 09 46.57, -0.01
Apply patch 10 46.51, -0.06
Apply patch 11 46.13, -0.38
Apply patch 12 45.42, -0.71
Apply patch 13 46.06, +0.64
Apply patch 14 45.19, -0.87
Apply patch 15 45.93, +0.74
Apply patch 16 45.48, -0.45
Apply patch 17 45.84, +0.36
Apply patch 18 45.91, +0.07 <= With all NEON intrinsics patches
Total -0.77 fps, 1.65% performance regression
Change-Id: I77bfc9eaccfb97b8d401e949ceff8795e26ca6b7
Signed-off-by: James Yu <james.yu@linaro.org>
|
|
|
|
|
|
|
|
|
|
Match x86_abi_support.asm configuration
Change-Id: Ic0d03a23961e6858cf5153389ec8afa0fae3307a
|
|
|
|
When ARNR filtering is disabled, by setting
arnr_max_frames=0, mode_skip_mask was being set to
-1 for the ARF frame resulting in no mode being
selected for the block.
The intent is to restrict the reference frame to the
previous ARF frame and the mode to one of ZEROMV,
NEARMV or NEARESTMV.
Change-Id: Ifc3920b153142cd01d422910c94d2f20ffb6f129
|
|
On balance Deb's modified rate control for VBR seems
to be outperforming especially on some low motion YT
clips so I have switched this to be the default mode for
now.
Change-Id: I0713d430cad6425ac5c48fccdf332e12814ee44a
|
|
Change-Id: Ib56df7cd282dadbfd202de23f0c746a93b5ce63e
|
|
Change-Id: I0d50354111df79b74aafcd3bb7dc14df3c14733a
|
|
|
|
When used --show-program-output shows the output from the programs run
during testing.
Change-Id: I15a47c43d1fcf0243c8df1a75d0d2a584ae1f08f
|
|
|
|
|
|
Change-Id: Idccb530c814cb8a2fb9f7d0c11eaef25044efe5e
|
|
Change-Id: Id6ab59e505be28cd4eb9f1fe114feb47debe0539
|
|
|
|
|
|
|
|
|
|
Change-Id: I7cc6f441f414ca1b4d95dad3f789fff6faf8c3c4
|
|
Change-Id: I34ebc59980cf661ed658555e245bf0a93e5c3373
|
|
Corresponding C functions were removed in
I99695564a3aa9bc8c79ac0a551d257e2ff3ad3c3
Change-Id: I50a5575065a7a9e41904eb2161afd739def927db
|
|
|
|
Add a verbose logging function instead of checking
$VPX_TEST_VERBOSE_OUTPUT in multiple places.
Change-Id: I82618809f0964f696ed17ca4d99d8d7d252232f4
|
|
|
|
Don't update the stats if we have a corrupted frame.
Change-Id: I65a13adc50e0389b4201d3b671f0225195dfaff4
TODO: Test case that shows this problem.
|
|
Added in preparation for modifications to support high bitdepth
operations.
Change-Id: I1ad403ea8886cb84020ff06807ae25e2e4bff608
|
|
|
|
Used horizonal add instructions instead of adding
byte lanes. The encoder performance improved by
~4% for the test clip used.
Change-Id: Iaddd10403fcffb5b3f53b1f591ab2fe0ff002c08
|
|
|
|
Change-Id: I6b520553cb5334b44356dc4651a2dbc1cb93cca5
|
|
|
|
This patch did a cleanup following the commit "Save NEON registers
in VP8 NEON functions". The pushing/poping of callee-saved NEON
registers was moved into individual NEON functions. Therefore,
we don't need to save those registers at the beginning of codec.
The related code was removed.
Change-Id: I5648166514fc9beffb780aa138495597731f49ea
|
|
Change-Id: I6dc9741cdcd700f5c4a387f58da7feb58dd4bbda
|
|
Assembly implementation of ssse3 8x8 forward 2D-DCT. The current
version is turned on only for x86_64. The average unit runtime
goes from 157 cycles down to 136 cycles, i.e., about 12.8% faster.
This translates into about 1.5% speed-up for pedestrian_area 1080p
at speed 2.
Change-Id: I0f12435857e9425ed7ce12541344dfa16837f4f4
|