summaryrefslogtreecommitdiff
path: root/vp8/vp8_common.mk
AgeCommit message (Collapse)Author
2014-05-06Revert "VP8 for ARMv8 by using NEON intrinsics 08"Johann
This reverts commit a5d79f43b963ced59b462206faf3b7857bdeff7b There is an issue with gcc 4.6 in the Android NDK: loopfilter_neon.c: In function 'vp8_loop_filter_vertical_edge_y_neon': loopfilter_neon.c:394:1: error: insn does not satisfy its constraints: Change-Id: I2b8c6ee3fa595c152ac3a5c08dd79bd9770c7b52
2014-05-03VP8 for ARMv8 by using NEON intrinsics 16James Yu
Add variance_neon.c - vp8_variance16x16_neon - vp8_variance16x8_neon - vp8_variance8x16_neon - vp8_variance8x8_neon Change-Id: Idfb9c96134a1c6a696a98ce68b4f7ed593a00660 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 15James Yu
Add idct_dequant_0_2x_neon.c - idct_dequant_0_2x_neon Change-Id: I8e129172ef1b2517cf72ff267788921f1a792586 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 14James Yu
Add sixtappredict_neon.c - vp8_sixtap_predict16x16_neon - vp8_sixtap_predict8x8_neon - vp8_sixtap_predict8x4_neon - vp8_sixtap_predict4x4_neon Change-Id: I3b02fce48ae2e6c6099041ba5ddd7b090f1463b9 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 13James Yu
Add shortidct4x4llm_neon.c - vp8_short_idct4x4llm_neon Change-Id: I5a734bbffca8dacf8633c2b0ff07b98aa2f438ba Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 12James Yu
Add sad_neon.c - vp8_sad16x16_neon - vp8_sad16x8_neon - vp8_sad8x8_neon - vp8_sad8x16_neon - vp8_sad4x4_neon Change-Id: I08eaae49ec03fb91b394354660a5df0367cea311 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 11James Yu
Add mbloopfilter_neon.c - vp8_mbloop_filter_horizontal_edge_y_neon - vp8_mbloop_filter_horizontal_edge_uv_neon - vp8_mbloop_filter_vertical_edge_y_neon - vp8_mbloop_filter_vertical_edge_uv_neon Change-Id: Ia9084e0892d4d49412d9cf2b165a0f719f2382d7 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 10James Yu
Add loopfiltersimpleverticaledge_neon.c - vp8_loop_filter_bvs_neon - vp8_loop_filter_mbvs_neon Change-Id: I7cf0a161ad4ae37c881b94cc0122f895d3baae79 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03VP8 for ARMv8 by using NEON intrinsics 09James Yu
Add loopfiltersimplehorizontaledge_neon.c - vp8_loop_filter_bhs_neon - vp8_loop_filter_mbhs_neon Change-Id: I77f9721b20585da8bf3869a3850ff0ae4b4bfeea Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02VP8 for ARMv8 by using NEON intrinsics 08James Yu
Add loopfilter_neon.c - vp8_loop_filter_horizontal_edge_y_neon - vp8_loop_filter_horizontal_edge_uv_neon - vp8_loop_filter_vertical_edge_y_neon - vp8_loop_filter_vertical_edge_uv_neon Change-Id: I50b57dedabd42d2a3c183c1738cc5346f0e71ed8 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02VP8 for ARMv8 by using NEON intrinsics 07James Yu
Add iwalsh_neon.c - vp8_short_inv_walsh4x4_neon Change-Id: I8beda6ce11ad8ce9e80cc0a38d40161938359162 Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02VP8 for ARMv8 by using NEON intrinsics 06James Yu
Add idct_dequant_full_2x_neon.c - idct_dequant_full_2x_neon ==== Summary of apply VP8 decode patch series ==== Benchmark on Samsung Chromebook, Cortex-A15, 1.7GHz, Dual core Toolchain: linaro-1.13.1-4.8-2014.01 Compile argument: CROSS=arm-linux-gnueabihf- ../libvpx/configure --target=armv7-linux-gcc --prefix=$HOME/out --enable-shared --cpu=cortex-a7 Test argument: vpxdec --summary --noblit ./tears_of_steel_1080p.webm NEON assembly 46.68 (fps) Apply patch 06 46.65, -0.03 Apply patch 07 46.86, +0.21 Apply patch 08 46.58, -0.28 Apply patch 09 46.57, -0.01 Apply patch 10 46.51, -0.06 Apply patch 11 46.13, -0.38 Apply patch 12 45.42, -0.71 Apply patch 13 46.06, +0.64 Apply patch 14 45.19, -0.87 Apply patch 15 45.93, +0.74 Apply patch 16 45.48, -0.45 Apply patch 17 45.84, +0.36 Apply patch 18 45.91, +0.07 <= With all NEON intrinsics patches Total -0.77 fps, 1.65% performance regression Change-Id: I77bfc9eaccfb97b8d401e949ceff8795e26ca6b7 Signed-off-by: James Yu <james.yu@linaro.org>
2014-04-29Remove VP8 save_reg_neon functionYunqing Wang
This patch did a cleanup following the commit "Save NEON registers in VP8 NEON functions". The pushing/poping of callee-saved NEON registers was moved into individual NEON functions. Therefore, we don't need to save those registers at the beginning of codec. The related code was removed. Change-Id: I5648166514fc9beffb780aa138495597731f49ea
2014-03-03build: convert rtcd.sh to perlJames Zern
significantly speeds up file generation. the goal of this change is to convert rtcd.sh to perl as directly as possible to allow for simple comparison. future changes can make it more perl-like. --- Linux [CREATE] vpx_scale_rtcd.h real 0m0.485s -> 0m0.022s [CREATE] vp8_rtcd.h real 0m4.619s -> 0m0.060s [CREATE] vp9_rtcd.h real 0m10.102s -> 0m0.087s Windows [CREATE] vpx_scale_rtcd.h real 0m8.360s -> 0m0.080s [CREATE] vp8_rtcd.h real 1m8.083s -> 0m0.160s [CREATE] vp9_rtcd.h real 2m6.489s -> 0m0.233s Change-Id: Idfb71188206c91237d6a3c3a81dfe00d103f11ee
2014-02-26VP8 for ARMv8 by using NEON intrinsics 05James Yu
Add dequantizeb_neon.c - vp8_dequantize_b_loop_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.23 (fps) Change-Id: Iebe3b0c6ed2359c778b0570763c5681ae25fef0c Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26VP8 for ARMv8 by using NEON intrinsics 04James Yu
Add dequant_idct_neon.c - vp8_dequant_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.22 (fps) Change-Id: Id48f39e1da58dd3d8d37658e94989411997f4f7c Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26VP8 for ARMv8 by using NEON intrinsics 03James Yu
Add dc_only_idct_add_neon.c - vp8_dc_only_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.24 (fps) Change-Id: I5e9e277ec3a3ca67e13c8cc4c324a6fbe8a897fc Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-23VP8 for ARMv8 by using NEON intrinsics 02James Yu
Add copymem_neon.c - vp8_copy_mem16x16_neon - vp8_copy_mem8x8_neon - vp8_copy_mem8x4_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.25 (fps) Change-Id: Ib956b5a20522ff57dc8a580bf0aef7b252bddba6 Signed-off-by: James Yu <james.yu@linaro.org>
2014-01-10Apply neon flags to intrinsic filesJohann
Filter out files ending in _neon.c and append .neon so the Android build system knows to apply -mfpu=neon Change-Id: Ib67277e5920bfcaeda7c4aa16cd1001b11d59305
2014-01-09VP8 for ARMv8 by using NEON intrinsics 01James Yu
Add bilinearpredict_neon_intrinsics.c - vp8_bilinear_predict4x4_neon - vp8_bilinear_predict8x4_neon - vp8_bilinear_predict8x8_neon - vp8_bilinear_predict16x16_neon Change-Id: I33dfa502881219841b442dda32b73220e51b716b Signed-off-by: James Yu <james.yu@linaro.org>
2013-07-09remove unused VP8 com/dec asm offsetsJames Zern
Change-Id: Ib3b26ee27f04b2dcbbd32b3127afb45e9f50cfcf
2013-03-02prefix vp8 asm_{com,dec,enc}_offsets filesJames Zern
make them symmetrical with the generated output and their vp9 counterparts Change-Id: I72cc97c4d33d713dff620a6d7cc25955266216fc
2012-11-15support building vp8 and vp9 into a single libJohn Koleszar
Change-Id: Ib8f8a66c9fd31e508cdc9caa662192f38433aa3d
2012-11-07Rough merge of master into experimentalJohn Koleszar
Creates a merge between the master and experimental branches. Fixes a number of conflicts in the build system to allow *either* VP8 or VP9 to be built. Specifically either: $ configure --disable-vp9 $ configure --disable-vp8 --disable-unit-tests VP9 still exports its symbols and files as VP8, so that will be resolved in the next commit. Unit tests are broken in VP9, but this isn't a new issue. They are fixed upstream on origin/experimental as of this writing, but rebasing this merge proved difficult, so will tackle that in a second merge commit. Change-Id: I2b7d852c18efd58d1ebc621b8041fe0260442c21
2012-11-01Rename vp8/ codec directory to vp9/.Ronald S. Bultje
Change-Id: Ic084c475844b24092a433ab88138cf58af3abbe4
2012-11-01Adjust style to match Google Coding Style a little more closely.Ronald S. Bultje
Most of these were picked up by jenkins in the commit that changed the vp8 namespace to vp9 in common/. Change-Id: I5cbd56ffc753b92ef805133cda6acc1713a13878
2012-10-29Make implicit_segmentation-related code an experiment.Ronald S. Bultje
This way, the code is not compiled in by default, thus decreasing overall binary size. Change-Id: I85cac8f5a22a51a7d99c820ef6d6ed179d4106a0
2012-10-25Faster 8t filteringScott LaVarnway
Quickly modified the ssse3 sixtap filters to support eight taps. For the test clip used, a 23+% boost in decoder performance was seen. We can revisit later and improve further. Change-Id: I5f59860459e80d6fa23e6cc0fd91296a969f5240
2012-10-25Added sse2 instrinsic version of vp8_sad16x3Scott LaVarnway
3.7% boost in decoder performance for the clip used. Change-Id: I74f28486a9352b472b36e21b5eaf30eff35e9199
2012-10-22Added rtcd support vp8_sad16x3 and vp8_sad3x16Scott LaVarnway
Change-Id: I5bca7b7a4b230082d36ac6fb84db84137ad177d7
2012-10-19sse2 intrinsic version of vp8_mbloop_filter_vertical_edge()Scott LaVarnway
First sse2 version of vp8_mbloop_filter_vertical_edge(). For now, intrinsics are being used until the bitstream is finalized. This function will be revisited later for further performance improvements. For the test clip used, a 34+% decoder performance improvement was seen. This will vary depending on material. Change-Id: I455b438bc8d8af76cf7533ac42eda5f689b21f7c
2012-10-18sse2 intrinsic version of vp8_mbloop_filter_horizontal_edge()Scott LaVarnway
First sse2 version of vp8_mbloop_filter_horizontal_edge(). For now, intrinsics are being used until the bitstream is finalized. This function will be revisited later for further performance improvements. For the test clip used, a 31+% decoder performance improvement was seen. This will vary depending on material. Change-Id: I03ed3a7182478bdd1f094644ff3e0442625600e7
2012-10-17removed obselete build dependencyJim Bankoski
this commit fixes the build on windows with visual studio 2008. Change-Id: I0baa4044e9e54237da29f2e17332ea6f766dbbec
2012-08-24New Motion Reference SearchPaul Wilkins
Alternative strategy for finding a list of candidate motion vectors to use as reference values in mv coding and as nearest and near. Sort by sad in vp8_find_best_ref_mvs() rather than just pick the best. Allow 0,0 as a best ref option but not a nearest or near unless there are no alternatives. Encode/Decode verified on at least some clips. Some commented out experimental and stats code still in place. Gain over existing code averages about 1% on derf (alll metrics) with improvement on all clips. Other test results pending. The entropy coding of the mode (nearest/near etc) still depends upon and requires the old "findnear" code so this needs looking at and may provide room for further gains. Change-Id: I871d7cba1d1c379c4bad9bcccce1fb19c46b8247
2012-08-22Merge "remove rotation experiment" into experimentalJohn Koleszar
2012-08-21SSE2 version of vectorized 8-tap filtering.Christian Duvivier
About 20% overall encoder speedup (vs. about 30% for sse4 version). Change-Id: Ibf608a6a1bc94b14ec47e8046d3206b275b5a8bd
2012-08-21remove rotation experimentJohn Koleszar
This is being reimplemented more generically in terms of affine transforms. Change-Id: I9300bfde5f8b93c708c64f59427087720f8ed782
2012-08-15First partial snapshot of vectorized 8-tap filtering.Christian Duvivier
About 3.5x faster, 30% overall encoder speedup. Rest of optimizations will come soon (see TODO section in filter_sse4.c). Change-Id: If18108048bfd5345fc942e8574e4c7f58e0e86e0
2012-08-08Partial import of "New RTCD implementation" from master branch.Christian Duvivier
Latest version of all scripts/makefile but rtcd_defs.sh is empty, all existing functions are still selected using the old/current way. Change-Id: Ib92946a48a31d6c8d1d7359eca524bc1d3e66174
2012-08-08Update armv6 vp8_intra4x4_predictJohann
Change-Id: I52a3b0a4a42e5af91b987e19523df07c8f467847
2012-08-01Rename vp8_intra4x4_predict_dJohann
predict_d has become canonical. Remove previous helper function. Disable ARM assembly pending update. Change-Id: Idd84ac8a28f9b0221ea97904a77de1e705d06a7d
2012-07-10VP8 optimizations for MIPS dspr2Dragan Mrdjan
Signed-off-by: Raghu Gandham <raghu@mips.com> Change-Id: I3a8bca425cd3dab746a6328c8fc8843c8e87aea6
2012-05-23changed the way that default probs for 8x8 is set.Yaowu Xu
The commit changed how baseline 8x8 coefficient probabilities are initialized, to be consistent with the initialization of baseline 4x4 coefficient probabilities. The commit does not have any effect on compression. Change-Id: Ifb3902b5dc0b0c2e6dc3aa5d4a6589d528e58355
2012-05-22Move all tests to test/ directoryJohn Koleszar
Consolodate the unit tests under vp8/ to the test/ directory Change-Id: I6d6a0fb60f5e3874a4d6710e9e121dd3e81a93db
2012-05-22Build unit tests monolithicallyJohn Koleszar
Rework unit tests to have a single executable rather than many, which should avoid pollution of the visual studio project namespace, improve build times, and make it easier to use the gtest test sharding system when we get these going on the continuous build cluster. Change-Id: If4c3e5d4b3515522869de6c89455c2a64697cca6
2012-04-19Makes all mode token tables const part 2Scott LaVarnway
(see Change I9b2ccc88: Makes all mode token tables const) Further remove runtime table initialization and use precalculated const data. Data footprint reduced by 4112 bytes. Change-Id: Ia3ae9fc19f77316b045cabff01f6e5f0876a86ab
2012-03-12fixed .mk files to reflect add/remove of a header fileYaowu Xu
In a previous commit, the duplicate of headerfile defaultcoefcounts.h was identified. This commit updates the .mk file to ensure configure and make works properly for all platforms. Change-Id: I31a39c809a734ba438ee53db700f252e9a03eddd
2012-03-06RFC: Reorganize MFQE loopsJohann
Break MFQE code into it's own file. It is currently only valid for 16x16 and 8x8 Y blocks. It also filters 4x4 U/V blocks. Refactor filtering and add associated assembly. Limited test cases show --mfqe introduces a penalty of ~20% with HD content. The assembly reduces the penalty to ~15% Change-Id: I4b8de6b5cdff5413037de5b6c42f437033ee55bf
2012-03-05Move SAD and variance functions to commonJohann
The MFQE function of the postprocessor depends on these Change-Id: I256a37c6de079fe92ce744b1f11e16526d06b50a
2012-02-21Add unit tests for idctllm_test and idctllm_mmxJames Berry
add unit tests for vp8_short_idct4x4llm_c Change-Id: I472b7c0baa365ba25dc99a3f6efccc816d27c941