summaryrefslogtreecommitdiff
path: root/vp8/common/arm
AgeCommit message (Collapse)Author
2014-02-26VP8 for ARMv8 by using NEON intrinsics 05James Yu
Add dequantizeb_neon.c - vp8_dequantize_b_loop_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.23 (fps) Change-Id: Iebe3b0c6ed2359c778b0570763c5681ae25fef0c Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26VP8 for ARMv8 by using NEON intrinsics 04James Yu
Add dequant_idct_neon.c - vp8_dequant_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.22 (fps) Change-Id: Id48f39e1da58dd3d8d37658e94989411997f4f7c Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26VP8 for ARMv8 by using NEON intrinsics 03James Yu
Add dc_only_idct_add_neon.c - vp8_dc_only_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.24 (fps) Change-Id: I5e9e277ec3a3ca67e13c8cc4c324a6fbe8a897fc Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-23VP8 for ARMv8 by using NEON intrinsics 02James Yu
Add copymem_neon.c - vp8_copy_mem16x16_neon - vp8_copy_mem8x8_neon - vp8_copy_mem8x4_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.25 (fps) Change-Id: Ib956b5a20522ff57dc8a580bf0aef7b252bddba6 Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-12minor spelling cleanup in commentsAndrew Russell
Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06
2014-01-23vp8/common: add extern "C" to headersJames Zern
Change-Id: I13b434b1e6621e31962b08831c3587c039368c83
2014-01-10Apply neon flags to intrinsic filesJohann
Filter out files ending in _neon.c and append .neon so the Android build system knows to apply -mfpu=neon Change-Id: Ib67277e5920bfcaeda7c4aa16cd1001b11d59305
2014-01-09VP8 for ARMv8 by using NEON intrinsics 01James Yu
Add bilinearpredict_neon_intrinsics.c - vp8_bilinear_predict4x4_neon - vp8_bilinear_predict8x4_neon - vp8_bilinear_predict8x8_neon - vp8_bilinear_predict16x16_neon Change-Id: I33dfa502881219841b442dda32b73220e51b716b Signed-off-by: James Yu <james.yu@linaro.org>
2013-12-16vp8/common: normalize include guardsJames Zern
Change-Id: Ia8789a8f864e0edc0bf94f00f6430846f86911c3
2013-05-22arm: Move the definition of bilinear_taps_coeff to within the sectionMartin Storsjo
Previously, the microsoft arm assembler errored out, saying that bilinear_taps_coeff was an undefined symbol. Change-Id: Ib938f0b454c41ccbd801e70a7c9acc0fa04e3c55
2013-05-22arm: Explicitly write both target registers for ldrdMartin Storsjo
The microsoft assembler can't handle the second register being implicit. Change-Id: Ia831953a78a25fd6b2082474f05fdb78d96cdf78
2012-11-15support building vp8 and vp9 into a single libJohn Koleszar
Change-Id: Ib8f8a66c9fd31e508cdc9caa662192f38433aa3d
2012-08-08Update armv6 vp8_intra4x4_predictJohann
Change-Id: I52a3b0a4a42e5af91b987e19523df07c8f467847
2012-08-01Change vp8_intra4x4_predict call sitesJohann
Use the _d variant from the decoder. It moves the pointer calculations to the caller. Change-Id: Iae2a793433ef082980a3ffa0a1cabf0264a6a24d
2012-06-11Fix pedantic compiler warningsJohn Koleszar
Allows building the library with the gcc -pedantic option, for improved portabilty. In particular, this commit removes usage of C99/C++ style single-line comments and dynamic struct initializers. This is a continuation of the work done in commit 97b766a46, which removed most of these warnings for decode only builds. Change-Id: Id453d9c1d9f44cc0381b10c3869fabb0184d5966
2012-05-02Merge "Fix compiler warnings" into eiderJohn Koleszar
2012-05-02Fix TEXTRELs in the ARM asm.Timothy B. Terriberry
Besides imposing a performance penalty at startup in most configurations, these relocations break the dynamic linker for native Fennec, since it does not support them at all. Change-Id: Id5dc768609354ebb4379966eb61a7313e6fd18de
2012-05-02Fix compiler warningsAttila Nagy
Fix code for following warnings: -Wimplicit-function-declaration -Wuninitialized -Wunused-but-set-variable -Wunused-variable Change-Id: I2be434f22fdecb903198e8b0711255b4c1a2947a
2012-03-05Move SAD and variance functions to commonJohann
The MFQE function of the postprocessor depends on these Change-Id: I256a37c6de079fe92ce744b1f11e16526d06b50a
2012-01-30RTCD: add subpixel functionsJohn Koleszar
This commit continues the process of converting to the new RTCD system. Change-Id: I6c519ab61e4f4e0ebcc796f2df061f945c48cefe
2012-01-30RTCD: add recon functionsJohn Koleszar
This commit continues the process of converting to the new RTCD system. Change-Id: I9bfcf9bef65c3d4ba0fb9a3e1532bad1463a10d6
2012-01-30RTCD: add remaining IDCT functionsJohn Koleszar
This commit continues the process of converting to the new RTCD system. Change-Id: I03c4dbf30dfd3558b0e256ff9d3ff4c012aadc80
2012-01-30RTCD: add loopfilter functionsJohn Koleszar
This commit continues the process of converting to the new RTCD system. Change-Id: Ic8a4047d72ff3a54ec98977dd90e70c13213db71
2012-01-30New RTCD implementationJohn Koleszar
This is a proof of concept RTCD implementation to replace the current system of nested includes, prototypes, INVOKE macros, etc. Currently only the decoder specific functions are implemented in the new system. Additional functions will be added in subsequent commits. Overview: RTCD "functions" are implemented as either a global function pointer or a macro (when only one eligible specialization available). Functions which have RTCD specializations are listed using a simple DSL identifying the function's base name, its prototype, and the architecture extensions that specializations are available for. Advantages over the old system: - No INVOKE macros. A call to an RTCD function looks like an ordinary function call. - No need to pass vtables around. - If there is only one eligible function to call, the function is called directly, rather than indirecting through a function pointer. - Supports the notion of "required" extensions, so in combination with the above, on x86_64 if the best function available is sse2 or lower it will be called directly, since all x86_64 platforms implement sse2. - Elides all references to functions which will never be called, which could reduce binary size. For example if sse2 is required and there are both mmx and sse2 implementations of a certain function, the code will have no link time references to the mmx code. - Significantly easier to add a new function, just one file to edit. Disadvantages: - Requires global writable data (though this is not a new requirement) - 1 new generated source file. Change-Id: Iae6edab65315f79c168485c96872641c5aa09d55
2012-01-26Rename save_neon_reg.asm as save_reg_neon.asmAttila Nagy
Easier to filter out all NEON asm. Change-Id: I0022dae8321a9608e864b09d4181414c5fff4610
2012-01-20Disconnect ARM tgt_isa from dsp extensionsFritz Koenig
A processor with ARMv7 instructions does not necessarily have NEON dsp extensions. This CL has the added side effect of allowing the ability to enable/disable the dsp extensions cleanly. Change-Id: Ie1e879b8fe131885bc3d4138a0acc9ffe73a36df
2012-01-06Reduced the size of Y1Dequant and friends to [128][2]Scott LaVarnway
This patch removes the local copies of the dequantize constants and implements John's idea as described in "Make a local copy of the dequantized data" commit. Change-Id: Ic6b7d681f00bf63263f71ff1e39ab2f80729e8b2
2011-12-21Remove useless g_common.hJohn Koleszar
This file declared a bunch of nonexistent, unreferenced global function pointers. Change-Id: Ic26bb8c7712deba754c49fc01f383b53afc9e728
2011-12-15Moved dequant idct into commonScott LaVarnway
These functions are now used by the encoder. This is WIP with the goal of creating a common idct/add for the encoder and decoder. A boost of 1.8% was seen for the HD rt test clip used. [Tero] Added needed changes to ARM side. Change-Id: Ibbb8000be09034203d7adffc457d3c3f8b06a5bf
2011-11-25Modified the inverse walsh to output directlyScott LaVarnway
to the dqcoeff or qcoeff buffer. The encoder would populate the dc coeffs of the y blocks as a separate stage (recon_dcblock) and the decoder would use a special version of the idct. This change eliminates the extra copy and reduces the code footprint. [Tero] Added needed changes to armv6 and NEON assembly. Change-Id: I83202ffdbaf83f6e5dd69f4ba2519fcf0b13b3ba
2011-11-09ARMv6 optimized Intra4x4 predictionTero Rintaluoma
Added ARM optimized intra 4x4 prediction - 2x faster on Profiler compared to C-code compiled with -O3 - Function interface changed a little to improve BLOCKD structure access Change-Id: I9bc2b723155943fe0cf03dd9ca5f1760f7a81f54
2011-10-18Remove usage of predict buffer for decodeScott LaVarnway
Instead of using the predict buffer, the decoder now writes the predictor into the recon buffer. For blocks with eob=0, unnecessary idcts can be eliminated. This gave a performance boost of ~1.8% for the HD clips used. Tero: Added needed changes to ARM side and scheduled some assembly code to prevent interlocks. Patch Set 6: Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3) into this commit because of similarities in the idct functions. Patch Set 7: EC bug fix. Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6
2011-09-22Replace vpx_ports/config.h with vpx_config.hAttila Nagy
Just a clean-up. Change-Id: Iea5b6dc925dcfa7db548bc1ab1a13d26ed5a2c9a
2011-07-13Merge "Update armv6 loopfilter to new interface"Johann
2011-07-13Merge "Update armv7 loopfilter to new interface"Johann
2011-07-12Update armv6 loopfilter to new interfaceAttila Nagy
Change-Id: I5fe581d797571a7a9432fbd17fc557591d0c1afa
2011-07-12Update armv7 loopfilter to new interfaceAttila Nagy
Change-Id: I65105a9c63832669237e6a6a7fcb4ea3ea683346
2011-06-29clean up warnings when building arm with rtcdJohann
Change-Id: I3683cb87e9cb7c36fc22c1d70f0799c7c46a21df
2011-06-28Merge "Avoid text relocations in ARM vp8 decoder"Johann
2011-06-28Avoid text relocations in ARM vp8 decoderMike Hommey
The current code stores pointers to coefficient tables and loads them to access the tables contents. As these pointers are stored in the code sections, it means we end up with text relocations. eu-findtextrel will thus complain about code not compiled with -fpic/-fPIC. Since the pointers are stored in the code sections, we can actually cheat and let the assembler generate relative addressing when accessing the coefficient tables, and just load their location with adr. Change-Id: Ib74ae2d3f2bab80b29991355f2dbe6955f38f6ae
2011-06-17utilize preload in ARMv6 MC/LPF/Copy routinesTaekhyun Kim
About 9~10% decoding perf improvement on non-Neon ARM cpus Change-Id: I7dc2a026764e84e9c2faf282b4ae113090326837
2011-05-19Fixed iwalsh_neon build problems with RVDS4.1Attila Nagy
rvct 4.1 was complaining about vstmia.16, store multiple expects 64 data type. optimized the implementation. Change-Id: I0701052cabd685c375637bbc3796ff6d88f5972c
2011-05-04Loopfilter NEON: Use VMOV for constant vectors instead of VLD.Attila Nagy
Change-Id: I562b6e01c32bb51d00f3b95faf757fc7dc29a3a3
2011-04-25remove simpler_lpfJohann
the decision to run the regular or simple loopfilter is made outside the function and managed with pointers stop tracking the option in two places. use filter_type exclusively Change-Id: I39d7b5d1352885efc632c0a94aaf56b72cc2fe15
2011-03-11Move build_intra_predictors_mby to RTCD frameworkJohn Koleszar
The vp8_build_intra_predictors_mby and vp8_build_intra_predictors_mby_s functions had global function pointers rather than using the RTCD framework. This can show up as a potential data race with tools such as helgrind. See https://bugzilla.mozilla.org/show_bug.cgi?id=640935 for an example. Change-Id: I29c407f828ac2bddfc039f852f138de5de888534
2011-02-10Fix relative include pathsJohn Koleszar
Allow compiling without adding vp8/{common,encoder,decoder} to the include paths. Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
2011-02-09Adds armv6 optimized variance calculationTero Rintaluoma
Adds vp8_sub_pixel_variance16x16_armv6 function to encoder. Integrates ARMv6 optimized bilinear interpolations from vp8/common/arm/armv6 and adds new assembly file for variance16x16 calculation. - vp8_filter_block2d_bil_first_pass_armv6 (integrated) - vp8_filter_block2d_bil_second_pass_armv6 (integrated) - vp8_variance16x16_armv6 (new) - bilinearfilter_arm.h (new) Change-Id: I18a8331ce7d031ceedd6cd415ecacb0c8f3392db
2011-02-08clean up bilinear filterJohann
make reference version of bilinear_filters short. use reference versions of bilinear_filters and sub_pel_filters when possible. recognize that Width was being passed into filter_block2d_bil_first_pass multiple times. ARM version had already fixed this. propegate to C. change references to src_pixels_per_line to src_pitch and standardize on src/dst (instead of input/output). recognize that first_pass is only run in the verticle and second_pass only horizontal. ARM version had already fixed this. propegate to C Change-Id: I292d376d239a9a7ca37ec2bf03cc0720606983e2
2011-02-07move one of the offset filesJohann
common/arm/vpx_asm_offsets moves up a level. prepare for muxing with encoder/arm/vpx_vp8_enc_asm_offsets Change-Id: I89a04a5235447e66571995c9d9b4b6edcb038e24
2011-02-04remove unused dboolhuff codeJohann
we were holding on to this "just in case." purge it instead Change-Id: I77a367b36d0821d731019f2566ecfffdae1d4b8a