summaryrefslogtreecommitdiff
path: root/vp8/common/x86
AgeCommit message (Collapse)Author
2014-06-12Use lrand48 on AndroidJohann
When building x86 assembly use lrand48 instead of the undocumented inlined _rand function. Android now supports rand() https://android-review.googlesource.com/97731 but only for new versions. Original workaround: https://gerrit.chromium.org/gerrit/15744 Change-Id: I130566837d5bfc9e54187ebe9807350d1a7dab2a
2014-05-23Removing vp8/common/pragmas.h.Dmitry Kovalev
Change-Id: I80630a7350e884ebc4fef73fb5b52ec25f908523
2014-05-21Renames x86_64 specific asm filesDeb Mukherjee
Renames all x86_64 specific assembly files to consistently end in _x86_64.asm. This will be useful for build systems to handle these files differently. All new 64-bit specific assembly files should use the new naming convention. Change-Id: I36c89584967c82ffc4088b1b5044ac15d2bb7536
2014-05-19Fix valgrind read out of bounds error.Jim Bankoski
MMX variance code in vp8 was reading out of bounds.. TODO(JBB): The best fix would involve removing duplicate library functions between vp8 and vp9... Change-Id: I5722853a6a58d3b55257ff385fa54c773bf98ded
2014-04-21Fix dr memory VP8 encode/decode errorsYunqing Wang
This patch fixed errors reported in Issue 746: "dr memory VP8 encode errors" and Issue 745: "dr memory VP8 decode errors". The "UNINITIALIZED READ" errors were fixed in x86 assembly code. The list of files fixed is vp8_intra_pred_uv_tm_sse2 vp8_intra_pred_uv_tm_ssse3 vp8_intra_pred_uv_ho_mmx2 vp8_intra_pred_uv_ho_ssse3 vp8_intra_pred_y_tm_sse2 vp8_intra_pred_y_tm_ssse3 vp8_intra_pred_y_ho_sse2 Change-Id: Ib6df7bf1d442077fe534edfd90e50ad16fadacdd
2014-03-24Fix uninitialized read in postprocessingYunqing Wang
This patch fixed WebRTC Issue 3020: "Uninit error at vp8_mbpost_proc_down_xmm". The first 8 values in d were not initialized, but was accessed. This patch fixed c code as well as mmx and sse2 code. Change-Id: Iaa5b41a4ed3bea971b15fb826ce34b7ab4e36fb1
2014-02-12minor spelling cleanup in commentsAndrew Russell
Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06
2014-01-23vp8/common: add extern "C" to headersJames Zern
Change-Id: I13b434b1e6621e31962b08831c3587c039368c83
2013-12-16vp8/common: normalize include guardsJames Zern
Change-Id: Ia8789a8f864e0edc0bf94f00f6430846f86911c3
2013-10-29idct_blk_mmx.c: use vpx_memset instead of castJohann
Fix warning with -Wstrict-aliasing=1 Change-Id: Ic37013e6477cf213925830d0bd8e6f17364ff7cc
2013-10-01Merge "Fix linker warnings for bilinear filters"Matthew Heaney
2013-10-01Fix linker warnings for bilinear filtersMatthew Heaney
The declaration of the bilinear filters specified an alignment clause in the implementation file but not in the header. This turned out to be harmless, but it did cause linker warnings to be emitted when building on Windows. The (extern) declaration in the header was changed, to match the declaration in the implementation. Change-Id: I44be89b1572fe9a50fa47a42e4db9128c4897b04
2013-09-26fixed integer overflow warningsYaowu Xu
Jenkins warns on left shift of negative numbers and non-aligned read of int. This commit fixed the two issues. Change-Id: I389a7fb6a572c643902e40a4c10fefef94500d2c
2013-03-15Bug fix: Issue 531: MMX code tries to read from SSE2 registerScott LaVarnway
Reported by Krzysztof Kaspruk. https://code.google.com/p/webm/issues/detail?id=531 Change-Id: Ib5d5878ad07707bd42c2ca833eb021004f537012
2013-02-27Fix --as=nasm compatibility for new asm code.Jan Kratochvil
s/movd/movq/ Change-Id: Id1a56de91551f8dc796f14f1056c565dfc1ba626
2013-02-22Fix variance (signed integer) overflowJames Zern
based on change made in experimental: 9847344 Fix variance (signed integer) overflow Change-Id: I36f4ba5700f6f4615057daf7e70868f68a86669f
2013-02-17Use dq instead of ddq with NASMKO Myung-Hun
Change-Id: Iffb7cd44b449dc10fa5c24405be909d051b7abb5
2013-01-31Add support for x64 and win64 yasm flags.Frank Galligan
Some projects must define only win64 for Windows 64bit builds using yasm. Change-Id: I1d09590d66a7bfc8b4412e1cc8685978ac60b748
2012-12-27Merge branch 'vp9-preview' of review:webm/libvpxJohn Koleszar
Merge the vp9-preview branch into master. Change-Id: If700b9054676f24bed9deb59050af546c1ca5296
2012-11-26Merge "vp8_intra_pred_y_tm_sse2: save/restore xmm registers"James Zern
2012-11-20Merge "vp8_loop_filter_bh_y_sse2: save/restore xmm registers"John Koleszar
2012-11-20vp8_filter_block1d4_h6_ssse3: add missing xmm restoreJames Zern
Change-Id: Ia8f6b6c2a9ed60bee7949dd06fcc18b392e91d76
2012-11-20vp8_loop_filter_bh_y_sse2: save/restore xmm registersJames Zern
xmm[6-11] should be saved and restored for Windows x64; prevents an encoder mismatch and some datarate issues. Change-Id: I03c38eb18ec20c6c441cae19416393058baad1ee
2012-11-19vp8_intra_pred_y_tm_sse2: save/restore xmm registersJames Zern
xmm6/xmm7 should be saved and restored for Windows x64; prevents an encoder mismatch and some datarate issues. Change-Id: Ifa1a82ab25fbdc5112d66f5332e14b16e69ac164
2012-11-15support building vp8 and vp9 into a single libJohn Koleszar
Change-Id: Ib8f8a66c9fd31e508cdc9caa662192f38433aa3d
2012-10-23postproc_sse2: avoid reading off the end of the limits arrayJohn Koleszar
Rather than unconditionally reading in the next MB's limits, test the loop exit condition first. Change-Id: I105d1e92698fb5561aa87160816787604aed03a2
2012-10-08post-proc: deblock filter optimizationYunqing Wang
1. Algorithm modification: Instead of having same filter threshold for a whole frame, now we allow the thresholds to be adjusted for each macroblock. In current implementation, to avoid excessive blur on background as reported in issue480(http://code.google.com/p/webm/issues/detail?id=480), we reduce the thresholds for skipped macroblocks. 2. SSE2 optimization: As started in issue479(http://code.google.com/p/webm/issues/detail?id=479), the filter calculation was adjusted for better performance. The c code was also modified accordingly. This made the deblock filter 2x faster, and the decoder was 1.2x faster overall. Next, the demacroblock filter will be modified similarly. Change-Id: I05e54c3f580ccd427487d085096b3174f2ab7e86
2012-07-27Be consistent with SAD valuesJohann
SAD returns unsigned values. Make all the declarations the same. Remove bestsad initialization and check. It is always set to the result of a SAD call so it will never remain UINT_MAX Use ja instead of jg to test unsigned comparison instead of signed. Update test. Change-Id: I46336ab45f4e60fc37caf20bd36bc5782079c7a5
2012-07-02Add 0 offsets handling in SSSE3 sixtap_predict functionsYunqing Wang
This patch fixed issue 458 by calling copy function when both offsets are 0, which guarantees the SSSE3 functions output same result as the c function for all possible offsets. Change-Id: I209aec7a4c6b3362db2646a8887c1038493b6496
2012-06-11Fix pedantic compiler warningsJohn Koleszar
Allows building the library with the gcc -pedantic option, for improved portabilty. In particular, this commit removes usage of C99/C++ style single-line comments and dynamic struct initializers. This is a continuation of the work done in commit 97b766a46, which removed most of these warnings for decode only builds. Change-Id: Id453d9c1d9f44cc0381b10c3869fabb0184d5966
2012-05-23Merge "Make libvpx Chromium build friendly"John Koleszar
2012-05-23Make libvpx Chromium build friendlyAlpha Lam
Add PRIVATE macro for adding private_extern directive for yasm to hide global symbols. This is only enabled if -DCHROMIUM is used with YASM. Also fixed a small problem with rtcd_defs.sh to guard TEMPORAL_DENOISING. Change-Id: I9027fce3ebddcf20078293e4b86b396f21da7857
2012-05-22Build unit tests monolithicallyJohn Koleszar
Rework unit tests to have a single executable rather than many, which should avoid pollution of the visual studio project namespace, improve build times, and make it easier to use the gtest test sharding system when we get these going on the continuous build cluster. Change-Id: If4c3e5d4b3515522869de6c89455c2a64697cca6
2012-04-12loopfilter improvementsScott LaVarnway
Local variable offsets are now consistent for the functions, removed unused parameters, reworked the assembly to eliminate stalls/instructions. Change-Id: Iaa37668f8a9bb8754df435f6a51c3a08d547f879
2012-03-29Updated vp8_build_intra_predictors_mby_s(sse2/ssse3)Scott LaVarnway
to work with the latest code. Patch Set 2: aligned the above_row buffers to fix crash Change-Id: I7a6992a20ed079ccd302f8c26215cf3057f8b70c
2012-03-26Updated vp8_build_intra_predictors_mbuv_s(sse2/ssse3)Scott LaVarnway
to work with the latest code. Change-Id: Ie382bb55d00ea5929bdadba859eea15f696d4cd9
2012-03-06RFC: Reorganize MFQE loopsJohann
Break MFQE code into it's own file. It is currently only valid for 16x16 and 8x8 Y blocks. It also filters 4x4 U/V blocks. Refactor filtering and add associated assembly. Limited test cases show --mfqe introduces a penalty of ~20% with HD content. The assembly reduces the penalty to ~15% Change-Id: I4b8de6b5cdff5413037de5b6c42f437033ee55bf
2012-03-05Move SAD and variance functions to commonJohann
The MFQE function of the postprocessor depends on these Change-Id: I256a37c6de079fe92ce744b1f11e16526d06b50a
2012-02-21Add unit tests for idctllm_test and idctllm_mmxJames Berry
add unit tests for vp8_short_idct4x4llm_c Change-Id: I472b7c0baa365ba25dc99a3f6efccc816d27c941
2012-02-16Support Android x86 NDK buildMakoto Kato
On Android NDK, rand() is inlined function. But, on our SSE optimization, we need symbol for rand() Change-Id: I42ab00e3255208ba95d7f9b9a8a3605ff58da8e1
2012-01-30RTCD: add subpixel functionsJohn Koleszar
This commit continues the process of converting to the new RTCD system. Change-Id: I6c519ab61e4f4e0ebcc796f2df061f945c48cefe
2012-01-30RTCD: add postproc functionsJohn Koleszar
This commit continues the process of converting to the new RTCD system. Change-Id: If54eb5cb5d1b0cac6c4c0633a9e99c93ca860ba2
2012-01-30RTCD: add recon functionsJohn Koleszar
This commit continues the process of converting to the new RTCD system. Change-Id: I9bfcf9bef65c3d4ba0fb9a3e1532bad1463a10d6
2012-01-30RTCD: add remaining IDCT functionsJohn Koleszar
This commit continues the process of converting to the new RTCD system. Change-Id: I03c4dbf30dfd3558b0e256ff9d3ff4c012aadc80
2012-01-30RTCD: add loopfilter functionsJohn Koleszar
This commit continues the process of converting to the new RTCD system. Change-Id: Ic8a4047d72ff3a54ec98977dd90e70c13213db71
2012-01-30New RTCD implementationJohn Koleszar
This is a proof of concept RTCD implementation to replace the current system of nested includes, prototypes, INVOKE macros, etc. Currently only the decoder specific functions are implemented in the new system. Additional functions will be added in subsequent commits. Overview: RTCD "functions" are implemented as either a global function pointer or a macro (when only one eligible specialization available). Functions which have RTCD specializations are listed using a simple DSL identifying the function's base name, its prototype, and the architecture extensions that specializations are available for. Advantages over the old system: - No INVOKE macros. A call to an RTCD function looks like an ordinary function call. - No need to pass vtables around. - If there is only one eligible function to call, the function is called directly, rather than indirecting through a function pointer. - Supports the notion of "required" extensions, so in combination with the above, on x86_64 if the best function available is sse2 or lower it will be called directly, since all x86_64 platforms implement sse2. - Elides all references to functions which will never be called, which could reduce binary size. For example if sse2 is required and there are both mmx and sse2 implementations of a certain function, the code will have no link time references to the mmx code. - Significantly easier to add a new function, just one file to edit. Disadvantages: - Requires global writable data (though this is not a new requirement) - 1 new generated source file. Change-Id: Iae6edab65315f79c168485c96872641c5aa09d55
2012-01-17vp8d - valgrind warnings in mb post processorJim Bankoski
Solved by extending the border in the postproc buffer as necessary Change-Id: Ic3f61397fe5bc8e4db6fc78050b0b160bd0aee86
2012-01-06Merge "Reduced the size of Y1Dequant and friends to [128][2]"John Koleszar
2012-01-06Reduced the size of Y1Dequant and friends to [128][2]Scott LaVarnway
This patch removes the local copies of the dequantize constants and implements John's idea as described in "Make a local copy of the dequantized data" commit. Change-Id: Ic6b7d681f00bf63263f71ff1e39ab2f80729e8b2
2012-01-05Merge "SSE2 optimizations for vp8_build_intra_predictors_mby{,_s}()"Scott LaVarnway