summaryrefslogtreecommitdiff
path: root/vp8
AgeCommit message (Collapse)Author
2011-05-06neon fast quantizer updatedTero Rintaluoma
vp8_fast_quantize_b_neon function updated and further optimized. - match current C implementation of fast quantizer - updated to use asm_enc_offsets for structure members - updated ads2gas scripts to handle alignment issues Change-Id: I5cbad9c460ad8ddb35d2970a8684cc620711c56d
2011-04-08Fix input MV for full searchYunqing Wang
Input MV needs to be modified to full-pixel precision. Change-Id: Ic5d78e41bf27077e325024332b9fe89f76c44f0c
2011-04-08Merge "use asm_offsets with vp8_fast_quantize_b_sse3"Johann Koenig
2011-04-08Merge "Error accumulator stats bug."John Koleszar
2011-04-08Error accumulator stats bug.Paul Wilkins
The error accumulator stats values cpi->prediction_error and cpi->intra_error were being populated with rd values not distortion values. These are only "currently" used in a limited way for RT compress key frame detection. Change-Id: I2702ba1cab6e49ab8dc096ba75b6b34ab3573021
2011-04-07use asm_offsets with vp8_fast_quantize_b_sse3Johann Koenig
on the same order as the sse2 fast quantize change: ~2% except for 32bit. only a slight improvment there. Change-Id: Iff80e5f1ce7e646eebfdc8871405458ff911986b
2011-04-07Use correct 32 bit comparisons for SAD breakout.James Berry
Rax updated to eax to avoid uninitialized memory usage. Change-Id: Iedb953f104329ede2a786fc648a47f1be2f3798a
2011-04-06Merge "use asm_offsets with vp8_fast_quantize_b_sse2"Johann
2011-04-06Merge "Minor modification"Yunqing Wang
2011-04-06Minor modificationYunqing Wang
A small change. Change-Id: I2e7726e58370a95d0319361f4f6ad231138d1328
2011-04-04use asm_offsets with vp8_fast_quantize_b_sse2Johann
on the same order as the regular quantize change: ~2% Change-Id: I5c9eec18e89ae7345dd96945cb740e6f349cee86
2011-04-04Fixed unused variable warnings for firstpass.cScott LaVarnway
Change-Id: I8378a9a541ade2f098359a7b20fa08e6c1596d80
2011-04-04Merge "Slightly simplify vp8_decode_mb_tokens."John Koleszar
2011-04-04Merge "tweak vp8_regular_quantize_b_sse2"Johann
2011-04-04Slightly simplify vp8_decode_mb_tokens.Gaute Strokkenes
Change-Id: I0058ba7dcfc50a3374b712197639ac337f8726be
2011-04-04Merge "Use full-pixel MV in mvsadcost calculation"Yunqing Wang
2011-04-01Use full-pixel MV in mvsadcost calculationYunqing Wang
MV sad cost error is only used in full-pixel motion search, which only need full-pixel resolution instead of quarter-pixel resolution. This change reduced mvsadcost table size, and removed unneccessary pamameter passing since this table is constant once it is generated. Change-Id: I9f931e55f6abc3c99011321f1dfb2f3562e6f6b0
2011-04-01tweak vp8_regular_quantize_b_sse2Johann
rather than look up rc in the zig zag table, embed it in the macro. this also allows us to shuffle some values in the macro and keep *d in rsi gains of about the same order as the obj_int_extract implementation: ~2% Change-Id: Ib7252dd10eee66e0af8b0e567426122781dc053d
2011-04-01Merge "Wrapper function removed from vp8_subtract_b_neon function call"Johann
2011-04-01Wrapper function removed from vp8_subtract_b_neon function callTero Rintaluoma
Address calculations moved from encodemb_arm.c file to neon optimized assembly function to save cycles in function calls. - vp8_subtract_b_neon_func replaced with vp8_subtract_b_neon that contains all needed address calculations - unnecessary file encodemb_arm.c removed - consistent with ARMv6 optimized version Change-Id: I6cbc1a2670b56c2077f59995fcf8f70786b4990b
2011-03-31Merge "ARMv6 optimized subtract functions"Johann
2011-03-30Fix: lpf semaphore was signaled in single threaded runAttila Nagy
After picking filter level, post the loopfilter semaphore just when multiple threads are in use. Change-Id: If7bfb64601d906adef703f454dafc25e978b93c6
2011-03-29Merge "Half pixel variance further optimized for ARMv6"Johann
2011-03-29Merge "Fix a crash while enabling shared (--enable-shared)"Yunqing Wang
2011-03-29Fix a crash while enabling shared (--enable-shared)Yunqing Wang
Fixed a bug in SSSE3 sub-pixel filter functions. Change-Id: I2e2126652970eb78307ffcefcace1efd5966fb0a
2011-03-29use GLOBAL correctly on 32bit shared librariesJohann
http://code.google.com/p/webm/issues/detail?id=309 Change-Id: I6fce9e2f74bc09a9f258df7f91ab599812324e8c
2011-03-29ARMv6 optimized subtract functionsTero Rintaluoma
Adds following ARMv6 optimized functions to encoder: - vp8_subtract_b_armv6 - vp8_subtract_mby_armv6 - vp8_subtract_mbuv_armv6 Gives 1-5% speed-up depending on input sequence and encoding parameters. Functions have one stall cycle inside the loop body on Cortex pipeline. Change-Id: I19cca5408b9861b96f378e818eefeb3855238639
2011-03-28add asm_enc_offsets.c for all targetsJohann
now that we need asm_enc_offsets.c for x86 and arm and it is harmless to build it for other targets, add it unconditionally Change-Id: I320c5220afd94fee2b98bda9ff4e5e34c67062f3
2011-03-28Half pixel variance further optimized for ARMv6Tero Rintaluoma
Half pixel interpolations optimized in variance calculations. Separate function calls to vp8_filter_block2d_bil_x_pass_armv6 are avoided.On average, performance improvement is 6-7% for VGA@30fps sequences. Change-Id: Idb5f118a9d51548e824719d2cfe5be0fa6996628
2011-03-24Merge "use asm_offsets with vp8_regular_quantize_b_sse2"Johann
2011-03-24use asm_offsets with vp8_regular_quantize_b_sse2Johann
remove helper function and avoid shadowing all the arguments to the stack on 64bit systems when running with --good --cpu-used=0: ~2% on linux x86 and x86_64 ~2% on win32 x86 msys and visual studio more on darwin10 x86_64 significantly more on x86_64-win64-vs9 Change-Id: Ib7be12edf511fbf2922f191afd5b33b19a0c4ae6
2011-03-23Merge "ARMv6 optimized fdct4x4"Johann
2011-03-21Merge "Fix multithreaded encoding for 1 MB wide frame"Yunqing Wang
2011-03-21Remove unused vp8_get4x4sse_cs_mmx declarationJohn Koleszar
This declaration did not match the prototype_sad() prototype, but was unused in this translation unit, so it is removed instead. Fixes issue 290. Change-Id: I168854f88a85f73ca9aaf61d1e5dc0f43fc3fdb3
2011-03-21Merge "Increase static linkage, remove unused functions"John Koleszar
2011-03-21ARMv6 optimized fdct4x4Tero Rintaluoma
Optimized fdct4x4 (8x4) for ARMv6 instruction set. - No interlocks in Cortex-A8 pipeline - One interlock cycle in ARM11 pipeline - About 2.16 times faster than current C-code compiled with -O3 Change-Id: I60484ecd144365da45bb68a960d30196b59952b8
2011-03-18Fix multithreaded encoding for 1 MB wide frameAttila Nagy
Thread synchronization was not correct when frame width was 1 MB. Number of allocated encoding threads is limited by the sync_range. There is no point having more because each thread lags sync_range MBs behind the thread processing the row above. http://code.google.com/p/webm/issues/detail?id=302 Change-Id: Icaf67a883beecc5ebf2f11e9be47b6997fdf6f26
2011-03-17Increase static linkage, remove unused functionsJohn Koleszar
A large number of functions were defined with external linkage, even though they were only used from within one file. This patch changes their linkage to static and removes the vp8_ prefix from their names, which should make it more obvious to the reader that the function is contained within the current translation unit. Functions that were not referenced were removed. These symbols were identified by: $ nm -A libvpx.a | sort -k3 | uniq -c -f2 | grep ' [A-Z] ' \ | sort | grep '^ *1 ' Change-Id: I59609f58ab65312012c047036ae1e0634f795779
2011-03-17Set bounds from the array when iterating mmaps.Ralph Giles
The mmap allocation code in vp8_dx_iface.c was inconsistent. The static array vp8_mem_req_segs defines two descriptors, but only the first is real. The second is a sentinel and isn't actually allocated, so vpx_codec_alg_priv is declared with mmaps[NELEMENTS(vp8_mem_req_segs)-1]. Some functions use this reduced upper bound when iterating though the mmap array, but these two functions did not. Instead, this commit calls NELEMENTS(...->mmaps) to directly query the bounds of the dereferenced array. This fixes an array-bounds warning from gcc 4.6 on vp8_xma_set_mmap. Change-Id: I918e2721b401d134c1a9764c978912bdb3188be1
2011-03-17Remove commented-out VP6 code from vp8_finalize_mmapsRalph Giles
Change-Id: I48642c380353043bed96026f56de5908fcee270a
2011-03-17Merge "Fix "used uninitialized" warning in vp8_pack_bitstream()"John Koleszar
2011-03-16apple: include proper mach primativesJohn Koleszar
Fixes implicit declaration warning for 'mach_task_self'. This change is an update to Change I9991dedd1ccfddc092eca86705ecbc3b764b799d, which fixed this issue for the decoder but not the encoder. Change-Id: I9df033e81f9520c4f975b7a7cf6c643d12e87c96
2011-03-15Add vp8_variance8x8_armv6 and vp8_sub_pixel_variance8x8_armv6 functionsAttila Nagy
Change-Id: I08edaffc62514907fa5e90e1689269e467c857f5
2011-03-14Merge "Fix an unused variable warning."John Koleszar
2011-03-14Merge "Add vp8_mse16x16_armv6 function"Johann
2011-03-14Add vp8_mse16x16_armv6 functionAttila Nagy
Change-Id: I77e9f2f521a71089228f96e2db72524189364ffb
2011-03-11Merge "Move build_intra_predictors_mby to RTCD framework"Johann
2011-03-11Move build_intra_predictors_mby to RTCD frameworkJohn Koleszar
The vp8_build_intra_predictors_mby and vp8_build_intra_predictors_mby_s functions had global function pointers rather than using the RTCD framework. This can show up as a potential data race with tools such as helgrind. See https://bugzilla.mozilla.org/show_bug.cgi?id=640935 for an example. Change-Id: I29c407f828ac2bddfc039f852f138de5de888534
2011-03-11Merge "ARMv6 optimized quantization"Johann
2011-03-11Merge "Only enable ssim_opt.asm on X86_64"John Koleszar