summaryrefslogtreecommitdiff
path: root/vp8/encoder
AgeCommit message (Collapse)Author
2011-03-21ARMv6 optimized fdct4x4Tero Rintaluoma
Optimized fdct4x4 (8x4) for ARMv6 instruction set. - No interlocks in Cortex-A8 pipeline - One interlock cycle in ARM11 pipeline - About 2.16 times faster than current C-code compiled with -O3 Change-Id: I60484ecd144365da45bb68a960d30196b59952b8
2011-03-17Merge "Fix "used uninitialized" warning in vp8_pack_bitstream()"John Koleszar
2011-03-15Add vp8_variance8x8_armv6 and vp8_sub_pixel_variance8x8_armv6 functionsAttila Nagy
Change-Id: I08edaffc62514907fa5e90e1689269e467c857f5
2011-03-14Merge "Add vp8_mse16x16_armv6 function"Johann
2011-03-14Add vp8_mse16x16_armv6 functionAttila Nagy
Change-Id: I77e9f2f521a71089228f96e2db72524189364ffb
2011-03-11Merge "Move build_intra_predictors_mby to RTCD framework"Johann
2011-03-11Move build_intra_predictors_mby to RTCD frameworkJohn Koleszar
The vp8_build_intra_predictors_mby and vp8_build_intra_predictors_mby_s functions had global function pointers rather than using the RTCD framework. This can show up as a potential data race with tools such as helgrind. See https://bugzilla.mozilla.org/show_bug.cgi?id=640935 for an example. Change-Id: I29c407f828ac2bddfc039f852f138de5de888534
2011-03-11Merge "ARMv6 optimized quantization"Johann
2011-03-11Clean up of vp8_init_config()Paul Wilkins
Clean up vp8_init_config() a bit and remove null pointer case, as this code can't be called any more and is not an adequate trap anyway, as a null pointer would cause exceptions before hitting the test. Change-Id: I937c00167cc039b3aa3f645f29c319d58ae8d3ee
2011-03-11Merge "1 Pass CQ and VBR bug fixes"John Koleszar
2011-03-111 Pass CQ and VBR bug fixesPaul Wilkins
Issue 291 highlighted the fact that CQ mode was not working as expected in 1 pass mode, This commit fixes that specific problem but in so doing I also uncovered an overflow issue in the VBR code for 1 pass and some data values not being correctly initialized. For some clips (particularly short clips), the resulting improvement is dramatic. Change-Id: Ieefd6c6e4776eb8f1b0550dbfdfb72f86b33c960
2011-03-11Merge "Fix incorrect macroblock counts in twopass rate control"John Koleszar
2011-03-11Merge "Align SAD output array to be 16-byte aligned"Yunqing Wang
2011-03-11Merge "vp8cx - psnr converted to call assemblerized sse"John Koleszar
2011-03-11Merge "vp8cx- alternate ssim function with optimizations"John Koleszar
2011-03-11vp8cx - psnr converted to call assemblerized sseJim Bankoski
Change-Id: Ie388d4618c44b131f96b9fe526618b457f020dfa
2011-03-11vp8cx- alternate ssim function with optimizationsJim Bankoski
Change-Id: I91921b0a90dbaddc7010380b038955be347964b3
2011-03-11Align SAD output array to be 16-byte alignedYunqing Wang
Use aligned store. Change-Id: Icab4c0c53da811d0c52bb7e8134927f249ba2499
2011-03-11Merge "Encoder loopfilter running in its own thread"Yunqing Wang
2011-03-11Fix "used uninitialized" warning in vp8_pack_bitstream()Attila Nagy
Change-Id: Iadcbdba717439f47a2c24e65fd69a3a1464174b5
2011-03-11Encoder loopfilter running in its own threadAttila Nagy
In multithreaded mode the loopfilter is running in its own thread (filter level calculation and frame filtering). Filtering is mostly done in parallel with the bitstream packing. Before starting the packing the loopfilter level has to be calculated. Also any needed reference frame copying is done in the filter thread. Currently the encoder will create n+1 threads, where n > 1 is the number of threads specified by application and 1 is the extra filter thread. With n = 1 the encoder runs in single thread mode. There will never be more than n threads running concurrently. Change-Id: I4fb29b559a40275d6d3babb8727245c40fba931b
2011-03-11ARMv6 optimized quantizationTero Rintaluoma
Adds new ARMv6 optimized function vp8_fast_quantize_b_armv6 to the encoder. Change-Id: I40277ec8f82e8a6cbc453cf295a0cc9b2504b21e
2011-03-10Added missing format specifier in print statementAdrian Grange
Printout of firstpass stats for frame had one fewer format specifiers than arguments. Change-Id: I5a42c85aa79c471e1a70afd75e24a91546b7a1cd
2011-03-10Removed firstpass motion mapAdrian Grange
The firstpass motion map consists of an 8-bit flag for each MB indicating how strongly the firstpass code believes it should be filtered during the second pass ARNR filtering. For long or large format material the motion map can become extremely large and hamper the operation of the encoding process. This change removes the motion map altogether, leaving the second pass to rely on the magnitude of the motion compensated error to determine the filter weight to use for the MB during ARNR filtering. Tests on the derf set indicate that the effect of this change is neutral, with some small wins and losses. The motion map has therefore been removed based on a cost/benefit evaluation. Change-Id: I53e07d236f5ce09a6f0c54e7c4ffbb490fb870f6
2011-03-10Fix incorrect macroblock counts in twopass rate controlJames Berry
The previous calculation of macroblock count (w*h)/256 is not correct when the width/height are not multiples of 16. Use the precalculated macroblock count from cpi->common instead. This manifested itself as a divide by zero when the number of pixels was less than 256. num_mbs updated in estimate_max_q, estimate_q, estimate_kf_group_q, and estimate_cq Change-Id: I92ff98587864c801b1ee5485cfead964673a9973
2011-03-09Add vp8_sub_pixel_variance16x8_ssse3 functionYunqing Wang
Added SSSE3 function Change-Id: I8c304c92458618d93fda3a2f62bd09ccb63e75ad
2011-03-09Remove unused functionsYunqing Wang
Removed some unused functions Change-Id: Ifdfc27453e53cfc75997b38492901d193a16b245
2011-03-09Merge "Improve SSE2 half-pixel filter funtions"Yunqing Wang
2011-03-09Merge "Configuration updates:Making a clear distinction between Init and Change"John Koleszar
2011-03-08Improve SSE2 half-pixel filter funtionsYunqing Wang
Rewrote these functions to process 16 pixels once instead of 8. Change-Id: Ic67e80124467a446a3df4cfecfb76a4248602adb
2011-03-08Merge "Add zero offset checking in SSE2 sub-pixel filter function"Yunqing Wang
2011-03-08Add zero offset checking in SSE2 sub-pixel filter functionYunqing Wang
Skip filter at zero offset. Change-Id: I95fc7e211869bc0ab5bcfb7ab2e3259d1c0ccf38
2011-03-08Merge "Write SSSE3 sub-pixel filter function"Yunqing Wang
2011-03-08Write SSSE3 sub-pixel filter functionYunqing Wang
1. Process 16 pixels at one time instead of 8. 2. Add check for both xoffset =0 and yoffset=0, which happens during motion search. This change gave encoder 1%~3% performance gain. Change-Id: Idaa39506b48f4f8b2fbbeb45aae8226fa32afb3e
2011-03-08Fix a multi-line format-string warning.Ralph Giles
GCC 4.5 and 4.6 both issue a warning about the multi-line format string introduced in bc9c30a0, which also changed the whitespace in the associated stt file by line-wrapping the long format string. Instead, use multiple string constants, which the compiler will concatenate. This maintains the original formatting, but remains legible within the standard line length. Change-Id: I27c9f92d46be82d408105a3a4091f145f677e00e
2011-03-08Corrected minor typos.Paul Wilkins
Change-Id: Icc9f12bd1e1bdaf51256dc8a90d08aa9be89ef34
2011-03-08Merge changes I00c3e823,If8bca004Paul Wilkins
* changes: Improved key frame detection. Improved KF insertion after fades to still.
2011-03-07correct zbin boost for splitmv modeJohn Koleszar
Disable zbin boost in SPLITMV mode as intended. Was incorrectly looking at vp8_ref_frame_order instead of vp8_mode_order when comparing against SPLITMV. This condition should have always been false, as SPLITMV is not in the range of valid reference frames. Change-Id: I0408cc7595eff68f00efef6d008e79f5b60d14bf
2011-03-07Improved key frame detection.Paul Wilkins
In some cases where clips have been encoded with borders (eg. some wide-screen content where there is a border top and bottom and slide shows containing portrait format photographs (border left and right)) key frames were not being correctly detected. The new code looks to measure cases where a portion of the image can be coded equally easily using intra or inter modes and where the resulting error score is also very low. These "neutral" areas are then discounted in the key frame detection code. Change-Id: I00c3e8230772b8213cdc08020e1990cf83b780d8
2011-03-07Improved KF insertion after fades to still.Paul Wilkins
This code extends what was previously done for GFs, to pick cases where insertion of a key frame after a fade (or other transition or complex motion) followed by a still section, will be beneficial and will reduce the number of forced key frames. Change-Id: If8bca00457f0d5f83dc3318a587f61c17d90f135
2011-03-04Merge "Fixing divide by zero"John Koleszar
2011-03-03Configuration updates:Making a clear distinction between Init and ChangeMikhal Shemer
Change-Id: I7b2fb326e1aabc08b032177a7b914a5b8bb7376f
2011-03-03Fixing divide by zeroMikhal Shemer
Change-Id: I9d8a98a2f7ed1e3116d0bae35164618c41998bac
2011-03-02Fix drastic undershoot in long form contentJohn Koleszar
When the modified_error_left accumulator exceeds INT_MAX, an incorrect cast to int resulted in a negative value, causing the rate control to allocate no bits to that keyframe group, leading to severe undershoot and subsequent poor quality. This error was exposed by the recent change to the rolling target and actual spend accumulators in commit 305be4e4 which fixed them to actually calculate the average value rather than be re-initialized on every frame to the average per-frame bitrate. When this bug was triggered, the target bitrate could be 0, so the rolling target becomes small, which causes the undershoot. The code prior to 305be4e4 did not exhibit this behavior because the rolling target was always set to a reasonable value and was independent of the actual target bitrate. With this patch, the actual target bitrate is calculated correctly, and the rate control tracks as expected. This cast was likely added to silence a compiler warning on a comparison between a double (modified_error_left) and an int (0). Instead, this patch removes the cast and changes the comparison to be against 0.0, which should prevent the warning from reoccuring. This fixes issue #289. Special thanks to gnafu for his efforts in reporting and debugging this fix. Change-Id: Ie5cc1a7b516c578a76c3a50c892a6f04a11621fe
2011-03-02Merge "ARMv6 optimized half pixel variance calculations"Johann
2011-02-28Merge "Add prefetch before variance calculation"Yunqing Wang
2011-02-28Merge "Avoid double copying of key frames into alt and golden buffer"Scott LaVarnway
2011-02-28Add prefetch before variance calculationYunqing Wang
This improved encoding performance by 0.5% (good, speed 1) to 1.5% (good, speed 5). Change-Id: I843d72a0d68a90b5f694adf770943e4a4618f50e
2011-02-25Merge "Remove a second check for invalid ptr in vp8_get_compressed_data"Johann
2011-02-25Merge "Remove temporal alt ref from realtime only build"Johann