summaryrefslogtreecommitdiff
path: root/vp8/encoder/encodeintra.c
AgeCommit message (Collapse)Author
2011-12-21Remove useless g_common.hJohn Koleszar
This file declared a bunch of nonexistent, unreferenced global function pointers. Change-Id: Ic26bb8c7712deba754c49fc01f383b53afc9e728
2011-12-15Moved dequant idct into commonScott LaVarnway
These functions are now used by the encoder. This is WIP with the goal of creating a common idct/add for the encoder and decoder. A boost of 1.8% was seen for the HD rt test clip used. [Tero] Added needed changes to ARM side. Change-Id: Ibbb8000be09034203d7adffc457d3c3f8b06a5bf
2011-11-15Added predictor stride argument(s) to subtract functionsScott LaVarnway
Patch set 2: 64 bit build fix Patch set 3: 64 bit crash fix [Tero] Patch set 4: Updated ARMv6 and NEON assembly. Added also minor NEON optimizations to subtract functions. Patch set 5: x86 stride bug fix Change-Id: I1fcca93e90c89b89ddc204e1c18f208682675c15
2011-11-09Merge "Relocated idct/add calls for encoder"Scott LaVarnway
2011-11-09Relocated idct/add calls for encoderScott LaVarnway
Call the idct/add after the tokenize. This is WIP with the goal of creating a common idct/add for the encoder and decoder. This move is necessary because the decoder's version of the idct clobbers qcoeff, which is used by the tokenize. Change-Id: I6b08d8e8397cd873647fa4fb9469884e3c876756
2011-11-09ARMv6 optimized Intra4x4 predictionTero Rintaluoma
Added ARM optimized intra 4x4 prediction - 2x faster on Profiler compared to C-code compiled with -O3 - Function interface changed a little to improve BLOCKD structure access Change-Id: I9bc2b723155943fe0cf03dd9ca5f1760f7a81f54
2011-10-28added code to clear 2nd order block when appropriateYaowu Xu
It is discovered that in rare situations the 2nd order block may produce a few small magnitude coefficients that has no effect on reconstruction. The situations are a combination of low quantizer values (high quality) and low energy in residual signals (content dependent). This commit added code to detect such cases and reset the 2nd order block to all 0. Patch 1 to 4 used code to do all-zero-check on idct result buffer, and tests on derf set showed a consistent gain of .12%-.14% on all metrics.But due to a recent change Ie31d90b, the idct result buffer is not longer populated. So patch 5&6 use an alternative method to detect the situations. Tests on derf set now shows a consistent quality gain of .16%-.20%. As suggested by Jim, Patch 7&8 removed the condition of all first order block not having any coefficient, instead we reset 2nd order coefficients to all 0 if sum of absolute value of the coefficients is small. So it does slightly more than just detecting the oddity as discussed above, but tests on derf set now show a consistent gain of .20%-.23% on all metrics. It is worth noting here that this change does not have any effect on mid/high quantizer range, it only affects the quantizer value 18 or blow. Within this range, the change helps compression by up to 2.5% on clips in the derf set. Change-Id: I718e19cf59a4fc2462cb7070832759beb9f7e7dd
2011-10-18Remove usage of predict buffer for decodeScott LaVarnway
Instead of using the predict buffer, the decoder now writes the predictor into the recon buffer. For blocks with eob=0, unnecessary idcts can be eliminated. This gave a performance boost of ~1.8% for the HD clips used. Tero: Added needed changes to ARM side and scheduled some assembly code to prevent interlocks. Patch Set 6: Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3) into this commit because of similarities in the idct functions. Patch Set 7: EC bug fix. Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6
2011-09-22Replace vpx_ports/config.h with vpx_config.hAttila Nagy
Just a clean-up. Change-Id: Iea5b6dc925dcfa7db548bc1ab1a13d26ed5a2c9a
2011-06-23Copy macroblock data to a buffer before encoding itYunqing Wang
I got this idea from Pascal (Thanks). Before encoding a macroblock, copy it to a 16x16 buffer, and then read source data from there instead. This will help keep the source data in cache, and help with the performance. Change-Id: Id05f4cb601299150511d59dcba0ae62c49b5b757
2011-06-14Merge "Populate bmi for B_PRED only"Scott LaVarnway
2011-06-14Fix RT only buildTero Rintaluoma
Moved encode_intra function from firstpass.c to encodeintra.c to prevent linking problem in real-time only build. Also changed name of the function to vp8_encode_intra because it is not a static. Change-Id: Ibf3c6c1de3152567347e5fbef47d1d39564620a5
2011-06-13Populate bmi for B_PRED onlyScott LaVarnway
Small decode performance gain (~1%) on keyframes. No noticeable gains on encode. Also changed pick_intra4x4mby_modes() to read the above and left block modes for keyframes only. Change-Id: I1f4885252f5b3e9caf04d4e01e643960f910aba5
2011-06-02Removed B_MODE_INFOScott LaVarnway
Declared the bmi in BLOCKD as a union instead of B_MODE_INFO. Then removed B_MODE_INFO completely. Change-Id: Ieb7469899e265892c66f7aeac87b7f2bf38e7a67
2011-05-24MODE_INFO size reductionScott LaVarnway
Declared the bmi in MODE_INFO as a union instead of B_MODE_INFO. This reduced the memory footprint by 518,400 bytes for 1080 resolutions. The decoder performance improved by ~4% for the clip used and the encoder showed very small improvements. (0.5%) This reduction was first mentioned to me by John K. and in a later discussion by Yaowu. This is WIP. Change-Id: I8e175fdbc46d28c35277302a04bee4540efc8d29
2011-05-19revise two function definitions with less parametersYaowu Xu
Change-Id: Ia96e5bf915e4d3c0ac9c1795114bd9e5dd07327a
2011-05-19disable trellis optimization for first passYaowu Xu
also remove 2 #defines and 1 function declaration that are not in use. Change-Id: I8f743d0e3dd9ebf1de24a8b0c30ff09f29b00c53
2011-05-10remove a variable no longer in useYaowu Xu
The variable is introduced in commit 2e53e9e53 to make more use of trellis quantization, but this is no longer necessary after RDMULT was made adaptive in a number of later commits. Change-Id: I7420522ec7723f38cf77033466c25afb405d52ae
2011-04-27SSE2/SSSE3 optimizations for build_predictors_mbuv{,_s}().Ronald S. Bultje
decoding before 10.425 10.432 10.423 =10.426 after: 10.405 10.416 10.398 =10.406, 0.2% faster encoding before 14.252 14.331 14.250 14.223 14.241 14.220 14.221 =14.248 after 14.095 14.090 14.085 14.095 14.064 14.081 14.089 =14.086, 1.1% faster Change-Id: I483d3d8f0deda8ad434cea76e16028380722aee2
2011-04-11Set cpu_used range to [-16, 16] in real-time modeYunqing Wang
Remove encoding speed limitation in real-time mode. Change-Id: Ib5e35d8bb522b2a25f3e4ad5cfe2788ebebb3617
2011-03-17Increase static linkage, remove unused functionsJohn Koleszar
A large number of functions were defined with external linkage, even though they were only used from within one file. This patch changes their linkage to static and removes the vp8_ prefix from their names, which should make it more obvious to the reader that the function is contained within the current translation unit. Functions that were not referenced were removed. These symbols were identified by: $ nm -A libvpx.a | sort -k3 | uniq -c -f2 | grep ' [A-Z] ' \ | sort | grep '^ *1 ' Change-Id: I59609f58ab65312012c047036ae1e0634f795779
2011-03-11Move build_intra_predictors_mby to RTCD frameworkJohn Koleszar
The vp8_build_intra_predictors_mby and vp8_build_intra_predictors_mby_s functions had global function pointers rather than using the RTCD framework. This can show up as a potential data race with tools such as helgrind. See https://bugzilla.mozilla.org/show_bug.cgi?id=640935 for an example. Change-Id: I29c407f828ac2bddfc039f852f138de5de888534
2011-02-17Merge "Fix relative include paths"John Koleszar
2011-02-14Improved vp8_rd_pick_intra_mbuv_modeScott LaVarnway
Eliminated unnecessary calculations. Very small change to performance. Change-Id: Ib7213d43c64e36955177c4d47950ff472266f822
2011-02-14Improved rd_pick_intra4x4blockScott LaVarnway
Eliminated unnecessary calculations. Improved performance by 10% on keyframes and 1.6% overall for the test clip used. Change-Id: I87671b26af5e2cc439e81d0fee3b15c7cd2a3309
2011-02-10Fix relative include pathsJohn Koleszar
Allow compiling without adding vp8/{common,encoder,decoder} to the include paths. Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c
2011-01-26Rationalize vp8_rd_pick_intra16x16mby_mode()Paul Wilkins
Use the function macro_block_yrd() to calculate error and distortion in keeping with what is done for inter frames. The old code was using a variance metric for once case and an SSE function for measuring distortion in the other case. The function vp8_encode_intra16x16mbyrd() is no longer used. Change-Id: Ic228cb00a78ff637f4365b43f58fbe5a9273d36f
2010-12-03Merge 'Add simple version of activity masking.'John Koleszar
Merge commit 'refs/changes/79/779/2' of https://review.webmproject.org/p/libvpx Conflicts: vp8/encoder/encodeintra.c vp8/encoder/encodemb.c Change-Id: Id607063fabe92d99eeb3c380e8ca670b01bfb3ef
2010-10-26make vp8_recon16x16mb{,y} RTCD functionsJohn Koleszar
ARM NEON has a platform specific version of vp8_recon16x16mb, though it's just a stub to extract the various parameters from the MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using that function's prototype directly will be a better long term solution, but it's quite an invasive change. Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620
2010-10-15change to make use of more trellis quantizationYaowu Xu
when a subsequent frame is encoded as an alt reference frame, it is unlikely that any mb in current frame will be used as reference for future frames, so we can enable quantization optimization even when the RD constant is slightly rate-biased. The change has an overall benefit between 0.1% to 0.2% bit savings on the test sets based on vpxssim scores. Change-Id: I9aa7bc5cd573ea84e3ee655d2834c18c4460ceea
2010-10-12Centralize mb skip state calculationJohn Koleszar
This patch moves the scattered updates to the mb skip state (mode_info_context->mbmi.mb_skip_coeff) to vp8_tokenize_mb. Recent changes to the quantizer exposed a bug where if a macroblock could be coded as a skip but isn't, the encoder would run the loopfilter but the decoder wouldn't, causing a reference buffer mismatch. The loopfilter is controlled by a flag called dc_diff. The decoder looks at the number of decoded coefficients when setting this flag. The encoder sets this flag based on the skip state, since any skippable macroblock should be transmitted as a skip. The coefficient optimization pass (vp8_optimize_b()) could change the coefficients such that a block that was not a skip becomes one. The encoder was not updating the skip state in this situation for intra coded blocks. The underlying issue predates it, but this bug was recently triggered by enabling trellis quantization on the Y2 block in commit dcd29e3, and by changing the quantizer range control in commit 305be4e. Change-Id: I5cce5da0dbc2d22f7d79ee48149f01e868a64802
2010-10-12Add simple version of activity masking.Timothy B. Terriberry
This uses MB variance to change the RDO weight for mode decision and quantization. Activity is normalized against the average for the frame, which is currently tracked using feed-forward statistics. This could also be used to adjust the quantizer for the entire frame, but that requires more extensive rate control changes. This does not yet attempt to adapt the quantizer within the frame, but the signaling cost means that will likely only be useful at very high rates. Change-Id: I26cd7c755cac3ff33cfe0688b1da50b2b87b9c93
2010-09-09Use WebM in copyright notice for consistencyJohn Koleszar
Changes 'The VP8 project' to 'The WebM project', for consistency with other webmproject.org repositories. Fixes issue #97. Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba
2010-08-12Removed unnecessary MB_MODE_INFO copiesScott LaVarnway
These copies occurred for each macroblock in the encoder and decoder. Thetemp MB_MODE_INFO mbmi was removed from MACROBLOCKD. As a result, a large number compile errors had to be fixed. Change-Id: I4cf0ffae3ce244f6db04a4c217d52dd256382cf3
2010-08-10Removed duplicate functionsYaowu Xu
Change-Id: Ie587972ccefd3c762b8cdf8ef39345cd22924b9b
2010-08-10Add trellis quantization.Timothy B. Terriberry
Replace the exponential search for optimal rounding during quantization with a linear Viterbi trellis and enable it by default when using --best. Right now this operates on top of the output of the adaptive zero-bin quantizer in vp8_regular_quantize_b() and gives a small gain. It can be tested as a replacement for that quantizer by enabling the call to vp8_strict_quantize_b(), which uses normal rounding and no zero bin offset. Ultimately, the quantizer will have to become a function of lambda in order to take advantage of activity masking, since there is limited ability to change the quantization factor itself. However, currently vp8_strict_quantize_b() plus the trellis quantizer (which is lambda-dependent) loses to vp8_regular_quantize_b() alone (which is not) on my test clip. Patch Set 3: Fix an issue related to the cost evaluation of successor states when a coefficient is reduced to zero. With this issue fixed, now the trellis search almost exactly matches the exponential search. Patch Set 2: Overall, the goal of this patch set is to make "trellis" search to produce encodings that match the exponential search version. There are three main differences between Patch Set 2 and 1: a. Patch set 1 did not properly account for the scale of 2nd order error, so patch set 2 disable it all together for 2nd blocks. b. Patch set 1 was not consistent on when to enable the the quantization optimization. Patch set 2 restore the condition to be consistent. c. Patch set 1 checks quantized level L-1, and L for any input coefficient was quantized to L. Patch set 2 limits the candidate coefficient to those that were rounded up to L. It is worth noting here that a strategy to check L and L+1 for coefficients that were truncated down to L might work. (a and b get trellis quant to basically match the exponential search on all mid/low rate encodings on cif set, without a, b, trellis quant can hurt the psnr by 0.2 to .3db at 200kbps for some cif clips) (c gets trellis quant to match the exponential search to match at Q0 encoding, without c, trellis quant can be 1.5 to 2db lower for encodings with fixed Q at 0 on most derf cif clips) Change-Id: Ib1a043b665d75fbf00cb0257b7c18e90eebab95e
2010-06-24Redo the forward 4x4 dctYaowu Xu
The new fdct lowers the round trip sum squared error for a 4x4 block ~0.12. or ~0.008/pixel. For reference, the old matrix multiply version has average round trip error 1.46 for a 4x4 block. Thanks to "derf" for his suggestions and references. Change-Id: I5559d1e81d333b319404ab16b336b739f87afc79
2010-06-18cosmetics: trim trailing whitespaceJohn Koleszar
When the license headers were updated, they accidentally contained trailing whitespace, so unfortunately we have to touch all the files again. Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
2010-06-07Remove duplicate and unused functionsYaowu Xu
Change-Id: I944035e720ef834561a9da0d723879a4f787312c
2010-06-04LICENSE: update with latest textJohn Koleszar
Change-Id: Ieebea089095d9073b3a94932791099f614ce120c
2010-05-18Initial WebM releaseJohn Koleszar