summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-10-20vpx: [x86] vpx_hadamard_16x16_avx2() improvementsScott LaVarnway
~10% performance gain. Fixed the cosmetics noted in the previous commit. Change-Id: Iddf475f34d0d0a3e356b2143682aeabac459ed13
2017-10-19Merge "vpx: [x86] add vpx_hadamard_16x16_avx2()"Scott LaVarnway
2017-10-19Merge "Corpus VBR tweak for undershoot."Paul Wilkins
2017-10-19Merge "Increase precision of some debug stats output for corpus VBR."Paul Wilkins
2017-10-19Merge "Prevent double application of min rate in two pass."Paul Wilkins
2017-10-18vpx: [x86] add vpx_hadamard_16x16_avx2()Scott LaVarnway
This version is ~1.91x faster than the sse2 version. When highbitdepth is enabled, it is ~1.74x. Change-Id: I2b0e92ede9f55c6259ca07bf1f8c8a5d0d0955bd
2017-10-18Merge "Add datarate test for vp8 ROI."Jerome Jiang
2017-10-18Add datarate test for vp8 ROI.Jerome Jiang
BUG=webm:1470 Change-Id: Icbc848837e64eacc49491dcc26b4c5802af2ee13
2017-10-18Merge "vp8: Enable use of ROI map."Jerome Jiang
2017-10-18Merge "Refactor x86/vpx_subpixel_8t_intrin_avx2.c"Kyle Siefring
2017-10-18Merge "vp8: [loongson] optimize idct with mmi"Shiyou Yin
2017-10-17vp8: Enable use of ROI map.Jerome Jiang
Disable cyclic refresh if ROI is used and add flag to properly handle the static_thresh deltas. Remove the ROI test for cyclic refresh (it's allowed but disabled if ROI is used). Add an example in vpx_temporal_svc_encoder.c. Turned off by default. BUG=webm:1470 Change-Id: Ief9ba1d7f967bc00511b412b491c3f70943bfbda
2017-10-17Merge changes I17fff122,Ic149e3cbLinfeng Zhang
* changes: Add 4 to 3 scaling SSSE3 optimization Test extreme inputs in frame scale functions
2017-10-17Merge "Generalize CheckScalingFiltering in ConvolveTest"Linfeng Zhang
2017-10-17Refactor x86/vpx_subpixel_8t_intrin_avx2.cKyle Siefring
Change-Id: I6539111dfb35a43028e9755785b2e9ea31854305
2017-10-17vp8: [loongson] optimize idct with mmiShiyou Yin
1. vp8_dequant_idct_add_y_block_mmi 2. vp8_dequant_idct_add_uv_block_mmi Change-Id: I9987147be2685ac79d4b045d1d56f6709ee1223c
2017-10-16Add 4 to 3 scaling SSSE3 optimizationLinfeng Zhang
Note this change will trigger the different C version on SSSE3 and generate different scaled output. Its speed is 2x compared with the version calling vpx_scaled_2d_ssse3(). Change-Id: I17fff122cd0a5ac8aa451d84daa606582da8e194
2017-10-13Adjust threshold in gf_boost for 1 pass vbrMarco
Small inncrease the sad_thresh1, avoids some false detection of possible scene changes within lag. Small improvement in few clips on ytlive, otherwise neutral change. Change-Id: Ia79b53bb657bbce65a7aac7d20666b6373d5af8b
2017-10-13Merge "Further Corpus VBR change."Paul Wilkins
2017-10-13Merge "Corpus Wide VBR test implementation."Paul Wilkins
2017-10-13Corpus VBR tweak for undershoot.paulwilkins
In cases of strong undershoot adjust Q range down faster. Change-Id: I84982beceb3c9b6dc50e52e4a6e891c7dd395d03
2017-10-13Merge "vp8: [loongson] optimize dct with mmi"Shiyou Yin
2017-10-12Merge "Adjust to scene detection for 1 pass vbr."Marco Paniconi
2017-10-12Adjust to scene detection for 1 pass vbr.Marco
Expose the threshold for setting key frame on cut, and increase it for speed 5. Also small adjustment to min_thresh. No change in overall metrics or fps. Small quality improvement and lower encode time on scene cuts. Change-Id: I36e06ff3b26b6c29aede39c23fce454525fc9026
2017-10-12Merge "vp9: use nonrd pick_intra for small blocks on keyframes."Jerome Jiang
2017-10-12Merge changes I38783d97,If5160c0cKyle Siefring
* changes: Extend 16 wide AVX2 convolve8 code to support averaging. Add AVX2 version of vpx_convolve8_avg.
2017-10-12Increase precision of some debug stats output for corpus VBR.paulwilkins
Change-Id: I75841797cc0c215781b5b36e3a3e9f4b0e35ba63
2017-10-11vp9: use nonrd pick_intra for small blocks on keyframes.Jerome Jiang
Keyframe encoding is more than 2x faster. Disabled on Speed 8. Change-Id: I2157318b6ac8253fa5398322c72d98cd7fa9b2b6
2017-10-12vp8: [loongson] optimize dct with mmiShiyou Yin
1. vp8_short_fdct4x4_mmi 2. vp8_short_fdct8x4_mmi 3. vp8_short_walsh4x4_mmi Change-Id: I89a7df25cfd09fae309fac257ad8b6a3dc1c8acb
2017-10-12Merge "vp8: [loongson] optimize quantize with mmi"Shiyou Yin
2017-10-11Adjust threshold in datarate tests for 1 pass VBRMarco
Small increase in threshold for the 1 pass VBR datarate tests. Needed due to commit: <017257a Adjustment to scene detection and key frame> Change-Id: I28b3bd7db2192a8cc2bccc3cb0e3b8dbb910ca16
2017-10-11Test extreme inputs in frame scale functionsLinfeng Zhang
Change-Id: Ic149e3cb59be2ee0f98a3fcfd83226ad5ea30c99
2017-10-11Prevent double application of min rate in two pass.paulwilkins
The initial allocation of bits in the two pass code to each frame should be within the min max limits on the command line. However, when forming an ARF group the cost of the ARF is shared by frames in that group such that the residual bits for a frame could drop below the min value. This change prevents the minimum being re-applied after the cost of the ARF has been deducted as this may otherwise cause low rate sections to overshoot their target. Test runs comparing to a baseline run with min and max section pct 0-2000% vs one closer to the YT use case (50-150%) suggest that this fix not only results in better rate control but also gives a better rd outcome. For example the HD set vs 0-2000% baseline (opsnr, ssim). Old code (50-150): +0.751, +1.099 New code(50-150): +0.241, -0.009 Change-Id: I715da7b130bf53ba8aa609532aa9e18b84f5e2ef
2017-10-11vp8: [loongson] optimize quantize with mmiShiyou Yin
1. vp8_fast_quantize_b_mmi 2. vp8_regular_quantize_b_mmi Change-Id: Ic6e21593075f92c1004acd67184602d2aa5d5646
2017-10-10Add 4 to 1 scaling x86 optimizationLinfeng Zhang
Change-Id: I51c190f0a88685867df36912522e67bdae58a673
2017-10-10Merge "Fix alignment in vpx_image without external allocation."Jerome Jiang
2017-10-10Fix alignment in vpx_image without external allocation.Jerome Jiang
This restores behaviors prior to <40c8fde Fix image width alignment. Enable ImageSizeSetting test.>. BUG=b/64710201 Change-Id: I559557afe80d5ff5ea6ac24021561715068e7786
2017-10-10Generalize CheckScalingFiltering in ConvolveTestLinfeng Zhang
Let it test extreme inputs and all filter types. In the future ConvolveTest should test regular 8-bit functions in high bitdepth mode. Change-Id: I1042564d1d390589ca203070fe332c6da3315d75
2017-10-10Adjustment to scene detection and key frame.Marco
For 1 pass vbr: use higher threshold on avg_sad and force key frame under scene cut detection if above the threshold. Allow it for speed >= 6 for now, since it does not use the full nonrd_pickmode partition (as in speed 5). Improves quality somewhat on scene cut frames. Neutral on overall metrics and fps for speed 6 on ytlive set. Change-Id: I12626f7627419ca14f9d0d249df86c7104438162
2017-10-10Merge changes I9d4c1af5,I882da3a0Linfeng Zhang
* changes: Rename some inline functions in NEON scaling Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
2017-10-10Further Corpus VBR change.paulwilkins
Change to the bit allocation within a GF/ARF group. Normal VBR and CQ mode allocate bits to a GF/ARF group based of the mean complexity score of the frames in that group but then share bits evenly between the "normal" frames in that group regardless of the individual frame complexity scores (with the exception of the middle and last frames). This patch alters the behavior for the experimental "Corpus VBR" mode such that the allocation is always based on the individual complexity scores. Change-Id: I5045a143eadeb452302886cc5ccffd0906b75708
2017-10-10Corpus Wide VBR test implementation.paulwilkins
This patch makes further changes to support an experimental corpus wide VBR mode that uses a corpus complexity number as the midpoint of the distribution used to allocate bits within a clip, rather than some average error score derived from the clip itself. At the moment the midpoint number is hard wired for testing and the mode is enabled or disabled through a #ifdef. Ultimately this would need to be controlled by command line parameters. Change-Id: I9383b76ac9fc646eb35a5d2c5b7d8bc645bfa873
2017-10-09Extend 16 wide AVX2 convolve8 code to support averaging.Kyle Siefring
Also adds vpx_convolve8_avg_horiz_avx2. Change-Id: I38783d972ac26bec77610e9e15a0a058ed498cbf
2017-10-09Rename some inline functions in NEON scalingLinfeng Zhang
Change-Id: I9d4c1af53d57f72fc716bacbe3b0965719c045ac
2017-10-09Merge "Update vp9_scale_and_extend_frame_ssse3()"Linfeng Zhang
2017-10-07Add AVX2 version of vpx_convolve8_avg.Kyle Siefring
vpx_convolve8_avg works by first running a normal horizontal filter then a vertical filter averages at the end. The added vpx_convolve8_avg_avx2 calls pre-existing AVX2 code for the horizontal step. vpx_convolve8_avg_vert_avx2 is also added, but only uses ssse3 code. Change-Id: If5160c0c8e778e10de61ee9bf42ee4be5975c983
2017-10-07Merge "ppc: Add vpx_idct32x32_1024_add_vsx"James Zern
2017-10-06Merge "Revert "Speed >=5 real-time: add TM intra mode for high_source_sad.""Marco Paniconi
2017-10-06Revert "Speed >=5 real-time: add TM intra mode for high_source_sad."Marco Paniconi
This reverts commit 9311ef18b4b4eff0da3adf9d702a34f489a270ff. Reason for revert: Notice small regression in some clips. Will revisit in another change. Original change's description: > Speed >=5 real-time: add TM intra mode for high_source_sad. > > Small/neutral change in metrics or speed for ytlive. > Some improvement in quality on frames with big content change. > > Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d TBR=marpan@google.com,builds@webmproject.org,jianj@google.com Change-Id: I9d8ec5195bb05ddf329d325699355185affb9b13 No-Presubmit: true No-Tree-Checks: true No-Try: true
2017-10-06Adjust threshold in scene detectionMarco
For 1 pass vbr: increase min_thresh slightly, and also add condition on golden/arf update for using full nonrd_pick_partition. Reduces possible false detection for scene cut detection. Neutral/small change in metrics or speed for speed 5. Change-Id: I388f4d9a56e3cc763e0148338c1bc0381e58ad76