summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-03-08Add vpx_highbd_idct32x32_135_add_c()Linfeng Zhang
When eob is less than or equal to 135 for high-bitdepth 32x32 idct, call this function. BUG=webm:1301 Change-Id: I8a5864f5c076e449c984e602946547a7b09c9fe6
2017-03-08cosmetics,dsp/arm/: vpx_idct32x32_{34,135}_add_neon()Linfeng Zhang
No speed changes and disassembly is almost identical. Change-Id: Id07996237d2607ca6004da5906b7d288b8307e1f
2017-03-08cosmetics,dsp/arm/: rename a variableLinfeng Zhang
Rename cospi_6_26_14_18N to cospi_6_26N_14_18N for consistency. Change-Id: I00498b43bb612b368219a489b3adaa41729bf31a
2017-03-07Merge "tiny_ssim.c : adds y4m support to tiny_ssim."James Bankoski
2017-03-07tiny_ssim.c : adds y4m support to tiny_ssim.Jim Bankoski
Change-Id: I7a13b7e3a1e11ddbe4be3009edf03528e1bc7647
2017-03-04Merge "vp8_create_decoder_instances: correct pbi[] memset"James Zern
2017-03-03Merge "Narrow cat6_high_cost tables to uint16_t"Alex Converse
2017-03-03vp8_create_decoder_instances: correct pbi[] memsetJames Zern
clear the entire array on error. the size used previously was equal to the number of elements. BUG=webm:1364 Change-Id: I2f2e16ed6e867f41d4774a5a8ac9cedaee11ce46
2017-03-03Narrow cat6_high_cost tables to uint16_tAlex Converse
Saves 2688 bytes of rodata. Change-Id: I46633b6e50c2845181c70fff6273a8e58fdd1e56
2017-03-03Merge "vp9,realtime: Enable row multithreading for non-rd"Vignesh Venkatasubramanian
2017-03-02Merge "vp9: Speed 8: reduce the adaptive_rd_thresh level."Marco Paniconi
2017-03-02vp9: Speed 8: reduce the adaptive_rd_thresh level.Marco
Reduce the level from 4 to 2. This gives ~1-2% quality gain on RTC set, with small decreaee in speed (~1-2% on mac). Change-Id: I7d959731badcee3d45b2f4a08efe378765016a13
2017-03-02vp9,realtime: Enable row multithreading for non-rdVignesh Venkatasubramanian
Enable row level multithreading for realtime encodes where non-rd path is used (speed >= 5). Change-Id: I5439cb49a02171166d8e1de06c7d5e6f8e819a41
2017-03-01Improve idct32x32_34_add SSSE3 intrinsics performanceYi Luo
- Split the transform into first half and second half. - Reschedule the instructions to avoid stack spillover. - Function level speed improves ~16%. Change-Id: I166889840d23aa8a273eca00f6fbdae8b4566f35
2017-03-01Merge "VPX_CODEC_CAP_HIGHBITDEPTH for decoder interface"Chrome Cunningham
2017-02-28VPX_CODEC_CAP_HIGHBITDEPTH for decoder interfaceChris Cunningham
Moves the def from vpx_encoder.h -> vpx_codec.h. The defined value is changed as part of this move. Adds the value to decoder capabilities when CONFIG_VP9_HIGHBITDEPTH. Change-Id: I7d61fc821cda29f1e32bb9b2b9ffd3d83966e419
2017-02-28Revert "Fix for max qindex calculation of a gf interval"James Zern
This reverts commit d3db846cc50b1b0a9f6efcbe2b36c9c1943bc528. This change causes a large drop in psnr (4-5db) on low framerate difficult content (tested at 360/480p) BUG=b/35804225 Change-Id: I8e90012d3b9c8a0cddb062ba93b01b36c0e0c0a0
2017-02-28vp9_ethread_test,cosmetics: s/new-mt/row-mt/James Zern
Change-Id: I8c145337adf49d30b88a17ff31501b8751ed1fa0
2017-02-28stress.sh: add vp9_stress_test_row_mtJames Zern
vp9_stress_test now forces --row-mt=0 to cover both versions Change-Id: I8d134879435bf1d8e76ab3fd89e698efba0e86b2
2017-02-28stress.sh: parameterize thread countJames Zern
Change-Id: Iae45266cea86585f0935af4012335198cf93719f
2017-02-28stress.sh: add one pass encodesJames Zern
Change-Id: I38e6c988f17c56fbfacd95378b27ef8d77c75f90
2017-02-28Add a comment in encoder thread testYunqing Wang
Added a comment. Change-Id: I82f71c72598ad6f1eaa0b57b0b8ec56ab9658e81
2017-02-28Set row_mt to 0 by defaultYunqing Wang
Set row_mt to 0 for now. Change-Id: I922536a6d71a765e435daeaf4d932ef14363d19a
2017-02-27vp9: Fix an issue with setting variance thresholds.Marco
From commit: https://chromium-review.googlesource.com/c/441393/ On non-segment the set_vbp_thresholds() should be called again to adjust thresholds based on content_state of superblock. This was the intended behavior from 441393. Small change in RTC metrics and speed. Change-Id: I45e5fbdc4af74db76b3cb4f13074fcae0eb2219e
2017-02-27vp9_ethread_test: Rename new_mt to row_mtVignesh Venkatasubramanian
Rename left over occurences of new_mt. Change-Id: Ib884e84c801fcd366ca4b57ec912ac5972023375
2017-02-27vp9: Rename new_mt to row_mtVignesh Venkatasubramanian
new_mt is a very generic name that will get obsolete soon enough. Since this is exposed as a codec control, renaming it to row_mt to signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH codec control to ROW_MT_BIT_EXACT. Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558
2017-02-24Remove an old leftover commentYunqing Wang
Removed an old comment that wasn't true anymore. Change-Id: I286ad8d7cb2843070a55e45a599d26bc226d6bd7
2017-02-24get_prob(): rationalize int typesJames Zern
promote the unsigned int calculation to uint64_t rather than int64_t for type consistency Change-Id: Ic34dee1dc707d9faf6a3ae250bfe39b60bef3438
2017-02-24Merge "Improve VP9 encoder threading test for better coverage"Yunqing Wang
2017-02-24Improve VP9 encoder threading test for better coverageYunqing Wang
Re-organized the encoder threading tests and grouped tests into 4 parts. Added PSNR checking test to make sure the PSNR variation is within a small range. BUG=webm:1376 Change-Id: I09edb990236a87a4d2b2b0e1ceaf6c6435a35eff
2017-02-24Merge "Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8."Jerome Jiang
2017-02-24consolidate block_error functionsJohann
vp9_highbd_block_error_8bit_c was a very simple wrapper around vp9_block_error_c. The SSE2 implemention was practically identical to the non-HBD one. It was missing some minor improvements which only went into the original version. In quick speed tests, the AVX implementation showed minimal improvement over SSE2 when it does not detect overflow. However, when overflow is detected the function is run a second time. The OperationCheck test seems to trigger this case and reverses any speed benefits by running ~60% slower. AVX2 on the other hand is always 30-40% faster. Change-Id: I9fcb9afbcb560f234c7ae1b13ddb69eca3988ba1
2017-02-24Merge "block error sse2: use tran_low_t"Johann Koenig
2017-02-23Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8.Jerome Jiang
Only works for bitdepth = 8 when compiled with high bitdepth flag. 4x speed ups for handling 1:2 down/upsampling. Validated manually for: 1) Dynamic resize for a single layer encoding 2) SVC encoding with 3 spatial layers Results are bitexact with the patch and the speed gain (~4x) in the scaling was verified. BUG=webm:1371 Change-Id: I1bdb5f4d4bd0df67763fc271b6aa355e60f34712
2017-02-24block error sse2: use tran_low_tJohann
Change-Id: Ib04990e4a7bda9fbf501f294da2057a2b2595deb
2017-02-23Merge "vp8_fdct4x4 test: fix segfault again"Johann Koenig
2017-02-23Merge "vp9: 1pass CBR: modify condition for reducing loop filter."Marco Paniconi
2017-02-22Merge "vp9: Non-rd pickmode: use simple block_yrd under some conditons."Jerome Jiang
2017-02-22vp9: 1pass CBR: modify condition for reducing loop filter.Marco
The reduction showed improvement on RTC when aq-mode=3 is on. Add that (cyclic refresh enabled) to the condition. Only affects 1 pass CBR. Change-Id: I5d0843002d8e31d7c165098a62e7a71146b08664
2017-02-22vp9: Non-rd pickmode: use simple block_yrd under some conditons.Marco
For speed 8 only. 3% speed up for QVGA and 6.3% for VGA on Nexus 6. ~3% avgPSNR decrease on rtc_derf and 2.9% on rtc. Disabled for now. Change-Id: I70133f1f6c804d663d594df437bfe7fdb0030d6a
2017-02-22Merge "vp9: aq-mode=3: On key frame reset cr->reduce_refresh to 0."Marco Paniconi
2017-02-22vp9: aq-mode=3: On key frame reset cr->reduce_refresh to 0.Marco
This prevent possible reduction of cyclic refresh after key frame. Change-Id: Idd4e49b69cd95476e7eccfa31b2bd8669569e9e8
2017-02-22vp8_fdct4x4 test: fix segfault againJohann
The output needs to be aligned. Input is read with 'movq' not 'movqda' so it is not expected to be aligned. Change-Id: Ibd48a84c1785917a6a97c3689a05322abba486b4
2017-02-22vp9: Only compute y_sad for golden in variance partition for speed < 8.Jerome Jiang
Only affects speed 8. No obvious quality regression. Systematic speed ups by ~1% on Nexus 6. Change-Id: Ia904ca28ea041c3281c532911ec38fb7d7f46a17
2017-02-22Merge "Refactored the row based multi-threading code"Yunqing Wang
2017-02-22Merge "Fix segmentation fault caused by denoiser working with spatial SVC."Jerome Jiang
2017-02-21vp9: Incorporate source sum_diff into non-rd partition thresholds.Marco
Increase the variance partition thresholds for superblocks that have low sum-diff (from source analysis prior to encoding frame). Use it for now only for speed >= 7 or for denoising on. Small change on metrics for rtc set: less than ~0.1 avgPNSR decrease on RTC set, for both speed 7 and 8. Change-Id: I38325046ebd5f371f51d6e91233d68ff73561af1
2017-02-21Following SSSE3 intrinsics functions also work for HBDYi Luo
- vpx_idct8x8_12_add_ssse3 vpx_idct8x8_64_add_ssse3 vpx_idct32x32_34_add_ssse3 vpx_idct32x32_135_add_ssse3 vpx_idct32x32_1024_add_ssse3 - turn on unit tests. Change-Id: I788b2b3b2074a6f3ab6a0e6f469c1327a123eff7
2017-02-21Merge "Drop zbin_ptr and quant_shift_ptr"Johann Koenig
2017-02-21Fix segmentation fault caused by denoiser working with spatial SVC.Jerome Jiang
Re-enable the affected test. BUG=webm:1374 Change-Id: I98cd49403927123546d1d0056660b98c9cb8babb