summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-30Merge "Properly space qp in q mode for multi-layer ARF"Jingning Han
2018-10-29Properly space qp in q mode for multi-layer ARFJingning Han
Space the quantization parameter distribution according to the layer depth for multi-layer ARF coding structure. This allows lower layers to have relatively smaller quantization parameters than higher layers. It improves the compression performance in constant q mode for multi-layer ARF system: avg PSNR overall PSNR SSIM lowres -0.33% -0.31% -1.44% midres -0.29% -0.38% -1.14% hdres -0.27% -0.49% -1.02% Change-Id: I9cfe2f27e6c0029c30614970a46de3045840264e
2018-10-29Merge "vp8 bilinear: ensure non-16x16 arrays are aligned"Johann Koenig
2018-10-29vp8 bilinear: ensure non-16x16 arrays are alignedJohann Koenig
The 16x16 array was changed to aligned. The 8xN and 4x4 functions use aligned loads/stores on their internal arrays as well. BUG=webm:1570 Change-Id: I9cfe53d7c8ed76e8854c2688eb9a509b876471d8
2018-10-29Merge "vp8 bilinear: ensure temp array is aligned"Johann Koenig
2018-10-29Merge "Enable 10 bit tpl support"Sai Deng
2018-10-29vp8 bilinear: ensure temp array is alignedJohann
Loads and stores to this array require 16 byte alignment. BUG=webm:1570 Change-Id: I82c7d21c9539a108930fd030d79caaa0bcd1eeb3
2018-10-29Merge "remove "register" keyword"Johann Koenig
2018-10-27Merge "Remove unused macros from vp9_firstpass.c"Jingning Han
2018-10-26Enable 10 bit tpl supportsdeng
lowres_bd10 midres_bd10 avg_psnr -0.897 -1.261 ovr_psnr -0.975 -1.349 Change-Id: Id54f2c419f4edaa91e89ffea52b4038b1d94e563
2018-10-26remove "register" keywordJohann
This has been deprecated for a long time. c++17 is trying to recover the name. Change-Id: Iade6bebce03a50b76061695f9e634a107cd989cd
2018-10-26Merge "Add Memory to Enable Row Decode"Harish Mahendrakar
2018-10-26Remove unused macros from vp9_firstpass.cJingning Han
Change-Id: If5267a8c71113b171b7bddda5b49f0326c4266b8
2018-10-25vp8 bilinear: rewrite 4x4Johann
~20% faster than the MMX. Removes the last usage of vp8_bilinear_filters_x86_[48]. Change-Id: Iee976fab9655d0020440f26c4403ce50103af913
2018-10-25Merge "vp8 bilinear: rewrite 16x16"Johann Koenig
2018-10-25Merge "Add AVX2 support for 4-tap interpolation filter."Chi Yo Tsai
2018-10-25vp8 bilinear: rewrite 16x16Johann
Marginally faster. Most importantly it drops a dependency on an external symbol (vp8_bilinear_filters_x86_8). Change-Id: Iff022e718720f1f0eeced6201a1ad69a9c9c4f45
2018-10-25Merge "vp8 bilinear: rewrite in intrinsics"Johann Koenig
2018-10-25Add Memory to Enable Row DecodeRitu Baldwa
Row based multi-thread needs extra memory to store the parsed co-efficients, partitions and eob. This commit adds memory for the same. Change-Id: I13fa4a6ada2ec3048bc973e465055b832429388f
2018-10-25Merge "Enable tpl model to support multi-layer ARF"Jingning Han
2018-10-25Merge "Reset frame udpate flags after qp estimate in tpl"Jingning Han
2018-10-25Merge "Bypass processing on use existing frame"Jingning Han
2018-10-25Merge "Fix frame offset computation for GOP extension"Jingning Han
2018-10-25Merge "Refactor gop_length use case in tpl model"Jingning Han
2018-10-24vp8 bilinear: rewrite in intrinsicsJohann
8x8 is 15% faster than the assembly. 8x4 is 200% faster than MMX. Remove MMX version. Change-Id: I55642ebd276db265911f2c79616177a3a9a7e04f
2018-10-24Merge "Clean up vpx_dsp/x86/convolve_sse2.h"Chi Yo Tsai
2018-10-23Enable tpl model to support multi-layer ARFJingning Han
Enable temporal dependency model for the base layer ARF. It improves the multi-layer ARF compression performance (results are tested in speed 0 vbr mode): avg PSNR overall PSNR SSIM lowres -0.40% -0.46% -0.32% midres -0.59% -0.68% -0.45% 720p -0.55% -0.59% -1.07% Change-Id: I7790b89ccfb6e61f9b7965f34d348c7440220dd0
2018-10-24Add AVX2 support for 4-tap interpolation filter.chiyotsai
Performance: | 4X4 | 8X8 |16X16|64X64| 2 DIM|1.491|1.902|1.772|1.479| HORZ|1.145|1.521|1.757|1.497| VERT|1.176|1.614|1.707|1.467| Each number in the chart above is 8-tap function time / 4-tap function time. The framerate tested on jets.y4m for 100 frames on speed 1 increased from 3.72 fps to 3.91 fps (about 5% increase). Change-Id: Ic0ad275cf32fafeefd0a89811badd8adff2134a0
2018-10-23Clean up vpx_dsp/x86/convolve_sse2.hchiyotsai
Removes unnecesssary includes and reword some functions/comments. Change-Id: Ied557d7faa9d845d38255e6e3e0e3fe1395276e1
2018-10-23Reset frame udpate flags after qp estimate in tplJingning Han
After the frame quantizer estimate run in tpl model, reset the actual value assigned to the current coding frame. This would avoid certain frame update flags being overwritten by different frame types' update. Change-Id: Idde2ba1108f1f68747b14149b211f882965c99f0
2018-10-23Merge "Use 8-tap interp filter in temporal filtering"Yunqing Wang
2018-10-23Use 8-tap interp filter in temporal filteringYunqing Wang
Used 8-tap interp filter in temporal filtering to achieve more accurate motion search result. Using 8-tap sharp gave slight better result than using 8-tap regular. Speed 0 borg test showed that avg_psnr: ovr_psnr: ssim: hdres: -0.160 -0.157 -0.173 midres: -0.083 -0.061 -0.183 lowres: -0.077 -0.099 -0.204 Speed test didn't see noticeable encoder time changes. Change-Id: I97dc3c4864b5a5675a6c1e3952799b81eedd7d93
2018-10-23Merge "Remove empty else branch in mode_estimation"Jingning Han
2018-10-23Bypass processing on use existing frameJingning Han
The use of show existing frame requries no further operation on that coding frame. Bypass the corresponding process. Change-Id: Ia092027a8a543be0ca54c00b4d51e453039712b8
2018-10-23Fix frame offset computation for GOP extensionJingning Han
Properly compute the extended GOP frames' buffer offsets. Change-Id: I9aed14f4b8d623f1832e782828dce07aa546507d
2018-10-23Refactor gop_length use case in tpl modelJingning Han
Make it support both single- and multi-layer ARF GOP structure. Change-Id: I760a95804d1b583b057120f6d6be65195a0e6c19
2018-10-23Remove empty else branch in mode_estimationJingning Han
Change-Id: Iefa184aae80b920b054e3e922a77244c2b0d4b61
2018-10-23Merge "Use the proper gfu_boost factor to compute rd_mult"Jingning Han
2018-10-22Use the proper gfu_boost factor to compute rd_multJingning Han
Update the Lagrangian multiplier according to the gfu_boost factor assigned per frame. It improves the multi-layer ARF compression performance (results below shown for speed 0): avg PSNR overall PSNR SSIM lowres -0.08% 0.02% -0.28% midres -0.08% 0.03% -0.22% hdres -0.19% -0.10% -0.39% nflx2k -0.29% -0.18% -0.85% Change-Id: Ifeb4b14918f880ba011ea41c1454ab00504f8855
2018-10-19Merge "ML_VAR_PARTITION: enable at speed 5"Hui Su
2018-10-18ML_VAR_PARTITION: enable at speed 5Hui Su
When the ML_VAR_PARTITION experiment is turned on, replace REFERENCE_PARTITION with ML_BASED_PARTITION at speed 5. Coding gains(avg_psnr) compared to baseline: ytlivehr 1.63% ytlivelr 0.07% Tested encoding speed with several clips from ytlivehr and ytlivelr on linux desktop(rt, vbr, 4 threads). Encoder speed is on average faster than baseline: 360p: 14% faster 720p: 7% faster 1080p: 1.5% faster Change-Id: I39b00078176ff516f7306818f33ba2b1ea53dfa1
2018-10-18Changes 4-tap SSSE3 filter to 8-tap AVX2 filter.chiyotsai
AVX2's 8-tap filter is slightly faster than 4-tap SSSE3 filter. Change-Id: I5fc37c431670780108706b206b32c791828555c9
2018-10-18Merge "Add SSSE3 support for 4-tap interpolation filter"Chi Yo Tsai
2018-10-18Merge "Enable rect partition search for HBD at speed 1"Hui Su
2018-10-18Add SSSE3 support for 4-tap interpolation filterchiyotsai
Performance: | 4X4 | 8X8 |16X16|64X64| 2 DIM|1.526|1.827|1.844|1.906| HORZ|1.336|1.795|1.886|1.654| VERT|1.443|1.539|2.139|2.190| The ratio is SSSE3 8-tap time / SSSE3 4-tap time. Change-Id: I01ed2ab494428256e918875774a459afecc5ec6a
2018-10-18Merge "Replace MAX_LAG_BUFFERS with MAX_ARF_GOP_SIZE for gop size"Jingning Han
2018-10-17Merge "Optimize vp9_highbd_temporal_filter_apply_c"Yunqing Wang
2018-10-17Replace MAX_LAG_BUFFERS with MAX_ARF_GOP_SIZE for gop sizeJingning Han
MAX_ARF_GOP_SIZE accurately reflects the maximum frame operated per group of pictures. Use that to replace MAX_LAG_BUFFERS in such use cases. Change-Id: Id26f9b1b2b0c38f255dee19795356c387d06d033
2018-10-17Merge changes I6d5c77af,I6bf504b4,Ie5dc5ea7,Ie6024b1a,If45fba8a, ...Angie Chiang
* changes: Add do_motion_search Preserve code of doing mv search in raster order Variant implementation of changing mv search order Add feature_score_loc_sort Init mv_[dist/cost]_sum in init_tpl_stats Change mv search order according to feature_score
2018-10-17Add do_motion_searchAngie Chiang
This will make the code cleaner. Change-Id: I6d5c77af7261c39656b35ec40ac1451bbdbfb7a7