summaryrefslogtreecommitdiff
path: root/vp9/encoder/vp9_mcomp.c
AgeCommit message (Collapse)Author
2018-08-02Refactor vp9_full_pixel_search()Hui Su
Code cleanup; add some comment. Also remove a reduncant call to vp9_get_mvpred_var() at the end when method is MESH. Change-Id: I4b58e7e1c42161642708f8b0342ab3c0ce39ed7d
2018-08-01Use mesh full pixel motion search to build the source ARFJingning Han
Append mesh search to the diamond shape search to refine the full pixel motion estimation for source ARF generation. It improves the average compression performance. Speed 0 avg PSNR overall PSNR SSIM mid -0.18% -0.18% -0.22% hd -0.25% -0.23% -0.36% nflx2k -0.22% -0.23% -0.37% Speed 1 avg PSNR overall PSNR SSIM mid -0.10% -0.08% -0.11% hd -0.25% -0.27% -0.38% nflx2k -0.20% -0.20% -0.34% The additional encoding time is close to the sample noise range. For bus_cif at 1000 kbps, the speed 0 encoding time goes from 83.0 s -> 83.6 s. Change-Id: I48647f50ec3e8f7ae4550a4bde831f569f46ecf3
2018-05-01Clean switch cases in vp9 encoderLinfeng Zhang
To save a branch. Change-Id: Ifa2be7583e95c6991784731c654bbd4cce31e993
2018-04-26Respect MV limit in vp9_int_pro_motion_estimation()Hui Su
Change-Id: I08cb072a32e06c6452eca068b2f7ef7287f221e6
2018-04-03rm CONVERT_TO_SHORTPTR in vpx_highbd_comp_avg_predLinfeng Zhang
BUG=webm:1388 Change-Id: I1d0dd9af52a1461e3e2b2d60e8c4b6b74c3b90b0
2018-01-18clang-format v5.0.0 vp9/Johann
Remove trailing commas to keep multiple elements on one line. Add blank lines to prevent comments from being treated as blocks. clang-format guards for struct with a comment in the middle. Change-Id: I3bcb8313ae8aaf69179249a13b4087b1272cdbc0
2017-07-10remove vp9_full_sad_searchJohann
This code is unused in vp9. Only vp8 still contains references to vpx_sad_NxMx[3|8] and only for sizes 16x16, 16x8, 8x16, 8x8 and 4x4. Remove the remaining sizes and all the highbitdepth versions. BUG=webm:1425 Change-Id: If6a253977c8e0c04599e25cbeb45f71a94f563e8
2017-06-05vp9_mcomp,get_cost_surf_min: quiet conversion warningJames Zern
visual studio will warn if a 32-bit shift is implicitly converted to 64. in this case integer storage is enough for the result. since: f3a9ae5ba Fix ubsan failure in vp9_mcomp.c. Change-Id: I7e0e199ef8d3c64e07b780c8905da8c53c1d09fc
2017-06-06Merge "Fix valgrind failure on uninitialized variables."Jerome Jiang
2017-06-05Fix valgrind failure on uninitialized variables.Jerome Jiang
BUG=webm:1440 Change-Id: I7074e42bdfa8dd25f11bbb3f2ab1b41d6f4c12e4
2017-06-02Fix ubsan failure in vp9_mcomp.c.Jerome Jiang
Change-Id: Iff1dea1fe9d4ea1d3fc95ea736ddf12f30e6f48d
2017-04-21Make allow_exhaustive_searches feature no longer adaptiveYunqing Wang
A previous patch turned on allow_exhaustive_searches feature only for FC_GRAPHICS_ANIMATION content. This patch further modified the feature by removing the exhaustive search limit, and made it no longer adaptive. As a result, the 2 counts that recorded the number of motion searches were removed, which helped achieve the determinism in the row based multi-threading encoding. Tests showed that this patch didn't cause the encoder much slower. Used exhaustive_searches_thresh for this speed feature, and removed allow_exhaustive_searches. Also, refactored the speed feature code to follow the general speed feature setting style. Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa
2017-04-10Fix an integer overflow in vp9_mcomp.cYunqing Wang
The MV unit test revealed an integer overflow issue in vp9_mcomp.c. This was caused if the MV was very large. In mv_err_cost(), when mv->row = 8184, mv->col = 8184 and ref_mv is 0, mv_cost = 34363 and error_per_bit = 132412, causing the overflow. BUG=webm:1406 Change-Id: I35f8299f22f9bee39cd9153d7b00d0993838845e
2017-04-06VP9 motion vector unit testYunqing Wang
To prevent the motion vector out of range bug, added a motion vector unit test in VP9. In the 4k video encoding, always forced to use extreme motion vectors and also encouraged to use INTER modes. In the decoding, checked if the motion vector was valid, and also checked the encoder/decoder mismatch. The tests showed that this unit test could reveal the issue we saw before. Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4
2017-04-03Fix for out of range motion vector bug in sub-pel motion estimationRanjit Kumar Tulabandu
BUG=webm:1397 (yunqingwang) To verify that this patch wouldn't cause much performance change, the Borg tests were run. Here was the result: avg_psnr overall_psnr ssim hdres: -0.002 0.006 0.013 midres: 0 0 0 lowres: 0 0 0 Change-Id: Iae395ae7b741e0513cf5bab9dcace110b792a67d
2017-02-15Merge "Row based multi-threading of encoding stage"Yunqing Wang
2017-02-15Row based multi-threading of encoding stageRanjit Kumar Tulabandu
(Yunqing Wang) This patch implements the row-based multi-threading within tiles in the encoding pass, and substantially speeds up the multi-threaded encoder in VP9. Speed tests at speed 1 on STDHD(using 4 tiles) set show that the average speedups of the encoding pass(second pass in the 2-pass encoding) is 7% while using 2 threads, 16% while using 4 threads, 85% while using 8 threads, and 116% while using 16 threads. Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
2017-02-14apply clang-formatclang-format
Change-Id: I75e4a9e0b37bd4586f26c8d6c1fa27f3f6ff1bce
2017-02-07Row based multi-threading of ARNR filtering stageRanjit Kumar Tulabandu
Change-Id: Ic238d32c7e10b730342224ab56712a89a6026a8f
2017-02-01Merge "Changes to facilitate row based multi-threading of ARNR filtering"Yunqing Wang
2017-02-01Changes to facilitate row based multi-threading of ARNR filteringRanjit Kumar Tulabandu
Change-Id: I2fd72af00afbbeb903e4fe364611abcc148f2fbb
2017-01-31Fix real-time compression regression in hbd modeJingning Han
This commit resolves the compression performance regression in real-time encoding setting when high bit-depth mode is enabled. The current solution temporarily disables the SIMD implementations of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode. The commit makes the coding results bit-wise identical between regular coding pipeline and high bit-depth at profile 0. BUG=webm:1365 Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf
2017-01-23Remove marco MVC in mcomp.cYunqing Wang
Removed MVC so that mv_err_cost() is always called while calculating the mv cost. Change-Id: I28123e05fbfc2352128e266c985d2ab093940071
2017-01-03Merge "Fix for out of range motion vector bug in joint motion search"Yunqing Wang
2017-01-03Fix for out of range motion vector bug in joint motion searchRanjit Kumar Tulabandu
Clamped the initial mv in vp9_refining_search_8p_c. BUG=webm:1354 Change-Id: I47d302b350937e3e6e52e95c983b5fb0b4c64fba
2016-12-27Make sub-pixel mv search's return value consistent with the return typeYunqing Wang
For out-of-range cases, returned UINT_MAX instead of INT_MAX in the sub-pixel mv search to be consistent with the "uint32_t" return type. Change-Id: I8e206d771228c13d89bafbbe9f14722c8ecc6a7a
2016-09-15apply clang-formatclang-format
Change-Id: I501597b7c1e0f0c7ae2aea3ee8073f0a641b3487
2016-08-08Refactor mv limits.Alex Converse
Change-Id: Ifebdc9ef37850508eb4b8e572fd0f6026ab04987
2016-08-02vp9/encoder: apply clang-formatclang-format
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
2016-06-25s/UINT32_MAX/UINT_MAX/James Zern
provides better toolchain compatibility Change-Id: I8561a6de668a68ff54fe3886a4ee6300f0ae9c04
2016-06-24Rationalize type to avoid integer out of rangeYaowu Xu
BUG=webm:1250 Change-Id: Id5bb2762ca1bf996ba4f9a60eec977a7994c1d94
2016-06-21Fix ubsan warnings: vp9/encoder/vp9_mcomp.cYaowu Xu
This commit fixes a number of ubsan warnings in HBD build. BUG=webm:1219 Change-Id: I05f0fd0ef50e93db4ba34205005c54af1ed32acc
2016-05-10mcomp: Remove an obsolete undef.Alex Converse
The macro was removed in 6724676. Change-Id: I412c24aac49bd1ff60a331a30933e0d8ae3f2dd5
2016-05-10mcomp: Remove an obsolete comment.Alex Converse
This was copied over from VP8. VP9 doesn't seem to do this buffer copy. Change-Id: I28a8bbf0503a7f99b2cb60620ab3674adde863bb
2016-03-15Use whole pixel only at speed 8 screen content.Alex Converse
+5.857% BD-RATE on SCREEN_CONTENT Leaving this off for non-screen content because: +25.300% on TWITCH120 +37.833% BD-RATE on RTC Change-Id: Ie0a312182d6cc859fb04298e4cd81d02b39e23fe
2016-02-09Restore previous motion search bit-error scale.Alex Converse
The bit to error transformation got doubled as a result of going from 8-bit to 9-bit costs (change d13385c). Use defines to derive the scale numbers and comment some of the fields. derf: -0.023 BDRATE hevcmr: +0.067 BDRATE stdhd: +0.098 BDRATE (These are substantially smaller than than the original gains from 8 to 9 bit costing.) Change-Id: I6a2b3b029b2f1415e4f90a05709b2333ec0eea9b
2016-02-02Fix some interger overflow errorshui su
Change-Id: I7e44bd952f28ce9925e8bdf6ee8ca2bb13de1b49
2016-02-01Fix a signed overflow in vp9 motion cost.Alex Converse
Change-Id: I5975e3aede62202d8ee6ced33889350c0a56554a
2016-01-19VP9: Eliminate MB_MODE_INFOScott LaVarnway
Change-Id: Ifa607dd2bb366ce09fa16dfcad3cc45a2440c185
2016-01-13VP9: inline vp9_use_mv_hp()Scott LaVarnway
Change-Id: Ib275bfc4c29c572d6c70e5ec6dbfc241590d3e3e
2015-12-14move vp9_avg to vpx_dspJames Zern
Change-Id: I7bc991abea383db1f86c1bb0f2e849837b54d90f
2015-11-13Changes to exhaustive motion search.paulwilkins
This change alters the nature and use of exhaustive motion search. Firstly any exhaustive search is preceded by a normal step search. The exhaustive search is only carried out if the distortion resulting from the step search is above a threshold value. Secondly the simple +/- 64 exhaustive search is replaced by a multi stage mesh based search where each stage has a range and step/interval size. Subsequent stages use the best position from the previous stage as the center of the search but use a reduced range and interval size. For example: stage 1: Range +/- 64 interval 4 stage 2: Range +/- 32 interval 2 stage 3: Range +/- 15 interval 1 This process, especially when it follows on from a normal step search, has shown itself to be almost as effective as a full range exhaustive search with step 1 but greatly lowers the computational complexity such that it can be used in some cases for speeds 0-2. This patch also removes a double exhaustive search for sub 8x8 blocks which also contained a bug (the two searches used different distortion metrics). For best quality in my test animation sequence this patch has almost no impact on quality but improves encode speed by more than 5X. Restricted use in good quality speeds 0-2 yields significant quality gains on the animation test of 0.2 - 0.5 db with only a small impact on encode speed. On most clips though the quality gain and speed impact are small. Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
2015-11-11Add AVX vectorized vp9_diamond_search_sadGeza Lore
This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc
2015-11-06Revert "Add AVX vectorized vp9_diamond_search_sad"James Zern
This reverts commit f1342a7b070ef61b9fbdf03e899ac2107cfcb6bd. This breaks 32-bit builds: runtime error: load of misaligned address 0xf72fdd48 for type 'const __m128i' (vector of 2 'long long' values), which requires 16 byte alignment + _mm_set1_epi64x is incompatible with some versions of visual studio Change-Id: I6f6fc3c11403344cef78d1c432cdc9147e5c1673
2015-11-05Add AVX vectorized vp9_diamond_search_sadGeza Lore
This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6
2015-10-28Convert motion search config from AoS to SoAGeza Lore
This is a prerequisite for vectorizing vp9_diamond_search_sad_c. Change-Id: I49cd9148782410ca8b16e8a468ca9e7c6d088410
2015-08-31Include vpx_dsp_common.h when using VPXMIN/MAXJohann
Change-Id: I2e387a06484a06301f3cd6600c4ba2f4335b61ee
2015-08-28vp9_mcomp: make search functions privateJames Zern
vp9_full_pixel_search() can be used as a replacement as it dispatches to all search methods Change-Id: I57fcb79c1362b569dc95237bdcc8390f54efd440
2015-08-26vpx_dsp_common: add VPX prefix to MIN/MAXJames Zern
prevents redeclaration warnings; vp8 has its own define which will be resolved in a future commit Change-Id: Ic941fef3dd4262fcdce48b73075fe6b375f11c9c
2015-08-07Merge "Improve the second-level sub-pixel motion search"Yunqing Wang