Age | Commit message (Collapse) | Author |
|
Code cleanup; add some comment.
Also remove a reduncant call to vp9_get_mvpred_var() at the end when
method is MESH.
Change-Id: I4b58e7e1c42161642708f8b0342ab3c0ce39ed7d
|
|
Append mesh search to the diamond shape search to refine
the full pixel motion estimation for source ARF generation.
It improves the average compression performance.
Speed 0
avg PSNR overall PSNR SSIM
mid -0.18% -0.18% -0.22%
hd -0.25% -0.23% -0.36%
nflx2k -0.22% -0.23% -0.37%
Speed 1
avg PSNR overall PSNR SSIM
mid -0.10% -0.08% -0.11%
hd -0.25% -0.27% -0.38%
nflx2k -0.20% -0.20% -0.34%
The additional encoding time is close to the sample noise
range. For bus_cif at 1000 kbps, the speed 0 encoding time
goes from 83.0 s -> 83.6 s.
Change-Id: I48647f50ec3e8f7ae4550a4bde831f569f46ecf3
|
|
To save a branch.
Change-Id: Ifa2be7583e95c6991784731c654bbd4cce31e993
|
|
Change-Id: I08cb072a32e06c6452eca068b2f7ef7287f221e6
|
|
BUG=webm:1388
Change-Id: I1d0dd9af52a1461e3e2b2d60e8c4b6b74c3b90b0
|
|
Remove trailing commas to keep multiple elements on one line.
Add blank lines to prevent comments from being treated as blocks.
clang-format guards for struct with a comment in the middle.
Change-Id: I3bcb8313ae8aaf69179249a13b4087b1272cdbc0
|
|
This code is unused in vp9. Only vp8 still contains references to
vpx_sad_NxMx[3|8] and only for sizes 16x16, 16x8, 8x16, 8x8 and 4x4.
Remove the remaining sizes and all the highbitdepth versions.
BUG=webm:1425
Change-Id: If6a253977c8e0c04599e25cbeb45f71a94f563e8
|
|
visual studio will warn if a 32-bit shift is implicitly converted to 64.
in this case integer storage is enough for the result.
since:
f3a9ae5ba Fix ubsan failure in vp9_mcomp.c.
Change-Id: I7e0e199ef8d3c64e07b780c8905da8c53c1d09fc
|
|
|
|
BUG=webm:1440
Change-Id: I7074e42bdfa8dd25f11bbb3f2ab1b41d6f4c12e4
|
|
Change-Id: Iff1dea1fe9d4ea1d3fc95ea736ddf12f30e6f48d
|
|
A previous patch turned on allow_exhaustive_searches feature only for
FC_GRAPHICS_ANIMATION content. This patch further modified the feature
by removing the exhaustive search limit, and made it no longer adaptive.
As a result, the 2 counts that recorded the number of motion searches
were removed, which helped achieve the determinism in the row based
multi-threading encoding. Tests showed that this patch didn't cause
the encoder much slower.
Used exhaustive_searches_thresh for this speed feature, and removed
allow_exhaustive_searches. Also, refactored the speed feature code
to follow the general speed feature setting style.
Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa
|
|
The MV unit test revealed an integer overflow issue in vp9_mcomp.c.
This was caused if the MV was very large. In mv_err_cost(), when
mv->row = 8184, mv->col = 8184 and ref_mv is 0, mv_cost = 34363
and error_per_bit = 132412, causing the overflow.
BUG=webm:1406
Change-Id: I35f8299f22f9bee39cd9153d7b00d0993838845e
|
|
To prevent the motion vector out of range bug, added a motion vector unit
test in VP9. In the 4k video encoding, always forced to use extreme motion
vectors and also encouraged to use INTER modes. In the decoding, checked if
the motion vector was valid, and also checked the encoder/decoder mismatch.
The tests showed that this unit test could reveal the issue we saw before.
Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4
|
|
BUG=webm:1397
(yunqingwang)
To verify that this patch wouldn't cause much performance change,
the Borg tests were run. Here was the result:
avg_psnr overall_psnr ssim
hdres: -0.002 0.006 0.013
midres: 0 0 0
lowres: 0 0 0
Change-Id: Iae395ae7b741e0513cf5bab9dcace110b792a67d
|
|
|
|
(Yunqing Wang)
This patch implements the row-based multi-threading within tiles in
the encoding pass, and substantially speeds up the multi-threaded
encoder in VP9.
Speed tests at speed 1 on STDHD(using 4 tiles) set show that the
average speedups of the encoding pass(second pass in the 2-pass
encoding) is 7% while using 2 threads, 16% while using 4 threads,
85% while using 8 threads, and 116% while using 16 threads.
Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
|
|
Change-Id: I75e4a9e0b37bd4586f26c8d6c1fa27f3f6ff1bce
|
|
Change-Id: Ic238d32c7e10b730342224ab56712a89a6026a8f
|
|
|
|
Change-Id: I2fd72af00afbbeb903e4fe364611abcc148f2fbb
|
|
This commit resolves the compression performance regression in
real-time encoding setting when high bit-depth mode is enabled.
The current solution temporarily disables the SIMD implementations
of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode.
The commit makes the coding results bit-wise identical between
regular coding pipeline and high bit-depth at profile 0.
BUG=webm:1365
Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf
|
|
Removed MVC so that mv_err_cost() is always called while calculating
the mv cost.
Change-Id: I28123e05fbfc2352128e266c985d2ab093940071
|
|
|
|
Clamped the initial mv in vp9_refining_search_8p_c.
BUG=webm:1354
Change-Id: I47d302b350937e3e6e52e95c983b5fb0b4c64fba
|
|
For out-of-range cases, returned UINT_MAX instead of INT_MAX in the
sub-pixel mv search to be consistent with the "uint32_t" return type.
Change-Id: I8e206d771228c13d89bafbbe9f14722c8ecc6a7a
|
|
Change-Id: I501597b7c1e0f0c7ae2aea3ee8073f0a641b3487
|
|
Change-Id: Ifebdc9ef37850508eb4b8e572fd0f6026ab04987
|
|
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
|
|
provides better toolchain compatibility
Change-Id: I8561a6de668a68ff54fe3886a4ee6300f0ae9c04
|
|
BUG=webm:1250
Change-Id: Id5bb2762ca1bf996ba4f9a60eec977a7994c1d94
|
|
This commit fixes a number of ubsan warnings in HBD build.
BUG=webm:1219
Change-Id: I05f0fd0ef50e93db4ba34205005c54af1ed32acc
|
|
The macro was removed in 6724676.
Change-Id: I412c24aac49bd1ff60a331a30933e0d8ae3f2dd5
|
|
This was copied over from VP8. VP9 doesn't seem to do this buffer copy.
Change-Id: I28a8bbf0503a7f99b2cb60620ab3674adde863bb
|
|
+5.857% BD-RATE on SCREEN_CONTENT
Leaving this off for non-screen content because:
+25.300% on TWITCH120
+37.833% BD-RATE on RTC
Change-Id: Ie0a312182d6cc859fb04298e4cd81d02b39e23fe
|
|
The bit to error transformation got doubled as a result of going from
8-bit to 9-bit costs (change d13385c).
Use defines to derive the scale numbers and comment some of the fields.
derf: -0.023 BDRATE
hevcmr: +0.067 BDRATE
stdhd: +0.098 BDRATE
(These are substantially smaller than than the original gains from 8 to
9 bit costing.)
Change-Id: I6a2b3b029b2f1415e4f90a05709b2333ec0eea9b
|
|
Change-Id: I7e44bd952f28ce9925e8bdf6ee8ca2bb13de1b49
|
|
Change-Id: I5975e3aede62202d8ee6ced33889350c0a56554a
|
|
Change-Id: Ifa607dd2bb366ce09fa16dfcad3cc45a2440c185
|
|
Change-Id: Ib275bfc4c29c572d6c70e5ec6dbfc241590d3e3e
|
|
Change-Id: I7bc991abea383db1f86c1bb0f2e849837b54d90f
|
|
This change alters the nature and use of exhaustive motion search.
Firstly any exhaustive search is preceded by a normal step search.
The exhaustive search is only carried out if the distortion resulting
from the step search is above a threshold value.
Secondly the simple +/- 64 exhaustive search is replaced by a
multi stage mesh based search where each stage has a range
and step/interval size. Subsequent stages use the best position from
the previous stage as the center of the search but use a reduced range
and interval size.
For example:
stage 1: Range +/- 64 interval 4
stage 2: Range +/- 32 interval 2
stage 3: Range +/- 15 interval 1
This process, especially when it follows on from a normal step
search, has shown itself to be almost as effective as a full range
exhaustive search with step 1 but greatly lowers the computational
complexity such that it can be used in some cases for speeds 0-2.
This patch also removes a double exhaustive search for sub 8x8 blocks
which also contained a bug (the two searches used different distortion
metrics).
For best quality in my test animation sequence this patch has almost
no impact on quality but improves encode speed by more than 5X.
Restricted use in good quality speeds 0-2 yields significant quality gains
on the animation test of 0.2 - 0.5 db with only a small impact on encode
speed. On most clips though the quality gain and speed impact are small.
Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
|
|
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
- mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
- For all i: mvsadcost[0][i] == mvsadcost[1][i]
(equal per component cost)
- For all i: mvsadcost[0][i] == mvsadcost[0][-i]
(Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.
Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc
|
|
This reverts commit f1342a7b070ef61b9fbdf03e899ac2107cfcb6bd.
This breaks 32-bit builds:
runtime error: load of misaligned address 0xf72fdd48 for type 'const
__m128i' (vector of 2 'long long' values), which requires 16 byte
alignment
+ _mm_set1_epi64x is incompatible with some versions of visual studio
Change-Id: I6f6fc3c11403344cef78d1c432cdc9147e5c1673
|
|
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
- mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
- For all i: mvsadcost[0][i] == mvsadcost[1][i]
(equal per component cost)
- For all i: mvsadcost[0][i] == mvsadcost[0][-i]
(Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.
Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6
|
|
This is a prerequisite for vectorizing vp9_diamond_search_sad_c.
Change-Id: I49cd9148782410ca8b16e8a468ca9e7c6d088410
|
|
Change-Id: I2e387a06484a06301f3cd6600c4ba2f4335b61ee
|
|
vp9_full_pixel_search() can be used as a replacement as it dispatches to
all search methods
Change-Id: I57fcb79c1362b569dc95237bdcc8390f54efd440
|
|
prevents redeclaration warnings;
vp8 has its own define which will be resolved in a future commit
Change-Id: Ic941fef3dd4262fcdce48b73075fe6b375f11c9c
|
|
|