Age | Commit message (Collapse) | Author |
|
Change-Id: Ifebdc9ef37850508eb4b8e572fd0f6026ab04987
|
|
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
|
|
This commit fixes a number of ubsan warnings in HBD build.
BUG=webm:1219
Change-Id: I05f0fd0ef50e93db4ba34205005c54af1ed32acc
|
|
+5.857% BD-RATE on SCREEN_CONTENT
Leaving this off for non-screen content because:
+25.300% on TWITCH120
+37.833% BD-RATE on RTC
Change-Id: Ie0a312182d6cc859fb04298e4cd81d02b39e23fe
|
|
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
- mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
- For all i: mvsadcost[0][i] == mvsadcost[1][i]
(equal per component cost)
- For all i: mvsadcost[0][i] == mvsadcost[0][-i]
(Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.
Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc
|
|
This reverts commit f1342a7b070ef61b9fbdf03e899ac2107cfcb6bd.
This breaks 32-bit builds:
runtime error: load of misaligned address 0xf72fdd48 for type 'const
__m128i' (vector of 2 'long long' values), which requires 16 byte
alignment
+ _mm_set1_epi64x is incompatible with some versions of visual studio
Change-Id: I6f6fc3c11403344cef78d1c432cdc9147e5c1673
|
|
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
- mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
- For all i: mvsadcost[0][i] == mvsadcost[1][i]
(equal per component cost)
- For all i: mvsadcost[0][i] == mvsadcost[0][-i]
(Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.
Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6
|
|
This is a prerequisite for vectorizing vp9_diamond_search_sad_c.
Change-Id: I49cd9148782410ca8b16e8a468ca9e7c6d088410
|
|
vp9_full_pixel_search() can be used as a replacement as it dispatches to
all search methods
Change-Id: I57fcb79c1362b569dc95237bdcc8390f54efd440
|
|
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
|
|
This commit fixes the integral projection motion search crash when
frame resize is used. It fixes issue 994.
Change-Id: Ieeb52619121d7444f7d6b3d0cf09415f990d1506
|
|
Make it a general purpose fast motion estimation function, to be
used in the mode search process.
Change-Id: Ib354cb0e664dc61c30c0b2314297835ee75b157a
|
|
The function pointer in compressor instance does not change, so this
commit changes to call the function directly.
Change-Id: I9c9c460e3475711c384b74c9842f0b4f3d037cc5
|
|
- Some fixes to surface fit.
- Returns variance function as cost rather than sad in the
pattern search and diamond search functions. Only
vp9_pattern_search_sad function used in bigdia search
uses sad as integer 1-away costs.
- Deploys SUBPEL_TREE_PRUNED_MORE for speed 4+.
Results:
derf [Speed 3]: About +0.036% in coding efficiency without any
discernible speed loss.
derf [Speed 4]: About 2-3% faster at -0.199% loss in coding efficiency.
derf [Speed 5]: About 3-4% faster at -0.149% loss in coding efficiency.
Change-Id: I8462f94f6adb46966ca964f2bd0400977357fd63
|
|
One is a more aggressive version of the pruned subpel tree
search where only a single halfpel candidate is searched.
The search candidate is based on a surface fit result.
The other is a method to obtain the subpel position at one
shot based on the same surface fit.
The methods have not been deployed in any speed setting yet.
Change-Id: I34fef3f2e34f11396c9d1ba97f4be8c4ffca62d3
|
|
Adds code to return an integer cost list for NSTEP search. Then
uses it for pruned subpel search in speed 3.
derf: -0.06%
Speed on mobcal 720p increaes from 10.28 fps to 10.65 fps.
[Subject to further testing].
Change-Id: Ib591382d25b2c11bcaba9d3a27a93a9d1ab27a96
|
|
Updates the vp9_pattern_search function to return integer one-away
neighbors' sad values, for subsequent use in speeding up the
sub-pel search. Also, removes code for the do_refine option
which is not being used currently.
Updates the integer and subpel functions to pass in a 5-element
sad list for output or input.
A new pruned sub-pel search algorithm is implemented that uses
the sad returned from the integer pel search. But it is not
deployed yet.
Change-Id: Ifa9f5ad024b5b660570366d2bd900343e1891520
|
|
Change-Id: I3d9130e726a1299fd258f6dfe93315e2d12f76da
|
|
Deleted vp9_find_best_sub_pixel_comp_tree(), and combined it in
vp9_find_best_sub_pixel_tree().
Change-Id: Ifb25763c8b19822df5537cc1daa76ce88dc3b056
|
|
Change-Id: I12389f801ebd3bd2ae3bf31e125433bfb429ee65
|
|
Remove two unused parameters in the function
vp9_refining_search_8p_c().
Change-Id: Ic192734586291cf5400926eeb8e720e69d40835c
|
|
Change-Id: I961d50d6fafdd37ef7f23f0a871d28e28d2084ca
|
|
Change-Id: I2ad333553e673dbabcdc0f0366aea311e90849bf
|
|
Change-Id: Id81a76d18be6b2de69f81bb563d74c3bb356d434
|
|
Change-Id: I0df8c2a6d9863f92ee406010f2daeb5e40627649
|
|
|
|
Adds a fast diamond search which is about 5% faster than FAST_HEX
with only a 0.1% drop in psnr when turned on for both speeds 5 and 7.
This search is turned on for speed 7.
Change-Id: I497630aa88a5148926086bb3038e7975e5f4eb98
|
|
Change-Id: Ic535f0a1c2501c1af143237af3b2b51b4b4980f4
|
|
The core motion estimation fucntions all return sad now consistently.
The only exception is vp9_full_pixel_diamond(), however the core diamond
and refining search routines called from vp9_full_pixel_diamond() also
return SAD. If variance of pred error + mv cost is desired it must be
calculated explicitly outside these functions. For very fast encoding,
hopefully this will eliminate some redundant computations.
Also suggests reimplementing FAST_HEX with the vp9_pattern_search
framework. It is not exactly the same as the existing FAST_HEX, but
performance is slightly better and speed is very similar. Enables
removing a lot of duplicate code.
Change-Id: I152736393438c25bdf7e96b37cbb8ce330f4f94a
|
|
|
|
In good quality mode motion search, the best matches are normally
found after searching in a large area. In real time mode, to make
encoding fast, a center-biased fast HEX search is used, which
converges quickly most of the time. A 4-point diamond search is
also carried out as the following refining search, which gives more
precise results, and maintains good motion search quality.
At speed 5, the borg test on rtc set showed an overall PSNR loss of
0.936%. The encoding speed gain is 4% - 5%.
Change-Id: I42cd68bb56a09ca1b86293c99d5f7312225ca7ae
|
|
Passing block MV pointer instead of block index into
vp9_full_search_sad{, x3, x8} functions.
Change-Id: Ica07356633471c2c8f81b583a7aeba85a436bafb
|
|
Change-Id: If33a5a12c4025d9b5ec863dfccea7ee70f800665
|
|
Change-Id: Ie79114bba4f0cea55d9f701e20d2be2017630f3b
|
|
Change-Id: Ib71d9ed3f98e9468ad951bdc24c9ab565216eb38
|
|
|
|
Change-Id: I74cf028e8c732cd0dbc070326152d3085b824a80
|
|
Change-Id: I4f51ce859a97bf1b8fd2b37ac585b7c643232b69
|
|
Change-Id: I660b53da8ebf3049832ce8a10721051c4e0ebb00
|
|
Change-Id: Icf3b3dd96d7e133a4ad7260cd95288f6217998a6
|
|
Change-Id: Ifd432fa3741ba47102d298e0b348eb00f5a9ce53
|
|
The original iterative search was replaced by subpel_tree search,
and was not used anymore.
Change-Id: I998b38e1cb0ee359a08b2410d0766dbf183ab071
|
|
Change-Id: I068345f722a7116e3119927295ad23a28d3066a0
|
|
Change-Id: I8b81a3e4b4fa530a654c28d9c136afa0c1d379fd
|
|
This function sets the motion search range limit. Rename it to be
more informative.
Change-Id: I2e8e01073dcb99c9bea9c9acd0a61d672d615444
|
|
Explicitly constrain the upper limit of motion search range (in the
unit of full pixel) to be [-1023, +1023]. It is intended to control
the effective motion search range for 4K sequences.
Change-Id: I645539c70885eec0f155781f439d97d333336e88
|
|
Making this change in order to move allow_high_precision_mv field
from MACROBLOCKD structure to VP9_COMMON (because it is a frame level
flag).
Change-Id: I1d006ba36d938e0caf4d40fa051e2e38df9c1108
|
|
|
|
|
|
Change-Id: I9795d0937bc07793c13d067281995e0750f694d9
|