Age | Commit message (Collapse) | Author |
|
- Some fixes to surface fit.
- Returns variance function as cost rather than sad in the
pattern search and diamond search functions. Only
vp9_pattern_search_sad function used in bigdia search
uses sad as integer 1-away costs.
- Deploys SUBPEL_TREE_PRUNED_MORE for speed 4+.
Results:
derf [Speed 3]: About +0.036% in coding efficiency without any
discernible speed loss.
derf [Speed 4]: About 2-3% faster at -0.199% loss in coding efficiency.
derf [Speed 5]: About 3-4% faster at -0.149% loss in coding efficiency.
Change-Id: I8462f94f6adb46966ca964f2bd0400977357fd63
|
|
One is a more aggressive version of the pruned subpel tree
search where only a single halfpel candidate is searched.
The search candidate is based on a surface fit result.
The other is a method to obtain the subpel position at one
shot based on the same surface fit.
The methods have not been deployed in any speed setting yet.
Change-Id: I34fef3f2e34f11396c9d1ba97f4be8c4ffca62d3
|
|
Adds code to return an integer cost list for NSTEP search. Then
uses it for pruned subpel search in speed 3.
derf: -0.06%
Speed on mobcal 720p increaes from 10.28 fps to 10.65 fps.
[Subject to further testing].
Change-Id: Ib591382d25b2c11bcaba9d3a27a93a9d1ab27a96
|
|
Updates the vp9_pattern_search function to return integer one-away
neighbors' sad values, for subsequent use in speeding up the
sub-pel search. Also, removes code for the do_refine option
which is not being used currently.
Updates the integer and subpel functions to pass in a 5-element
sad list for output or input.
A new pruned sub-pel search algorithm is implemented that uses
the sad returned from the integer pel search. But it is not
deployed yet.
Change-Id: Ifa9f5ad024b5b660570366d2bd900343e1891520
|
|
Change-Id: I3d9130e726a1299fd258f6dfe93315e2d12f76da
|
|
Deleted vp9_find_best_sub_pixel_comp_tree(), and combined it in
vp9_find_best_sub_pixel_tree().
Change-Id: Ifb25763c8b19822df5537cc1daa76ce88dc3b056
|
|
Change-Id: I12389f801ebd3bd2ae3bf31e125433bfb429ee65
|
|
Remove two unused parameters in the function
vp9_refining_search_8p_c().
Change-Id: Ic192734586291cf5400926eeb8e720e69d40835c
|
|
Change-Id: I961d50d6fafdd37ef7f23f0a871d28e28d2084ca
|
|
Change-Id: I2ad333553e673dbabcdc0f0366aea311e90849bf
|
|
Change-Id: Id81a76d18be6b2de69f81bb563d74c3bb356d434
|
|
Change-Id: I0df8c2a6d9863f92ee406010f2daeb5e40627649
|
|
|
|
Adds a fast diamond search which is about 5% faster than FAST_HEX
with only a 0.1% drop in psnr when turned on for both speeds 5 and 7.
This search is turned on for speed 7.
Change-Id: I497630aa88a5148926086bb3038e7975e5f4eb98
|
|
Change-Id: Ic535f0a1c2501c1af143237af3b2b51b4b4980f4
|
|
The core motion estimation fucntions all return sad now consistently.
The only exception is vp9_full_pixel_diamond(), however the core diamond
and refining search routines called from vp9_full_pixel_diamond() also
return SAD. If variance of pred error + mv cost is desired it must be
calculated explicitly outside these functions. For very fast encoding,
hopefully this will eliminate some redundant computations.
Also suggests reimplementing FAST_HEX with the vp9_pattern_search
framework. It is not exactly the same as the existing FAST_HEX, but
performance is slightly better and speed is very similar. Enables
removing a lot of duplicate code.
Change-Id: I152736393438c25bdf7e96b37cbb8ce330f4f94a
|
|
|
|
In good quality mode motion search, the best matches are normally
found after searching in a large area. In real time mode, to make
encoding fast, a center-biased fast HEX search is used, which
converges quickly most of the time. A 4-point diamond search is
also carried out as the following refining search, which gives more
precise results, and maintains good motion search quality.
At speed 5, the borg test on rtc set showed an overall PSNR loss of
0.936%. The encoding speed gain is 4% - 5%.
Change-Id: I42cd68bb56a09ca1b86293c99d5f7312225ca7ae
|
|
Passing block MV pointer instead of block index into
vp9_full_search_sad{, x3, x8} functions.
Change-Id: Ica07356633471c2c8f81b583a7aeba85a436bafb
|
|
Change-Id: If33a5a12c4025d9b5ec863dfccea7ee70f800665
|
|
Change-Id: Ie79114bba4f0cea55d9f701e20d2be2017630f3b
|
|
Change-Id: Ib71d9ed3f98e9468ad951bdc24c9ab565216eb38
|
|
|
|
Change-Id: I74cf028e8c732cd0dbc070326152d3085b824a80
|
|
Change-Id: I4f51ce859a97bf1b8fd2b37ac585b7c643232b69
|
|
Change-Id: I660b53da8ebf3049832ce8a10721051c4e0ebb00
|
|
Change-Id: Icf3b3dd96d7e133a4ad7260cd95288f6217998a6
|
|
Change-Id: Ifd432fa3741ba47102d298e0b348eb00f5a9ce53
|
|
The original iterative search was replaced by subpel_tree search,
and was not used anymore.
Change-Id: I998b38e1cb0ee359a08b2410d0766dbf183ab071
|
|
Change-Id: I068345f722a7116e3119927295ad23a28d3066a0
|
|
Change-Id: I8b81a3e4b4fa530a654c28d9c136afa0c1d379fd
|
|
This function sets the motion search range limit. Rename it to be
more informative.
Change-Id: I2e8e01073dcb99c9bea9c9acd0a61d672d615444
|
|
Explicitly constrain the upper limit of motion search range (in the
unit of full pixel) to be [-1023, +1023]. It is intended to control
the effective motion search range for 4K sequences.
Change-Id: I645539c70885eec0f155781f439d97d333336e88
|
|
Making this change in order to move allow_high_precision_mv field
from MACROBLOCKD structure to VP9_COMMON (because it is a frame level
flag).
Change-Id: I1d006ba36d938e0caf4d40fa051e2e38df9c1108
|
|
|
|
|
|
Change-Id: I9795d0937bc07793c13d067281995e0750f694d9
|
|
Both first pass and mbgraph search use block size 16x16 for motion
estimation. This commit put a limit of motion vector range. The
effective range allows the entire 16x16 with required subpel
interpolation input to be completely outside image border, but
not any further away from image border.
Change-Id: Id70a5ed08be49e70959f064859d72adc7d775d08
|
|
Updating fractional_mv_step_fp and fractional_mv_step_comp_fp function
types.
Change-Id: I601c4378bc39ac3ffd4e295d9cbd8e1f74829d46
|
|
Converting vp9_mv_bit_cost, mv_err_cost, and mvsad_err_cost
functions for now.
Change-Id: I60e3cc20daef773c2adf9a18e30bc85b1c2eb211
|
|
Change-Id: I3c45916a9059f11b41e9d798e34ffee052969a44
|
|
Adds a new subpel motion estimation function that uses a 2-level
tree-structured decision tree to eliminate redundant computations.
It searches fewer points than iterative search (which can search
the same point multiple times) but has the same quality roughly.
This is made the default setting at speeds 0 and 1, while at
speed 2 and above only a 1-level search is used.
Also includes various cleanups for consistency and redundancy removal.
Results:
derf: +0.012% psnr
stdhd: +0.09% psnr
Speedup of about 2-3%
Change-Id: Iedde4866f5475586dea0f0ba4cb7428fba24eee9
|
|
Removes some unused code and speed features, and organizes the
interfaces for fractional mv step functions for use in new speed
features to come.
In the process a new speed feature - number of iterations per
step during the subpel search - is exposed.
No change when this parameter is set as the original value of 3.
Results:
subpel_iters_per_step = 3: baseline
subpel_iters_per_step = 2: psnr -0.067%, 1% speedup
subpel_iters_per_step = 1: psnr -0.331%, 3-4% speedup
Change-Id: I2eba8a21f6461be8caf56af04a5337257a5693a8
|
|
Adds a few pattern searches to achieve various tradeoffs
between motion estimation complexity and performance.
The search framework is unified across these searches so that a
common pattern search function is used for all. Besides it will
be easier to experiment with various patterns or combinations
thereof at different scales in the future.
The new pattern search is multi-scale and is capable of using
different patterns at different scales.
The new hex search uses 8 points at the smallest scale
and 6 points at other scales.
Two other pattern searches - big-diamond and square are
also added. Big diamond uses 4 points at the smallest scale and
8 points in diamond shape at the larger scales.
Square is very similar conceptually to the default n-step search
but is somewhat faster since it keeps only one survivor across
all scales.
Psnr/speed-up results on derf300:
hex: -1.6% psnr%, 6-8% speed-up
big-diamond: -0.96% psnr, 4-5% speedup
square: -0.93% psnr, 4-5% speedup
Change-Id: I02a7ef5193f762601e0994e2c99399a3535a43d2
|
|
Using different variable names "allow_hp" and "use_hp" instead of "usehp".
Change-Id: I0cd5996ddeb46bd754473b680a993c0aaf8eb879
|
|
Renamed cpi->sf.first_step to cpi->sf.reduce_first_step_size
and changed its meaning such that it is a delta applied to
reduce the default first step size (>> x) in the motion search
rather than an absolute value.
The default first step size is already changed according to the image
dimensions (smaller for smaller images). cpi->sf.reduce_first_step_size
now applies a further correction from the default.
Change-Id: Ia94e08bc24c67b604831f980909af7e982fcd16d
|
|
Merge this experiment so that it is under a speed feature
flag not a configuration flag.
Change-Id: I536f7f125a4ff5149bb3a64f791e835c324535fd
|
|
This patch creates a new inter mode contest that avoids
a dependence on the reconstructed motion vectors from
neighboring blocks. This was a change requested by
a hardware vendor to improve decode performance.
As part of this change I have also made some modifications
to stats output code (under a flag) to allow accumulation of
inter mode context flags over multiple clips
Some further changes will be required to accommodate the
deprecation of the split mv mode over the next few days.
Performance as stands is around -0.25% on derf and
std-hd but up on the YT and YT-HD sets. With further tuning
or some adjustment to the context criteria it should be
possible to make this change broadly neutral.
Change-Id: Ia15cb4470969b9e87332a59c546ae0bd40676f6c
|
|
In current code, motion vectors got from single prediction mode are used
in compound prediction mode directly. These motion vectors may not give
accurate prediction since they are searched independently. In this patch,
we took Pascal's suggestion, and did joint motion search in compound
prediction mode to find better motion vectors in this situation.
Test results:
Overall PSNR: 0.570%(derf), 0.918%(stdhd);
SSIM: 0.572%(derf), 1.009%(stdhd);
The encoder is a little slower. This can be improved since some c
code is used in motion search.
Change-Id: Ib30c9240f6c56c9b070867b4ca89412a76d9f3c6
|
|
All members can be referenced from their per-plane counterparts, and
removes assumptions about 24 blocks per macroblock.
Change-Id: I7ff2fa72d22c29163eb558981c8193765a8113d9
|