summaryrefslogtreecommitdiff
path: root/vp9/encoder/vp9_mcomp.h
AgeCommit message (Collapse)Author
2015-11-11Add AVX vectorized vp9_diamond_search_sadGeza Lore
This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc
2015-11-06Revert "Add AVX vectorized vp9_diamond_search_sad"James Zern
This reverts commit f1342a7b070ef61b9fbdf03e899ac2107cfcb6bd. This breaks 32-bit builds: runtime error: load of misaligned address 0xf72fdd48 for type 'const __m128i' (vector of 2 'long long' values), which requires 16 byte alignment + _mm_set1_epi64x is incompatible with some versions of visual studio Change-Id: I6f6fc3c11403344cef78d1c432cdc9147e5c1673
2015-11-05Add AVX vectorized vp9_diamond_search_sadGeza Lore
This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6
2015-10-28Convert motion search config from AoS to SoAGeza Lore
This is a prerequisite for vectorizing vp9_diamond_search_sad_c. Change-Id: I49cd9148782410ca8b16e8a468ca9e7c6d088410
2015-08-28vp9_mcomp: make search functions privateJames Zern
vp9_full_pixel_search() can be used as a replacement as it dispatches to all search methods Change-Id: I57fcb79c1362b569dc95237bdcc8390f54efd440
2015-07-07Move sub pixel variance to vpx_dspJohann
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-05-22Fix integral projection motion search for frame resizeJingning Han
This commit fixes the integral projection motion search crash when frame resize is used. It fixes issue 994. Change-Id: Ieeb52619121d7444f7d6b3d0cf09415f990d1506
2015-03-04Move integral projection motion search to vp9_mcomp.cJingning Han
Make it a general purpose fast motion estimation function, to be used in the mode search process. Change-Id: Ib354cb0e664dc61c30c0b2314297835ee75b157a
2014-11-17change to call vp9_refining_search_sad() directlyYaowu Xu
The function pointer in compressor instance does not change, so this commit changes to call the function directly. Change-Id: I9c9c460e3475711c384b74c9842f0b4f3d037cc5
2014-10-08Subpel search cleanups and enhancementsDeb Mukherjee
- Some fixes to surface fit. - Returns variance function as cost rather than sad in the pattern search and diamond search functions. Only vp9_pattern_search_sad function used in bigdia search uses sad as integer 1-away costs. - Deploys SUBPEL_TREE_PRUNED_MORE for speed 4+. Results: derf [Speed 3]: About +0.036% in coding efficiency without any discernible speed loss. derf [Speed 4]: About 2-3% faster at -0.199% loss in coding efficiency. derf [Speed 5]: About 3-4% faster at -0.149% loss in coding efficiency. Change-Id: I8462f94f6adb46966ca964f2bd0400977357fd63
2014-09-29Adds two new subpel search methodsDeb Mukherjee
One is a more aggressive version of the pruned subpel tree search where only a single halfpel candidate is searched. The search candidate is based on a surface fit result. The other is a method to obtain the subpel position at one shot based on the same surface fit. The methods have not been deployed in any speed setting yet. Change-Id: I34fef3f2e34f11396c9d1ba97f4be8c4ffca62d3
2014-09-23Pruned subpel search for speed 3.Deb Mukherjee
Adds code to return an integer cost list for NSTEP search. Then uses it for pruned subpel search in speed 3. derf: -0.06% Speed on mobcal 720p increaes from 10.28 fps to 10.65 fps. [Subject to further testing]. Change-Id: Ib591382d25b2c11bcaba9d3a27a93a9d1ab27a96
2014-08-28Updates vp9_pattern search to return integer sadsDeb Mukherjee
Updates the vp9_pattern_search function to return integer one-away neighbors' sad values, for subsequent use in speeding up the sub-pel search. Also, removes code for the do_refine option which is not being used currently. Updates the integer and subpel functions to pass in a 5-element sad list for output or input. A new pruned sub-pel search algorithm is implemented that uses the sad returned from the integer pel search. But it is not deployed yet. Change-Id: Ifa9f5ad024b5b660570366d2bd900343e1891520
2014-07-11Remove an unused parameter in vp9_init_search_range()Yaowu Xu
Change-Id: I3d9130e726a1299fd258f6dfe93315e2d12f76da
2014-07-09Remove repetitive code in mcomp.cYunqing Wang
Deleted vp9_find_best_sub_pixel_comp_tree(), and combined it in vp9_find_best_sub_pixel_tree(). Change-Id: Ifb25763c8b19822df5537cc1daa76ce88dc3b056
2014-06-12Moving full_pixel_search() to vp9_mcomp.c.Dmitry Kovalev
Change-Id: I12389f801ebd3bd2ae3bf31e125433bfb429ee65
2014-05-14Silence unused parameter warnings.Paul Wilkins
Remove two unused parameters in the function vp9_refining_search_8p_c(). Change-Id: Ic192734586291cf5400926eeb8e720e69d40835c
2014-05-01Using SPEED_FEATURES instead of VP9_COMP in vp9_init_search_range().Dmitry Kovalev
Change-Id: I961d50d6fafdd37ef7f23f0a871d28e28d2084ca
2014-04-29Adding search_site_config struct.Dmitry Kovalev
Change-Id: I2ad333553e673dbabcdc0f0366aea311e90849bf
2014-04-11Removing unused cost arguments from mcomp functions.Dmitry Kovalev
Change-Id: Id81a76d18be6b2de69f81bb563d74c3bb356d434
2014-03-26Cleaning up vp9_get_mvpred_{av_,}var() functions.Dmitry Kovalev
Change-Id: I0df8c2a6d9863f92ee406010f2daeb5e40627649
2014-03-10Merge "Support for a fast diamond search"Deb Mukherjee
2014-03-07Support for a fast diamond searchDeb Mukherjee
Adds a fast diamond search which is about 5% faster than FAST_HEX with only a 0.1% drop in psnr when turned on for both speeds 5 and 7. This search is turned on for speed 7. Change-Id: I497630aa88a5148926086bb3038e7975e5f4eb98
2014-03-06Cleaning up vp9_get_mvpred_var().Dmitry Kovalev
Change-Id: Ic535f0a1c2501c1af143237af3b2b51b4b4980f4
2014-03-03Refactoring motion search libsDeb Mukherjee
The core motion estimation fucntions all return sad now consistently. The only exception is vp9_full_pixel_diamond(), however the core diamond and refining search routines called from vp9_full_pixel_diamond() also return SAD. If variance of pred error + mv cost is desired it must be calculated explicitly outside these functions. For very fast encoding, hopefully this will eliminate some redundant computations. Also suggests reimplementing FAST_HEX with the vp9_pattern_search framework. It is not exactly the same as the existing FAST_HEX, but performance is slightly better and speed is very similar. Enables removing a lot of duplicate code. Change-Id: I152736393438c25bdf7e96b37cbb8ce330f4f94a
2014-02-25Merge "Changing vp9_full_search_sad{, x3, x8} signatures."Dmitry Kovalev
2014-02-18Use fast HEX search in real time modeYunqing Wang
In good quality mode motion search, the best matches are normally found after searching in a large area. In real time mode, to make encoding fast, a center-biased fast HEX search is used, which converges quickly most of the time. A 4-point diamond search is also carried out as the following refining search, which gives more precise results, and maintains good motion search quality. At speed 5, the borg test on rtc set showed an overall PSNR loss of 0.936%. The encoding speed gain is 4% - 5%. Change-Id: I42cd68bb56a09ca1b86293c99d5f7312225ca7ae
2014-02-17Changing vp9_full_search_sad{, x3, x8} signatures.Dmitry Kovalev
Passing block MV pointer instead of block index into vp9_full_search_sad{, x3, x8} functions. Change-Id: Ica07356633471c2c8f81b583a7aeba85a436bafb
2014-02-13Using MV instead of int_mv inside vp9_full_pixel_diamond().Dmitry Kovalev
Change-Id: If33a5a12c4025d9b5ec863dfccea7ee70f800665
2014-02-12Adding consts to mv search function arguments.Dmitry Kovalev
Change-Id: Ie79114bba4f0cea55d9f701e20d2be2017630f3b
2014-01-31Cleaning up vp9_mcomp.{c, h}.Dmitry Kovalev
Change-Id: Ib71d9ed3f98e9468ad951bdc24c9ab565216eb38
2014-01-31Merge "Cleaning up motion compensation code."Dmitry Kovalev
2014-01-23Cleaning up motion compensation code.Dmitry Kovalev
Change-Id: I74cf028e8c732cd0dbc070326152d3085b824a80
2014-01-23vp9/encoder: add extern "C" to headersJames Zern
Change-Id: I4f51ce859a97bf1b8fd2b37ac585b7c643232b69
2014-01-17Cleaning up vp9_refining_search_sad() function.Dmitry Kovalev
Change-Id: I660b53da8ebf3049832ce8a10721051c4e0ebb00
2014-01-16Cleaning up vp9_refining_search_8p_c() function.Dmitry Kovalev
Change-Id: Icf3b3dd96d7e133a4ad7260cd95288f6217998a6
2014-01-03Replacing int_mv with MV.Dmitry Kovalev
Change-Id: Ifd432fa3741ba47102d298e0b348eb00f5a9ce53
2013-12-19Remove a unused sub-pixel searchYunqing Wang
The original iterative search was replaced by subpel_tree search, and was not used anymore. Change-Id: I998b38e1cb0ee359a08b2410d0766dbf183ab071
2013-12-13Using MV struct instead of int_mv union in encoder (2).Dmitry Kovalev
Change-Id: I068345f722a7116e3119927295ad23a28d3066a0
2013-12-13Using MV struct instead of int_mv union in encoder.Dmitry Kovalev
Change-Id: I8b81a3e4b4fa530a654c28d9c136afa0c1d379fd
2013-12-11Rename clamp_mv_min_max to set_mv_search_rangeJingning Han
This function sets the motion search range limit. Rename it to be more informative. Change-Id: I2e8e01073dcb99c9bea9c9acd0a61d672d615444
2013-11-18Constrain encoder motion search rangeJingning Han
Explicitly constrain the upper limit of motion search range (in the unit of full pixel) to be [-1023, +1023]. It is intended to control the effective motion search range for 4K sequences. Change-Id: I645539c70885eec0f155781f439d97d333336e88
2013-10-17Adding allow_hp as an argument to mv search functions.Dmitry Kovalev
Making this change in order to move allow_high_precision_mv field from MACROBLOCKD structure to VP9_COMMON (because it is a frame level flag). Change-Id: I1d006ba36d938e0caf4d40fa051e2e38df9c1108
2013-09-29Merge "Moving from int_mv* to MV* (3)."Dmitry Kovalev
2013-09-25Merge "Limit mv search range for first pass and mbgraph"Yaowu Xu
2013-09-25Moving from int_mv* to MV* (3).Dmitry Kovalev
Change-Id: I9795d0937bc07793c13d067281995e0750f694d9
2013-09-24Limit mv search range for first pass and mbgraphYaowu Xu
Both first pass and mbgraph search use block size 16x16 for motion estimation. This commit put a limit of motion vector range. The effective range allows the entire 16x16 with required subpel interpolation input to be completely outside image border, but not any further away from image border. Change-Id: Id70a5ed08be49e70959f064859d72adc7d775d08
2013-09-24Moving from int_mv* to MV* (2).Dmitry Kovalev
Updating fractional_mv_step_fp and fractional_mv_step_comp_fp function types. Change-Id: I601c4378bc39ac3ffd4e295d9cbd8e1f74829d46
2013-09-20Moving from int_mv to MV.Dmitry Kovalev
Converting vp9_mv_bit_cost, mv_err_cost, and mvsad_err_cost functions for now. Change-Id: I60e3cc20daef773c2adf9a18e30bc85b1c2eb211
2013-08-12Using MV* instead of int_mv* as argument of vp9_clamp_mv_min_max.Dmitry Kovalev
Change-Id: I3c45916a9059f11b41e9d798e34ffee052969a44