summaryrefslogtreecommitdiff
path: root/vp9/common/vp9_idct.h
AgeCommit message (Collapse)Author
2018-09-15cosmetics: normalize include guardsJames Zern
use the recommended format [1] of: <PROJECT>_<PATH>_<FILE>_H_ [1] https://google.github.io/styleguide/cppguide.html#The__define_Guard "All header files should have #define guards to prevent multiple inclusion. The format of the symbol name should be <PROJECT>_<PATH>_<FILE>_H_." Change-Id: I2e8ab0b32fb23c30fa43cff5fec12d043c0d2037
2017-05-03Update highbd idct functions arguments to use uint16_t dstLinfeng Zhang
BUG=webm:1388 Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5
2016-08-02vp9/common: apply clang-formatclang-format
Change-Id: Ie0f150fdcfcbf7c4db52d3a08bc8238ed1c72e3b
2015-08-04Replace vp9_ prefix with vpx_ prefix in vpx_dsp function namesJingning Han
This commit clears the function naming convention in vpx_dsp. It replaces vp9_ prefix of global functions with vpx_ prefix. It also removes the vp9_ prefix from static functions. Change-Id: I6394359a63b71a51dda01342eec6a3cc08dfeedf
2015-07-31Factor inverse transform functions into vpx_dspJingning Han
This commit moves the module inverse transform functions from vp9 to vpx_dsp folder. The hybrid transform wrapper functions stay in the vp9 folder, since it involves codec-specific data structures. Change-Id: Ib066367c953d3d024c73ba65157bbd70a95c9ef8
2015-07-31Code refactor on InterpKernelZoe Liu
It in essence refactors the code for both the interpolation filtering and the convolution. This change includes the moving of all the files as well as the changing of the code from vp9_ prefix to vpx_ prefix accordingly, for underneath architectures: (1) x86; (2) arm/neon; and (3) mips/msa. The work on mips/drsp2 will be done in a separate change list. Change-Id: Ic3ce7fb7f81210db7628b373c73553db68793c46
2015-07-26Refactor vp9_idct.h fileJingning Han
Separate the common coefficient constant into vpx_dsp/txfm_common.h. Move the SSE2 macro definitions to vpx_dsp/x86/txfm_common_sse2.h. This clears the use case of vp9_idct.h in vpx_dsp folder. Change-Id: I319735a2abf42888e5080ac14cfbcde34be7b121
2015-07-08Clean out more MSVC warningsYaowu Xu
Change-Id: I1bab0c104df2ec4825d050cd516e26ab635a7b3e
2015-05-13Relocate memory operations for common codeJohann
With the sad functions, and hopefully the variance functions soon, moving to the vpx_dsp location, place the defines used in the reference C code in a common location. Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca
2015-01-06Enable coefficient range checking for 10-/12-bitDeb Mukherjee
Also fixes a broken build with --enable-coefficient-range-checking configuration option. Change-Id: Icc536f53088e8cec59dfb8f635668555fdb9125e
2014-11-24Refactored idct routines and headersPeter de Rivaz
This change is made in preparation for a subsequent patch which adds acceleration for the highbitdepth transform functions. The highbitdepth transform functions attempt to use 16/32bit sse instructions where possible, but fallback to using the C implementations if potential overflow is detected. For this reason the dct routines are made global so they can be called from the acceleration functions in the subsequent patch. Change-Id: Ia921f191bf6936ccba4f13e8461624b120c1f665 (cherry picked from commit 454342d4e77dbb67f4a3c10f97a57a6fcb46d9a0)
2014-11-05Fix visual studio 2013 compiler warningsYaowu Xu
For configured with --enable-vp9-highbitdepth Change-Id: I2b181519d7192f8d7a241ad5760c3578255f24e6
2014-10-09Rename highbitdepth functions to use highbd prefixDeb Mukherjee
Uses highbd_ prefix convention consistently. Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e
2014-10-06Add range check in inverse ADST 16x16Jingning Han
Bit-stream clarification related to Issue 868. Change-Id: I92a7bc5b7782c9ea5c3f6cceec761742183c9514
2014-10-01Remove redundant header file from vp9_idct.hJingning Han
Change-Id: Id92544762e7b96d3c729dfc8e04ecff91cbcc7f9
2014-09-30Moves transform type defines to vp9_commonDeb Mukherjee
Moves transform type defines to vp9_common.h from vp9_idct.h so that they can be included in vp9_rtcd_defs.pl safely. Change-Id: Id5106227bee5934f7ce8b06f2eb9fa8a9a2e0ddb
2014-09-30Revert "Fix compiling error in vp9_idct.h"James Zern
This reverts commit eafc8c9c40d712aabe234bed5269a02c62fa0bfc. tran_low_t/tran_high_t don't belong in a public header, they're private. Similarly the public headers shouldn't rely on config defines, vpx_config.h isn't installed. Change-Id: I194ec273598da418df8dd727b6c0e78a556740ad
2014-09-30Fix compiling error in vp9_idct.hJingning Han
This commit fixes a compiling error in vp9_idct.h, where the codec checks that the intermediate steps of transformation fit within 16-bit length. The issue was due to broken file dependency. Change-Id: Ib22bba13a1e6df28489cb23d6774c561969f1fdc
2014-09-11Adds high bitdepth transform functions and testsDeb Mukherjee
Adds various high bitdepth transform functions and tests. Much of the changes are related to using typedefs tran_low_t and tran_high_t for the final transform cofficients and intermediate stages of the transform computation respectively rather than fixed types int16_t/int. When vp9_highbitdepth configure flag is off, these map tp int16_t/int32_t, but when the flag is on, they map to int32_t/int64_t to make space for needed extra precision. Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8
2014-08-06configure: add --enable-coefficient-range-checkingYaowu Xu
This commit adds a configure time option used to enable strict error checking in decoder to make sure intermediate stage cofficients of inverse transforms are within valid range of signed 16 bit integer. For valid VP9 input streams, intermediate stage coefficients should always stay within the range of a signed 16 bit integer. Coefficients can go out of this range for invalid/corrupt VP9 streams. However, strictly checking this range for every intermediate coefficient can be a burden for decoder, therefore such validation is only enabled with configure option --enable-coefficient-range-checking. Change-Id: I47d47c8c4e48a922c3d223ca59064f51b3f0f5ed
2014-05-28Enable SSSE3 inverse 2D-DCT with 10 non-zero coeffsJingning Han
This commit enables SSSE3 implementation of the inverse 2D-DCT with only first 10 coefficients non-zero. It reduces the runtime of SSE2 version from 745 cycles to 538 cycles, i.e., 27% speed-up. Change-Id: I18ba4128859b09c704a6ee361d69a86c09fe8dfe
2014-05-01Moving pair_set_epi32 macro into vp9_dct32x32_sse2.c.Dmitry Kovalev
Change-Id: I642a7d343677bf934e9a54cf4ad78e908620e39a
2014-01-23vp9/common: add extern "C" to headersJames Zern
Change-Id: Ic334da9aee968e33762c2b25d9fbad24c844b411
2013-11-15Take out assertion from inverse transformsJingning Han
Separate the rounding and right shift operations of forward transform from those of inverse transform. Take out the assertion check from inverse transforms. If the transform coefficients were constructed to cause intermediate steps of inverse transform overflow, the codec will just let it overflow without breaking the decoding flow. Change-Id: I73cfc3706c4e840fc543a77cbc4cdb0b05d07730
2013-10-11Making input pointer of any inverse transform constant.Dmitry Kovalev
Also renaming dest_stride to stride in some places. Change-Id: I75f602b623a5a7071d4922b747c45fa0b7d7a940
2013-10-11Consistent names for inverse hybrid transforms (2 of 2).Dmitry Kovalev
Renames: vp9_iht_add -> vp9_iht4x4_add vp9_iht_add_8x8 -> vp9_iht8x8_add vp9_iht_add_16x16 -> vp9_iht16x16_add Change-Id: I8f1a2913e02d90d41f174f27e4ee2fad0dbd4a21
2013-10-11Adding const to the input argument of all 1D transforms.Dmitry Kovalev
Also adding static to iadst16_1d and fadst16 functions. Change-Id: I13c7df3b776f0f8efc6e80099bdb0a2f6d29edaf
2013-10-10Giving consistent names to IDCT 32x32 functions.Dmitry Kovalev
Renames: vp9_short_idct32x32_add -> vp9_idct32x32_1024_add vp9_short_idct32x32_1_add -> vp9_idct32x32_1_add vp9_idct_add_32x32 -> vp9_idct32x32_add Change-Id: Id85306f5814bac6c47463a6b5901a93082510666
2013-10-07Giving consistent names to IDCT 16x16 functions.Dmitry Kovalev
Renames: vp9_short_idct16x16_add -> vp9_idct16x16_256_add vp9_short_idct16x16_10_add -> vp9_idct16x16_10_add vp9_short_idct16x16_1_add -> vp9_idct16x16_1_add vp9_idct_add_16x16 -> vp9_idct16x16_add Change-Id: Ief8a3904de78deab0f4ede944c4d0339c228cfc3
2013-10-06Giving consistent names to IDCT 8x8 functions.Dmitry Kovalev
Renames: vp9_short_idct8x8_add -> vp9_idct8x8_64_add vp9_short_idct8x8_1_add -> vp9_idct8x8_1_add vp9_short_idct8x8_10_add -> vp9_idct8x8_10_add vp9_idct_add_8x8 -> vp9_idct8x8_add Change-Id: Ifb8d3a45b4c0397aa805b30463f3d14581bf72c1
2013-10-04Giving consistent names to IDCT/IWHT functions.Dmitry Kovalev
The idea is to have the following names for each transform size: vp9_idct4x4_add vp9_idct4x4_1_add vp9_idct4x4_10_add vp9_idct4x4_16_add vp9_idct8x8_add vp9_idct8x8_1_add vp9_idct8x8_10_add vp9_idct8x8_64_add etc for 16x16, 32x32 The actual list of renames in this patch: vp9_idct_add_lossless -> vp9_iwht4x4_add vp9_short_iwalsh4x4_add -> vp9_iwht4x4_16_add vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add vp9_idct_add -> vp9_idct4x4_add vp9_short_idct4x4_add -> vp9_idct4x4_16_add vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
2013-10-02Moving all idct/iht functions in one place.Dmitry Kovalev
Moving functions from vp9_idct_blk to vp9_idct because these functions are used from both encoder and decoder. Removing duplicated code from vp9_encodemb.c and reusing existing functions. Change-Id: Ia0a6782f8c4c409efb891651b871dd4bf22d5fe8
2013-09-24Rename defined constantsYaowu Xu
The change is to better reflect the nature of the constants. Change-Id: Icabac6e9bceefbdb3f03f8218f88ef75943c30fb
2013-09-19fix integer overflow errorsYaowu Xu
Change-Id: I76f440a917832c02d7a727697b225bac66b99f56
2013-08-12SSE2 high precision 32x32 forward DCTJingning Han
Enable SSE2 implementation of high precision 32x32 forward DCT. The intermediate stacks are of 32-bits. The run-time goes down from 32126 cycles to 13442 cycles. Change-Id: Ib5ccafe3176c65bd6f2dbdef790bd47bbc880e56
2013-07-15Removing and moving around constant definitions.Dmitry Kovalev
Removing unused and duplicated constants, moving them from *.h to *.c if possible. Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f
2013-06-29SSE2 version of vp9_short_fdct32x32_rd.Christian Duvivier
43,000 -> 5,750 cycles, about 7.5x faster. Change-Id: Ibfd92821b9603f4ed9c256e0ececec14fa4565d0
2013-06-18Make fdct32 computation flow within 16bit rangeJingning Han
This commit makes use of dual fdct32x32 versions for rate-distortion optimization loop and encoding process, respectively. The one for rd loop requires only 16 bits precision for intermediate steps. The original fdct32x32 that allows higher intermediate precision (18 bits) was retained for the encoding process only. This allows speed-up for fdct32x32 in the rd loop. No performance loss observed. Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3
2013-05-30Changed to use a new variant of WHTYaowu Xu
The commit changed to use a new variant of Walsh-Hadamard Transform by Tim Terriberry. This new variant has the best compression among a number of variants that developed by Tim. Change-Id: Icb3a88515463cfc644b17ca046fcd139db2557e9
2013-03-18Optimize 8x8 idct functionYunqing Wang
Wrote sse2 functions of vp9_short_idct8x8 and vp9_short_idct10_8x8. Compared to c version, the sse2 version is 2X faster. The decoder test didn't show noticeable gain since 8x8 idct doesn't take much of decoding time (less than 1% in my test). Change-Id: I56313e18cd481700b3b52c4eda5ca204ca6365f3
2013-03-07Consistent usage of ROUND_POWER_OF_TWO macro.Dmitry Kovalev
Change-Id: I44660975e9985310d8c654c158ee7a61291b5a08
2013-03-04Optimize vp9_short_idct4x4llm functionYunqing Wang
Wrote a SSE2 vp9_short_idct4x4llm to improve the decoder performance. Change-Id: I90b9d48c4bf37aaf47995bffe7e584e6d4a2c000
2013-02-27Faster vp9_short_fdct8x8.Christian Duvivier
Scalar path is about 1.4x faster (4% overall encoder speedup). SSE2 path is about 7x faster (13% overall encoder speedup). Change-Id: I7e85d8225a914a74c61ea370210414696560094d
2013-02-27Code cleanup.Dmitry Kovalev
Fixing code style, using array lookup instead of switch statements for forward hybrid transforms (in the same way as for their inverses). Consistent usage of ROUND_POWER_OF_TWO macro in appropriate places. Change-Id: I0d3822ae11f928905fdbfbe4158f91d97c71015f
2013-02-27Merge "Optimize vp9_dc_only_idct_add_c function" into experimentalYunqing Wang
2013-02-26Optimize vp9_dc_only_idct_add_c functionYunqing Wang
Wrote SSE2 version of vp9_dc_only_idct_add_c function. In order to improve performance, clipped the absolute diff values to [0, 255]. This allowed us to keep the additions/subtractions in 8 bits. Test showed an over 2% decoder performance increase. Change-Id: Ie1a236d23d207e4ffcd1fc9f3d77462a9c7fe09d
2013-02-26Improve 32x32 forward dctYaowu Xu
The commit improves the 32x32 forward dct implementation: 1. change to use same constants and rounding as other forward dcts 2. select rounding to specifically minimize the roundtrip error, which improved average 19/block to .77/block using 100000 random input. Test showed a small but consistent gain on all test sets, about .15% Change-Id: If0afd6a71880a522f60c1c234be0462092c2eb53
2013-02-25clean up forward and inverse hybrid transformJingning Han
Rebased. Remove the old matrix multiplication transform computation. The 16x16 ADST/DCT can be switched on/off and evaluated by setting ACTIVE_HT16 300/0 in vp9/common/vp9_blockd.h. Change-Id: Icab2dbd18538987e1dc4e88c45abfc4cfc6e133f
2013-02-22Code cleanup.Dmitry Kovalev
Removing redundant 'extern' keywords and parentheses, fixing indentation, making variable names lower case, using short expressions x *= c instead of x = x * c, minor code simplifications. Change-Id: If6a25fcf306d1db26e90d27e3c24a32735c607de
2013-02-20Code cleanup.Dmitry Kovalev
Change-Id: I7c6e3bebd94856b24dbe2aded7f9e04ef8bb8c08