summaryrefslogtreecommitdiff
path: root/vp9/common/vp9_idct.c
AgeCommit message (Collapse)Author
2015-08-04Replace vp9_ prefix with vpx_ prefix in vpx_dsp function namesJingning Han
This commit clears the function naming convention in vpx_dsp. It replaces vp9_ prefix of global functions with vpx_ prefix. It also removes the vp9_ prefix from static functions. Change-Id: I6394359a63b71a51dda01342eec6a3cc08dfeedf
2015-07-31Factor inverse transform functions into vpx_dspJingning Han
This commit moves the module inverse transform functions from vp9 to vpx_dsp folder. The hybrid transform wrapper functions stay in the vp9 folder, since it involves codec-specific data structures. Change-Id: Ib066367c953d3d024c73ba65157bbd70a95c9ef8
2015-05-13Relocate memory operations for common codeJohann
With the sad functions, and hopefully the variance functions soon, moving to the vpx_dsp location, place the defines used in the reference C code in a common location. Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca
2015-04-28vpx_mem: remove vpx_memsetJames Zern
vestigial. replace instances with memset() which they already were being defined to. Change-Id: Ie030cfaaa3e890dd92cf1a995fcb1927ba175201
2015-01-06Enable coefficient range checking for 10-/12-bitDeb Mukherjee
Also fixes a broken build with --enable-coefficient-range-checking configuration option. Change-Id: Icc536f53088e8cec59dfb8f635668555fdb9125e
2014-11-24Refactored idct routines and headersPeter de Rivaz
This change is made in preparation for a subsequent patch which adds acceleration for the highbitdepth transform functions. The highbitdepth transform functions attempt to use 16/32bit sse instructions where possible, but fallback to using the C implementations if potential overflow is detected. For this reason the dct routines are made global so they can be called from the acceleration functions in the subsequent patch. Change-Id: Ia921f191bf6936ccba4f13e8461624b120c1f665 (cherry picked from commit 454342d4e77dbb67f4a3c10f97a57a6fcb46d9a0)
2014-11-07Iadst transforms to use internal low precisionDeb Mukherjee
Change-Id: I266777d40c300bc53b45b205144520b85b0d6e58 (cherry picked from commit a1b726117f5470f227bc90cd030b7d25045dc510)
2014-11-05Fix visual studio 2013 compiler warningsYaowu Xu
For configured with --enable-vp9-highbitdepth Change-Id: I2b181519d7192f8d7a241ad5760c3578255f24e6
2014-10-09Rename highbitdepth functions to use highbd prefixDeb Mukherjee
Uses highbd_ prefix convention consistently. Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e
2014-10-06Add range check in inverse ADST 16x16Jingning Han
Bit-stream clarification related to Issue 868. Change-Id: I92a7bc5b7782c9ea5c3f6cceec761742183c9514
2014-10-04Some data type changes in vp9_idct.cDeb Mukherjee
Resolves a visual studio warning, and includes some cleanups. Change-Id: I6a7576ef323c475b7d1c659800cd82c6cb1fd18d
2014-10-03Incorporate WRAPLOW macro into non-highbitdepth txDeb Mukherjee
Incorporates the WRAPLOW macro into the non-highbitdepth transforms to aid hardware verification between a software C model and an intended hardware implementation though the use of the configure options: --enable-experimental --enable-emulate-hardware. Note that to avoid further discrepancies between the sse/sse2 implementations of the transforms and the C implementation, when the emulate hardware option is invoked, we also disable sse/sse2/etc. Also incudes some minor cleanups/renaming etc. Change-Id: Ib864d8493313927d429cce402982f1c8e45b3287
2014-09-30Remove redundant header file declarationJingning Han
Some header file in vp9_idct.c has been included in vp9_idct.h. This commit removes these redundant declarations. Change-Id: I0238c27e4efff5c981eb437022c6bc6970c4e445
2014-09-11Adds high bitdepth transform functions and testsDeb Mukherjee
Adds various high bitdepth transform functions and tests. Much of the changes are related to using typedefs tran_low_t and tran_high_t for the final transform cofficients and intermediate stages of the transform computation respectively rather than fixed types int16_t/int. When vp9_highbitdepth configure flag is off, these map tp int16_t/int32_t, but when the flag is on, they map to int32_t/int64_t to make space for needed extra precision. Change-Id: I3c56de79e15b904d6f655b62ffae170729befdd8
2014-05-08Change eob threshold for partial inverse 8x8 2D-DCT to 12Jingning Han
The scanning order has the first 12 coefficients of the 8x8 2D-DCT sitting in the top left 4x4 block. Hence the partial inverse 8x8 2D-DCT allows to handle cases with eob below 12. The overall runtime of the inverse 8x8 2D-DCT unit is reduced from 166 cycles (using SSE2) to 150 cycles (using SSSE3). Change-Id: I4514f9748042809ac84df4c14382c00f313f1cd2
2014-01-27Removing _1d suffix from transform names.Dmitry Kovalev
It is enough to specify (e.g.) idct16, it is obviously different from idct16x16. Change-Id: I6b408a37a945de3162429380b59a775b03b95db0
2013-11-20Remove unnecessary eob checking.hkuang
Change-Id: Ia568f70bddc1a2b62141a0197459119ca74c22b5
2013-11-14Fix coding format in vp9_idctJingning Han
Change-Id: If97ae16a4478717933345b6b9d5bc1b417b8dd84
2013-10-24Add 32x32 idct function for eob<=34 caseYunqing Wang
When only upper-left 8x8 area has non-zero dct coefficients, we could skip 1D IDCT for 9th to 32th rows to save operations. This function is called when eob <= 34. Change-Id: I9684b75947bdde346cfe3720f08a953aa7a13fb5
2013-10-11Making input pointer of any inverse transform constant.Dmitry Kovalev
Also renaming dest_stride to stride in some places. Change-Id: I75f602b623a5a7071d4922b747c45fa0b7d7a940
2013-10-11Consistent names for inverse hybrid transforms (2 of 2).Dmitry Kovalev
Renames: vp9_iht_add -> vp9_iht4x4_add vp9_iht_add_8x8 -> vp9_iht8x8_add vp9_iht_add_16x16 -> vp9_iht16x16_add Change-Id: I8f1a2913e02d90d41f174f27e4ee2fad0dbd4a21
2013-10-11Consistent names for inverse hybrid transforms (1 of 2).Dmitry Kovalev
Renames: vp9_short_iht4x4_add -> vp9_iht4x4_16_add vp9_short_iht8x8_add -> vp9_iht8x8_64_add vp9_short_iht16x16_add_c -> vp9_iht16x16_256_add Change-Id: Ibca7a188fd062b196787ac5efc1ea545e7f166c0
2013-10-11Adding const to the input argument of all 1D transforms.Dmitry Kovalev
Also adding static to iadst16_1d and fadst16 functions. Change-Id: I13c7df3b776f0f8efc6e80099bdb0a2f6d29edaf
2013-10-10Removing vp9_idct4_1d_sse2 function.Dmitry Kovalev
We have two SSE2-optimized functions for idct4_1d: vp9_idct4_1d_sse2 <-- removing this one idct4_1d_sse2 vp9_idct4_1d_sse2 was used only by the following functions which already have SSE2 optimized variants: vp9_idct4x4_16_add_c -> vp9_idct4x4_16_add_see2 idct8_1d -> vp9_idct8x8_{16, 10, 1}_see2 vp9_short_iht4x4_add_c -> vp9_short_iht4x4_add_see2 Change-Id: Ib0a7f6d1373dbaf7a4a41208cd9d0671fdf15edb
2013-10-10Giving consistent names to IDCT 32x32 functions.Dmitry Kovalev
Renames: vp9_short_idct32x32_add -> vp9_idct32x32_1024_add vp9_short_idct32x32_1_add -> vp9_idct32x32_1_add vp9_idct_add_32x32 -> vp9_idct32x32_add Change-Id: Id85306f5814bac6c47463a6b5901a93082510666
2013-10-10Merge "Giving consistent names to IDCT 16x16 functions."Dmitry Kovalev
2013-10-08All zero coeff skip in IDCT 32x32Jingning Han
When all coefficients are zeros, skip the corresponding 1-D inverse transform. This practice has been used in the SSE2 implementation of inverse 32x32 DCT. This commit imports this algorithm into the C code. Change-Id: I0f58bfcb183a569fab85d524d5d9cf8ae8653f86
2013-10-07Giving consistent names to IDCT 16x16 functions.Dmitry Kovalev
Renames: vp9_short_idct16x16_add -> vp9_idct16x16_256_add vp9_short_idct16x16_10_add -> vp9_idct16x16_10_add vp9_short_idct16x16_1_add -> vp9_idct16x16_1_add vp9_idct_add_16x16 -> vp9_idct16x16_add Change-Id: Ief8a3904de78deab0f4ede944c4d0339c228cfc3
2013-10-06Giving consistent names to IDCT 8x8 functions.Dmitry Kovalev
Renames: vp9_short_idct8x8_add -> vp9_idct8x8_64_add vp9_short_idct8x8_1_add -> vp9_idct8x8_1_add vp9_short_idct8x8_10_add -> vp9_idct8x8_10_add vp9_idct_add_8x8 -> vp9_idct8x8_add Change-Id: Ifb8d3a45b4c0397aa805b30463f3d14581bf72c1
2013-10-04Giving consistent names to IDCT/IWHT functions.Dmitry Kovalev
The idea is to have the following names for each transform size: vp9_idct4x4_add vp9_idct4x4_1_add vp9_idct4x4_10_add vp9_idct4x4_16_add vp9_idct8x8_add vp9_idct8x8_1_add vp9_idct8x8_10_add vp9_idct8x8_64_add etc for 16x16, 32x32 The actual list of renames in this patch: vp9_idct_add_lossless -> vp9_iwht4x4_add vp9_short_iwalsh4x4_add -> vp9_iwht4x4_16_add vp9_short_iwalsh4x4_1_add -> vp9_iwht4x4_1_add vp9_idct_add -> vp9_idct4x4_add vp9_short_idct4x4_add -> vp9_idct4x4_16_add vp9_short_idct4x4_1_add -> vp9_idct4x4_1_add Change-Id: I6f43f7437c68dd30cdd05d72e213765578ed30b1
2013-10-02Moving all idct/iht functions in one place.Dmitry Kovalev
Moving functions from vp9_idct_blk to vp9_idct because these functions are used from both encoder and decoder. Removing duplicated code from vp9_encodemb.c and reusing existing functions. Change-Id: Ia0a6782f8c4c409efb891651b871dd4bf22d5fe8
2013-09-30Removing vp9_add_constant_residual_{8x8, 16x16, 32x32} functions.Dmitry Kovalev
We don't need these functions anymore. The only one which was actually used is vp9_add_constant_residual_32x32. Addition of vp9_short_idct32x32_1_add eliminates this single usage. SSE2 optimized version of vp9_short_idct32x32_1_add will be added in the next patch set, right now it is only C implementation. Now we have all idct functions implemented in a consistent manner. Change-Id: I63df79a13cf62aa2c9360a7a26933c100f9ebda3
2013-09-27Renaming vp9_short_idct10_8x8_add to vp9_short_idct8x8_10_add.Dmitry Kovalev
Making name consistent with vp9_short_idct8x8 and vp9_short_idct8x8_1. Change-Id: I99e0be040ec893f9571dcf090e18f98dc58339f5
2013-09-26Renaming vp9_short_idct10_16x16 to vp9_short_idct16x16_10.Dmitry Kovalev
Making function name consistent with vp9_short_idct16x16 and vp9_short_idct16x16_1. Change-Id: I70e54be9e6b9a1dddab0de470686591e96d05517
2013-09-24Rename defined constantsYaowu Xu
The change is to better reflect the nature of the constants. Change-Id: Icabac6e9bceefbdb3f03f8218f88ef75943c30fb
2013-08-01Remove unused vp9_short_idct10_32x32_addJingning Han
The inverse 32x32 transform detects all zero entries and skips the computations accordingly per 8 rows in the first 1-D operation. The function vp9_short_idct10_32x32_add performs differently and is not used anywhere, hence removed. Change-Id: Ic4fad422debbde7b6b6ffed47c69fbd4268a906c
2013-07-2916x16 inverse 2D-DCT with DC onlyJingning Han
This commit provides special handle on 16x16 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero value. Change-Id: I7bf71be7fa13384fab453dc8742b5b50e77a277c
2013-07-26Special handle on DC only inverse 8x8 2D-DCTJingning Han
This commit enables a special handle for the 8x8 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero. For bus_cif at 2000 kbps, it provides about 1% speed-up at speed 0. Change-Id: I2523222359eec26b144cf8fd4c63a4ad63b1b011
2013-07-24Merge vp9_dc_only_idct_add and vp9_short_idct4x4_1Jingning Han
They share the same functionality, so merging together. Change-Id: I98a0386fcee052cb854f9ff90c283c1b844bcb79
2013-07-17Remove unnecessary buffer copy in idct4x4.hkuang
Change-Id: I386066b9bcfb4bffb582e6827af36ca0181f6a83
2013-07-16SSE2 16x16 inverse ADST/DCT hybrid transformJingning Han
This commit enables SSE2 implementation of 16x16 inverse ADST/DCT hybrid transform. The runtime goes from 5742 cycles -> 1821 cycles. This provides about 1% encoding speed-up at speed 0. Change-Id: I1678d0988bf30b9efd524877705bbb3645edb17b
2013-07-12Using vp9_copy and vp9_zero instead of custom code.Dmitry Kovalev
Change-Id: Id9b6ceeddca3f9b34bfada5c499b1e7a2f42c30b
2013-05-30Changed to use a new variant of WHTYaowu Xu
The commit changed to use a new variant of Walsh-Hadamard Transform by Tim Terriberry. This new variant has the best compression among a number of variants that developed by Tim. Change-Id: Icb3a88515463cfc644b17ca046fcd139db2557e9
2013-05-27Reduce WHT complexity.Timothy B. Terriberry
Saves 1 add, 3 shifts (and a shift bias) per 1-D transform. Change-Id: I1104bb1679fe342b2f9677df8a9cdc0cb9699e7d
2013-05-21Removed unused idct functionsScott LaVarnway
No longer used. Change-Id: Id28c9247cebba183c6fa786dff96824ae100132c
2013-05-20WIP: 4x4 idct/recon mergeScott LaVarnway
This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: I296604bf73579c45105de0dd1adbcc91bcc53c22
2013-05-16WIP: 8x8 idct/recon mergeScott LaVarnway
This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: Iacfd57324fbe2b7beca5d7f3dcae25c976e67f45
2013-05-15WIP: 16x16 idct/recon mergeScott LaVarnway
This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: Iea7976b22b1927d24b8004d2a3fddae7ecca3ba1
2013-05-14WIP: 32x32 idct/recon mergeScott LaVarnway
This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: I4ea09df0e162591e420d869b7431c2e7f89a8c1a
2013-03-13removed reference to "LLM" and "x8"Yaowu Xu
The commit changed the name of files and function to remove obselete reference to LLM and x8. Change-Id: I973b20fc1a55149ed68b5408b3874768e6f88516