Age | Commit message (Collapse) | Author |
|
|
|
Change-Id: Ic64b6928af7ae8ecc987f845b0bf0faecdacb072
|
|
A new version of vp9_highbd_error_8bit is now available which is
optimized with AVX assembly. AVX itself does not buy us too much, but
the non-destructive 3 operand format encoding of the 128bit SSEn integer
instructions helps to eliminate move instructions. The Sandy Bridge
micro-architecture cannot eliminate move instructions in the processor
front end, so AVX will help on these machines.
Further 2 optimizations are applied:
1. The common case of computing block error on 4x4 blocks is optimized
as a special case.
2. All arithmetic is speculatively done on 32 bits only. At the end of
the loop, the code detects if overflow might have happened and if so,
the whole computation is re-executed using higher precision arithmetic.
This case however is extremely rare in real use, so we can achieve a
large net gain here.
The optimizations rely on the fact that the coefficients are in the
range [-(2^15-1), 2^15-1], and that the quantized coefficients always
have the same sign as the input coefficients (in the worst case they are
0). These are the same assumptions that the old SSE2 assembly code for
the non high bitdepth configuration relied on. The unit tests have been
updated to take this constraint into consideration when generating test
input data.
Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7
|
|
to make meaning of color_range obvious.
Change-Id: I303582e448b82b3203b497e27b22601cc718dfff
|
|
single-threaded:
swanky (silvermont): ~1% faster overall
peppy (celeron,haswell): ~1.5% faster overall
Change-Id: Ib74f014374c63c9eaf2d38191cbd8e2edcc52073
|
|
Change-Id: Iccb4cdc23c1845cf9cb7d69101c9f4f43675d368
|
|
If high bit depth configuration is enabled, but encoding in profile 0,
the code now falls back on optimized SSE2 assembler to compute the
block errors, similar to when high bit depth is not enabled.
Change-Id: I471d1494e541de61a4008f852dbc0d548856484f
|
|
* changes:
vp9/tile_worker_hook: add multiple tile decoding
invalid_file_test: loosen error check w/tile-threading
|
|
some mingw32 configs define this. force this to be on to ensure the
build succeeds
Change-Id: I2cc490782b6a0736aa617e6a1457fc2bc984adbb
|
|
The serial decode check is too strict for tile-threaded decoding as
there is no guarantee on the decode order nor which specific error
will take precedence. Currently a tile-level error is not forwarded so
the frame will simply be marked corrupt.
Change-Id: I51cf1e39e44bedeac93746154b36a4ccb2f059b1
|
|
|
|
|
|
|
|
Change-Id: I936c2430c3c5b1e0ab5dec0a20110525e925b5e4
|
|
Change-Id: I2000820e0c04de2c975d370a0cf7145330289bb2
|
|
* changes:
vp9_thread_test: clarify test case names
vp9_thread_test: add non-frame-parallel files
|
|
|
|
define NOMINMAX to allow the std:: versions to be used; min/max will be
defined transitively via windows.h otherwise
Change-Id: I692b03fa3e70b7a53962d3fd209498f70f712fed
|
|
Change-Id: Iad73b490b171cdda5c368ada69fb8eab2a86c156
|
|
|
|
|
|
|
|
In the decoder, map this to the output variable vpx_image_t.r_w/h.
This is intended as an improved version of VP9D_GET_DISPLAY_SIZE,
which doesn't work with parallel frame decoding. In the encoder,
map this to a codec control func (VP9E_SET_RENDER_SIZE) that takes
a w/h pair argument in a int[2] (identical to VP9D_GET_DISPLAY_SIZE).
Also add render_size to the encoder_param_get_to_decoder unit test.
See issue 1030.
Change-Id: I12124c13602d832bf4c44090db08c1009c94c7e8
|
|
comment out fdct32
remove fdct32 test
Change-Id: I31c47fb435377465cd3265e39621ca50d3aae656
|
|
|
|
Fails with Icac63051bf37c7355e661837b57c257d58c764fc reverted.
Change-Id: I460d7a5a74faa4daace25f911f8dc5f68e16c951
|
|
|
|
rename Decode[2-4] to something more precise
Change-Id: I68c4f189796eb11ac1a5b7b682f24efb71708187
|
|
these have been supported in tile-threaded decoding since:
b3b7645 vp9_dthread: remove frame_parallel_decoding_mode requirement
Change-Id: Ia5a752db9be937153cf4830d9258752136356d1b
|
|
This reverts commit 8903b9fa8345726efbe9b92a759c98cc21c4c14b.
there is no reason for these to be global
Change-Id: I66a31c06f8426aeca348ef12d9b9ab59d6d5e55d
|
|
|
|
remove static from fdct4/8/16/32 in vp10/encoder/dct.c
add prefix vp10_ to fdct4/8/16/32
add vp10/encoder/dct.h
Change-Id: I644827a191c1a7761850ec0b1da705638b618c66
|
|
Reallocation of mi buffer fails if change size on the first frame and
change config in subsequent frames. Add a condition for resolution
check to avoid assertion failure.
BUG=1074
Change-Id: Ie26ed816a57fa871ba27a72db9805baaaeaba9f3
|
|
the range check in dct.c (abs(input[i]) < (1 << bit)) will fail in many
cases. this was broken at the time this check was added
BUG=1076
Change-Id: I3df8c7a555e95567d73ac16acda997096ab8d6e2
|
|
the range check in dct.c (abs(input[i]) < (1 << bit)) will fail in the
25-29 range. this was broken at the time this check was added
Change-Id: I8ca9607f6cbdc8be7f47696ffeabbab3ac5727e2
|
|
|
|
|
|
In decoder, export (eventually) into vpx_image_t.range field. In
encoder, use oxcf->color_range to set it (same way as for
color_space).
See issue 1059.
Change-Id: Ieabbb2a785fa58cc4044bd54eee66f328f3906ce
|
|
Verify the dynamic resizer behavior for real time, 1 pass CBR mode.
Start at low target bitrate, raise the bitrate in the middle of the
clip, verify that scaling-up does occur after bitrate changed.
Change-Id: I7ad8c9a4c8288387d897dd6bdda592f142d8870c
|
|
* changes:
vp9_encoder_parms_get_to_decoder: cosmetics
vp9...parms_get_to_decoder: remove unneeded func
vp9...parms_get_to_decoder: fix EXPECT param order
vp9_encoder_parms_get_to_decoder: delete dead code
fix BitstreamParms test
vp9_encoder_parms_get_to_decoder: remove vp10
yuvconfig2image(): add explicit cast to avoid conv warning
vp9/10 decoder_init: add missing alloc cast
vp9/10: set color_space on preview frame
vp10: add extern "C" to headers
vp9: add extern "C" to headers
|
|
Verify the dynamic resizer behavior for real time, 1 pass CBR mode.
Run at low bitrate, with resize_allowed = 1, and verify that we get
one resize down event.
Change-Id: Ic347be60972fa87f7d68310da2a055679788929d
|
|
Unify the style of fdct4() fdct8() fdct16()
Add fdct32()
Add range_check() at each stage
Add unit test at ../../test/vp10_dct_test.cc
Change-Id: I13f76d9046c3ea473c82024b09a5bc8662e2c28e
|
|
|
|
1) copy following files from vpx_dsp/ to vp10/common/
vp10_inv_txfm.c
vp10_inv_txfm.h
vp10_inv_txfm_sse2.c
vp10_inv_txfm_sse2.h
2) change the function prefix "vpx_" to "vp10_" in above files
3) add unit test at vp10_inv_txfm_test.cc
Change-Id: I206f10f60c8b27d872c84b7482c3bb1d1cb4b913
|
|
fix indent, */& association, join a few lines
Change-Id: Idaca24b87b574788f9508168082d0ade3d4e9ecc
|
|
removes a redundant cast in the process
Change-Id: Ie3727a0938c0093f70f25a875c2c58671938d45c
|
|
(expected, actual)
Change-Id: I449e7b6c51aa85cdde008d2fad5a9629970222a9
|
|
the only input is y4m, there's no need to test for yuv.
Change-Id: Ie5b55ea4af44ad79a55304ef5636a8ad7ed30bb8
|
|
avoid duplicating internal structures and include vp9_dx_iface.c
directly. these had fallen out of sync after the frame-parallel branch
merge.
Change-Id: I604cfbffa95abe2a1c8e906a696f32436b1422ed
|
|
this file needs to be reworked to remove the duplication of codec
internals + allow for divergence of vp9 and vp10
Change-Id: I6266b94ccfbc24dae30148f134804b52aa411b88
|