Age | Commit message (Collapse) | Author |
|
This commit allows the encoder to account for additional chroma
plane costs in the mode decision process, if the current block
potentially contains significant color change. It improves the
visual quality at very low bit-rates.
The compression performance of dark720p is improved by 12.39% in
speed 6. For jimred at 150 kbps, the PSNR of V component (red)
increased by 0.2 dB, at the expense of about 5% increase in
encoding time. Note that for sequences where the chroma components
are fairly consistent, the encoding time increase is negligible.
On average the rtc set compression performance is improved by
1.172% in PSNR and 1.920% in SSIM.
Change-Id: Ia55b24ef23a25304f7ec9958fbf07fd6e658505c
|
|
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls.
Current frame parallel decode will only speed up the decoding for frame
parallel encoded videos. For non frame parallel encoded videos, frame
parallel decode is slower than serial decode due to lack of loopfilter
worker thread.
There are still some known issues that need to be addressed. For example:
decode frame parallel videos with segmentation enabled is not right sometimes.
* frame-parallel:
Add error handling for frame parallel decode and unit test for that.
Fix a bug in frame parallel decode and add a unit test for that.
Add two test vectors to test frame parallel decode.
Add key frame seeking to webmdec and webm_video_source.
Implement frame parallel decode for VP9.
Increase the thread test range to cover 5, 6, 7, 8 threads.
Fix a bug in adding frame parallel unit test.
Add VP9 frame-parallel unit test.
Manually pick "Make the api behavior conform to api spec." from master branch.
Move vp9_dec_build_inter_predictors_* to decoder folder.
Add segmentation map array for current and last frame segmentation.
Include the right header for VP9 worker thread.
Move vp9_thread.* to common.
ctrl_get_reference does not need user_priv.
Seperate the frame buffers from VP9 encoder/decoder structure.
Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
Conflicts:
test/codec_factory.h
test/decode_test_driver.cc
test/decode_test_driver.h
test/invalid_file_test.cc
test/test-data.sha1
test/test.mk
test/test_vectors.cc
vp8/vp8_dx_iface.c
vp9/common/vp9_alloccommon.c
vp9/common/vp9_entropymode.c
vp9/common/vp9_loopfilter_thread.c
vp9/common/vp9_loopfilter_thread.h
vp9/common/vp9_mvref_common.c
vp9/common/vp9_onyxc_int.h
vp9/common/vp9_reconinter.c
vp9/decoder/vp9_decodeframe.c
vp9/decoder/vp9_decodeframe.h
vp9/decoder/vp9_decodemv.c
vp9/decoder/vp9_decoder.c
vp9/decoder/vp9_decoder.h
vp9/encoder/vp9_encoder.c
vp9/encoder/vp9_pickmode.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_cx_iface.c
vp9/vp9_dx_iface.c
This reverts commit a18da9760a74d9ce6fb9f875706dc639c95402f5.
Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02
|
|
master branch."
This reverts commit bde04ce5039cbcf86c8b34bdb4127e18d7e1d0c7
Change-Id: I053dae04c761b04a36dc239558503905a14d2470
|
|
In frame parallel decode, libvpx decoder decodes several frames on all
cpus in parallel fashion. If not being flushed, it will only return frame
when all the cpus are busy. If getting flushed, it will return all the
frames in the decoder. Compare with current serial decode mode in which
libvpx decoder is idle between decode calls, libvpx decoder is busy
between decode calls. VP9 frame parallel decode is >30% faster than serial
decode with tile parallel threading which will makes devices play 1080P
VP9 videos more easily.
* frame-parallel:
Add error handling for frame parallel decode and unit test for that.
Fix a bug in frame parallel decode and add a unit test for that.
Add two test vectors to test frame parallel decode.
Add key frame seeking to webmdec and webm_video_source.
Implement frame parallel decode for VP9.
Increase the thread test range to cover 5, 6, 7, 8 threads.
Fix a bug in adding frame parallel unit test.
Add VP9 frame-parallel unit test.
Manually pick "Make the api behavior conform to api spec." from master branch.
Move vp9_dec_build_inter_predictors_* to decoder folder.
Add segmentation map array for current and last frame segmentation.
Include the right header for VP9 worker thread.
Move vp9_thread.* to common.
ctrl_get_reference does not need user_priv.
Seperate the frame buffers from VP9 encoder/decoder structure.
Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""
Conflicts:
test/codec_factory.h
test/decode_test_driver.cc
test/decode_test_driver.h
test/invalid_file_test.cc
test/test-data.sha1
test/test.mk
test/test_vectors.cc
vp8/vp8_dx_iface.c
vp9/common/vp9_alloccommon.c
vp9/common/vp9_entropymode.c
vp9/common/vp9_loopfilter_thread.c
vp9/common/vp9_loopfilter_thread.h
vp9/common/vp9_mvref_common.c
vp9/common/vp9_onyxc_int.h
vp9/common/vp9_reconinter.c
vp9/decoder/vp9_decodeframe.c
vp9/decoder/vp9_decodeframe.h
vp9/decoder/vp9_decodemv.c
vp9/decoder/vp9_decoder.c
vp9/decoder/vp9_decoder.h
vp9/encoder/vp9_encoder.c
vp9/encoder/vp9_pickmode.c
vp9/encoder/vp9_rdopt.c
vp9/vp9_cx_iface.c
vp9/vp9_dx_iface.c
Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64
|
|
Change-Id: I90ad08823e1d038384536fa9f458caadc2c87f38
|
|
Uses highbd_ prefix convention consistently.
Change-Id: I58f7f799a7ff8e32701bcd71c955bcf1cdd4581e
|
|
Also includes yv12 config changes.
Change-Id: Iacf40d8bf486815b54c32a127ce3cd4516b7e44f
|
|
Use a local variable to hold the result of vp9_is_scaled.
Change-Id: I5e203909805923e20eefef596bc84424da47dbe2
|
|
The first comment is obselete given the way is now normative in VP9
bitstream. The second comment line was too long.
Change-Id: I6546585babf60d466485ddcf2daa6d2fa79e999a
|
|
As reported in issue #850, the condition for border extension was not
complete. This commit added the case when the scaling is enabled.
This fixes issue #850.
Change-Id: I67768b23f0dcc4ac9a9aa0a0825b0fe8cb85a72e
|
|
|
|
mi_grid_* are arrays of pointer to pointer. They save the pointers that point
to the MIs in cm->mi. But they are unnecessary and complicated. The original
goal was to remove MODE_INFO_t copy. But with an extra MODE_INFO_t pointer
inside MODE_INFO_t, same goal could be achieved.
This commit totally removes the mi_grid_* structures. But there are still
many dummy MODE_INFO_t inside cm->mi which are a waste of memory. Next commit
will do on-demand MODE_INFO_t allocation in order to save these memories.
Change-Id: I3a05cf1610679fed26e0b2eadd315a9ae91afdd6
|
|
Change-Id: Ie51c352a6b250547207cbc1ebba833a01ed053e3
|
|
The issue was discovered on bitstream with 2x vertical downscale. For
zero MVs, y_pad is set to 1 only when vertical convolution is
required. The original code assumes that for y_step_q4 == 32 we don't
perform vertical convolution. But vp9_setup_scale_factors_for_frame()
sets convolve functions so that when x_step and y_step are both not
equal to 16, convolve in both directions is performed. And convolve()
unconditionally subtracts one stride from source pointer when calls
convolve_horiz(). This leads to invalid memory access.
Change-Id: I882dfa6081a58e172b5ffa55842bfcd6727f10bf
|
|
Change-Id: Ibe9fa28440cc79ba9f3504d78c7dca7bb01a23e1
|
|
Change-Id: I6ad6fd75dc3c9e6218d88148cf49e205398e2af5
|
|
Change-Id: Ifc741da9da6f61c8d3c1f675ec6b8a96570f877d
|
|
This breaks the profile 1 bitstream.
Don't force non420 uv transform size to 1/4 y size. In the 4:2:0 case the
chroma corresponding to a luma block is 1/4 its size. In the 4:4:4 case
chroma and luma planes are the same size. Disallowing larger transforms
can result in a loss of compression efficiency and is inconsistent.
For sub-8x8 blocks only average corresponding motion vectors.
4:2:0 and profile 0 behavior remains unchanged.
Change-Id: I560ae07183012c6734dd1860ea54ed6f62f3cae8
|
|
Change-Id: I9ef40f3d95ab8f94f69e92ea25678a40956bc1ce
|
|
Now interp_kernel is obtained when it is really required (based on
mbmi->interp_filter value).
Change-Id: I4c7a93c179d1045eba16e7526c293d02c9b8b47e
|
|
Renames:
mi_8x8 -> mi
mode_info_stride -> mi_stride
Change-Id: I66f3e5fd1e7b7f46f108af5bb711c5fd9493c1be
|
|
Let the calculation to be compatible with Google's HW implementation.
Change-Id: I22e179888cdb0419e230351c0a47661b37051fef
|
|
Fixes issue #731
Change-Id: Id313e84b8fb4ff20f6a4e1ed11cb601927888318
|
|
This reverts commit b0fec6ab4a61ded1ab2ade188987631f53c4e9c1.
Change-Id: I9acd8ee0423f22d92138f11579611ff959331013
|
|
This reverts commit 9650b9d72aa236e76c54b4f0acebd6bf1d6bbe48.
Change-Id: I841c4a4734170fda63469e32adc10703aa4bf0fa
|
|
Change-Id: I916944950deb22f4c2301d83a803b732bf3ecd77
|
|
There were two parameters not in use, this commit removed them.
Change-Id: Ia03a73b9a2521400bed539df45574e34214ed93a
|
|
Change-Id: I568861ba1d43620865ad9a98a97eef37a51fd856
|
|
|
|
is not longer needed.
Change-Id: I40c37ef18c67ab27fc336694dfca3c43a87c47ca
|
|
Change-Id: Iae787d491f7cfe24855ef8f2d04e2c6c19350378
|
|
-> InterpKernel
avoids conflicts in variable names, fixing the build with various
toolchains.
broken since:
8691565 Removing subpix_fn_table struct.
Change-Id: Ib5f6fdbcb494a97b62c75b99d4d826ff25d4c981
|
|
We don't use different filter kernels for x and y, it is always one kernel
for both directions.
Change-Id: Iefcbb02ec74bf46ea20d9dca672a3efd5d631517
|
|
reference buffer is out of boarder.
Change-Id: Ic7ad136e54a4d68abe0fd4345146a86b0ba824e1
|
|
Adding RefBuffer to simplify reference buffer management. The struct has a
pointer to image data and scale factors relative to the current frame.
Change-Id: If38eb1491ff687cc11428aee339f3e052e2c5d9e
|
|
Moving back to scale_factors struct. We don't need anymore x_offset_q4 and
y_offset_q4 because both values are calculated locally inside vp9_scale_mv
function.
Change-Id: I78a2122ba253c428a14558bda0e78ece738d2b5b
|
|
Before mv scaling it is required to calculate x_offset_q4/y_offset_q4
by calling set_scaled_offsets(). Now offset configuration can not be
missed because it happens just before scale_mv().
Change-Id: I7dd1a85b85811a6cc67c46c9b01e6ccbbb06ce3a
|
|
|
|
VP9 decoder can now use frame buffers passed in by the application.
Change-Id: I599527ec85c577f3f5552831d79a693884fafb73
|
|
|
|
Temporarily change memcpy to memmove.
Change-Id: I700a197bc1ce496be1ddad7118429c5da465b0ca
|
|
Change-Id: Ic429b2f16462e926f30efb3af4da3080026359d8
|
|
the border now. Next commit will totally remove the border.
Change-Id: Ic1e1ca9cc34f81c688715b3948689b47df63a151
|
|
NUM_YV12_BUFFERS => FRAME_BUFFERS
ALLOWED_REFS_PER_FRAME => REFS_PER_FRAME
NUM_REF_FRAMES_LOG2 => REF_FRAMES_LOG2
NUM_REF_FRAMES => REF_FRAMES
NUM_FRAME_CONTEXTS_LOG2 => FRAME_CONTEXTS_LOG2
NUM_FRAME_CONTEXTS => FRAME_CONTEXTS
Change-Id: I4e1ada08f25d8fa30fdf03aebe1b1c9df0f87e63
|
|
Using get_plane_block_size() instead of manipulation with subsampling
values, calculating all required values only once without redundant calls
to b_width_log2().
Change-Id: I00303f2a0926f9c4cb17f34591adda60615f8919
|
|
The decoder will construct inter predictor using lazy border extension,
while the encoder, going with multiple runs of motion search in the rate-
distortion optimization loop for each block, does border extension at
frame level. This commit makes separate the inter predictors for encoder
and decoder, respectively.
Change-Id: Ieca2fecba3a7201a6d64ef9f219e5d91e50559c3
|
|
|
|
This commit takes out vp9_extend_frame_borders from
vp9_setup_scale_factors.
The refactoring is for the preparation of the use of lazy border
extension at decoder. This makes it necessary to handle border
extension separately at encoder/decoder. The use of
vp9_extend_frame_borders will be removed, when lazy border extension
is ready.
Change-Id: Ia3baba3d179d5f11eee1634f19b3b319d2a59186
|
|
Change-Id: I29c0dfcf41a1253d5e2a0d2ff740c0c38ebaa5a2
|
|
As it is used in encoder only.
Change-Id: I5f2a8abbe72bb18cbf6ce36a3dc7e132aeae8ec2
|