summaryrefslogtreecommitdiff
path: root/vp9/decoder/vp9_dthread.h
AgeCommit message (Collapse)Author
2016-08-03vp9/decoder,vp9/*.[hc]: apply clang-formatclang-format
Change-Id: Ic38ea06c7b2fb3e8e94a4c0910e82672a1acaea7
2015-09-09vp9: add extern "C" to headersJames Zern
Change-Id: I1b6927ad820f99340985b094d415aaab14defaf4
2015-07-02Rename vpx_thread to vpx_utilJingning Han
Change the dir name to include more util tools. Change-Id: Id5b16062803ce5eed872fe2edb36d7e56b32eed8
2015-07-02Use vpx prefix for codec independent threading functionsJingning Han
Replace vp9_ prefix with vpx_ for common multi-threading functions. Change-Id: I941a5ead9bfe8213fdad345511d2061b07797b55
2015-07-01Move multi-threading module functions into vpx_thread folderJingning Han
This commit moves the primitive multi-threading files from vp9 folder to vpx_thread, which will be accessible by all vpx codec. Change-Id: Ib51e66e9c69801c10631fab56d35a0c0aaed5883
2015-02-04Fix a thread lost bug in frame parallel decode.hkuang
After syncing the frame worker thread, avaiable thread count should increase by 1 even the worker thread does not have displayable frame to output. Change-Id: I9eeb87720fed82dfe38555286833ff88e8a8e746
2015-01-30Try again to merge branch 'frame-parallel' into master branch.hkuang
In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. Current frame parallel decode will only speed up the decoding for frame parallel encoded videos. For non frame parallel encoded videos, frame parallel decode is slower than serial decode due to lack of loopfilter worker thread. There are still some known issues that need to be addressed. For example: decode frame parallel videos with segmentation enabled is not right sometimes. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c This reverts commit a18da9760a74d9ce6fb9f875706dc639c95402f5. Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02
2015-01-23Revert "Merge branch 'frame-parallel' to enable frame parallel decode in ↵Johann
master branch." This reverts commit bde04ce5039cbcf86c8b34bdb4127e18d7e1d0c7 Change-Id: I053dae04c761b04a36dc239558503905a14d2470
2015-01-22Merge branch 'frame-parallel' to enable frame parallel decode in master branch.hkuang
In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. VP9 frame parallel decode is >30% faster than serial decode with tile parallel threading which will makes devices play 1080P VP9 videos more easily. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64
2015-01-16vp9_ethread: add parallel loopfilterYunqing Wang
1. Added row-based loopfilter in encoder; 2. Moved common multi-threaded loopfilter functions from decoder to common; 3. Merged multi-threaded loopfilter code, and made encoder/ decoder call same function to reduce code duplication. Encoder tests showed that 1% - 2% speedup was seen for good-quality 2-pass mode(at speed 3); 1% - 3% speedup using 2 threads and 4% - 6% speedup using 4 threads were seen for real-time mode(at speed 7). Change-Id: I8a4ac51c2ad9bab9fa7b864e90743931c53ec1c4
2014-12-19vp9: add per-tile longjmp error handlingJames Zern
this avoids longjmp'ing from another thread on error which will cause undesired behavior Change-Id: Ic9074ed8cc4243944bf2539d6e482f213f4e8c86
2014-10-22Implement frame parallel decode for VP9.Hangyu Kuang
Using 4 threads, frame parallel decode is ~3x faster than single thread decode and around 30% faster than tile parallel decode for frame parallel encoded video on both Android and desktop with 4 threads. Decode speed is scalable to threads too which means decode could be even faster with more threads. Change-Id: Ia0a549aaa3e83b5a17b31d8299aa496ea4f21e3e
2014-10-16move LFWorkerData allocation to VP9LfSyncJames Zern
this removes an assumption that worker->data1 would be pointing to a TileWorkerData allocation. additionally, within the multi-threaded loopfilter pass VP9LfSync as a parameter to the worker hook, removing the need for a shadow pointer in LFWorkerData. Change-Id: Ic7b2faa34e3eb59dbcb8a7c67f333448fa047c88
2014-10-16vp9_loop_filter_frame_mt: remove pbi dependencyJames Zern
Change-Id: I44d42a5098305a2d050ce8ff3c76baf7798c48af
2014-10-16vp9_loop_filter_frame_mt: pass planes directlyJames Zern
one less dependency on pbi Change-Id: I3f3a392416d3523f4aea6682c3965885baf85197
2014-10-16vp9_loop_filter_frame_mt: pass VP9LfSync directlyJames Zern
a step towards removing the pbi dependency Change-Id: I10747b325e81c172f5e67031ea5159159fc26e91
2014-09-08vp9_loop_filter_alloc: reorder parametersJames Zern
VP9LfSync lf_sync is being operated on, make it the first parameter as in dealloc Change-Id: Id3cdf6b6a48157627780ae0d5d4b7dfa94a78078
2014-08-29vp9: fix m/t loop filter invalid freeJames Zern
store the number of allocated rows in VP9LfSync, the calculated values can not be relied on when dealing with corrupt material. Change-Id: I13b8bcec9738c299a71df726772ab7ac05511e5b
2014-07-11Move vp9_thread.* to common.hkuang
Prepare for frame parallel decoding, the reference count buffers need to be protected by mutex. Move vp9_thread.* to common folder so that those buffers could use cross-platform mutex from vp9_thread.*. (cherry picked from commit 337e8015c9deaf8ab7e8d0c3c132160a77dd1590) Change-Id: I0587a08447925f4554d7788686a31483c2ae3f37
2014-07-07Move vp9_thread.* to common.hkuang
Prepare for frame parallel decoding, the reference count buffers need to be protected by mutex. Move vp9_thread.* to common folder so that those buffers could use cross-platform mutex from vp9_thread.*. Change-Id: I541277cf15eefed6641555944f67f4a0bcdc8154
2014-07-02Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:"""hkuang
This reverts commit 749e0c7b2883139afa14b4886bbd6a940d021f4f. Change-Id: I0c63a152baf94d38496dd925a40040366153bf4f
2014-06-27Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""James Zern
This reverts commit b336356198b8ada50fbb59f04f11cefceaf5ff95. This causes a hang in: VP9/InvalidFileTest.ReturnCode/3 the change to test/user_priv_test.cc remains with a minor update Change-Id: I4a8a272ca37ea329b0f413f0b1cd827a238bd9fd
2014-06-25Revert "Revert 3 patches from Hangyu to get Chrome to build:"hkuang
This patch reverts the previous revert from Jim and also add a variable user_priv in the FrameWorker to save the user_priv passed from the application. In the decoder_get_frame function, the user_priv will be binded with the img. This change is needed or it will fail the unit test added here: https://gerrit.chromium.org/gerrit/#/c/70610/ This reverts commit 9be46e4565f553460a1bbbf58d9f99067d3242ce. Change-Id: I376d9a12ee196faffdf3c792b59e6137c56132c1
2014-06-21Revert 3 patches from Hangyu to get Chrome to build:Jim Bankoski
Avoids failures: MSE_ClearKey/EncryptedMediaTest.Playback_VP9Video_WebM/0 MSE_ClearKey_Prefixed/EncryptedMediaTest.Playback_VP9Video_WebM/0 MSE_ExternalClearKey_Prefixed/EncryptedMediaTest.Playback_VP9Video_WebM/0 MSE_ExternalClearKey/EncryptedMediaTest.Playback_VP9Video_WebM/0 MSE_ExternalClearKeyDecryptOnly/EncryptedMediaTest.Playback_VP9Video_WebM/0 MSE_ExternalClearKeyDecryptOnly_Prefixed/EncryptedMediaTest.Playback_VP9Video_WebM/0 SRC_ExternalClearKey/EncryptedMediaTest.Playback_VP9Video_WebM/0 SRC_ExternalClearKey_Prefixed/EncryptedMediaTest.Playback_VP9Video_WebM/0 SRC_ClearKey_Prefixed/EncryptedMediaTest.Playback_VP9Video_WebM/0 Patches are This reverts commit 9bc040859b0ca6869d31bc0efa223e8684eef37a This reverts commit 6f5aba069a2c7ffb293ddce70219a9ab4a037441 This reverts commit 9bc040859b0ca6869d31bc0efa223e8684eef37a I1f250441 Revert "Refactor the vp9_get_frame code for frame parallel." Ibfdddce5 Revert "Delay decreasing reference count in frame-parallel decoding." I00ce6771 Revert "Introduce FrameWorker for decoding." Need better testing in libvpx for these commits Change-Id: Ifa1f279b0cabf4b47c051ec26018f9301c1e130e
2014-06-20Introduce FrameWorker for decoding.hkuang
When decoding in serial mode, there will be only one FrameWorker doing decoding. When decoding in parallel mode, there will be several FrameWorkers doing decoding in parallel. Change-Id: If53fc5c49c7a0bf5e773f1ce7008b8a62fdae257
2014-05-15cleanup -wextra warnings:Yaowu Xu
vp9_decoder.c vp9_dthread.c Change-Id: Iaafe941545db98e9e3559096a955894646084ac2
2014-05-12Moving loopfilter call to vp9_decode_frame().Dmitry Kovalev
Inline loopfilter has been already handled in vp9_decode_frame(). Collecting all similar code in one place now. Change-Id: I358a0280fc7c2b27cca520bc1e8c16c4eb6491dd
2014-04-10Cleaning up vp9_dthread.{c, h}.Dmitry Kovalev
Change-Id: If33087462293605f79d9281af133091fff33a876
2014-04-08Renaming VP9D_COMP & VP9Decompressor to VP9Decoder.Dmitry Kovalev
Change-Id: Ieb9b455b8aaef9884391021b7f640ef24c554687
2014-03-28Moving dqcoeff array to MACROBLOCKD in decoder.Dmitry Kovalev
Change-Id: I3e20c0cdb9d2437bddf21afb255855f2dead8e02
2014-01-31Rename a loopfilter parameterYunqing Wang
As pointed out by Dmitry and James, "partial" is a Microsoft- specific c++ keyword, and it is renamed. Change-Id: Ia0fc11ceb89e54b3195287f89f7e26edbbe9beb8
2014-01-31vp9 decoder: row-based multi-threaded loopfilterYunqing Wang
Implemented parallel loopfiltering, which uses existing tile- decoding threads. Each thread works on one row, and when that row is loopfiltered, it moves to next unattended row. To ensure the correct filtering order, threads are synchronized and one superblock is filtered only if the superblocks it depends on are filtered already. To reduce synchronization overhead and speed up the decoder, we use nsync > 1 for high resolution. Performance tests: 1. on desktop: 8-tile 4k video using 8 threads, speedup: 70% - 80% 4-tile HD video using 4 threads, speedup: ~35% 2. on mobile device(Nexus 7): 4-tile 1080p video using 4 threads, speedup: 18% - 25% 4-tile 1080p video using 2 threads, speedup: 10% - 15% Change-Id: If54b4a11960dd706c22d5ad145ad94156031f36a