libvpx.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2015-02-01	Merge "Optimize coef update"	Yaowu Xu

2015-01-30	Try again to merge branch 'frame-parallel' into master branch.	hkuang
	In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. Current frame parallel decode will only speed up the decoding for frame parallel encoded videos. For non frame parallel encoded videos, frame parallel decode is slower than serial decode due to lack of loopfilter worker thread. There are still some known issues that need to be addressed. For example: decode frame parallel videos with segmentation enabled is not right sometimes. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c This reverts commit a18da9760a74d9ce6fb9f875706dc639c95402f5. Change-Id: I361442ffec1586d036ea2e0ee97ce4f077585f02
2015-01-30	vp9: rename 'near' parameters	James Zern
	+ nearest for consistency near is a reserved word in windows builds so using it as a parameter name may cause build failures with some configurations Change-Id: Iddf1d4ecdb39843f14e95dbfd9dca55f07f81403
2015-01-30	Optimize coef update	Yaowu Xu
	1. move the check of search method of USE_TX_8X8 up one level to avoid operations of build_tree_distributions() 2. count tx used and avoid computaton for coef udpate when one size is not used at all. Change-Id: Ia3e54a2588aa531c41377a1bfaa64385d04a592c
2015-01-27	Add vp9_sad32x32x4d_neon Neon intrinsic function.	Frank Galligan
	On Nexus 7 speed -6 saw ~18% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: I70ccdea0326750552ed946fb004507d6efe02d5c
2015-01-27	Add vp9_sad16x16x4d_neon Neon intrinsic function.	Frank Galligan
	On Nexus 7 speed -6 saw ~15% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: I4b2006b644c488f42bf06d8a22ef0e6120a96bf9
2015-01-27	Add vp9_sad64x64x4d_neon Neon intrinsic function.	Frank Galligan
	On Nexus 7 speed -6 saw ~30% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. BUG=https://code.google.com/p/webm/issues/detail?id=908 Change-Id: Id12af7d1883243c23e6692e898aea82299633d58
2015-01-24	Add Neon intrinsic vp9_fdct8x8_quant_neon	Frank Galligan
	On Nexus 7 speed -5 got ~2%, -6 got ~15%, -7 and -8 got ~30% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I83246d63b96674d170098a572fa4fe28a05aaf51
2015-01-23	Merge "Replace divide with look-up"	Yaowu Xu

2015-01-23	Merge "SSE2 code for the filter in MFQE."	JackyChen

2015-01-23	Replace divide with look-up	Yaowu Xu
	This commit replaces an integer divide with a table-lookup. It is to improve decoding speed, and at the same time, to reduce possible complications with a bug in AMD Family 12h processors: "665 Integer Divide Instruction May Cause Unpredictable Behavior" Change-Id: I678b707a538798a923850bac467e66e847e6def7
2015-01-23	Revert "Merge branch 'frame-parallel' to enable frame parallel decode in ↵	Johann
	master branch." This reverts commit bde04ce5039cbcf86c8b34bdb4127e18d7e1d0c7 Change-Id: I053dae04c761b04a36dc239558503905a14d2470
2015-01-22	Merge branch 'frame-parallel' to enable frame parallel decode in master branch.	hkuang
	In frame parallel decode, libvpx decoder decodes several frames on all cpus in parallel fashion. If not being flushed, it will only return frame when all the cpus are busy. If getting flushed, it will return all the frames in the decoder. Compare with current serial decode mode in which libvpx decoder is idle between decode calls, libvpx decoder is busy between decode calls. VP9 frame parallel decode is >30% faster than serial decode with tile parallel threading which will makes devices play 1080P VP9 videos more easily. * frame-parallel: Add error handling for frame parallel decode and unit test for that. Fix a bug in frame parallel decode and add a unit test for that. Add two test vectors to test frame parallel decode. Add key frame seeking to webmdec and webm_video_source. Implement frame parallel decode for VP9. Increase the thread test range to cover 5, 6, 7, 8 threads. Fix a bug in adding frame parallel unit test. Add VP9 frame-parallel unit test. Manually pick "Make the api behavior conform to api spec." from master branch. Move vp9_dec_build_inter_predictors_* to decoder folder. Add segmentation map array for current and last frame segmentation. Include the right header for VP9 worker thread. Move vp9_thread.* to common. ctrl_get_reference does not need user_priv. Seperate the frame buffers from VP9 encoder/decoder structure. Revert "Revert "Revert "Revert 3 patches from Hangyu to get Chrome to build:""" Conflicts: test/codec_factory.h test/decode_test_driver.cc test/decode_test_driver.h test/invalid_file_test.cc test/test-data.sha1 test/test.mk test/test_vectors.cc vp8/vp8_dx_iface.c vp9/common/vp9_alloccommon.c vp9/common/vp9_entropymode.c vp9/common/vp9_loopfilter_thread.c vp9/common/vp9_loopfilter_thread.h vp9/common/vp9_mvref_common.c vp9/common/vp9_onyxc_int.h vp9/common/vp9_reconinter.c vp9/decoder/vp9_decodeframe.c vp9/decoder/vp9_decodeframe.h vp9/decoder/vp9_decodemv.c vp9/decoder/vp9_decoder.c vp9/decoder/vp9_decoder.h vp9/encoder/vp9_encoder.c vp9/encoder/vp9_pickmode.c vp9/encoder/vp9_rdopt.c vp9/vp9_cx_iface.c vp9/vp9_dx_iface.c Change-Id: Ib92eb35851c172d0624970e312ed515054e5ca64
2015-01-20	Merge "Add Neon intrinsics for vp9_avg_8x8_neon"	Frank Galligan

2015-01-20	Add non420 code in multi-threaded loopfilter	Yunqing Wang
	Added non420 part back to make it consistent with single thread code in vp9_loopfilter.c. Change-Id: I8ca255d73bffebae294d2627d6655eafe535cb90
2015-01-18	SSE2 code for the filter in MFQE.	JackyChen
	The SSE2 code is from VP8 MFQE, reuse it in VP9. No change on VP8 side. In our testing, we achieve 2X speed by adopting this change. Change-Id: Ib2b14144ae57c892005c1c4b84e3379d02e56716
2015-01-16	vp9_ethread: add parallel loopfilter	Yunqing Wang
	1. Added row-based loopfilter in encoder; 2. Moved common multi-threaded loopfilter functions from decoder to common; 3. Merged multi-threaded loopfilter code, and made encoder/ decoder call same function to reduce code duplication. Encoder tests showed that 1% - 2% speedup was seen for good-quality 2-pass mode(at speed 3); 1% - 3% speedup using 2 threads and 4% - 6% speedup using 4 threads were seen for real-time mode(at speed 7). Change-Id: I8a4ac51c2ad9bab9fa7b864e90743931c53ec1c4
2015-01-15	Add Neon intrinsics for vp9_avg_8x8_neon	Frank Galligan
	On Nexus 7 speed -5, -6, -7, and -8 saw about a 1% increase in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 1.5% increase in perf for 720p. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: Ibf17ebfd952a6aec941719bd8306df8ec4574bee
2015-01-14	Merge "Add encoder control for setting color space"	Yaowu Xu

2015-01-14	Add encoder control for setting color space	Yaowu Xu
	This commit adds encoder side control for vp9 to set color space info in the output compressed bitstream. It also amends the "vp9_encoder_params_get_to_decoder" test to verify the correct color space information is passed from the encoder end to decoder end. Change-Id: Ibf5fba2edcb2a8dc37557f6fae5c7816efa52650
2015-01-14	Add 64x64 sub_pel_variance Neon function	Frank Galligan
	On Nexus 7 speed -5, -6, -7, and -8 saw about a 15% increase in perf for 480p. Speeds -5, -6, -7, and -8 saw about a 10% increase in perf for 720p. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I2fa5315845e3021c9a6e2ea47e52e68b398d8334
2015-01-13	Add 64x variance Neon functions	Frank Galligan
	Add optimized Neon functions of: vp9_variance32x64 vp9_variance64x32 vp9_variance64x64 On Nexus 7 speed -5 and -6 saw about a 4% increase in perf. Speeds -7 and -8 saw about a 6% increase in perf. Tested on Nexus 7, built with ndk r10d, gcc 4.9. Change-Id: I5a81f13c9897eb927fa39662530f5524a0f768fa
2015-01-08	Merge "vp9: add per-tile longjmp error handling"	James Zern

2015-01-08	Merge "Disable vp9 _8_ loopfilters"	Johann

2015-01-07	Merge "Refactor calculation of tile_cols"	Yaowu Xu

2015-01-07	Merge "Use qdiff to adjust the threshold of sad and variance in MFQE."	JackyChen

2015-01-07	Refactor calculation of tile_cols	Yaowu Xu
	Change-Id: I2c38ea2bcf6d221a0b6b2fb9be4cebbee21006a3
2015-01-07	Use qdiff to adjust the threshold of sad and variance in MFQE.	JackyChen
	When qdiff is larger, the sad/variance threshold should also be higher which indicates a more aggressive action on MFQE. Change-Id: I44c5c93572805458d4f87fdc7619cc9d8a522185
2015-01-06	Disable vp9 _8_ loopfilters	Johann
	Investigating https://code.google.com/p/chromium/issues/detail?id=443839 Change-Id: Ibb7485d835c5aa5e1d40f31715596ba8d208eedb
2015-01-06	Rearrange loopfilter functions	Johann
	Separate functions and rename files. This will make it easier to disable some functions later to help work around a compiler issue in chromium. Change-Id: I7f30e109f77c4cd22e2eda7bd006672f090c1dc5
2015-01-06	Merge "Enable coefficient range checking for 10-/12-bit"	Deb Mukherjee

2015-01-06	Enable coefficient range checking for 10-/12-bit	Deb Mukherjee
	Also fixes a broken build with --enable-coefficient-range-checking configuration option. Change-Id: Icc536f53088e8cec59dfb8f635668555fdb9125e
2015-01-05	Adopt weighted averaging in MFQE.	JackyChen
	By using weighted averaging in the calculation of the frames to be displayed, we get an average gain of more than 1 db for key frames whose base qp are 20 higher than non-key frames. Change-Id: I7bcb2e7b9c6420ea3f73f33204d18b072dffd17c
2014-12-23	WIP: Remove giant value cost table	Jim Bankoski
	Change-Id: Iabe8a8868a747626c24bb13f1796f4c7827af367
2014-12-22	Revert "Revert "Removal of legacy zbin_extra / zbin_oq_value.""	Jingning Han
	This reverts commit 9946ee23e0a4c158e26a505b162a072f81b8a3be. Fix the ssse3 asm function. Change-Id: I07f77a63aa98087626e45c4e87aa5dcafc0b0b07
2014-12-19	vp9: add per-tile longjmp error handling	James Zern
	this avoids longjmp'ing from another thread on error which will cause undesired behavior Change-Id: Ic9074ed8cc4243944bf2539d6e482f213f4e8c86
2014-12-19	Revert "Removal of legacy zbin_extra / zbin_oq_value."	Paul Wilkins
	This reverts commit e9b586e21bb899e247346e82bccf5afb42604910. Change-Id: I5b36e6727da6c05278d97e2c37b80c109f79bed4
2014-12-19	Merge "Removal of legacy zbin_extra / zbin_oq_value."	Paul Wilkins

2014-12-18	Merge "make vp9 encoder static initializers thread safe"	James Zern

2014-12-18	make vp9 encoder static initializers thread safe	Jim Bankoski
	Change-Id: If2d0888d13ebe52bc7c3b16f16319408a86ab6de
2014-12-18	Removal of legacy zbin_extra / zbin_oq_value.	Paul Wilkins
	zbin extra / zbin_oq_value was widely passed around, hence removal touches a lot of code. Change-Id: Idc94359735b60c38a160e4385ae09d5ca8b6b8e5
2014-12-17	Let YUV plane share the same dqcoeff buffer.	hkuang
	Remove unnecessary dqcoeff from macroblockd which reduce macroblockd size by 16384 bytes. Change-Id: Ia379a703b4fee81c8fd4698b52488a85a90c9bc2
2014-12-17	Merge "Add rectangle block support for MFQE."	JackyChen

2014-12-17	Merge "Use bit_depth in VP9Common as the flag of highbit."	JackyChen

2014-12-17	Merge "Remove reset mode_info array per frame"	Jingning Han

2014-12-16	Use bit_depth in VP9Common as the flag of highbit.	JackyChen
	Change-Id: I881aefbe68f9c10bb4629a2a5ee1e42a225d5ab7
2014-12-16	VP9 common for ARMv8 by using NEON intrinsics 15	James Yu
	Re-write - vp9_lpf_horizontal_4_dual_neon in vp9_loopfilter_16_neon.c Change-Id: Ie14f63d352f9564ad01db3939a61d91cf6d21a31 Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-16	Merge "Use defines for inline and __builtin_prefetch"	Johann

2014-12-16	Add rectangle block support for MFQE.	JackyChen
	Only for the rectangle blocks larger than 16X16, SAD and Variance are still based on the internal square blocks. Change-Id: I3754da1b0254147313f86a0140dbf4f980f06a5a
2014-12-16	Merge "VP9 common for ARMv8 by using NEON intrinsics 16"	Johann