libvpx.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2014-12-08	Fix the comments.	hkuang
	Change-Id: I9789476865a1b24dad54115d8f7edb4fed780b90
2014-12-08	Merge "Improve the performance by caching the left_mi and right_mi in ↵	hkuang
	macroblockd."
2014-12-05	Improve the performance by caching the left_mi and right_mi in macroblockd.	hkuang
	This improve the deocde performance by ~2% on Nexus 7 2013. Change-Id: Ie9c4ba0371a149eb7fddc687a6a291c17298d6c3
2014-12-05	Merge "Merge set_prev_mi function into encoder function."	hkuang

2014-12-04	Use the RTC optimizations when in high bitdepth mode.	Peter de Rivaz
	Change 72193 made the encoder behave differently when configured with and without high bitdepth. This change means the same algorithm is used for both. Change-Id: I707a44a94afca773a9e0c2f7ebeeea83030257c5
2014-12-04	Merge set_prev_mi function into encoder function.	hkuang
	Change-Id: Ifcf2efbb232ea4cabcdebbe77e0820d121e4a6da
2014-12-03	Enable non-rd mode coding on key frame, for speed 6.	Marco
	For key frame at speed 6: enable the non-rd mode selection in speed setting and use the (non-rd) variance_based partition. Adjust some logic/thresholds in variance partition selection for key frame only (no change to delta frames), mainly to bias to selecting smaller prediction blocks, and also set max tx size of 16x16. Loss in key frame quality (~0.6-0.7dB) compared to rd coding, but speeds up key frame encoding by at least 6x. Average PNSR/SSIM metrics over RTC clips go down by ~1-2% for speed 6. Change-Id: Ie4845e0127e876337b9c105aa37e93b286193405
2014-12-02	Added high bitdepth sse2 transform functions	Peter de Rivaz
	Also removes some spurious changes in common/vp9_blockd.h which was introduced by a rebase issue between nextgen and master branches. Change-Id: If359f0e9a71bca9c2ba685a87a355873536bb282 (cherry picked from commit 005d80cd05269a299cd2f7ddbc3d4d8b791aebba) (cherry picked from commit 08d2f548007fd8d6fd41da8ef7fdb488b6485af3) (cherry picked from commit 4230c2306c194c058f56433a5275aa02a2e71d56)
2014-11-24	Fix a tautological assert.	Alex Converse
	Change-Id: I90ad08823e1d038384536fa9f458caadc2c87f38
2014-11-24	Merge "Refactored idct routines and headers"	Debargha Mukherjee

2014-11-24	Refactored idct routines and headers	Peter de Rivaz
	This change is made in preparation for a subsequent patch which adds acceleration for the highbitdepth transform functions. The highbitdepth transform functions attempt to use 16/32bit sse instructions where possible, but fallback to using the C implementations if potential overflow is detected. For this reason the dct routines are made global so they can be called from the acceleration functions in the subsequent patch. Change-Id: Ia921f191bf6936ccba4f13e8461624b120c1f665 (cherry picked from commit 454342d4e77dbb67f4a3c10f97a57a6fcb46d9a0)
2014-11-21	Merge "Added highbitdepth sse2 acceleration for quantize"	Debargha Mukherjee

2014-11-19	Added highbitdepth sse2 acceleration for quantize	Peter de Rivaz
	Also includes block error. (This patch is mostly cherry picked from commit db7192e0b014a331a1dcb102c8a1148e9f0e1081) Change-Id: Idef18f90b111a0d0c9546543d3347e551908fd78
2014-11-19	Enable ssse3 version of vp9_fdct8x8_quant	Jingning Han
	It improves the speed performance of vp9_fdct8x8_quant_sse2 by about 5%. Change-Id: I74b093ba4d81df64caf71ac7693f3d917f673097
2014-11-19	Merge "Combine fdct8x8 and quantization process"	Jingning Han

2014-11-19	Merge "Add sse2 version for vp9_quantize_fp"	Jingning Han

2014-11-18	Combine fdct8x8 and quantization process	Jingning Han
	This commit reworks the forward transform and quantization process for 8x8 block coding. It combines the two operations in a single function to save a store/load stage of the original transform coefficients. Overall the speed -6 is slightly faster (around 1% range). The compression performance of speed -6 is improved by 3.4%. Change-Id: Id6628daef123f3e4649248735ec2ad7423629387
2014-11-18	Add sse2 version for vp9_quantize_fp	Jingning Han
	vp9_quantize_fp is the quantization process used by rtc coding mode. This commit adds a sse2 implementation of it. The implementation is modified based on vp9_quantize_b_sse2. No speed difference from ssse3 version. Change-Id: I24949c5b27df160b4f35117d28858d269454e64a
2014-11-17	change to call vp9_refining_search_sad() directly	Yaowu Xu
	The function pointer in compressor instance does not change, so this commit changes to call the function directly. Change-Id: I9c9c460e3475711c384b74c9842f0b4f3d037cc5
2014-11-14	Added sse2 acceleration for highbitdepth variance	Peter de Rivaz
	Change-Id: I446bdf3a405e4e9d2aa633d6281d66ea0cdfd79f (cherry picked from commit d7422b2b1eb9f0011a8c379c2be680d6892b16bc) (cherry picked from commit 6d741e4d76a7d9ece69ca117d1d9e2f9ee48ef8c)
2014-11-12	Merge "Added highbitdepth sse2 SAD acceleration and tests"	Debargha Mukherjee

2014-11-12	Added highbitdepth sse2 SAD acceleration and tests	Peter de Rivaz
	Change-Id: I1a74a1b032b198793ef9cc526327987f7799125f (cherry picked from commit b1a6f6b9cb47eafe0ce86eaf0318612806091fe5)
2014-11-07	Iadst transforms to use internal low precision	Deb Mukherjee
	Change-Id: I266777d40c300bc53b45b205144520b85b0d6e58 (cherry picked from commit a1b726117f5470f227bc90cd030b7d25045dc510)
2014-11-07	Merge "Change the use of a reserved color space entry"	Yaowu Xu

2014-11-06	Change the use of a reserved color space entry	Yaowu Xu
	This commit rename a reserved color space entry to BT_2020, it intends to provide support for VP9 bitstream to pass along the color space type defined in BT.2020(Rec.2020) please note this entry does not have any effect on encoding/decoding behavior, but allow applications to the pass the information along from encoding end to decoding end. Change-Id: I4678520e89141ea5e8900f7bd1c0e95b710b7091
2014-11-06	Modify the frame context memory deallocation	Yunqing Wang
	This patch was to fix the vpxdec fuzzing3 test failure. When an error occurs, setjmp() is invoked, which calls the decoder removing routine. In multiple thread situation, other threads could try to access the frame context memory that is already deallocated, thus causing a segfault. An invalid unit test was added for this issue. Change-Id: Ida7442154f3d89759483f0f4fe0324041fffb952
2014-11-05	Merge "Totally remove prev_mi in VP9 decoder."	hkuang

2014-11-05	Totally remove prev_mi in VP9 decoder.	hkuang
	This will save the memory and improve the decode speed due to removing unnecessary memset of big prev_mi array for all the key frames. Decoding a all key frames 1080p video shows speed improve around 2%. Change-Id: I6284a445c1291056e3c15135c3c20d502f791c10
2014-11-05	Fix visual studio 2013 compiler warnings	Yaowu Xu
	For configured with --enable-vp9-highbitdepth Change-Id: I2b181519d7192f8d7a241ad5760c3578255f24e6
2014-11-04	Fix the memory leak due to missing free frame_mvs.	hkuang
	Change-Id: I2ceee7341d906259002c0ea31ea009ae32c04bfd
2014-11-03	Merge "WORKAROUND FIX FOR GCC4.9.1"	Yunqing Wang

2014-11-01	WORKAROUND FIX FOR GCC4.9.1	levytamar82
	In the function mb_lpf_horizontal_edge_w_avx2_16 the usage of the intrinsic _mm256_cvtepu8_epi16 cause a compiler bug in gcc 4.9.1. until it will be fixed I created a workaround that create the up convert by using broadcast128+shuffle. The bug was reported here: https://code.google.com/p/webm/issues/detail?id=867 Change-Id: I73452e6806f42e0fadcde96b804ea3afa7eeb351
2014-10-31	Bind motion vectors with frame buffer structure.	hkuang
	This will save a lot of memory for decoder due to removing of prev_mi, but prev_mi is still needed in encoder. So this will increase a little bit memory for encoder. Change-Id: I24b2f1a423ebffa55a9bd2fcee1077dac995b2ed
2014-10-30	Merge "Move the definition of switchable filter numbers into enum ↵	Hui Su
	INTERP_FILTER; Modify the macro ADD_MV_REF_LIST and IF_DIFF_REF_FRAME_ADD_MV."
2014-10-24	Merge changes I8a9c9019,Ic7b2faa3,I44d42a50,I3f3a3924,I10747b32,I31b49c9e	James Zern
	* changes: add vp9_loop_filter_data_reset move LFWorkerData allocation to VP9LfSync vp9_loop_filter_frame_mt: remove pbi dependency vp9_loop_filter_frame_mt: pass planes directly vp9_loop_filter_frame_mt: pass VP9LfSync directly vp9: store TileWorkerData allocations separately
2014-10-23	add vp9_loop_filter_data_reset	James Zern
	Change-Id: I8a9c9019242ec10fa499a78db322221bf96a0275
2014-10-22	Merge "vp9_ethread: allocate frame contexts outside VP9_COMMON struct"	Yunqing Wang

2014-10-22	vp9_ethread: allocate frame contexts outside VP9_COMMON struct	Yunqing Wang
	This patch allocated frame contexts outside VP9_COMMON. This allows multiple threads to share the same copy of frame contexts, and reduces the overhead. It also guarantees the correct update of these contexts during bitstream packing. This patch doesn't change encoding result. Change-Id: Ic181a2460b891d1d587278a6d02d8057b9dbd353
2014-10-22	Fix Neon convolve profiling	Frank Galligan
	When profiling, gprof can't distinguish between matching labels in different files. Change-Id: I56770df212ed314a0d8568071fa8157624ef1e8f
2014-10-21	Move the definition of switchable filter numbers into enum	Hui Su
	INTERP_FILTER; Modify the macro ADD_MV_REF_LIST and IF_DIFF_REF_FRAME_ADD_MV. Change-Id: Ic36c9eb6ccb8ec324d991f7241e42b40b60b1dcb
2014-10-20	Merge "SAD32xh and SAD64xh for AVX2"	Yunqing Wang

2014-10-19	SAD32xh and SAD64xh for AVX2	levytamar82
	All sad function that process above 32 consecutive elements are optimized for AVX2: vp9_sad64x64 vp9_sad64x32 vp9_sad32x64 vp9_sad32x32 vp9_sad32x16 vp9_sad64x64_avg vp9_sad64x32_avg vp9_sad32x64_avg vp9_sad32x32_avg vp9_sad32x16_avg The functions that appeared as a hotspot is vp9_sad32x32 and vp9_sad64x64 vp9_sad32x32 was optimized by 68% and vp9_sad64x64 was optimized by 90% both of them gave and overall ~2.3% user level gain Change-Id: Iccf86b375a2b54c5fbbe685902ead0c9a561b9fd
2014-10-17	Add highbitdepth function for vp9_avg_8x8	Peter de Rivaz
	Cherry-picked from https://gerrit.chromium.org/gerrit/#/c/71914/ (a92f987a6b7819ae5c62a429e126e1c26bdb1b71) on highbitdepth branch. Change-Id: I6903e4e4cb57d90590725c8a1c64c23da7ae65e8
2014-10-16	move LFWorkerData allocation to VP9LfSync	James Zern
	this removes an assumption that worker->data1 would be pointing to a TileWorkerData allocation. additionally, within the multi-threaded loopfilter pass VP9LfSync as a parameter to the worker hook, removing the need for a shadow pointer in LFWorkerData. Change-Id: Ic7b2faa34e3eb59dbcb8a7c67f333448fa047c88
2014-10-14	Merge "Add a 32-bit friendly sse2 quantizer."	Alex Converse

2014-10-14	Add a 32-bit friendly sse2 quantizer.	Alex Converse
	This is based on the 64-bit ssse3 quantizer. 1.1x speedup for screen content at speed 7. Change-Id: I57d15415ef97c49165954bbe3daaaf9318e37448
2014-10-14	Merge "Remove extra line."	hkuang

2014-10-14	Merge "Remove mi_grid_base_array from VP9_COMMON (unused)"	Adrian Grange

2014-10-13	Use pre increment.	hkuang
	Change-Id: I016b4e77d8268e189473f4c382603afe1ae1750f
2014-10-13	Remove mi_grid_base_array from VP9_COMMON (unused)	Adrian Grange
	Change-Id: I4b4764463f5a7cdc01ec004b882c6237466c74b0