libvpx.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2017-03-01	Improve idct32x32_34_add SSSE3 intrinsics performance	Yi Luo
	- Split the transform into first half and second half. - Reschedule the instructions to avoid stack spillover. - Function level speed improves ~16%. Change-Id: I166889840d23aa8a273eca00f6fbdae8b4566f35
2017-02-24	get_prob(): rationalize int types	James Zern
	promote the unsigned int calculation to uint64_t rather than int64_t for type consistency Change-Id: Ic34dee1dc707d9faf6a3ae250bfe39b60bef3438
2017-02-22	Merge "Fix segmentation fault caused by denoiser working with spatial SVC."	Jerome Jiang

2017-02-21	Following SSSE3 intrinsics functions also work for HBD	Yi Luo
	- vpx_idct8x8_12_add_ssse3 vpx_idct8x8_64_add_ssse3 vpx_idct32x32_34_add_ssse3 vpx_idct32x32_135_add_ssse3 vpx_idct32x32_1024_add_ssse3 - turn on unit tests. Change-Id: I788b2b3b2074a6f3ab6a0e6f469c1327a123eff7
2017-02-21	Fix segmentation fault caused by denoiser working with spatial SVC.	Jerome Jiang
	Re-enable the affected test. BUG=webm:1374 Change-Id: I98cd49403927123546d1d0056660b98c9cb8babb
2017-02-17	Fix idct8x8 SSSE3 SingleExtremeCoeff unit tests	Yi Luo
	- In SSSE3 optimization, 16-bit addition and subtraction would overflow when input coefficient is 16-bit signed extreme values. - Function-level speed becomes slower (unit ms): idct8x8_64: 284 -> 294 idct8x8_12: 145 -> 158. BUG=webm:1332 Change-Id: I1e4bf9d30a6d4112b8cac5823729565bf145e40b
2017-02-17	Merge "Add vpx_highbd_idct16x16_10_add_neon()"	James Zern

2017-02-16	Replace idct32x32_1024_add_ssse3 assembly with intrinsics	Yi Luo
	- Encoding/decoding test, BQTerrace_1920x1080_60.y4m, on i7-6700, no obvious user-level speed performance downgrade. - Passed unit tests. Change-Id: I20688e0dd3731021ec8fb4404734336f1a426bfc
2017-02-16	Merge "block error avx2: use tran_low_t"	Johann Koenig

2017-02-16	Add vpx_highbd_idct16x16_10_add_neon()	Linfeng Zhang
	BUG=webm:1301 Change-Id: If686c8144764c4162458f0bc4bb1bbf6555c48ab
2017-02-16	Merge "Fix mips vpx_post_proc_down_and_across_mb_row_msa function"	James Zern

2017-02-16	Merge "correct bitdepth_conversion_sse2.h header guard"	Johann Koenig

2017-02-16	correct bitdepth_conversion_sse2.h header guard	Johann
	Change-Id: Ic4ffd861608e67fe59bcb3a86010ce3ef11a5519
2017-02-16	Merge "Add idct32x32_135_add SSSE3 intrinsics"	Yi Luo

2017-02-16	block error avx2: use tran_low_t	Johann
	Change-Id: Ic5f3a1f569d6f82afeaf4fcd7235374bb460db3c
2017-02-16	Add idct32x32_135_add SSSE3 intrinsics	Yi Luo
	- Replace the corresponding assembly code. - No user level speed performance degrade. - Unit tests passed. Change-Id: Idd0c5a4bad4976f1617c34100cb46e75e3b961e5
2017-02-16	quantize_fp highbd ssse3: use tran_low_t for coeff	Johann
	Change-Id: Iebade0efc0efbb0a80a0f3adbef4962e3a2f25e8
2017-02-16	bitdepth conversion: really use num elements	Johann
	The previous implementation confused bit/bytes/elements. It was using '32' as the multiplier but that was mistakenly adopted because a 32x32 transform embedded the stride. Change-Id: Ieeb867a332416b9a40580b5e7c9b20088e9e691a
2017-02-16	Fix mips vpx_post_proc_down_and_across_mb_row_msa function	Kaustubh Raste
	Added fix to handle non-multiple of 16 cols case for size 16 Change-Id: If3a6d772d112077c5e0a9be9e612e1148f04338c
2017-02-16	Merge "Use 'packssdw' for loading tran_low_t values"	Johann Koenig

2017-02-15	cosmetics,dsp/inv_txfm.c: reorder functions	Linfeng Zhang
	Change-Id: Ie0f7689ebe230c68eadb22a32b14838c1a7543a6
2017-02-15	Add vpx_highbd_idct16x16_38_add_neon()	Linfeng Zhang
	BUG=webm:1301 Change-Id: Ic6cd8c1e63e1b7a997cbed221e20fff4c599e0fe
2017-02-14	Add vpx_highbd_idct16x16_38_add_c()	Linfeng Zhang
	When eob is less than or equal to 38 for high-bitdepth 16x16 idct, call this function. BUG=webm:1301 Change-Id: I09167f89d29c401f9c36710b0fd2d02644052060
2017-02-14	Use 'packssdw' for loading tran_low_t values	Johann
	This matches bitdepth_conversion_sse2.asm and produces substantially better assembly. The old way had lots of 'movzwl' and 'shl' and storing back to memory before loading into an xmm register. Change-Id: Ib33e35354dfd691a4f8b1e39f4dbcbb14cd5302b
2017-02-14	Replace 14 with DCT_CONST_BITS in idct NEON functions' shifts	Linfeng Zhang
	Change-Id: I2a39a3bb87516b04d273bc1c0f4a634e3fb6f0f6
2017-02-14	apply clang-format	clang-format
	Change-Id: I75e4a9e0b37bd4586f26c8d6c1fa27f3f6ff1bce
2017-02-14	Merge "Replace idct32x32_34_add_ssse3 assembly with intrinsics"	Yi Luo

2017-02-14	Replace idct32x32_34_add_ssse3 assembly with intrinsics	Yi Luo
	- No user-level speed performance change. - Pass unit tests. Change-Id: Idfc598e00f354265e41f6b3219f4734216c115c6
2017-02-14	Merge "Add vpx_highbd_idct16x16_256_add_neon()"	Linfeng Zhang

2017-02-13	Add vpx_highbd_idct16x16_256_add_neon()	Linfeng Zhang
	BUG=webm:1301 Change-Id: I6bb755552a39bdd26eef3f449601f6a9766c65ec
2017-02-13	fdct8x8 highbd neon: use tran_low_t for output	Johann
	Change-Id: I100c4a1955d80bec4d28e82796b3e7f57e84d0ba
2017-02-13	Add vpx_highbd_idct{16x16,32x32}_1_add_neon()	Linfeng Zhang
	and update vpx_highbd_idct8x8_1_add_neon() BUG=webm:1301 Change-Id: I18d1a0cbe98ba822d5194c1b4e13a4c29c5c75f4
2017-02-11	Merge "Add vpx_idct16x16_38_add_neon()"	James Zern

2017-02-08	Add vpx_idct16x16_38_add_neon()	Linfeng Zhang
	The RunQuantCheck() test on it exposes 16-bit overflow in stage 7 of pass 2. Change to use saturating add/sub for both vpx_idct16x16_38_add_neon() and vpx_idct16x16_256_add_neon() for high bitdepth. Change-Id: Ibf4c107a887553a52852cc582e28d38a5a5a2712
2017-02-08	Replace idct8x8_12_add_ssse3 assembly code with intrinsics	Yi Luo
	- Performance achieves the same as assembly. - Unit tests pass. Change-Id: I6eacfbbd826b3946c724d78fbef7948af6406ccd
2017-02-07	Add vpx_idct16x16_38_add_c()	Linfeng Zhang
	When eob is less than or equal to 38 for 16x16 idct, call this function. Change-Id: Ief6f3fb16a49ace3c92cebf4e220bf5bf52a6087
2017-02-07	Merge "Update 16x16 8-bit idct NEON intrinsics"	Linfeng Zhang

2017-02-06	highbd x86: consolidate tran_low_t conversions	Johann
	Create new helper files specifically for converting tran_low_t types. Change-Id: I7c4c458ef910f3b3d10a3cfbf9df4de7682fd905
2017-02-02	Merge "Add SSSE3 intrinsic 8x8 inverse 2D-DCT"	Jingning Han

2017-02-02	Merge "Add mips msa sum_squares_2d_i16 function"	Kaustubh Raste

2017-02-02	Merge "Remove neon assembly for idct 16x16 and 8x8"	Johann Koenig

2017-02-02	Merge changes I43521ad3,I013659f6	Johann Koenig
	* changes: satd highbd neon: use tran_low_t for coeff satd highbd sse2: use tran_low_t for coeff
2017-02-01	Update 16x16 8-bit idct NEON intrinsics	Linfeng Zhang
	Remove redundant memory accesses. Change-Id: I8049074bdba5f49eab7e735b2b377423a69cd4c8
2017-02-01	Add SSSE3 intrinsic 8x8 inverse 2D-DCT	Jingning Han
	The intrinsic version reduces the average cycles from 183 to 175. Change-Id: I7c1bcdb0a830266e93d8347aed38120fb3be0e03
2017-02-01	Merge changes I374dfc08,I7e15192e,Ica414007	Johann Koenig
	* changes: hadamard highbd ssse3: use tran_low_t for coeff hadamard highbd neon: use tran_low_t for coeff hadamard highbd sse2: use tran_low_t for coeff
2017-02-01	Merge "deblock: annotate postproc parameters"	Johann Koenig

2017-02-01	satd highbd neon: use tran_low_t for coeff	Johann
	BUG=webm:1365 Change-Id: I43521ad32b6c96737a8ef2b8c327f901fd7eaf84
2017-02-01	satd highbd sse2: use tran_low_t for coeff	Johann
	BUG=webm:1365 Change-Id: I013659f6b9fbf9cc52ab840eae520fe0b5f883fb
2017-02-01	hadamard highbd ssse3: use tran_low_t for coeff	Johann
	BUG=webm:1365 Change-Id: I374dfc08732932382043905f128e928b08cb4f57
2017-02-01	hadamard highbd neon: use tran_low_t for coeff	Johann
	BUG=webm:1365 Change-Id: I7e15192ead3a3631755b386f102c979f06e26279