libvpx.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2022-10-12	[NEON] Add highbd FDCT 8x8 function	Konstantinos Margaritis
	50% faster than C version in best/rt profiles Change-Id: I0f9504ed52b5d5f7722407e91108ed4056d66bc2
2022-10-12	[NEON] Add highbd FDCT 4x4 function	Konstantinos Margaritis
	~80% faster than C version for both best/rt profiles. Change-Id: Ibb3c8e1862131d2a020922420d53c66b31d5c2c3
2022-10-12	[NEON] Move helper functions for reuse	Konstantinos Margaritis
	Move all butterfly functions to fdct_neon.h Slightly optimize load/scale/cross functions in fdct 16x16. These will be reused in highbd variants. Change-Id: I28b6e0cc240304bab6b94d9c3f33cca77b8cb073
2022-10-10	[NEON] move transpose_8x8 to reuse	Konstantinos Margaritis
	Change-Id: I3915b6c9971aedaac9c23f21fdb88bc271216208
2022-10-10	Merge "[NEON] highbd partial DCT functions" into main	James Zern

2022-10-10	[NEON] highbd partial DCT functions	Konstantinos Margaritis
	Change-Id: I7dd4e698469562f5b1f948cc36f8403b490dcb6a
2022-10-07	Add vpx_highbd_sad64x{64,32}_avx2.	Scott LaVarnway
	~2.8x faster than the sse2 version. Bug: b/245917257 Change-Id: Ibc8e5d030ec145c9a9b742fff98fbd9131c9ede4
2022-10-06	Add vpx_highbd_sad32x{64,32,16}_avx2.	Scott LaVarnway
	2.7x to 3.1x faster than the sse2 version. Bug: b/245917257 Change-Id: Idff3284932f7ee89d036f38893205bf622a159a3
2022-10-05	Add vpx_highbd_sad16x{32,16,8}_avx2.	Scott LaVarnway
	1.9x to 2.4x faster than the sse2 version. Bug: b/245917257 Change-Id: I686452772f9b72233930de2207af36a0cd72e0bb
2022-09-30	vpx_subpixel_8t_intrin_avx2.c: quiet -Wuninitialized	Scott LaVarnway
	warning: ‘s2[3]’ may be used uninitialized and warning: ‘s1[3]’ may be used uninitialized The warnings exposed unused code. Change-Id: I75cf1f9db75e811cb42e2f143be1ad76f3e4dee9
2022-09-26	quantize: standardize vp9_quantize_fp_sse2	Johann
	Match style for vpx_quantize_b_sse2 and prepare to rewrite ssse3 version in intrinsics. Need to evaluate the value of threshold breakout before going further. Change-Id: I9cfceb1bb0dc237cd6b73fc8d41d78bba444a15b
2022-09-23	quantize: increase iscan by 1	Johann
	All of the assembly adds 1 to iscan to convert from a 0 based array to the EOB value. Add 1 to all iscan values and remove the extra instructions from the assembly. Change-Id: I219dd7f2bd10533ab24b206289565703176dc5e9
2022-09-21	Merge "post_proc_sse2.c: quiet -Wuninitialized" into main	Scott LaVarnway

2022-09-21	post_proc_sse2.c: quiet -Wuninitialized	Scott LaVarnway
	In file included from ../libvpx/vpx_dsp/x86/post_proc_sse2.c:12: In function ‘_mm_add_epi16’, inlined from ‘vpx_mbpost_proc_down_sse2’ at ../libvpx/vpx_dsp/x86/post_proc_sse2.c:88:13: /usr/lib/gcc/x86_64-linux-gnu/12/include/emmintrin.h:1060:35: warning: ‘below_context’ may be used uninitialized [-Wmaybe-uninitialized] 1060 \| return (__m128i) ((__v8hu)__A + (__v8hu)__B); \| ^~~~~~~~~~~ ../libvpx/vpx_dsp/x86/post_proc_sse2.c: In function ‘vpx_mbpost_proc_down_sse2’: ../libvpx/vpx_dsp/x86/post_proc_sse2.c:39:13: note: ‘below_context’ was declared here 39 \| __m128i below_context; Change-Id: I2fc592f121c4e85d0aff1640014c3444f5eb09fd
2022-09-17	fwd_txfm: remove avx2 file from non-hbd	Johann
	Resolves warning on OS X: file: libvpx_g.a(fwd_txfm_avx2.c.o) has no symbols Change-Id: Ie8b290bb3ed329656beb883d552c98353f1ed5e5
2022-09-14	Add vpx_highbd_sad64x{64,32}x4d_avx2.	Scott LaVarnway
	~2x faster than the sse2 version. Bug: b/245917257 Change-Id: I4742950ab7b90d7f09e8d4687e1e967138acee39
2022-09-13	Add vpx_highbd_sad32x{64,32,16}x4d_avx2.	Scott LaVarnway
	~2.4x faster than the sse2 version. Bug: b/245917257 Change-Id: I6df2bd62b46e5e175c8ad80daa6de3a1c313db0f
2022-09-09	Add vpx_highbd_sad16x{32,16,8}x4d_avx2.	Scott LaVarnway
	1.98x to 2.3x faster than the sse2 version. Bug: b/245917257 Change-Id: Ie4f9bb942ffaf4af7d395fb5a5978b41aabfc93c
2022-09-06	Merge "x86,cosmetics: prefer _mm_setzero_si128/_mm256_setzero_si256" into main	James Zern

2022-09-02	sad_neon: enable UDOT implementation w/aarch32	James Zern
	Change-Id: Ia28305ec5c61518b732cbacbd102acd2cb7f9d82
2022-09-02	variance_neon.cc: simplify __ARM_FEATURE_DOTPROD check	James Zern
	missed in 447e27588 vpx_dsp,neon: simplify __ARM_FEATURE_DOTPROD check + fix #if comments only check that the macro is defined, the value doesn't have any effect. from https://arm-software.github.io/acle/main/acle.html: 5.5.7.7. Dot Product extension __ARM_FEATURE_DOTPROD is defined if the dot product data manipulation instructions are supported and the vector intrinsics are available. Note that this implies: - __ARM_NEON == 1 Change-Id: I098b96421b7de5928bb3b11612ca1f32e7b6cbc4
2022-09-02	x86,cosmetics: prefer _mm_setzero_si128/_mm256_setzero_si256	James Zern
	over _set1_(0) Change-Id: I136e1798a2ce286480ebb9418db67a2f1e92b9a2
2022-09-02	vpx_dsp,neon: simplify __ARM_FEATURE_DOTPROD check	James Zern
	only check that the macro is defined, the value doesn't have any effect. from https://arm-software.github.io/acle/main/acle.html: 5.5.7.7. Dot Product extension __ARM_FEATURE_DOTPROD is defined if the dot product data manipulation instructions are supported and the vector intrinsics are available. Note that this implies: - __ARM_NEON == 1 Change-Id: I164fe121ccefda99050a9b6a99738a2b518520f3
2022-09-01	neon,load_unaligned_*: use dup for lane 0	James Zern
	this produces better assembly with gcc (11.3.0-3); no change in assembly using clang from the r24 android sdk (Android (8075178, based on r437112b) clang version 14.0.1 (https://android.googlesource.com/toolchain/llvm-project 8671348b81b95fc603505dfc881b45103bee1731) Change-Id: Ifec252d4f499f23be1cd94aa8516caf6b3fbbc11
2022-08-26	highbd_variance_neon,cosmetics: reorder a few lines	James Zern
	Change-Id: Ia6fa54652d7f94687e64108482bb0f28ca06cf49
2022-08-26	Merge "[NEON] Add highbd variance functions" into main	James Zern

2022-08-25	[NEON] Add highbd variance functions	Konstantinos Margaritis
	Total gain for 12-bit encoding: * ~7.2% for best profile * ~5.8% for rt profile Change-Id: I5b70415fb89d1bbb02a0c139eb317ba6b08adede
2022-08-24	Merge "[NEON] Improve vpx_quantize_b* functions" into main	James Zern

2022-08-23	.clang-format: update to clang-format-11	clang-format
	only store the deltas from --style Google in the file and reapply using Debian clang-format version 11.1.0-6+build1 Bug: b/229626362 Change-Id: I3e18a2e7c17a90a48405b3cf1b37ebc652aba0db
2022-08-23	[NEON] Improve vpx_quantize_b* functions	Konstantinos Margaritis
	Slight optimization, prefetch gives a 1% improvement in 1st pass Change-Id: Iba4664964664234666406ab53893e02d481fbe61
2022-08-22	Merge "highbd_quantize_neon.c: remove unneeded assert.h" into main	James Zern

2022-08-22	Merge changes Iabed118b,I60a384b2 into main	James Zern
	* changes: use VPX_NO_UNSIGNED_SHIFT_CHECK with entropy functions compiler_attributes.h: add VPX_NO_UNSIGNED_SHIFT_CHECK
2022-08-22	[NEON] Add vpx_highbd_subtract_block function	Konstantinos Margaritis
	Total gain for 12-bit encoding: * ~1% for best and rt profile Change-Id: I4039120dc570baab1ae519a5e38b1acff38d81f0
2022-08-22	[NEON] Added vpx_highbd_sad* functions	Konstantinos Margaritis
	Total gain for 12-bit encoding: * ~7.8% for best profile * ~10% for rt profile Change-Id: I89eda5c4372a5b628c9df84cdeb4c8486fc44789
2022-08-22	highbd_quantize_neon.c: remove unneeded assert.h	James Zern
	Change-Id: I041f5fb23b856a2b519669b5bf8a40d3772b4a6e
2022-08-20	[NEON] Added vpx_highbd_quantize_b* functions	Konstantinos Margaritis
	Total gain for 12-bit encoding: * ~4.8% for best profile * ~6.2% for rt profile Change-Id: I61e646ab7aedf06a25db1365d6d1cf7b05101c21
2022-08-18	use VPX_NO_UNSIGNED_SHIFT_CHECK with entropy functions	James Zern
	these shift values off the most significant bit as part of the process; vp8_regular_quantize_b_sse4_1 is included here for a special case of mask creation quiets warnings of the form: vp8/decoder/dboolhuff.h:81:11: runtime error: left shift of 2373679303235599696 by 3 places cannot be represented in type 'VP8_BD_VALUE' (aka 'unsigned long') vp8/encoder/bitstream.c:257:18: runtime error: left shift of 2147493041 by 1 places cannot be represented in type 'unsigned int' vp8/encoder/x86/quantize_sse4.c:114:18: runtime error: left shift of 4294967294 by 1 places cannot be represented in type 'unsigned int' vp9/encoder/vp9_pickmode.c:1632:41: runtime error: left shift of 4294967295 by 1 places cannot be represented in type 'unsigned int' Bug: b/229626362 Change-Id: Iabed118b2a094232783e5ad0e586596d874103ca
2022-08-18	loopfilter.c: normalize flat func param type	James Zern
	flat/flat2 are stored as int8_t as returned by the filter_mask* functions. this quiets integer sanitizer warnings of the form: vpx_dsp/loopfilter.c:197:28: runtime error: implicit conversion from type 'int8_t' (aka 'signed char') of value -1 (8-bit, signed) to type 'uint8_t' (aka 'unsigned char') changed the value to 255 (8-bit, unsigned) Bug: b/229626362 Change-Id: Iacb6ae052d4cb2b6e0ebccbacf59ece9501d3b5f
2022-08-16	highbd_quantize_intrin_sse2: quiet int sanitizer warnings	James Zern
	add a missing cast in ^ operations; quiets warnings of the form: implicit conversion from type 'int' of value -1 (32-bit, signed) to type 'unsigned int' changed the value to 4294967295 (32-bit, unsigned) Bug: b/229626362 Change-Id: I56f74981050b2c9d00bad20e68f1b73ce7454729
2022-08-16	load_unaligned_u32: use an int w/_mm_cvtsi32_si128	James Zern
	this matches the type of the function parameter; quiets integer sanitizer warnings of the form: implicit conversion from type 'uint32_t' (aka 'unsigned int') of value 3215646151 (32-bit, unsigned) to type 'int' changed the value to -1079321145 (32-bit, signed) Bug: b/229626362 Change-Id: Ia9a5dc5e1f57cbf4f8f8fa457bb674ef43369d37
2022-08-16	variance_sse2.c: add some missing casts	James Zern
	quiets integer sanitizer warnings of the form: ../vpx_dsp/x86/variance_sse2.c:100:10: runtime error: implicit conversion from type 'unsigned int' of value 4294966272 (32-bit, unsigned) to type 'int' changed the value to -1024 (32-bit, signed) Bug: b/229626362 Change-Id: I150cc0a6a6b85143c3bf96886686fe3a40897db5
2022-08-09	VPX: Fix vp9_quantize_fp_avx2() VS build error.	Scott LaVarnway
	Add build fix for _mm256_extract_epi16() being undefined. Bug: b/237714063 Change-Id: I855b1828ce1b6b2b2f063fe097999481881bf074
2022-08-05	VPX: Add vpx_subtract_block_avx2().	Scott LaVarnway
	~1.3x faster than vpx_subtract_block_sse2(). Based on aom_subtract_block_avx2(). Bug: b/241580104 Change-Id: I17da036363f213d53c6546c3e858e4c3cba44a5b
2022-07-29	Provide Arm SDOT optimizations for SAD functions	Konstantinos Margaritis
	Change-Id: I497ee1c45d1fc4d643cefad7d87e5aaacd77869c
2022-07-27	x86: normalize type with _mm_cvtsi128_si32	James Zern
	prefer int in most cases w/clang -fsanitize=integer fixes warnings of the form: implicit conversion from type 'int' of value -809931979 (32-bit, signed) to type 'uint32_t' (aka 'unsigned int') changed the value to 3485035317 (32-bit, unsigned) Bug: b/229626362 Change-Id: I0c6604efc188f2660c531eddfc7aa10060637813
2022-07-27	variance_avx2.c: fix implicit conversion warnings	James Zern
	w/clang -fsanitize=integer fixes warnings of the form: implicit conversion from type 'int' of value -1323 (32-bit, signed) to type 'unsigned int' changed the value to 4294965973 (32-bit, unsigned) Bug: b/229626362 Change-Id: I7291d9bd5cacea0d88d9f4c4624c096764f4a472
2022-07-26	VPX: Add vpx_highbd_quantize_b_32x32_avx2().	Scott LaVarnway
	Up to 11.78x faster than vpx_quantize_b_32x32_sse2() for full calculations. ~1.7% overall encoder improvement for the test clip used. Bug: b/237714063 Change-Id: Ib759056db94d3487239cb2748ffef1184a89ae18
2022-07-25	VPX: Add vpx_highbd_quantize_b_avx2().	Scott LaVarnway
	Up to 3.61x faster than vpx_highbd_quantize_b_sse2() for full calculations. ~2.3% overall encoder improvement for the test clip used. Bug: b/237714063 Change-Id: I23f88d2a7f96aaa4103778372f4f552207f73cee
2022-07-25	Merge "VPX: Add vpx_quantize_b_32x32_avx2()." into main	Scott LaVarnway

2022-07-20	avg_intrin_avx2: rm dead store in highbd_hadamard_8x8	James Zern
	missed in: 53dd1e8e7 avg_intrin_{sse2,avg2}: rm dead store in hadamard_8x8 Change-Id: I378e4a388ceb193a4cfee4d9d317fc62fcc4b39e