diff options
author | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-11-08 17:38:41 -0800 |
---|---|---|
committer | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-11-08 19:22:33 -0800 |
commit | 52cf11004eb10f8ebbc193fbdf4094cfecb3dbff (patch) | |
tree | dc3c5cf41a53bd42de548c3e0d04f37dac95b72b /sysdeps/x86_64/wcpncpy.S | |
parent | 64b8b6516b3cba19dba4c8f4f9b97daa0556fd98 (diff) | |
download | glibc-52cf11004eb10f8ebbc193fbdf4094cfecb3dbff.tar glibc-52cf11004eb10f8ebbc193fbdf4094cfecb3dbff.tar.gz glibc-52cf11004eb10f8ebbc193fbdf4094cfecb3dbff.tar.bz2 glibc-52cf11004eb10f8ebbc193fbdf4094cfecb3dbff.zip |
x86: Add avx2 optimized functions for the wchar_t strcpy family
Implemented:
wcscat-avx2 (+ 744 bytes
wcscpy-avx2 (+ 539 bytes)
wcpcpy-avx2 (+ 577 bytes)
wcsncpy-avx2 (+1108 bytes)
wcpncpy-avx2 (+1214 bytes)
wcsncat-avx2 (+1085 bytes)
Performance Changes:
Times are from N = 10 runs of the benchmark suite and are reported
as geometric mean of all ratios of New Implementation / Best Old
Implementation. Best Old Implementation was determined with the
highest ISA implementation.
wcscat-avx2 -> 0.975
wcscpy-avx2 -> 0.591
wcpcpy-avx2 -> 0.698
wcsncpy-avx2 -> 0.730
wcpncpy-avx2 -> 0.711
wcsncat-avx2 -> 0.954
Code Size Changes:
This change increase the size of libc.so by ~5.5kb bytes. For
reference the patch optimizing the normal strcpy family functions
decreases libc.so by ~5.2kb.
Full check passes on x86-64 and build succeeds for all ISA levels w/
and w/o multiarch.
Diffstat (limited to 'sysdeps/x86_64/wcpncpy.S')
-rw-r--r-- | sysdeps/x86_64/wcpncpy.S | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/sysdeps/x86_64/wcpncpy.S b/sysdeps/x86_64/wcpncpy.S index b4e531473e..0e0f432fbb 100644 --- a/sysdeps/x86_64/wcpncpy.S +++ b/sysdeps/x86_64/wcpncpy.S @@ -24,11 +24,12 @@ #include <isa-level.h> -#if MINIMUM_X86_ISA_LEVEL >= 4 +#if MINIMUM_X86_ISA_LEVEL >= 3 # define WCPNCPY __wcpncpy # define DEFAULT_IMPL_V4 "multiarch/wcpncpy-evex.S" +# define DEFAULT_IMPL_V3 "multiarch/wcpncpy-avx2.S" /* isa-default-impl.h expects DEFAULT_IMPL_V1 to be defined but it should never be used from here. */ # define DEFAULT_IMPL_V1 "ERROR -- Invalid ISA IMPL" |