diff options
author | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-11-08 17:38:39 -0800 |
---|---|---|
committer | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-11-08 19:22:33 -0800 |
commit | 642933158e7cf072d873231b1a9bb03291f2b989 (patch) | |
tree | 352c3956cef706e683d0ac26ef85d165d1adcceb /sysdeps/x86_64/multiarch/stpncpy-avx2.S | |
parent | f049f52dfeed8129c11ab1641a815705d09ff7e8 (diff) | |
download | glibc-642933158e7cf072d873231b1a9bb03291f2b989.tar glibc-642933158e7cf072d873231b1a9bb03291f2b989.tar.gz glibc-642933158e7cf072d873231b1a9bb03291f2b989.tar.bz2 glibc-642933158e7cf072d873231b1a9bb03291f2b989.zip |
x86: Optimize and shrink st{r|p}{n}{cat|cpy}-avx2 functions
Optimizations are:
1. Use more overlapping stores to avoid branches.
2. Reduce how unrolled the aligning copies are (this is more of a
code-size save, its a negative for some sizes in terms of
perf).
3. For st{r|p}n{cat|cpy} re-order the branches to minimize the
number that are taken.
Performance Changes:
Times are from N = 10 runs of the benchmark suite and are
reported as geometric mean of all ratios of
New Implementation / Old Implementation.
strcat-avx2 -> 0.998
strcpy-avx2 -> 0.937
stpcpy-avx2 -> 0.971
strncpy-avx2 -> 0.793
stpncpy-avx2 -> 0.775
strncat-avx2 -> 0.962
Code Size Changes:
function -> Bytes New / Bytes Old -> Ratio
strcat-avx2 -> 685 / 1639 -> 0.418
strcpy-avx2 -> 560 / 903 -> 0.620
stpcpy-avx2 -> 592 / 939 -> 0.630
strncpy-avx2 -> 1176 / 2390 -> 0.492
stpncpy-avx2 -> 1268 / 2438 -> 0.520
strncat-avx2 -> 1042 / 2563 -> 0.407
Notes:
1. Because of the significant difference between the
implementations they are split into three files.
strcpy-avx2.S -> strcpy, stpcpy, strcat
strncpy-avx2.S -> strncpy
strncat-avx2.S > strncat
I couldn't find a way to merge them without making the
ifdefs incredibly difficult to follow.
Full check passes on x86-64 and build succeeds for all ISA levels w/
and w/o multiarch.
Diffstat (limited to 'sysdeps/x86_64/multiarch/stpncpy-avx2.S')
-rw-r--r-- | sysdeps/x86_64/multiarch/stpncpy-avx2.S | 5 |
1 files changed, 2 insertions, 3 deletions
diff --git a/sysdeps/x86_64/multiarch/stpncpy-avx2.S b/sysdeps/x86_64/multiarch/stpncpy-avx2.S index b2f8c19143..a46a8edbe2 100644 --- a/sysdeps/x86_64/multiarch/stpncpy-avx2.S +++ b/sysdeps/x86_64/multiarch/stpncpy-avx2.S @@ -3,6 +3,5 @@ #endif #define USE_AS_STPCPY -#define USE_AS_STRNCPY -#define STRCPY STPNCPY -#include "strcpy-avx2.S" +#define STRNCPY STPNCPY +#include "strncpy-avx2.S" |