summaryrefslogtreecommitdiff
path: root/sysdeps/x86_64/multiarch/stpncpy-avx2.S
diff options
context:
space:
mode:
authorNoah Goldstein <goldstein.w.n@gmail.com>2022-11-08 17:38:39 -0800
committerNoah Goldstein <goldstein.w.n@gmail.com>2022-11-08 19:22:33 -0800
commit642933158e7cf072d873231b1a9bb03291f2b989 (patch)
tree352c3956cef706e683d0ac26ef85d165d1adcceb /sysdeps/x86_64/multiarch/stpncpy-avx2.S
parentf049f52dfeed8129c11ab1641a815705d09ff7e8 (diff)
downloadglibc-642933158e7cf072d873231b1a9bb03291f2b989.tar
glibc-642933158e7cf072d873231b1a9bb03291f2b989.tar.gz
glibc-642933158e7cf072d873231b1a9bb03291f2b989.tar.bz2
glibc-642933158e7cf072d873231b1a9bb03291f2b989.zip
x86: Optimize and shrink st{r|p}{n}{cat|cpy}-avx2 functions
Optimizations are: 1. Use more overlapping stores to avoid branches. 2. Reduce how unrolled the aligning copies are (this is more of a code-size save, its a negative for some sizes in terms of perf). 3. For st{r|p}n{cat|cpy} re-order the branches to minimize the number that are taken. Performance Changes: Times are from N = 10 runs of the benchmark suite and are reported as geometric mean of all ratios of New Implementation / Old Implementation. strcat-avx2 -> 0.998 strcpy-avx2 -> 0.937 stpcpy-avx2 -> 0.971 strncpy-avx2 -> 0.793 stpncpy-avx2 -> 0.775 strncat-avx2 -> 0.962 Code Size Changes: function -> Bytes New / Bytes Old -> Ratio strcat-avx2 -> 685 / 1639 -> 0.418 strcpy-avx2 -> 560 / 903 -> 0.620 stpcpy-avx2 -> 592 / 939 -> 0.630 strncpy-avx2 -> 1176 / 2390 -> 0.492 stpncpy-avx2 -> 1268 / 2438 -> 0.520 strncat-avx2 -> 1042 / 2563 -> 0.407 Notes: 1. Because of the significant difference between the implementations they are split into three files. strcpy-avx2.S -> strcpy, stpcpy, strcat strncpy-avx2.S -> strncpy strncat-avx2.S > strncat I couldn't find a way to merge them without making the ifdefs incredibly difficult to follow. Full check passes on x86-64 and build succeeds for all ISA levels w/ and w/o multiarch.
Diffstat (limited to 'sysdeps/x86_64/multiarch/stpncpy-avx2.S')
-rw-r--r--sysdeps/x86_64/multiarch/stpncpy-avx2.S5
1 files changed, 2 insertions, 3 deletions
diff --git a/sysdeps/x86_64/multiarch/stpncpy-avx2.S b/sysdeps/x86_64/multiarch/stpncpy-avx2.S
index b2f8c19143..a46a8edbe2 100644
--- a/sysdeps/x86_64/multiarch/stpncpy-avx2.S
+++ b/sysdeps/x86_64/multiarch/stpncpy-avx2.S
@@ -3,6 +3,5 @@
#endif
#define USE_AS_STPCPY
-#define USE_AS_STRNCPY
-#define STRCPY STPNCPY
-#include "strcpy-avx2.S"
+#define STRNCPY STPNCPY
+#include "strncpy-avx2.S"