diff options
author | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-10-18 17:44:05 -0700 |
---|---|---|
committer | Noah Goldstein <goldstein.w.n@gmail.com> | 2022-10-19 17:31:03 -0700 |
commit | b79f8ff26aa6151d2d2167afcddcd1ec46cfbc81 (patch) | |
tree | 3d956430a93cb6412a7a0a55a67cb4d82ccae86d /sysdeps/unix/sysv/linux/pause.c | |
parent | 69717709ec5c2769322678e96a7672d1e270de3a (diff) | |
download | glibc-b79f8ff26aa6151d2d2167afcddcd1ec46cfbc81.tar glibc-b79f8ff26aa6151d2d2167afcddcd1ec46cfbc81.tar.gz glibc-b79f8ff26aa6151d2d2167afcddcd1ec46cfbc81.tar.bz2 glibc-b79f8ff26aa6151d2d2167afcddcd1ec46cfbc81.zip |
x86: Optimize strnlen-evex.S and implement with VMM headers
Optimizations are:
1. Use the fact that bsf(0) leaves the destination unchanged to save a
branch in short string case.
2. Restructure code so that small strings are given the hot path.
- This is a net-zero on the benchmark suite but in general makes
sense as smaller sizes are far more common.
3. Use more code-size efficient instructions.
- tzcnt ... -> bsf ...
- vpcmpb $0 ... -> vpcmpeq ...
4. Align labels less aggressively, especially if it doesn't save fetch
blocks / causes the basic-block to span extra cache-lines.
The optimizations (especially for point 2) make the strnlen and
strlen code essentially incompatible so split strnlen-evex
to a new file.
Code Size Changes:
strlen-evex.S : -23 bytes
strnlen-evex.S : -167 bytes
Net perf changes:
Reported as geometric mean of all improvements / regressions from N=10
runs of the benchtests. Value as New Time / Old Time so < 1.0 is
improvement and 1.0 is regression.
strlen-evex.S : 0.992 (No real change)
strnlen-evex.S : 0.947
Full results attached in email.
Full check passes on x86-64.
Diffstat (limited to 'sysdeps/unix/sysv/linux/pause.c')
0 files changed, 0 insertions, 0 deletions