diff options
author | Adhemerval Zanella <adhemerval.zanella@linaro.org> | 2024-02-08 10:08:39 -0300 |
---|---|---|
committer | H.J. Lu <hjl.tools@gmail.com> | 2024-02-13 08:49:13 -0800 |
commit | 272708884cb750f12f5c74a00e6620c19dc6d567 (patch) | |
tree | f6c6fdc807ba59154f501111a69c4c6ef812e9d9 /sysdeps | |
parent | 0c0d39fe4aeb0f69b26e76337c5dfd5530d5d44e (diff) | |
download | glibc-272708884cb750f12f5c74a00e6620c19dc6d567.tar glibc-272708884cb750f12f5c74a00e6620c19dc6d567.tar.gz glibc-272708884cb750f12f5c74a00e6620c19dc6d567.tar.bz2 glibc-272708884cb750f12f5c74a00e6620c19dc6d567.zip |
x86: Do not prefer ERMS for memset on Zen3+
For AMD Zen3+ architecture, the performance of the vectorized loop is
slightly better than ERMS.
Checked on x86_64-linux-gnu on Zen3.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Diffstat (limited to 'sysdeps')
-rw-r--r-- | sysdeps/x86/dl-cacheinfo.h | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h index f34d12846c..5a98f70364 100644 --- a/sysdeps/x86/dl-cacheinfo.h +++ b/sysdeps/x86/dl-cacheinfo.h @@ -1021,6 +1021,11 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) minimum value is fixed. */ rep_stosb_threshold = TUNABLE_GET (x86_rep_stosb_threshold, long int, NULL); + if (cpu_features->basic.kind == arch_kind_amd + && !TUNABLE_IS_INITIALIZED (x86_rep_stosb_threshold)) + /* For AMD Zen3+ architecture, the performance of the vectorized loop is + slightly better than ERMS. */ + rep_stosb_threshold = SIZE_MAX; TUNABLE_SET_WITH_BOUNDS (x86_data_cache_size, data, 0, SIZE_MAX); TUNABLE_SET_WITH_BOUNDS (x86_shared_cache_size, shared, 0, SIZE_MAX); |