aboutsummaryrefslogtreecommitdiff
path: root/sysdeps/ieee754
diff options
context:
space:
mode:
authorH.J. Lu <hjl.tools@gmail.com>2023-08-17 09:42:29 -0700
committerH.J. Lu <hjl.tools@gmail.com>2023-08-21 10:44:26 -0700
commita8ecb126d4c26c52f4ad828c566afe4043a28155 (patch)
treebfab51ea467c0717fc322be816770598edfcca63 /sysdeps/ieee754
parentce99601fa883a8916cb902c7bcd2125046a4a39d (diff)
downloadglibc-a8ecb126d4c26c52f4ad828c566afe4043a28155.tar
glibc-a8ecb126d4c26c52f4ad828c566afe4043a28155.tar.gz
glibc-a8ecb126d4c26c52f4ad828c566afe4043a28155.tar.bz2
glibc-a8ecb126d4c26c52f4ad828c566afe4043a28155.zip
x86_64: Add log1p with FMA
On Skylake, it changes log1p bench performance by: Before After Improvement max 63.349 58.347 8% min 4.448 5.651 -30% mean 12.0674 10.336 14% The minimum code path is if (hx < 0x3FDA827A) /* x < 0.41422 */ { if (__glibc_unlikely (ax >= 0x3ff00000)) /* x <= -1.0 */ { ... } if (__glibc_unlikely (ax < 0x3e200000)) /* |x| < 2**-29 */ { math_force_eval (two54 + x); /* raise inexact */ if (ax < 0x3c900000) /* |x| < 2**-54 */ { ... } else return x - x * x * 0.5; FMA and non-FMA code sequences look similar. Non-FMA version is slightly faster. Since log1p is called by asinh and atanh, it improves asinh performance by: Before After Improvement max 75.645 63.135 16% min 10.074 10.071 0% mean 15.9483 14.9089 6% and improves atanh performance by: Before After Improvement max 91.768 75.081 18% min 15.548 13.883 10% mean 18.3713 16.8011 8%
Diffstat (limited to 'sysdeps/ieee754')
-rw-r--r--sysdeps/ieee754/dbl-64/s_log1p.c5
1 files changed, 5 insertions, 0 deletions
diff --git a/sysdeps/ieee754/dbl-64/s_log1p.c b/sysdeps/ieee754/dbl-64/s_log1p.c
index e6476a8260..eeb0af859f 100644
--- a/sysdeps/ieee754/dbl-64/s_log1p.c
+++ b/sysdeps/ieee754/dbl-64/s_log1p.c
@@ -99,6 +99,11 @@ static const double
static const double zero = 0.0;
+#ifndef SECTION
+# define SECTION
+#endif
+
+SECTION
double
__log1p (double x)
{