diff options
author | H.J. Lu <hjl.tools@gmail.com> | 2023-08-17 09:42:29 -0700 |
---|---|---|
committer | H.J. Lu <hjl.tools@gmail.com> | 2023-08-21 10:44:26 -0700 |
commit | a8ecb126d4c26c52f4ad828c566afe4043a28155 (patch) | |
tree | bfab51ea467c0717fc322be816770598edfcca63 /sysdeps/pthread/tst-cond14.c | |
parent | ce99601fa883a8916cb902c7bcd2125046a4a39d (diff) | |
download | glibc-a8ecb126d4c26c52f4ad828c566afe4043a28155.tar glibc-a8ecb126d4c26c52f4ad828c566afe4043a28155.tar.gz glibc-a8ecb126d4c26c52f4ad828c566afe4043a28155.tar.bz2 glibc-a8ecb126d4c26c52f4ad828c566afe4043a28155.zip |
x86_64: Add log1p with FMA
On Skylake, it changes log1p bench performance by:
Before After Improvement
max 63.349 58.347 8%
min 4.448 5.651 -30%
mean 12.0674 10.336 14%
The minimum code path is
if (hx < 0x3FDA827A) /* x < 0.41422 */
{
if (__glibc_unlikely (ax >= 0x3ff00000)) /* x <= -1.0 */
{
...
}
if (__glibc_unlikely (ax < 0x3e200000)) /* |x| < 2**-29 */
{
math_force_eval (two54 + x); /* raise inexact */
if (ax < 0x3c900000) /* |x| < 2**-54 */
{
...
}
else
return x - x * x * 0.5;
FMA and non-FMA code sequences look similar. Non-FMA version is slightly
faster. Since log1p is called by asinh and atanh, it improves asinh
performance by:
Before After Improvement
max 75.645 63.135 16%
min 10.074 10.071 0%
mean 15.9483 14.9089 6%
and improves atanh performance by:
Before After Improvement
max 91.768 75.081 18%
min 15.548 13.883 10%
mean 18.3713 16.8011 8%
Diffstat (limited to 'sysdeps/pthread/tst-cond14.c')
0 files changed, 0 insertions, 0 deletions