diff options
author | Wilco Dijkstra <wdijkstr@arm.com> | 2018-04-03 16:24:29 +0100 |
---|---|---|
committer | Wilco Dijkstra <wdijkstr@arm.com> | 2018-04-03 16:52:16 +0100 |
commit | 19a8b9a300f2f1f0012aff0f2b70b09430f50d9e (patch) | |
tree | 95242c2116141cead4b7b6a6f3a74d607aba08a6 /sysdeps/x86_64/fpu | |
parent | f72aa11d7e3008d608e1092abade16101fed8f35 (diff) | |
download | glibc-19a8b9a300f2f1f0012aff0f2b70b09430f50d9e.tar glibc-19a8b9a300f2f1f0012aff0f2b70b09430f50d9e.tar.gz glibc-19a8b9a300f2f1f0012aff0f2b70b09430f50d9e.tar.bz2 glibc-19a8b9a300f2f1f0012aff0f2b70b09430f50d9e.zip |
[PATCH 1/7] sin/cos slow paths: avoid slow paths for small inputs
This series of patches removes the slow patchs from sin, cos and sincos.
Besides greatly simplifying the implementation, the new version is also much
faster for inputs up to PI (41% faster) and for large inputs needing range
reduction (27% faster).
ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most
of the range with mpsin and mpcos. The number of incorrectly rounded results
(ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5,
the average is ~850 per million between 0 and PI.
Tested on AArch64 and x86_64 with no regressions.
The first patch removes the slow paths for the cases where the input is small
and doesn't require range reduction. Update ULP tables for sin, cos and sincos
on AArch64 and x86_64.
* sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos.
* sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small
inputs.
(__cos): Likewise.
* sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos.
Diffstat (limited to 'sysdeps/x86_64/fpu')
-rw-r--r-- | sysdeps/x86_64/fpu/libm-test-ulps | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index 48e53f7ef2..bbb8a4d075 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -1262,7 +1262,9 @@ ildouble: 1 ldouble: 1 Function: "cos": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 @@ -2528,7 +2530,9 @@ Function: "pow_vlen8_avx2": float: 3 Function: "sin": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 @@ -2578,7 +2582,9 @@ Function: "sin_vlen8_avx2": float: 1 Function: "sincos": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 |