Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
This patch fix the missing symbol __fe_nomask_env from commit
41e8926aa4b7f17bc95984737ee82a254ad0911c for GLIBC_2.1.
|
|
This patch helps some math functions performance by adding the libc_fexxx
variant of inline functions to handle both FPU round and exception set/restore
and by using them on the libc_fexxx_ctx functions. It is based on already coded
fexxx family functions for PPC with fpu.
Here is the summary of performance improvements due this patch (measured on a
POWER7 machine):
Before:
cos(): ITERS:9.5895e+07: TOTAL:5116.03Mcy, MAX:77.6cy, MIN:49.792cy, 18744 calls/Mcy
exp(): ITERS:2.827e+07: TOTAL:5187.15Mcy, MAX:494.018cy, MIN:38.422cy, 5450.01 calls/Mcy
pow(): ITERS:6.1705e+07: TOTAL:5144.26Mcy, MAX:171.95cy, MIN:29.935cy, 11994.9 calls/Mcy
sin(): ITERS:8.6898e+07: TOTAL:5117.06Mcy, MAX:83.841cy, MIN:46.582cy, 16982 calls/Mcy
tan(): ITERS:2.9473e+07: TOTAL:5115.39Mcy, MAX:191.017cy, MIN:172.352cy, 5761.63 calls/Mcy
After:
cos(): ITERS:2.05265e+08: TOTAL:5111.37Mcy, MAX:78.754cy, MIN:24.196cy, 40158.5 calls/Mcy
exp(): ITERS:3.341e+07: TOTAL:5170.84Mcy, MAX:476.317cy, MIN:15.574cy, 6461.23 calls/Mcy
pow(): ITERS:7.6153e+07: TOTAL:5129.1Mcy, MAX:147.5cy, MIN:30.916cy, 14847.2 calls/Mcy
sin(): ITERS:1.58816e+08: TOTAL:5115.11Mcy, MAX:1490.39cy, MIN:22.341cy, 31048.4 calls/Mcy
tan(): ITERS:3.4964e+07: TOTAL:5114.18Mcy, MAX:177.422cy, MIN:146.115cy, 6836.68 calls/Mcy
|
|
This patch does not export __fe_mask_env anymore, only providing a
compatibility symbol. It fixes BZ#14143.
|
|
http://sourceware.org/ml/libc-alpha/2013-07/msg00202.html
Another little-endian fix.
* sysdeps/powerpc/fpu_control.h (_FPU_GETCW): Rewrite using
64-bit int/double union.
(_FPU_SETCW): Likewise.
* sysdeps/powerpc/fpu/tst-setcontext-fpscr.c (_GET_DI_FPSCR): Likewise.
(_SET_DI_FPSCR, _GET_SI_FPSCR, _SET_SI_FPSCR): Likewise.
|
|
http://sourceware.org/ml/libc-alpha/2013-07/msg00201.html
These two functions oddly test x+1>0 when a double x is >= 0.0, and
similarly when x is negative. I don't see the point of that since the
test should always be true. I also don't see any need to convert x+1
to integer rather than simply using xr+1. Note that the standard
allows these functions to return any value when the input is outside
the range of long long, but it's not too hard to prevent xr+1
overflowing so that's what I've done.
(With rounding mode FE_UPWARD, x+1 can be a lot more than what you
might naively expect, but perhaps that situation was covered by the
x - xrf < 1.0 test.)
* sysdeps/powerpc/fpu/s_llround.c (__llround): Rewrite.
* sysdeps/powerpc/fpu/s_llroundf.c (__llroundf): Rewrite.
|
|
http://sourceware.org/ml/libc-alpha/2013-07/msg00200.html
This works around the fact that vsx is disabled in current
little-endian gcc. Also, float constants take 4 bytes in memory
vs. 16 bytes for vector constants, and we don't need to write one lot
of masks for double (register format) and another for float (mem
format).
* sysdeps/powerpc/fpu/s_float_bitwise.h (__float_and_test28): Don't
use vector int constants.
(__float_and_test24, __float_and8, __float_get_exp): Likewise.
|
|
http://sourceware.org/ml/libc-alpha/2013-07/msg00199.html
Corrects floating-point environment code for little-endian.
* sysdeps/powerpc/fpu/fenv_libc.h (fenv_union_t): Replace int
array with long long.
* sysdeps/powerpc/fpu/e_sqrt.c (__slow_ieee754_sqrt): Adjust.
* sysdeps/powerpc/fpu/e_sqrtf.c (__slow_ieee754_sqrtf): Adjust.
* sysdeps/powerpc/fpu/fclrexcpt.c (__feclearexcept): Adjust.
* sysdeps/powerpc/fpu/fedisblxcpt.c (fedisableexcept): Adjust.
* sysdeps/powerpc/fpu/feenablxcpt.c (feenableexcept): Adjust.
* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Adjust.
* sysdeps/powerpc/fpu/feholdexcpt.c (feholdexcept): Adjust.
* sysdeps/powerpc/fpu/fesetenv.c (__fesetenv): Adjust.
* sysdeps/powerpc/fpu/feupdateenv.c (__feupdateenv): Adjust.
* sysdeps/powerpc/fpu/fgetexcptflg.c (__fegetexceptflag): Adjust.
* sysdeps/powerpc/fpu/fraiseexcpt.c (__feraiseexcept): Adjust.
* sysdeps/powerpc/fpu/fsetexcptflg.c (__fesetexceptflag): Adjust.
* sysdeps/powerpc/fpu/ftestexcept.c (fetestexcept): Adjust.
|
|
http://sourceware.org/ml/libc-alpha/2013-08/msg00083.html
Further replacement of ieee854 macros and unions. These files also
have some optimisations for comparison against 0.0L, infinity and nan.
Since the ABI specifies that the high double of an IBM long double
pair is the value rounded to double, a high double of 0.0 means the
low double must also be 0.0. The ABI also says that infinity and
nan are encoded in the high double, with the low double unspecified.
This means that tests for 0.0L, +/-Infinity and +/-NaN need only check
the high double.
* sysdeps/ieee754/ldbl-128ibm/e_atan2l.c (__ieee754_atan2l): Rewrite
all uses of ieee854 long double macros and unions. Simplify tests
for long doubles that are fully specified by the high double.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_ilogbl.c (__ieee754_ilogbl): Likewise.
Remove dead code too.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_log10l.c (__ieee754_log10l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_logl.c (__ieee754_logl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise.
Remove dead code too.
* sysdeps/ieee754/ldbl-128ibm/k_tanl.c (__kernel_tanl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_frexpl.c (__frexpl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_isinf_nsl.c (__isinf_nsl): Likewise.
Simplify.
* sysdeps/ieee754/ldbl-128ibm/s_isinfl.c (___isinfl): Likewise.
Simplify.
* sysdeps/ieee754/ldbl-128ibm/s_log1pl.c (__log1pl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_modfl.c (__modfl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nextafterl.c (__nextafterl): Likewise.
Comment on variable precision.
* sysdeps/ieee754/ldbl-128ibm/s_nexttoward.c (__nexttoward): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_tanhl.c (__tanhl): Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Adjust tan_towardzero ulps.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This patch fixes hypot/hypotf spurious floating-point exceptions
generate by internal operations.
|
|
|
|
|
|
|
|
|
|
|
|
This patch fixes a sqrtl ABI issue when building for powerpc64.
|
|
|
|
|
|
|
|
This patch removes redudant definition from PowerPC specific
math_ldbl, using the definitions from ieee754 math_ldbl.h.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
IBM long double fixes and POWER ulps update.
|
|
|
|
This patch fixes some sinf/cosf calculations that generated unexpected
underflows exceptions.
|
|
|
|
|
|
Adjustments for libm ulps added with commit d8b82cad1b525bdcbfff88d218c7c45032e4a3af,
495fd99f3a119e5c0c542ccc6cf9c93b1fb9e892, and 5ba3cc691c856e5c67a7d4cd4713f20a79f7ba81.
I also adjusted some exp10 ulps definition that was higher than needed.
|
|
|