aboutsummaryrefslogtreecommitdiff
path: root/math
AgeCommit message (Collapse)Author
2023-12-19x86: Do not raises floating-point exception traps on fesetexceptflag (BZ 30990)Bruno Haible
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). The flags can be set in the 387 unit or in the SSE unit. When we need to clear a flag, we need to do so in both units, due to the way fetestexcept is implemented. When we need to set a flag, it is sufficient to do it in the SSE unit, because that is guaranteed to not trap. However, on i386 CPUs that have only a 387 unit, set the flags in the 387, as long as this cannot trap. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19i686: Do not raise exception traps on fesetexcept (BZ 30989)Adhemerval Zanella
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). The flags can be set in the 387 unit or in the SSE unit. To set a flag, it is sufficient to do it in the SSE unit, because that is guaranteed to not trap. However, on i386 CPUs that have only a 387 unit, set the flags in the 387, as long as this cannot trap. Checked on i686-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-12-19powerpc: Do not raise exception traps for fesetexcept/fesetexceptflag (BZ 30988)Adhemerval Zanella
According to ISO C23 (7.6.4.4), fesetexcept is supposed to set floating-point exception flags without raising a trap (unlike feraiseexcept, which is supposed to raise a trap if feenableexcept was called with the appropriate argument). This is a side-effect of how we implement the GNU extension feenableexcept, where feenableexcept/fesetenv/fesetmode/feupdateenv might issue prctl (PR_SET_FPEXC, PR_FP_EXC_PRECISE) depending of the argument. And on PR_FP_EXC_PRECISE, setting a floating-point exception flag triggers a trap. To make the both functions follow the C23, fesetexcept and fesetexceptflag now fail if the argument may trigger a trap. The math tests now check for an value different than 0, instead of bail out as unsupported for EXCEPTION_SET_FORCES_TRAP. Checked on powerpc64le-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-11-20aarch64: Add vector implementations of expm1 routinesJoe Ramsay
May discard sign of 0 - auto tests for -0 and -0x1p-10000 updated accordingly.
2023-11-10aarch64: Add vector implementations of log1p routinesJoe Ramsay
May discard sign of zero.
2023-10-23aarch64: Add vector implementations of tan routinesJoe Ramsay
This includes some utility headers for evaluating polynomials using various schemes.
2023-09-18math: Add a no-mathvec flag for sin (-0.0)Wilco Dijkstra
Add support for a no-mathvec flag to gen-auto-libm-tests.c. Update input test sin (-0.0) to be skipped in vector math libraries and regenerate testcases. Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2023-06-30Stop applying a GCC-specific workaround on clang [BZ #30550]Tulio Magno Quites Machado Filho
GCC was the only compiler affected by the issue with __builtin_isinf_sign and float128. Fix BZ #30550. Reported-by: Qiu Chaofan <qiucofan@cn.ibm.com> Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-06-02Fix all the remaining misspellings -- BZ 25337Paul Pluzhnikov
2023-04-03math: Remove the error handling wrapper from fmod and fmodfAdhemerval Zanella Netto
The error handling is moved to sysdeps/ieee754 version with no SVID support. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). The ia64 is unchanged, since it still uses the arch specific __libm_error_region on its implementation. For both i686 and m68k, which provive arch specific implementation, wrappers are added so no new symbol are added (which would require to change the implementations). It shows an small improvement, the results for fmod: Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 12.5049 | 9.40992 x86_64 (Ryzen 9) | normal | 296.939 | 296.738 x86_64 (Ryzen 9) | close-exponents | 16.0244 | 13.119 aarch64 (N1) | subnormal | 6.81778 | 4.33313 aarch64 (N1) | normal | 155.620 | 152.915 aarch64 (N1) | close-exponents | 8.21306 | 5.76138 armhf (N1) | subnormal | 15.1083 | 14.5746 armhf (N1) | normal | 244.833 | 241.738 armhf (N1) | close-exponents | 21.8182 | 22.457 Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-04-03math: Improve fmodfAdhemerval Zanella Netto
This uses a new algorithm similar to already proposed earlier [1]. With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers), the simplest implementation is: mx * 2^ex == 2 * mx * 2^(ex - 1) while (ex > ey) { mx *= 2; --ex; mx %= my; } With mx/my being mantissa of double floating pointer, on each step the argument reduction can be improved 8 (which is sizeof of uint32_t minus MANTISSA_WIDTH plus the signal bit): while (ex > ey) { mx << 8; ex -= 8; mx %= my; } */ The implementation uses builtin clz and ctz, along with shifts to convert hx/hy back to doubles. Different than the original patch, this path assume modulo/divide operation is slow, so use multiplication with invert values. I see the following performance improvements using fmod benchtests (result only show the 'mean' result): Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 17.2549 | 12.0318 x86_64 (Ryzen 9) | normal | 85.4096 | 49.9641 x86_64 (Ryzen 9) | close-exponents | 19.1072 | 15.8224 aarch64 (N1) | subnormal | 10.2182 | 6.81778 aarch64 (N1) | normal | 60.0616 | 20.3667 aarch64 (N1) | close-exponents | 11.5256 | 8.39685 I also see similar improvements on arm-linux-gnueabihf when running on the N1 aarch64 chips, where it a lot of soft-fp implementation (for modulo, and multiplication): Architecture | Input | master | patch -----------------|-----------------|----------|-------- armhf (N1) | subnormal | 11.6662 | 10.8955 armhf (N1) | normal | 69.2759 | 34.1524 armhf (N1) | close-exponents | 13.6472 | 18.2131 Instead of using the math_private.h definitions, I used the math_config.h instead which is used on newer math implementations. Co-authored-by: kirill <kirill.okhotnikov@gmail.com> [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-04-03math: Improve fmodAdhemerval Zanella Netto
This uses a new algorithm similar to already proposed earlier [1]. With x = mx * 2^ex and y = my * 2^ey (mx, my, ex, ey being integers), the simplest implementation is: mx * 2^ex == 2 * mx * 2^(ex - 1) while (ex > ey) { mx *= 2; --ex; mx %= my; } With mx/my being mantissa of double floating pointer, on each step the argument reduction can be improved 11 (which is sizeo of uint64_t minus MANTISSA_WIDTH plus the signal bit): while (ex > ey) { mx << 11; ex -= 11; mx %= my; } */ The implementation uses builtin clz and ctz, along with shifts to convert hx/hy back to doubles. Different than the original patch, this path assume modulo/divide operation is slow, so use multiplication with invert values. I see the following performance improvements using fmod benchtests (result only show the 'mean' result): Architecture | Input | master | patch -----------------|-----------------|----------|-------- x86_64 (Ryzen 9) | subnormals | 19.1584 | 12.5049 x86_64 (Ryzen 9) | normal | 1016.51 | 296.939 x86_64 (Ryzen 9) | close-exponents | 18.4428 | 16.0244 aarch64 (N1) | subnormal | 11.153 | 6.81778 aarch64 (N1) | normal | 528.649 | 155.62 aarch64 (N1) | close-exponents | 11.4517 | 8.21306 I also see similar improvements on arm-linux-gnueabihf when running on the N1 aarch64 chips, where it a lot of soft-fp implementation (for modulo, clz, ctz, and multiplication): Architecture | Input | master | patch -----------------|-----------------|----------|-------- armhf (N1) | subnormal | 15.908 | 15.1083 armhf (N1) | normal | 837.525 | 244.833 armhf (N1) | close-exponents | 16.2111 | 21.8182 Instead of using the math_private.h definitions, I used the math_config.h instead which is used on newer math implementations. Co-authored-by: kirill <kirill.okhotnikov@gmail.com> [1] https://sourceware.org/pipermail/libc-alpha/2020-November/119794.html Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2023-02-14update auto-libm-test-out-hypotPaul Zimmermann
This change was forgotten in commit cf7ffdd. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2023-02-14added pair of inputs for hypotf in binary32Paul Zimmermann
This pair yields an error of 1 ulp in binary32, whereas the current maximal known error for hypotf on x86_64 is zero: Checking hypot with glibc-2.37 hypot 0 -1 -0x1.003222p-20,-0x1.6a2d58p-32 [0.501] 0.500001 0.500000001392678 libm gives 0x1.003224p-20 mpfr gives 0x1.003222p-20 See https://sourceware.org/pipermail/libc-alpha/2023-February/145432.html and https://sourceware.org/pipermail/libc-alpha/2023-February/145442.html Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-01-06Update copyright dates with scripts/update-copyrightsJoseph Myers
2023-01-06C2x semantics for <tgmath.h>Joseph Myers
<tgmath.h> implements semantics for integer generic arguments that handle cases involving _FloatN / _FloatNx types as specified in TS 18661-3 plus some defect fixes. C2x has further changes to the semantics for <tgmath.h> macros with such types, which should also be considered defect fixes (although handled through the integration of TS 18661-3 in C2x rather than through an issue tracking process). Specifically, the rules were changed because of problems raised with using the macros with the evaluation format types such as float_t and _Float32_t: the older version of the rules didn't allow passing _FloatN / _FloatNx types to the narrowing macros returning float or double, or passing float / double / long double to the narrowing macros returning _FloatN / _FloatNx, which was a problem with the evaluation format types which could be either kind of type depending on the value of FLT_EVAL_METHOD. Thus the new rules allow cases of mixing types which were not allowed before, and, as part of the changes, the handling of integer arguments was also changed: if there is any _FloatNx generic argument, integer generic arguments are treated as _Float32x (not double), while the rule about treating integer arguments to narrowing macros returning _FloatN or _FloatNx as _Float64 not double was removed (no longer needed now double is a valid argument to such macros). I've implemented the changes in GCC's __builtin_tgmath, which thus requires updates to glibc's test expectations so that the tests continue to build with GCC 13 (the test is also updated to test the argument types that weren't allowed before but are now valid under C2x rules). Given those test changes, it's then also necessary to fix the implementations in <tgmath.h> to have appropriate semantics with older GCC so that the tests pass with GCC versions before GCC 13 as well. For some cases (non-narrowing macros with two or three generic arguments; narrowing macros returning _Float32x), the older version of __builtin_tgmath doesn't correspond sufficiently well to C2x semantics, so in those cases <tgmath.h> is adjusted to use the older macro implementation instead of __builtin_tgmath. The older macro implementation is itself adjusted to give the desired semantics, with GCC 7 and later. (It's not possible to get the right semantics in all cases for the narrowing macros with GCC 6 and before when the _FloatN / _FloatNx names are typedefs rather than distinct types.) Tested as follows: with the full glibc testsuite for x86_64, GCC 6, 7, 11, 13; with execution of the math/tests for aarch64, arm, powerpc and powerpc64le, GCC 6, 7, 12 and 13 (powerpc64le only with GCC 12 and 13); with build-many-glibcs.py with GCC 6, 7, 12 and 13.
2022-11-01Disable use of -fsignaling-nans if compiler does not support itAdhemerval Zanella
Reviewed-by: Fangrui Song <maskray@google.com>
2022-10-31Fix build with GCC 13 _FloatN, _FloatNx built-in functionsJoseph Myers
GCC 13 has added more _FloatN and _FloatNx versions of existing <math.h> and <complex.h> built-in functions, for use in libstdc++-v3. This breaks the glibc build because of how those functions are defined as aliases to functions with the same ABI but different types. Add appropriate -fno-builtin-* options for compiling relevant files, as already done for the case of long double functions aliasing double ones and based on the list of files used there. I fixed some mistakes in that list of double files that I noticed while implementing this fix, but there may well be more such (harmless) cases, in this list or the new one (files that don't actually exist or don't define the named functions as aliases so don't need the options). I did try to exclude cases where glibc doesn't define certain functions for _FloatN or _FloatNx types at all from the new uses of -fno-builtin-* options. As with the options for double files (see the commit message for commit 49348beafe9ba150c9bd48595b3f372299bddbb0, "Fix build with GCC 10 when long double = double."), it's deliberate that the options are used even if GCC currently doesn't have a built-in version of a given functions, so providing some level of future-proofing against more such built-in functions being added in future. Tested with build-many-glibcs.py for aarch64-linux-gnu powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu (compilers and glibcs builds) with GCC mainline.
2022-09-30Fix iseqsig for _FloatN and _FloatNx in C++ with GCC 13Joseph Myers
With GCC 13, _FloatN and _FloatNx types, when they exist, are distinct types like they are in C with GCC 7 and later, rather than typedefs for types such as float, double or long double. This breaks the templated iseqsig implementation for C++ in <math.h>, when used with types that were previously implemented as aliases. Add the necessary definitions for _Float32, _Float64, _Float128 (when the same format as long double), _Float32x and _Float64x in this case, so that iseqsig can be used with such types in C++ with GCC 13 as it could with previous GCC versions. Also add tests for calling iseqsig in C++ with arguments of such types (more minimal than existing tests, so that they can work with older GCC versions and without relying on any C++ library support for the types or on hardcoding details of their formats). The LDBL_MANT_DIG != 106 conditionals on some tests are because the type-generic comparison macros have undefined behavior when neither argument has a type whose set of values is a subset of those for the type of the other argument, which applies when one argument is IBM long double and the other is an IEEE format wider than binary64. Tested with build-many-glibcs.py glibcs build for aarch64-linux-gnu i686-linux-gnu mips-linux-gnu mips64-linux-gnu-n32 powerpc-linux-gnu powerpc64le-linux-gnu x86_64-linux-gnu.
2022-09-22Use '%z' instead of '%Z' on printf functionsAdhemerval Zanella Netto
The Z modifier is a nonstandard synonymn for z (that predates z itself) and compiler might issue an warning for in invalid conversion specifier. Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-02-24math: Add more input to atanh accuracy testsSunil K Pandey
This patch adds following input to atanh accuracy test. 0x1.f80094p-8 Tested on x86-64 and i686 platforms. Other platforms may have to regenerate ulps file. Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2022-01-14math: Add more inputs to atan2 accuracy tests [BZ #28765]Sunil K Pandey
This patch adds following inputs: 0x1.bcab29da0e947p-54 0x1.bc41f4d2294b8p-54 0x1.a11891ec004d4p-348 0x1.814830510be26p-348 0x1.b836ed678be29p-588 0x1.b7be6f5a03a8cp-588 0x1.a83f842ef3f73p-633 0x1.a799d8a6677ep-633 to atan2 tests and updates x86_64 double atan2 ulps. This fixes BZ #28765. Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2022-01-10math: Fix float conversion regressions with gcc-12 [BZ #28713]Szabolcs Nagy
Converting double precision constants to float is now affected by the runtime dynamic rounding mode instead of being evaluated at compile time with default rounding mode (except static object initializers). This can change the computed result and cause performance regression. The known correctness issues (increased ulp errors) are already fixed, this patch fixes remaining cases of unnecessary runtime conversions. Add float M_* macros to math.h as new GNU extension API. To avoid conversions the new M_* macros are used and instead of casting double literals to float, use float literals (only required if the conversion is inexact). The patch was tested on aarch64 where the following symbols had new spurious conversion instructions that got fixed: __clog10f __gammaf_r_finite@GLIBC_2.17 __j0f_finite@GLIBC_2.17 __j1f_finite@GLIBC_2.17 __jnf_finite@GLIBC_2.17 __kernel_casinhf __lgamma_negf __log1pf __y0f_finite@GLIBC_2.17 __y1f_finite@GLIBC_2.17 cacosf cacoshf casinhf catanf catanhf clogf gammaf_positive Fixes bug 28713. Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2022-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: *** 912-#endif remote: *** 913: remote: *** 914- remote: *** error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2021-12-30x86-64: Add vector tan/tanf implementation to libmvecSunil K Pandey
Implement vectorized tan/tanf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector tan/tanf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-30x86-64: Add vector erfc/erfcf implementation to libmvecSunil K Pandey
Implement vectorized erfc/erfcf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector erfc/erfcf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector asinh/asinhf implementation to libmvecSunil K Pandey
Implement vectorized asinh/asinhf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector asinh/asinhf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector tanh/tanhf implementation to libmvecSunil K Pandey
Implement vectorized tanh/tanhf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector tanh/tanhf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector erf/erff implementation to libmvecSunil K Pandey
Implement vectorized erf/erff containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector erf/erff with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector acosh/acoshf implementation to libmvecSunil K Pandey
Implement vectorized acosh/acoshf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector acosh/acoshf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector atanh/atanhf implementation to libmvecSunil K Pandey
Implement vectorized atanh/atanhf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector atanh/atanhf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector log1p/log1pf implementation to libmvecSunil K Pandey
Implement vectorized log1p/log1pf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector log1p/log1pf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector log2/log2f implementation to libmvecSunil K Pandey
Implement vectorized log2/log2f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector log2/log2f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector log10/log10f implementation to libmvecSunil K Pandey
Implement vectorized log10/log10f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector log10/log10f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector atan2/atan2f implementation to libmvecSunil K Pandey
Implement vectorized atan2/atan2f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector atan2/atan2f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector cbrt/cbrtf implementation to libmvecSunil K Pandey
Implement vectorized cbrt/cbrtf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector cbrt/cbrtf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector sinh/sinhf implementation to libmvecSunil K Pandey
Implement vectorized sinh/sinhf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector sinh/sinhf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector expm1/expm1f implementation to libmvecSunil K Pandey
Implement vectorized expm1/expm1f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector expm1/expm1f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector cosh/coshf implementation to libmvecSunil K Pandey
Implement vectorized cosh/coshf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector cosh/coshf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector exp10/exp10f implementation to libmvecSunil K Pandey
Implement vectorized exp10/exp10f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector exp10/exp10f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector exp2/exp2f implementation to libmvecSunil K Pandey
Implement vectorized exp2/exp2f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector exp2/exp2f with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector hypot/hypotf implementation to libmvecSunil K Pandey
Implement vectorized hypot/hypotf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector hypot/hypotf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector asin/asinf implementation to libmvecSunil K Pandey
Implement vectorized asin/asinf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector asin/asinf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29x86-64: Add vector atan/atanf implementation to libmvecSunil K Pandey
Implement vectorized atan/atanf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector atan/atanf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-23math: Properly cast X_TLOSS to float [BZ #28713]H.J. Lu
Add #define AS_FLOAT_CONSTANT_1(x) x##f #define AS_FLOAT_CONSTANT(x) AS_FLOAT_CONSTANT_1(x) to cast X_TLOSS to float at compile-time to fix: FAIL: math/test-float-j0 FAIL: math/test-float-jn FAIL: math/test-float-y0 FAIL: math/test-float-y1 FAIL: math/test-float-yn FAIL: math/test-float32-j0 FAIL: math/test-float32-jn FAIL: math/test-float32-y0 FAIL: math/test-float32-y1 FAIL: math/test-float32-yn when compiling with GCC 12. Reviewed-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
2021-12-22x86-64: Add vector acos/acosf implementation to libmvecSunil K Pandey
Implement vectorized acos/acosf containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector acos/acosf with regenerated ulps. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-13math: Remove the error handling wrapper from hypot and hypotfAdhemerval Zanella
The error handling is moved to sysdeps/ieee754 version with no SVID support. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). Only ia64 is unchanged, since it still uses the arch specific __libm_error_region on its implementation. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu.
2021-12-13math: Add math-use-builtinds-fmin.hAdhemerval Zanella
It allows the architecture to use the builtin instead of generic implementation.
2021-12-13math: Add math-use-builtinds-fmax.hAdhemerval Zanella
It allows the architecture to use the builtin instead of generic implementation.
2021-10-06math: Also xfail the new j0f tests for ibm128-libgccAdhemerval Zanella
From commit 6bbf7298323bf31b. Checked on powerpc64-linux-gnu.