diff options
author | Adhemerval Zanella <adhemerval.zanella@linaro.org> | 2019-03-18 20:18:49 +0000 |
---|---|---|
committer | Adhemerval Zanella <adhemerval.zanella@linaro.org> | 2019-07-08 17:22:22 -0300 |
commit | 931c616eedc303d48fdd3b05bc063b354a133c74 (patch) | |
tree | e820e2f8c8e7d7d9e0120e147643995b4ed62790 | |
parent | 69461d989669d3da051a2bfdae8d5b0ff3dc0749 (diff) | |
download | glibc-931c616eedc303d48fdd3b05bc063b354a133c74.tar glibc-931c616eedc303d48fdd3b05bc063b354a133c74.tar.gz glibc-931c616eedc303d48fdd3b05bc063b354a133c74.tar.bz2 glibc-931c616eedc303d48fdd3b05bc063b354a133c74.zip |
powerpc: Refactor modf{f}
The modf{f} optimization is not an optimization for ISA 2.07+. This
patch move the IFUNC for powerpc64 only, move the power5+ to generic
location, and include the generic implementation for ISA 2.07+.
The performance changes are based on modf benchtests:
* POWER9 - ppc64
"modf": {
"": {
"duration": 4.97057e+09,
"iterations": 1.00688e+09,
"max": 28.76,
"min": 4.912,
"mean": 4.9366
}
}
* POWER9 - power5+
"modf": {
"": {
"duration": 4.98291e+09,
"iterations": 9.32818e+08,
"max": 15.058,
"min": 5.107,
"mean": 5.34178
}
}
* POWER8 - ppc64
"modf": {
"": {
"duration": 5.05329e+09,
"iterations": 8.38814e+08,
"max": 518.051,
"min": 5.79,
"mean": 6.02433
}
}
* POWER8 - power5+
"modf": {
"": {
"duration": 5.05573e+09,
"iterations": 8.35254e+08,
"max": 63.141,
"min": 5.873,
"mean": 6.05293
}
}
* POWER7 - ppc64
"modf": {
"": {
"duration": 4.89818e+09,
"iterations": 1.08408e+09,
"max": 57.556,
"min": 3.953,
"mean": 4.51827
}
}
* POWER7 - power5+
"modf": {
"": {
"duration": 4.83789e+09,
"iterations": 1.33409e+09,
"max": 46.608,
"min": 2.224,
"mean": 3.62636
}
}
Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).
* sysdeps/powerpc/power5+/fpu/s_modf.c: Move to ...
* sysdeps/powerpc/fpu/s_modf.c: ... here. Add ISA 2.07 optimization.
* sysdeps/powerpc/power5+/fpu/s_modff.c: Move to ...
* sysdeps/powerpc/fpu/s_modff.c: ... here. Add ISA 2.07 optimization.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c:
Adjust include.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c:
Likewise.
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (sysdep_calls,
sysdep_routines): Add s_modf* objects.
(CFLAGS-s_modf-power5+.c, CFLAGS-s_modff-power5+.c,
CFLAGS-s_modf-ppc64.c, CFLAGS-s_modff-ppc64.c): New rule.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Movo
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: Move
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c: ... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c: Move
to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c:
... here.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c: Move to ...
* sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c: ... here.
Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
-rw-r--r-- | ChangeLog | 32 | ||||
-rw-r--r-- | sysdeps/powerpc/fpu/s_modf.c (renamed from sysdeps/powerpc/power5+/fpu/s_modf.c) | 17 | ||||
-rw-r--r-- | sysdeps/powerpc/fpu/s_modff.c (renamed from sysdeps/powerpc/power5+/fpu/s_modff.c) | 13 | ||||
-rw-r--r-- | sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c | 13 | ||||
-rw-r--r-- | sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c | 9 | ||||
-rw-r--r-- | sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile | 19 | ||||
-rw-r--r-- | sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c (renamed from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c) | 3 | ||||
-rw-r--r-- | sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-ppc64.c (renamed from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-ppc64.c) | 0 | ||||
-rw-r--r-- | sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c (renamed from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c) | 0 | ||||
-rw-r--r-- | sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c (renamed from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c) | 3 | ||||
-rw-r--r-- | sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c (renamed from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c) | 0 | ||||
-rw-r--r-- | sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c (renamed from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c) | 0 | ||||
-rw-r--r-- | sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile | 13 |
13 files changed, 80 insertions, 42 deletions
@@ -1,5 +1,37 @@ 2019-07-08 Adhemerval Zanella <adhemerval.zanella@linaro.org> + * sysdeps/powerpc/power5+/fpu/s_modf.c: Move to ... + * sysdeps/powerpc/fpu/s_modf.c: ... here. Add ISA 2.07 optimization. + * sysdeps/powerpc/power5+/fpu/s_modff.c: Move to ... + * sysdeps/powerpc/fpu/s_modff.c: ... here. Add ISA 2.07 optimization. + * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c: + Adjust include. + * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c: + Likewise. + * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (sysdep_calls, + sysdep_routines): Add s_modf* objects. + (CFLAGS-s_modf-power5+.c, CFLAGS-s_modff-power5+.c, + CFLAGS-s_modf-ppc64.c, CFLAGS-s_modff-ppc64.c): New rule. + * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Move + to ... + * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: + ... here. + * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Movo + to ... + * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: Move + ... here. + * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c: Move to ... + * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c: ... here. + * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c: Move + to ... + * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c: + ... here. + * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c: Move to ... + * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c: + ... here. + * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c: Move to ... + * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c: ... here. + * sysdeps/powerpc/fpu/e_hypot.c (two60, two500, two600, two1022, twoM500, twoM600, two60factor, pdnum): Remove. (TEST_INFO_NAN, GET_TW0_HIGH_WORD): Remove macro. diff --git a/sysdeps/powerpc/power5+/fpu/s_modf.c b/sysdeps/powerpc/fpu/s_modf.c index dbb11652e1..2304fc48ed 100644 --- a/sysdeps/powerpc/power5+/fpu/s_modf.c +++ b/sysdeps/powerpc/fpu/s_modf.c @@ -15,9 +15,15 @@ License along with the GNU C Library; see the file COPYING.LIB. If not, see <http://www.gnu.org/licenses/>. */ -#include <math.h> -#include <math_ldbl_opt.h> -#include <libm-alias-double.h> +/* ISA 2.07 provides fast GPR to FP instruction (mfvsr{d,wz}) which make + generic implementation faster. Also disables for old ISAs that do not + have ceil/floor instructions. */ +#if defined(_ARCH_PWR8) || !defined(_ARCH_PWR5X) +# include <sysdeps/ieee754/ldbl-opt/s_modf.c> +#else +# include <math.h> +# include <math_ldbl_opt.h> +# include <libm-alias-double.h> double __modf (double x, double *iptr) @@ -44,7 +50,10 @@ __modf (double x, double *iptr) return copysign (x - *iptr, x); } } +# ifndef __modf libm_alias_double (__modf, modf) -#if LONG_DOUBLE_COMPAT (libc, GLIBC_2_0) +# if LONG_DOUBLE_COMPAT (libc, GLIBC_2_0) compat_symbol (libc, __modf, modfl, GLIBC_2_0); +# endif +# endif #endif diff --git a/sysdeps/powerpc/power5+/fpu/s_modff.c b/sysdeps/powerpc/fpu/s_modff.c index 87c9f020f7..2a0f114b20 100644 --- a/sysdeps/powerpc/power5+/fpu/s_modff.c +++ b/sysdeps/powerpc/fpu/s_modff.c @@ -15,8 +15,14 @@ License along with the GNU C Library; see the file COPYING.LIB. If not, see <http://www.gnu.org/licenses/>. */ -#include <math.h> -#include <libm-alias-float.h> +/* ISA 2.07 provides fast GPR to FP instruction (mfvsr{d,wz}) which make + generic implementation faster. Also disables for old ISAs that do not + have ceil/floor instructions. */ +#if defined(_ARCH_PWR8) || !defined(_ARCH_PWR5X) +# include <sysdeps/ieee754/flt-32/s_modff.c> +#else +# include <math.h> +# include <libm-alias-float.h> float __modff (float x, float *iptr) @@ -43,4 +49,7 @@ __modff (float x, float *iptr) return copysignf (x - *iptr, x); } } +# ifndef __modff libm_alias_float (__modf, modf) +# endif +#endif diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c index b1d0540b31..6f93c2b652 100644 --- a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c +++ b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c @@ -16,16 +16,5 @@ License along with the GNU C Library; if not, see <http://www.gnu.org/licenses/>. */ -#include <math.h> -#include <math_ldbl_opt.h> - -#undef weak_alias -#define weak_alias(a,b) -#undef strong_alias -#define strong_alias(a,b) -#undef compat_symbol -#define compat_symbol(a,b,c,d) - #define __modf __modf_power5plus - -#include <sysdeps/powerpc/power5+/fpu/s_modf.c> +#include <sysdeps/powerpc/fpu/s_modf.c> diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c index 8b333eae0d..2e701881e8 100644 --- a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c +++ b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c @@ -16,12 +16,5 @@ License along with the GNU C Library; if not, see <http://www.gnu.org/licenses/>. */ -#include <math.h> -#include <math_ldbl_opt.h> - -#undef weak_alias -#define weak_alias(a,b) - #define __modff __modff_power5plus - -#include <sysdeps/powerpc/power5+/fpu/s_modff.c> +#include <sysdeps/powerpc/fpu/s_modff.c> diff --git a/sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile index f542e89520..f5fa357d57 100644 --- a/sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile @@ -1,4 +1,13 @@ ifeq ($(subdir),math) +# These functions are built both for libc and libm because they're required +# by printf. While the libc objects have the prefix s_, the libm ones are +# prefixed with m_. +sysdep_calls := s_modf-power5+ \ + s_modf-ppc64 \ + s_modff-power5+ \ + s_modff-ppc64 + +sysdep_routines += $(sysdep_calls) libm-sysdep_routines += s_ceil-power5+ \ s_ceil-ppc64 \ s_ceilf-power5+ \ @@ -22,7 +31,8 @@ libm-sysdep_routines += s_ceil-power5+ \ s_llround-power6x \ s_llround-power5+ \ s_llround-ppc64 \ - s_llroundf-ppc64 + s_llroundf-ppc64 \ + $(sysdep_calls:s_%=m_%) CFLAGS-s_ceil-power5+.c = -mcpu=power5+ CFLAGS-s_ceilf-power5+.c = -mcpu=power5+ @@ -37,4 +47,11 @@ CFLAGS-s_llrint-power6x.c += -mcpu=power6x CFLAGS-s_llround-power8.c += -mcpu=power8 CFLAGS-s_llround-power6x.c += -mcpu=power6x CFLAGS-s_llround-power5+.c += -mcpu=power5+ + +CFLAGS-s_modf-power5+.c += -mcpu=power5+ +CFLAGS-s_modff-power5+.c += -mcpu=power5+ +# These files quiet sNaNs in a way that is optimized away without +# -fsignaling-nans. +CFLAGS-s_modf-ppc64.c += -fsignaling-nans +CFLAGS-s_modff-ppc64.c += -fsignaling-nans endif diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c index 1a958de178..6f93c2b652 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c +++ b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c @@ -16,4 +16,5 @@ License along with the GNU C Library; if not, see <http://www.gnu.org/licenses/>. */ -#include <sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c> +#define __modf __modf_power5plus +#include <sysdeps/powerpc/fpu/s_modf.c> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-ppc64.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-ppc64.c index 06f0e87281..06f0e87281 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-ppc64.c +++ b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-ppc64.c diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c index af57aaaf85..af57aaaf85 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c +++ b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c index 4939d4bc2b..2e701881e8 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c +++ b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c @@ -16,4 +16,5 @@ License along with the GNU C Library; if not, see <http://www.gnu.org/licenses/>. */ -#include <sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c> +#define __modff __modff_power5plus +#include <sysdeps/powerpc/fpu/s_modff.c> diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c index 5af1887d3b..5af1887d3b 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c +++ b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c index f048f4abcf..f048f4abcf 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c +++ b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile index 534d5a7133..d7ad1e2724 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile @@ -1,10 +1,4 @@ ifeq ($(subdir),math) -# These functions are built both for libc and libm because they're required -# by printf. While the libc objects have the prefix s_, the libm ones are -# prefixed with m_. -sysdep_calls := s_modf-power5+ s_modf-ppc64 \ - s_modff-power5+ s_modff-ppc64 - sysdep_routines += $(sysdep_calls) libm-sysdep_routines += s_logb-power7 s_logbf-power7 \ s_logbl-power7 s_logb-ppc64 s_logbf-ppc64 \ @@ -14,11 +8,4 @@ libm-sysdep_routines += s_logb-power7 s_logbf-power7 \ CFLAGS-s_logbf-power7.c = -mcpu=power7 CFLAGS-s_logbl-power7.c = -mcpu=power7 CFLAGS-s_logb-power7.c = -mcpu=power7 -CFLAGS-s_modf-power5+.c = -mcpu=power5+ -CFLAGS-s_modff-power5+.c = -mcpu=power5+ - -# These files quiet sNaNs in a way that is optimized away without -# -fsignaling-nans. -CFLAGS-s_modf-ppc64.c += -fsignaling-nans -CFLAGS-s_modff-ppc64.c += -fsignaling-nans endif |