Age | Commit message (Collapse) | Author |
|
|
|
Besides the option being gcc specific, this approach is still fragile
and not future proof since we do not know if this will be the only
optimization option gcc will add that transforms loops to memset
(or any libcall).
This patch adds a new header, dl-symbol-redir-ifunc.h, that can b
used to redirect the compiler generated libcalls to port the generic
memset implementation if required.
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
The builtin from generic code generates similar compliant sequence.
Checked on sparc64-linux-gnu.
|
|
For sparc64 is the same as the generic implementation, while for
sparc32 the builtin generates the same code.
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
|
|
The symbol is not present in current POSIX specification and compiler
already generates memset call.
|
|
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
|
|
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.
The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well. As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma. The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.
The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf). Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.
This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done). (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)
Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float). The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
|
|
include/math.h has a mechanism to redirect internal calls to various
libm functions, that can often be inlined by the compiler, to call
non-exported __* names for those functions in the case when the calls
aren't inlined, with the redirection being disabled when
NO_MATH_REDIRECT. Add fma to the functions to which this mechanism is
applied.
At present, libm-internal fma calls (generally to __builtin_fma*
functions) are only done when it's known the call will be inlined,
with alternative code not relying on an fma operation being used in
the caller otherwise. This patch is in preparation for adding the TS
18661 / C2X narrowing fma functions to glibc; it will be natural for
the narrowing function implementations to call the underlying fma
functions unconditionally, with this either being inlined or resulting
in an __fma* call. (Using two levels of round-to-odd computation like
that, in the case where there isn't an fma hardware instruction, isn't
optimal but is certainly a lot simpler for the initial implementation
than writing different narrowing fma implementations for all the
various pairs of formats.)
Tested with build-many-glibcs.py that installed stripped shared
libraries are unchanged by the patch (using
<https://sourceware.org/pipermail/libc-alpha/2021-September/130991.html>
to fix installed library stripping in build-many-glibcs.py). Also
tested for x86_64.
|
|
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date. Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.
Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions. These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.
The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively. These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:
https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc
https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
|
|
|
|
This patch removes the arch-specific atomic instruction, relying on
compiler builtins. The __sparc32_atomic_locks support is removed
and a configure check is added to check if compiler uses libatomic
to implement CAS.
It also removes the sparc specific sem_* and pthread_barrier_*
implementations. It in turn allows buidling against a LEON3/LEON4
sparcv8 target, although it will still be incompatible with generic
sparcv9.
Checked on sparcv9-linux-gnu and sparc64-linux-gnu. I also checked
with build against sparcv8-linux-gnu with -mcpu=leon3.
Tested-by: Andreas Larsson <andreas@gaisler.com>
|
|
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:
sed -ri '
s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
$(find $(git ls-files) -prune -type f \
! -name '*.po' \
! -name 'ChangeLog*' \
! -path COPYING ! -path COPYING.LIB \
! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
! -path manual/texinfo.tex ! -path scripts/config.guess \
! -path scripts/config.sub ! -path scripts/install-sh \
! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
! -path INSTALL ! -path locale/programs/charmap-kw.h \
! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
! '(' -name configure \
-execdir test -f configure.ac -o -f configure.in ';' ')' \
! '(' -name preconfigure \
-execdir test -f preconfigure.ac ';' ')' \
-print)
and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:
chmod a+x sysdeps/unix/sysv/linux/riscv/configure
# Omit irrelevant whitespace and comment-only changes,
# perhaps from a slightly-different Autoconf version.
git checkout -f \
sysdeps/csky/configure \
sysdeps/hppa/configure \
sysdeps/riscv/configure \
sysdeps/unix/sysv/linux/csky/configure
# Omit changes that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
git checkout -f \
sysdeps/powerpc/powerpc64/ppc-mcount.S \
sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
# Omit change that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
|
|
This patch refactor how hp-timing is used on loader code for statistics
report. The HP_TIMING_AVAIL and HP_SMALL_TIMING_AVAIL are removed and
HP_TIMING_INLINE is used instead to check for hp-timing avaliability.
For alpha, which only defines HP_SMALL_TIMING_AVAIL, the HP_TIMING_INLINE
is set iff for IS_IN(rtld).
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also
checked the builds for all afected ABIs.
* benchtests/bench-timing.h: Replace HP_TIMING_AVAIL with
HP_TIMING_INLINE.
* nptl/descr.h: Likewise.
* elf/rtld.c (RLTD_TIMING_DECLARE, RTLD_TIMING_NOW, RTLD_TIMING_DIFF,
RTLD_TIMING_ACCUM_NT, RTLD_TIMING_SET): Define.
(dl_start_final_info, _dl_start_final, dl_main, print_statistics):
Abstract hp-timing usage with RTLD_* macros.
* sysdeps/alpha/hp-timing.h (HP_TIMING_INLINE): Define iff IS_IN(rtld).
(HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Remove.
* sysdeps/generic/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL,
HP_TIMING_NONAVAIL): Likewise.
* sysdeps/ia64/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
Likewise.
* sysdeps/powerpc/powerpc32/power4/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/powerpc/powerpc64/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/sparc/sparc32/sparcv9/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/sparc/sparc64/hp-timing.h (HP_TIMING_AVAIL,
HP_SMALL_TIMING_AVAIL): Likewise.
* sysdeps/x86/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL):
Likewise.
* sysdeps/generic/hp-timing-common.h: Update comment with
HP_TIMING_AVAIL removal.
|
|
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __copysign functions to call
the corresponding copysign names instead, with asm redirection to
__copysign when the calls are not inlined (all cases are inlined
except for IBM long double for powerpc soft-float / e500v1). This
eliminates the need for an inline function defining __copysign in
terms of __builtin_copysign.
Tested for x86_64, and with build-many-glibcs.py.
* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT]
(MATH_REDIRECT_BINARY_ARGS): New macro.
[!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
&& !NO_MATH_REDIRECT] (copysign): Redirect using MATH_REDIRECT.
* sysdeps/alpha/fpu/s_copysign.c: Define NO_MATH_REDIRECT before
header inclusion.
* sysdeps/alpha/fpu/s_copysignf.c: Likewise.
* sysdeps/ieee754/dbl-64/s_copysign.c: Likewise.
* sysdeps/ieee754/float128/s_copysignf128.c: Likewise.
* sysdeps/ieee754/flt-32/s_copysignf.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_copysignl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_copysignl.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c:
Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise.
* sysdeps/riscv/rvd/s_copysign.c: Likewise.
* sysdeps/riscv/rvf/s_copysignf.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.c:
Likewise.
* sysdeps/generic/math_private_calls.h
[!__MATH_DECLARING_LONG_DOUBLE || !NO_LONG_DOUBLE] (__copysign):
Do not declare and define as an inline function.
* math/divtc3.c (__divtc3): Use copysign functions instead of
__copysign variants.
* math/multc3.c (__multc3): Likewise.
* sysdeps/generic/math-type-macros.h (M_COPYSIGN): Likewise.
* sysdeps/ieee754/dbl-64/e_atan2.c (signArctan2): Likewise.
* sysdeps/ieee754/dbl-64/e_atanh.c (__ieee754_atanh): Likewise.
* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r):
Likewise.
* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise.
(__ieee754_yn): Likewise.
* sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise.
* sysdeps/ieee754/dbl-64/s_atan.c (__signArctan): Likewise.
* sysdeps/ieee754/dbl-64/s_scalbln.c (__scalbln): Likewise.
* sysdeps/ieee754/dbl-64/s_scalbn.c (__scalbn): Likewise.
* sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Likewise.
(__sin): Likewise.
* sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint):
Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_scalbln.c (__scalbln):
Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (__scalbn):
Likewise.
* sysdeps/ieee754/flt-32/e_atanhf.c (__ieee754_atanhf): Likewise.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
Likewise.
* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise.
(__ieee754_ynf): Likewise.
* sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise.
* sysdeps/ieee754/flt-32/s_scalbnf.c (__scalbnf): Likewise.
* sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
* sysdeps/ieee754/ldbl-128/s_scalblnl.c (__scalblnl): Likewise.
* sysdeps/ieee754/ldbl-128/s_scalbnl.c (__scalbnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fmal.c (__fmal): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl)
* sysdeps/ieee754/ldbl-96/s_asinhl.c (__asinhl): Likewise.
* sysdeps/ieee754/ldbl-96/s_scalblnl.c (__scalblnl): Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-copysign.c (copysignl): Likewise.
* sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise.
* sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
|
|
Continuing the move to use, within libm, public names for libm
functions that can be inlined as built-in functions on many
architectures, this patch moves calls to __rint functions to call the
corresponding rint names instead, with asm redirection to __rint when
the calls are not inlined. The x86_64 math_private.h is removed as no
longer useful after this patch.
This patch is relative to a tree with my floor patch
<https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied,
and much the same considerations arise regarding possibly replacing an
IFUNC call with a direct inline expansion.
Tested for x86_64, and with build-many-glibcs.py.
* include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ &&
__FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect
using MATH_REDIRECT.
* sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before
header inclusion.
* sysdeps/aarch64/fpu/s_rintf.c: Likewise.
* sysdeps/alpha/fpu/s_rint.c: Likewise.
* sysdeps/alpha/fpu/s_rintf.c: Likewise.
* sysdeps/i386/fpu/s_rintl.c: Likewise.
* sysdeps/ieee754/dbl-64/s_rint.c: Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise.
* sysdeps/ieee754/float128/s_rintf128.c: Likewise.
* sysdeps/ieee754/flt-32/s_rintf.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise.
* sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise.
* sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise.
* sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise.
* sysdeps/powerpc/fpu/s_rint.c: Likewise.
* sysdeps/powerpc/fpu/s_rintf.c: Likewise.
* sysdeps/riscv/rv64/rvd/s_rint.c: Likewise.
* sysdeps/riscv/rvf/s_rintf.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise.
* sysdeps/x86_64/fpu/math_private.h: Remove file.
* math/e_scalb.c (invalid_fn): Use rint functions instead of
__rint variants.
* math/e_scalbf.c (invalid_fn): Likewise.
* math/e_scalbl.c (invalid_fn): Likewise.
* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r):
Likewise.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
Likewise.
* sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise.
* sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise.
* sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.
|
|
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
Support added to identify Sparc M7/T7/S7/M8/T8 processor capability.
Performance tests run on Sparc S7 using new code and old niagara4 code.
Optimizations for memset also apply to bzero as they share code.
For memset/bzero, performance comparison with niagara4 code:
For memset nonzero data,
256-1023 bytes - 60-90% gain (in cache); 5% gain (out of cache)
1K+ bytes - 80-260% gain (in cache); 40-80% gain (out of cache)
For memset zero data (and bzero),
256-1023 bytes - 80-120% gain (in cache), 0% gain (out of cache)
1024+ bytes - 2-4x gain (in cache), 10-35% gain (out of cache)
Tested in sparcv9-*-* and sparc64-*-* targets in both multi and
non-multi arch configurations.
Patrick McGehearty <patrick.mcgehearty@oracle.com>
Adhemerval Zanella <adhemerval.zanella@linaro.org>
* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
(sysdeps_routines): Add memset-niagara7.
* sysdeps/sparc/sparc64/multiarch/Makefile (sysdes_rotuines):
Likewise.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memset-niagara7.S: New
file.
* sysdeps/sparc/sparc64/multiarch/memset-niagara7.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add __bzero_niagara7 and __memset_niagara7.
* sysdeps/sparc/sparc64/multiarch/ifunc-memset.h (IFUNC_SELECTOR):
Add niagara7 option.
* NEWS: Mention sparc m7 optimized memcpy, mempcpy, memmove, and
memset.
|
|
Support added to identify Sparc M7/T7/S7/M8/T8 processor capability.
Performance tests run on Sparc S7 using new code and old niagara4 code.
Optimizations for memcpy also apply to mempcpy and memmove
where they share code. Optimizations for memset also apply
to bzero as they share code.
For memcpy/mempcpy/memmove, performance comparison with niagara4 code:
Long word aligned data
0-127 bytes - minimal changes
128-1023 bytes - 7-30% gain
1024+ bytes - 1-7% gain (in cache); 30-100% gain (out of cache)
Word aligned data
0-127 bytes - 50%+ gain
128-1023 bytes - 10-200% gain
1024+ bytes - 0-15% gain (in cache); 5-50% gain (out of cache)
Unaligned data
0-127 bytes - 0-70%+ gain
128-447 bytes - 40-80%+ gain
448-511 bytes - 1-3% loss
512-4096 bytes - 2-3% gain (in cache); 0-20% gain (out of cache)
4096+ bytes - ± 3% (in cache); 20-50% gain (out of cache)
Tested in sparcv9-*-* and sparc64-*-* targets in both multi and
non-multi arch configurations.
Patrick McGehearty <patrick.mcgehearty@oracle.com>
Adhemerval Zanella <adhemerval.zanella@linaro.org>
* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
(sysdeps_routines): Add memcpy-memmove-niagara7 and memmove-ultra1.
* sysdeps/sparc/sparc64/multiarch/Makefile (sysdeps_routines):
Likewise.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memcpy-memmove-niagara7.S:
New file.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memmove-ultra1.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/multiarch/rtld-memmove.c: Likewise.
* sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add __memcpy_niagara7, __mempcpy_niagara7,
and __memmove_niagara7.
* sysdeps/sparc/sparc64/multiarch/ifunc-memcpy.h (IFUNC_SELECTOR):
Add niagara7 option.
* sysdeps/sparc/sparc64/multiarch/memmove.c: New file.
* sysdeps/sparc/sparc64/multiarch/ifunc-memmove.h: Likewise.
* sysdeps/sparc/sparc64/multiarch/memcpy-memmove-niagara7.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/memmove-ultra1.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/rtld-memmove.c: Likewise.
|
|
Tested in sparcv9-*-* and sparc64-*-* targets in both non-multi-arch and
multi-arch configurations.
* sysdeps/sparc/sparc32/sparcv9/memmove.S: New file.
* sysdeps/sparc/sparc32/sparcv9/rtld-memmove.c: Likewise.
* sysdeps/sparc/sparc64/memmove.S: Likewise.
* sysdeps/sparc/sparc64/rtld-memmove.c: Likewise.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Fix build caused by 5b4e5e78690c4938de312a8b176f4b14eb7bea4a.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c: Fix build
due redirect macro.
|
|
* sysdeps/sparc/sparc64/cpu_relax.c: New file.
* sysdeps/sparc/sparc32/sparcv9/cpu_relax.c: Likewise.
* sysdeps/sparc/sparc64/cpu_relax.S: Remove file.
* sysdeps/sparc/sparc32/sparcv9/cpu_relax.S: Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
This patch refactors the sparc32 ifunc selector to a C implementation.
Also, the generic symbol is moved to its own implementation file
s_nearbyint{f}-generic.S).
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_nearbyintf-generic and
s_nearbyint-generic.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-generic.S:
New file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-generic.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf.c:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint.S: Remove
file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf.S:
Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
This patch refactors the sparc32 ifunc selector to a C implementation.
Also, the generic symbol is moved to its own implementation file
s_rint{f}-generic.S).
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_rintf-generic and s_rint-generic.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint-generic.S: New
file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf-generic.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.S: Remove file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.S: Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
This patch refactors the sparc32 ifunc selector to a C implementation.
Also, the generic symbol is moved to its own implementation file
s_llrint{f}-generic.S).
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_llrintf-generic and s_llrint-generic.
* sysdeps/sparc/sparcv9/fpu/multiarch/s_llrint-generic.S: New
file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrint.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrintf-generic.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrintf.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrint.S: Remove file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrintf.S: Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
This patch refactors the sparc32 ifunc selector to a C implementation.
Also, the generic symbol is moved to its own implementation file
s_fabs{f}-generic.S).
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
(libm-sysdep_routines): Add s_fabsf-generic and s_fabs-generic.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs-generic.S: New
file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabsf-generic.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabsf.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.S: Remove file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabsf.S: Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
This patch refactors the sparc32 ifunc selector to a C implementation.
Also, the generic symbol is moved to its own implementation file
s_copysign{f}-generic.S).
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
(sysdep_calls): New rule.
(sysdep_routines): Use sysdep_calls as base.
(libm-sysdep_routines): Add generic rule for symbols shared with
libc. Add s_copysign-generic and s_copysign-generic objects.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign-generic.S:
New file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf-generic.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.S: Remove file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.S: Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
The sparc32/sparcv9/fpu/multiarch implementations of llrint / llrintf
have aliases lllrint / lllrintf. No such function is exported from or
used in libm and these aliases should not be there; I expect they
arose accidentally in the course of converting a 64-bit implementation
(where lrint and llrint can be aliases) to a 32-bit llrint
implementation. This patch removes those spurious aliases.
Tested (compilation only) with build-many-glibcs.py for
sparcv9-linux-gnu.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrint.S
(__lllrint): Remove alias.
(lllrint): Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrintf.S
(__lllrintf): Likewise.
(lllrintf): Likewise.
|
|
Continuing the preparation for additional _FloatN / _FloatNx function
aliases, this patch makes sparc libm function implementations use
libm_alias_float to define function aliases.
Tested with build-many-glibcs.py for all its sparc configurations that
installed stripped shared libraries are unchanged by the patch.
* sysdeps/sparc/sparc32/fpu/s_copysignf.S: Include
<libm-alias-float.h>.
(copysignf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/fpu/s_fabsf.S: Include
<libm-alias-float.h>.
(fabsf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.S:
Include <libm-alias-float.h>.
(copysignf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabsf.S: Include
<libm-alias-float.h>.
(fabsf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf.c: Include
<libm-alias-float.h>.
(fdimf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmaf.c: Include
<libm-alias-float.h>.
(fmaf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrintf.S: Include
<libm-alias-float.h>.
(llrintf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf.S:
Include <libm-alias-float.h>.
(nearbyintf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.S: Include
<libm-alias-float.h>.
(rintf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_llrintf.S: Include
<libm-alias-float.h>.
(llrintf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_lrintf.S: Include
<libm-alias-float.h>.
(lrintf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S: Include
<libm-alias-float.h>.
(nearbyintf): Define using libm_alias_float.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_rintf.S: Include
<libm-alias-float.h>.
(rintf): Define using libm_alias_float.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Include
<libm-alias-float.h>.
(ceilf): Define using libm_alias_float.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Include
<libm-alias-float.h>.
(floorf): Define using libm_alias_float.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fmaf.c: Include
<libm-alias-float.h>.
(fmaf): Define using libm_alias_float.
* sysdeps/sparc/sparc64/fpu/multiarch/s_lrintf.c: Include
<libm-alias-float.h>.
(lrintf): Define using libm_alias_float.
(llrintf): Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_nearbyintf.c: Include
<libm-alias-float.h>.
(nearbyintf): Define using libm_alias_float.
* sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Include
<libm-alias-float.h>.
(rintf): Define using libm_alias_float.
* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Include
<libm-alias-float.h>.
(truncf): Define using libm_alias_float.
* sysdeps/sparc/sparc64/fpu/s_copysignf.S: Include
<libm-alias-float.h>.
(copysignf): Define using libm_alias_float.
* sysdeps/sparc/sparc64/fpu/s_fabsf.c: Include
<libm-alias-float.h>.
(fabsf): Define using libm_alias_float.
* sysdeps/sparc/sparc64/fpu/s_lrintf.S: Include
<libm-alias-float.h>.
(lrintf): Define using libm_alias_float.
(llrintf): Likewise.
* sysdeps/sparc/sparc64/fpu/s_nearbyintf.S: Include
<libm-alias-float.h>.
(nearbyintf): Define using libm_alias_float.
* sysdeps/sparc/sparc64/fpu/s_rintf.S: Include
<libm-alias-float.h>.
(rintf): Define using libm_alias_float.
|
|
Continuing the preparation for additional _FloatN / _FloatNx function
aliases, this patch makes sparc libm function implementations use
libm_alias_double to define function aliases (with consequent
simplification where compat symbol handling is now done by those
macros rather than locally in architecture-specific code).
Tested with build-many-glibcs.py for all its sparc configurations that
installed stripped shared libraries are unchanged by the patch.
* sysdeps/sparc/sparc32/fpu/s_copysign.S: Include
<libm-alias-double.h>.
(copysign): Define using libm_alias_double.
* sysdeps/sparc/sparc32/fpu/s_fabs.S: Include
<libm-alias-double.h>.
(fabs): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.S:
Include <libm-alias-double.h>.
(copysign): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.S: Include
<libm-alias-double.h>.
(fabs): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.c: Include
<libm-alias-double.h>.
(fdim): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fma.c: Include
<libm-alias-double.h>.
(fma): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_llrint.S: Include
<libm-alias-double.h>.
(llrint): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint.S:
Include <libm-alias-double.h>.
(nearbyint): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.S: Include
<libm-alias-double.h>.
(rint): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_fabs.S: Include
<libm-alias-double.h>.
(fabs): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_llrint.S: Include
<libm-alias-double.h>.
(llrint): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S: Include
<libm-alias-double.h>.
(nearbyint): Define using libm_alias_double.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_rint.S: Include
<libm-alias-double.h>.
(rint): Define using libm_alias_double.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Include
<libm-alias-double.h>.
(ceil): Define using libm_alias_double.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Include
<libm-alias-double.h>.
(floor): Define using libm_alias_double.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fma.c: Include
<libm-alias-double.h>.
(fma): Define using libm_alias_double.
* sysdeps/sparc/sparc64/fpu/multiarch/s_lrint.c: Include
<libm-alias-double.h>.
(lrint): Define using libm_alias_double.
(llrint): Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_nearbyint.c: Include
<libm-alias-double.h>.
(nearbyint): Define using libm_alias_double.
* sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Include
<libm-alias-double.h>.
(rint): Define using libm_alias_double.
* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Include
<libm-alias-double.h>.
(trunc): Define using libm_alias_double.
* sysdeps/sparc/sparc64/fpu/s_copysign.S: Include
<libm-alias-double.h>.
(copysign): Define using libm_alias_double.
* sysdeps/sparc/sparc64/fpu/s_fabs.c: Include
<libm-alias-double.h>.
(fabs): Define using libm_alias_double.
* sysdeps/sparc/sparc64/fpu/s_lrint.S: Include
<libm-alias-double.h>.
(lrint): Define using libm_alias_double.
(llrint): Likewise.
* sysdeps/sparc/sparc64/fpu/s_nearbyint.S: Include
<libm-alias-double.h>.
(nearbyint): Define using libm_alias_double.
* sysdeps/sparc/sparc64/fpu/s_rint.S: Include
<libm-alias-double.h>.
(rint): Define using libm_alias_double.
|
|
The --disable-multi-arch case of sparcv9 libm is missing a fabsl
compat symbol for when long double had the same ABI as double. This
patch adds the missing compat symbol to this implementation. As my
fix for other instances of this missing compat symbol postdates the
last release, I'm considering this as being part of bug 22229 that was
missing from my previous fix rather than as a separate issue, and so
as not needing a new bug report in Bugzilla.
Tested (compilation only) with build-many-glibcs.py for
sparcv9-linux-gnu --disable-multi-arch.
[BZ #22229]
* sysdeps/sparc/sparc32/sparcv9/fpu/s_fabs.S: Include
<math_ldbl_opt.h>.
(fabsl): Define as compat symbol at version GLIBC_2_0 for libm.
|
|
This patch assumes VIS3 support by binutils, which is supported since
version 2.22. This leads to some code simplification, mostly on
multiarch build where there is only one variant instead of previously
two (whether binutils supports VIS3 instructions or not).
For multiarch files where HAVE_AS_VIS3_SUPPORT was checked and
the default implementation was built with a different name, a new
file with (implementation with -generic appended) is added.
Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
* config.h.in (HAVE_AS_VIS3_SUPPORT): Remove check for VIS3 support.
* sysdeps/sparc/configure.ac (HAVE_AS_VIS3_SUPPORT): Likewise.
* sysdeps//sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.c: Likewise.
* sysdeps//sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf.c: Likewise.
* sysdeps//sparc/sparc32/sparcv9/fpu/multiarch/s_fma.c: Likewise.
* sysdeps//sparc/sparc32/sparcv9/fpu/multiarch/s_fmaf.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fma.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fmaf.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise.
* sysdeps/sparc/sparc-ifunc.h [!HAVE_AS_VIS3_SUPPORT]
(SPARC_ASM_VIS3_IFUNC, SPARC_ASM_VIS3_VIS2_IFUNC): Remove macros.
* sysdeps/sparc/sparc32/sparcv9/Makefile [$(have-as-vis3) != yes]
(ASFLAGS.o, ASFLAGS-.os, ASFLAGS-.op, ASFLAGS-.oS): Remove rules.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
($(have-as-vis3) == yes): Remove conditional.
* sysdeps/sparc/sparc64/Makefile (($(have-as-vis3) == yes)):
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim-generic.c: New
file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdimf-generic.c: New
file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fma-generic.c: New
file.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmaf-generic.c: New
file.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceil-generic.c: New file.
* sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf-generic.c: New file.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floor-generic.c: New file.
* sysdeps/sparc/sparc64/fpu/multiarch/s_floorf-generic.c: New file.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fma-generic.c: New file.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fmaf-generic.c: New file.
* sysdeps/sparc/sparc64/fpu/multiarch/s_trunc-generic.c: New file.
* sysdeps/sparc/sparc64/fpu/multiarch/s_truncf-generic.c: New file.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
This patch refactor the SPARC64 ifunc selector to a C implementation.
No functional change is expected, including ifunc resolution rules.
Checked on sparc64-linux-gnu, sparcv9-linux-gnu and x86_64-linux-gnu.
* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
[$(subdir) = string] (sysdep_routines): Add memset-ultra1.
* sysdeps/sparc/sparc64/multiarch/Makefile [$(subdir) = string]
(sysdep_routines): Add memset-ultra1.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S: New
file.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memset.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c: Likewise.
* sysdeps/sparc/sparc64/multiarch/ifunc-memset.h: Likewise.
* sysdeps/sparc/sparc64/multiarch/memset-ultra1.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/memset.c: Likewise.
* sysdeps/sparc/sparc64/multiarch/bzero.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memset.S: Remove file.
* sysdeps/sparc/sparc64/multiarch/memset.S: Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
This patch refactor the SPARC64 ifunc selector to a C implementation.
The x86_64 implementation is used as default, which resulted in common
definitions (ifunc-init.h) used on both architectures. No functional
change is expected, including ifunc resolution rules.
Checked on sparc64-linux-gnu, sparcv9-linux-gnu and x86_64-linux-gnu.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memcpy-ultra1.S: New
file.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memcpy.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/multiarch/mempcpy.c: Likewise.
* sysdeps/sparc/sparc64/multiarch/ifunc-memcpy.h: Likewise.
* sysdeps/sparc/sparc64/multiarch/memcpy-ultra1.S: Likewise.
* sysdeps/sparc/sparc64/multiarch/memcpy.c: Likewise.
* sysdeps/sparc/sparc64/multiarch/mempcpy.c: Likewise.
* sysdeps/sparc/sparc-ifunc.h (sparc_libc_ifunc_redirected): New
macro.
* sysdeps/sparc/sparc32/sparcv9/multiarch/Makefile
[$(subdir) = string] (sysdep_routines): Add memcpy-ultra1.
* sysdeps/sparc/sparc64/multiarch/Makefile [$(subdir) = string]
(sysdep_routines): Add memcpy-ultra1.
* sysdeps/sparc/sparc64/multiarch/memcpy.S: Remove file.
* sysdeps/sparc/sparc32/sparcv9/multiarch/memcpy.S: Likewise.
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
32-bit SPARC libm should have compat symbols for copysignl
(GLIBC_2.0), fabsl (GLIBC_2.0), fmal (GLIBC_2.1), pointing to the
double functions; they were present in glibc 2.8, for example, but are
now missing, probably when optimized SPARC function implementations
were added without appropriate compat symbol handling. The same
applies to copysignl in libc. This patch restores those compat
symbols.
Tested with build-many-glibcs.py for sparcv9-linux-gnu.
[BZ #22229]
* sysdeps/sparc/sparc32/fpu/s_copysign.S: Include
<math_ldbl_opt.h>
(copysignl): Define as compat symbol at version GLIBC_2_0 for libm
and libc.
* sysdeps/sparc/sparc32/fpu/s_fabs.S: Include <math_ldbl_opt.h>.
(fabsl): Define as compat symbol at version GLIBC_2_0 for libm.
* sysdeps/sparc/sparc32/fpu/s_fma.c: Include <math_ldbl_opt.h>.
(fmal): Define as compat symbol at version GLIBC_2_1 for libm.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.S:
Include <math_ldbl_opt.h>
(copysignl): Define as compat symbol at version GLIBC_2_0 for libm
and libc.
(compat_symbol): Undefine and redefine.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fabs.S: Include
<math_ldbl_opt.h>
(fabsl): Define as compat symbol at version GLIBC_2_0 for libm.
(compat_symbol): Undefine and redefine.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fma.c
[HAVE_AS_VIS3_SUPPORT]: Include <math_ldbl_opt.h>.
[HAVE_AS_VIS3_SUPPORT] (fmal): Define as compat symbol at version
GLIBC_2_1 for libm.
* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Add
GLIBC_2.0 copysignl symbol.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Add
GLIBC_2.0 copysignl and fabsl and GLIBC_2.1 fmal symbols.
|
|
Continuing the process of setting up common macros for libm function
aliases, with a view to using them to define _FloatN / _FloatNx
aliases in future, this patch adds a libm_alias_double macro and uses
it in the type-generic templates.
This macro handles defining aliases for double, and for long double in
the NO_LONG_DOUBLE case. It also handles defining compat symbols for
long double = double for architectures that changed their long double
format. By so doing, it eliminates the need for the
M_LIBM_NEED_COMPAT and declare_mgen_libm_compat macros; the single
declare_mgen_alias call in each template now suffices to define all
required compat symbols. When used for more double functions (not
based on type-generic templates), I expect it will eliminate the need
for most ldbl-opt wrappers for such functions.
A few special cases are needed. __clog10l is a public symbol (for
historical reasons) so needs to be given appropriate compat versions
for architectures that changed their long double format, but is not
defined as an alias using the normal macros since __clog10* are *not*
public symbols for _FloatN / _FloatNx types. For scalbn, scalbln and
log1p, the changes adding errno setting support for those functions
left compat symbols pointing directly to the non-errno-setting
implementations. There is no requirement for the compat symbols not
to set errno; that just made for the simplest patches at that time.
Now, with these common macros, it's natural to redirect the compat
symbols to the errno-setting wrappers, which I intend to do in a
separate patch.
Tested for x86_64, and with build-many-glibcs.py. For ldbl-opt
platforms the stripped libm.so binaries are changed (disassembly
unchanged) because the details of how the clog10l compat symbol is
created mean it ceases to be weak as it was before; for other
platforms, stripped libm.so binaries are unchanged.
2017-09-13 Joseph Myers <joseph@codesourcery.com>
* sysdeps/generic/libm-alias-double.h: New file.
* sysdeps/ieee754/ldbl-opt/libm-alias-double.h: Likewise.
* sysdeps/generic/math-type-macros-double.h: Include
<libm-alias-double.h>.
[declare_mgen_alias] (declare_mgen_alias): Define to use
libm_alias_double.
* sysdeps/generic/math-type-macros.h [!M_LIBM_NEED_COMPAT]
(M_LIBM_NEED_COMPAT): Remove macro.
[!M_LIBM_NEED_COMPAT] (declare_mgen_libm_compat): Likewise.
* sysdeps/ieee754/ldbl-opt/math-type-macros-double.h: Remove.
* math/cabs_template.c [M_LIBM_NEED_COMPAT]: Remove conditional
code.
* math/carg_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/cimag_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/conj_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/creal_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_cacos_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_cacosh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_casin_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_casinh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_catan_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_catanh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_ccos_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_ccosh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_cexp_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_clog10_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_clog_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_cpow_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_cproj_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_csin_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_csinh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_csqrt_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_ctan_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_ctanh_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_fdim_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_fmax_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_fmin_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/s_nan_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* math/w_ilogb_template.c [M_LIBM_NEED_COMPAT]: Likewise.
* sysdeps/ieee754/ldbl-opt/s_clog10.c: New file.
* sysdeps/ieee754/ldbl-opt/s_ldexp.c (M_LIBM_NEED_COMPAT): Remove
macro.
(declare_mgen_alias): New macro.
* sysdeps/ieee754/ldbl-opt/w_log1p.c: New file.
* sysdeps/ieee754/ldbl-opt/w_scalbln.c: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim-vis3.c
(M_LIBM_NEED_COMPAT): Remove macro.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fdim.c
[HAVE_AS_VIS3_SUPPORT]: Include <math_ldbl_opt.h> and
<first-versions.h>.
[HAVE_AS_VIS3_SUPPORT && LONG_DOUBLE_COMPAT (libm,
FIRST_VERSION_libm_fdiml)]: Define fdiml as compat symbol.
|
|
This patch removes the SPARC-specific wrappers for sqrt and sqrtf.
These wrappers, by adding architecture-specific uses of _LIB_VERSION
and __kernel_standard, unnecessarily complicate cleanups of libm error
handling. They also do not serve a useful optimization purpose. GCC
knows about sqrt as a built-in function, and can generate direct calls
to a hardware square root instruction, either on its own, in the
-fno-math-errno case, or together with an inline check for the
argument being negative and a call to the out-of-line sqrt function
for error handling only in that case (and has been able to do so for a
long time). Thus in practice the wrapper will only be called only in
the case of negative arguments, which is not a case it is useful to
optimize for.
The removal of the wrappers also uncovers, and fixes, an old bug.
32-bit SPARC libm used (checked with glibc 2.8 binaries) to have a
sqrtl compat symbol, version GLIBC_2.0, for old binaries when sqrtl
was an alias of sqrt (I don't have pre-glibc-2.4 binaries for SPARC to
hand to check for the sqrtl symbol in those). This disappeared,
probably with:
commit 8847f0377003fbfe9cbe951ce9f8717d74f26247
Author: David S. Miller <davem@davemloft.net>
Date: Tue Feb 28 22:37:58 2012 -0800
Add sparc optimized sqrt{,f}.
Removing the wrappers brings back the generic ldbl-opt logic for
creating such compat symbols, and so restores the compat symbol that
should be there. This could of course also be fixed in the wrappers -
but as noted above, the wrappers are optimizing a case it's not useful
to optimize, so the bug of the missing compat symbol serves to
illustrate the risks involved with the extra complexity of
architecture-specific function versions where not needed.
Tested with build-many-glibcs.py.
[BZ #21973]
* sysdeps/sparc/sparc32/fpu/w_sqrt_compat.S: Remove file.
* sysdeps/sparc/sparc32/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrt_compat.S : Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc64/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc64/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Add
GLIBC_2.0 sqrtl symbol.
|
|
This patch obsoletes support for SVID libm error handling (the system
where a user-defined function matherr is called on a libm function
error; only enabled if you also set _LIB_VERSION = _SVID_ or
_LIB_VERSION = _XOPEN_) and the use of the _LIB_VERSION global
variable to control libm error handling. matherr and _LIB_VERSION are
made into compat symbols, not supported for new ports or for static
linking. The libieee.a object file (which sets _LIB_VERSION = _IEEE_,
so disabling errno setting for some functions) is also removed, and
all the related definitions are removed from math.h.
The manual already recommends against using matherr, and it's already
not supported for _Float128 functions (those use new wrappers that
don't support matherr, only errno) - this patch means that it becomes
possible to e.g. add sinf32 as an alias to sinf without that resulting
in undesired matherr support in sinf32 for existing glibc ports.
matherr support is not part of any standard supported by glibc (it was
removed in XPG4).
Because matherr is a function to be defined by the user, of course
user programs defining such a function will still continue to link; it
just quietly won't be used. If they try to write to the library's
copy of _LIB_VERSION to enable SVID error handling, however, they will
get a link error (but if they define their own _LIB_VERSION variable,
they won't).
I expect the most likely case of build failures from this patch to be
programs with unconditional cargo-culted uses of -lieee (based on a
notion of "I want IEEE floating point", not any actual requirement for
that library).
Ideally, the new-port-or-static-linking case would use the new
wrappers used for _Float128. This is not implemented in this patch,
because of the complication of architecture-specific (powerpc32 and
sparc) sqrt wrappers that use _LIB_VERSION and __kernel_standard
directly. Thus, the old wrappers and __kernel_standard are still
built unconditionally, and _LIB_VERSION still exists in static libm.
But when the old wrappers and __kernel_standard are built in the
non-compat case, _LIB_VERSION and matherr are defined as macros so
code to support those features isn't actually built into static libm
or new ports' shared libm after this patch.
I intend to move to the new wrappers for static libm and new ports in
followup patches. I believe the sqrt wrappers for powerpc32 and sparc
can reasonably be removed. GCC already optimizes the normal case of
sqrt by generating code that uses a hardware instruction and only
calls the sqrt function if the argument was negative (if
-fno-math-errno, of course, it just uses the hardware instruction
without any check for negative argument being needed). Thus those
wrappers will only actually get called in the case of negative
arguments, which is not a case it makes sense to optimize for. But
even without removing the powerpc32 and sparc wrappers it should still
be possible to move to the new wrappers for static libm and new ports,
just without having those dubious architecture-specific optimizations
in static libm.
Everything said about matherr equally applies to matherrf and matherrl
(IA64-specific, undocumented), except that the structure of IA64 libm
means it won't be converted to using the new wrappers (it doesn't use
the old ones either, but its own error-handling code instead).
As with other tests of compat symbols, I expect test-matherr and
test-matherr-2 to need to become appropriately conditional once we
have a system for disabling such tests for ports too new to have the
relevant symbols.
Tested for x86_64 and x86, and with build-many-glibcs.py.
* math/math.h [__USE_MISC] (_LIB_VERSION_TYPE): Remove.
[__USE_MISC] (_LIB_VERSION): Likewise.
[__USE_MISC] (struct exception): Likewise.
[__USE_MISC] (matherr): Likewise.
[__USE_MISC] (DOMAIN): Likewise.
[__USE_MISC] (SING): Likewise.
[__USE_MISC] (OVERFLOW): Likewise.
[__USE_MISC] (UNDERFLOW): Likewise.
[__USE_MISC] (TLOSS): Likewise.
[__USE_MISC] (PLOSS): Likewise.
[__USE_MISC] (HUGE): Likewise.
[__USE_XOPEN] (MAXFLOAT): Define even if [__USE_MISC].
* math/math-svid-compat.h: New file.
* conform/linknamespace.pl (@whitelist): Remove matherr, matherrf
and matherrl.
* include/math.h [!_ISOMAC] (__matherr): Remove.
* manual/arith.texi (FP Exceptions): Do not document matherr.
* math/Makefile (tests): Change test-matherr to test-matherr-3.
(tests-internal): New variable.
(install-lib): Do not add libieee.a.
(non-lib.a): Likewise.
(extra-objs): Do not add libieee.a and ieee-math.o.
(CPPFLAGS-s_lib_version.c): Remove variable.
($(objpfx)libieee.a): Remove rule.
($(addprefix $(objpfx), $(tests-internal)): Depend on $(libm).
* math/ieee-math.c: Remove.
* math/libm-test-support.c (matherr): Remove.
* math/test-matherr.c: Use <support/test-driver.c>. Add copyright
and license notices. Include <math-svid-compat.h> and
<shlib-compat.h>.
(matherr): Undefine as macro. Use compat_symbol_reference.
(_LIB_VERSION): Likewise.
* math/test-matherr-2.c: New file.
* math/test-matherr-3.c: Likewise.
* sysdeps/generic/math_private.h (__kernel_standard): Remove
declaration.
(__kernel_standard_f): Likewise.
(__kernel_standard_l): Likewise.
* sysdeps/ieee754/s_lib_version.c: Do not include <math.h> or
<math_private.h>. Include <math-svid-compat.h>.
(_LIB_VERSION): Undefine as macro.
(_LIB_VERSION_INTERNAL): Always initialize to _POSIX_. Define
only if [LIBM_SVID_COMPAT || !defined SHARED]. If
[LIBM_SVID_COMPAT], use compat_symbol.
* sysdeps/ieee754/s_matherr.c: Do not include <math.h> or
<math_private.h>. Include <math-svid-compat.h>.
(matherr): Undefine as macro.
(__matherr): Define only if [LIBM_SVID_COMPAT]. Use
compat_symbol.
* sysdeps/ia64/fpu/libm_error.c: Include <math-svid-compat.h>.
[_LIBC && LIBM_SVID_COMPAT] (matherrf): Use
compat_symbol_reference.
[_LIBC && LIBM_SVID_COMPAT] (matherrl): Likewise.
[_LIBC && !LIBM_SVID_COMPAT] (matherrf): Define as macro.
[_LIBC && !LIBM_SVID_COMPAT] (matherrl): Likewise.
* sysdeps/ia64/fpu/libm_support.h: Include <math-svid-compat.h>.
(MATHERR_D): Remove declaration.
[!_LIBC] (_LIB_VERSION_TYPE): Likewise
[!LIBM_BUILD] (_LIB_VERSIONIMF): Likewise.
[LIBM_BUILD] (pmatherrf): Likewise.
[LIBM_BUILD] (pmatherr): Likewise.
[LIBM_BUILD] (pmatherrl): Likewise.
(DOMAIN): Likewise.
(SING): Likewise.
(OVERFLOW): Likewise.
(UNDERFLOW): Likewise.
(TLOSS): Likewise.
(PLOSS): Likewise.
* sysdeps/ia64/fpu/s_matherrf.c: Include <math-svid-compat.h>.
(__matherrf): Define only if [LIBM_SVID_COMPAT]. Use
compat_symbol.
* sysdeps/ia64/fpu/s_matherrl.c: Include <math-svid-compat.h>.
(__matherrl): Define only if [LIBM_SVID_COMPAT]. Use
compat_symbol.
* math/lgamma-compat.h: Include <math-svid-compat.h>.
* math/w_acos_compat.c: Likewise.
* math/w_acosf_compat.c: Likewise.
* math/w_acosh_compat.c: Likewise.
* math/w_acoshf_compat.c: Likewise.
* math/w_acoshl_compat.c: Likewise.
* math/w_acosl_compat.c: Likewise.
* math/w_asin_compat.c: Likewise.
* math/w_asinf_compat.c: Likewise.
* math/w_asinl_compat.c: Likewise.
* math/w_atan2_compat.c: Likewise.
* math/w_atan2f_compat.c: Likewise.
* math/w_atan2l_compat.c: Likewise.
* math/w_atanh_compat.c: Likewise.
* math/w_atanhf_compat.c: Likewise.
* math/w_atanhl_compat.c: Likewise.
* math/w_cosh_compat.c: Likewise.
* math/w_coshf_compat.c: Likewise.
* math/w_coshl_compat.c: Likewise.
* math/w_exp10_compat.c: Likewise.
* math/w_exp10f_compat.c: Likewise.
* math/w_exp10l_compat.c: Likewise.
* math/w_exp2_compat.c: Likewise.
* math/w_exp2f_compat.c: Likewise.
* math/w_exp2l_compat.c: Likewise.
* math/w_fmod_compat.c: Likewise.
* math/w_fmodf_compat.c: Likewise.
* math/w_fmodl_compat.c: Likewise.
* math/w_hypot_compat.c: Likewise.
* math/w_hypotf_compat.c: Likewise.
* math/w_hypotl_compat.c: Likewise.
* math/w_j0_compat.c: Likewise.
* math/w_j0f_compat.c: Likewise.
* math/w_j0l_compat.c: Likewise.
* math/w_j1_compat.c: Likewise.
* math/w_j1f_compat.c: Likewise.
* math/w_j1l_compat.c: Likewise.
* math/w_jn_compat.c: Likewise.
* math/w_jnf_compat.c: Likewise.
* math/w_jnl_compat.c: Likewise.
* math/w_lgamma_main.c: Likewise.
* math/w_lgamma_r_compat.c: Likewise.
* math/w_lgammaf_main.c: Likewise.
* math/w_lgammaf_r_compat.c: Likewise.
* math/w_lgammal_main.c: Likewise.
* math/w_lgammal_r_compat.c: Likewise.
* math/w_log10_compat.c: Likewise.
* math/w_log10f_compat.c: Likewise.
* math/w_log10l_compat.c: Likewise.
* math/w_log2_compat.c: Likewise.
* math/w_log2f_compat.c: Likewise.
* math/w_log2l_compat.c: Likewise.
* math/w_log_compat.c: Likewise.
* math/w_logf_compat.c: Likewise.
* math/w_logl_compat.c: Likewise.
* math/w_pow_compat.c: Likewise.
* math/w_powf_compat.c: Likewise.
* math/w_powl_compat.c: Likewise.
* math/w_remainder_compat.c: Likewise.
* math/w_remainderf_compat.c: Likewise.
* math/w_remainderl_compat.c: Likewise.
* math/w_scalb_compat.c: Likewise.
* math/w_scalbf_compat.c: Likewise.
* math/w_scalbl_compat.c: Likewise.
* math/w_sinh_compat.c: Likewise.
* math/w_sinhf_compat.c: Likewise.
* math/w_sinhl_compat.c: Likewise.
* math/w_sqrt_compat.c: Likewise.
* math/w_sqrtf_compat.c: Likewise.
* math/w_sqrtl_compat.c: Likewise.
* math/w_tgamma_compat.c: Likewise.
* math/w_tgammaf_compat.c: Likewise.
* math/w_tgammal_compat.c: Likewise.
* sysdeps/ieee754/dbl-64/w_exp_compat.c: Likewise.
* sysdeps/ieee754/flt-32/w_expf_compat.c: Likewise.
* sysdeps/ieee754/k_standard.c: Likewise.
* sysdeps/ieee754/k_standardf.c: Likewise.
* sysdeps/ieee754/k_standardl.c: Likewise.
* sysdeps/ieee754/ldbl-128/w_expl_compat.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/w_expl_compat.c: Likewise.
* sysdeps/ieee754/ldbl-96/w_expl_compat.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power4/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/powerpc/powerpc32/power5/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc32/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc32/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrtf_compat.S: Likewise.
* sysdeps/sparc/sparc64/fpu/w_sqrt_compat.S: Likewise.
* sysdeps/sparc/sparc64/fpu/w_sqrtf_compat.S: Likewise.
|
|
This patch optimizes the generic spinlock code.
The type pthread_spinlock_t is a typedef to volatile int on all archs.
Passing a volatile pointer to the atomic macros which are not mapped to the
C11 atomic builtins can lead to extra stores and loads to stack if such
a macro creates a temporary variable by using "__typeof (*(mem)) tmp;".
Thus, those macros which are used by spinlock code - atomic_exchange_acquire,
atomic_load_relaxed, atomic_compare_exchange_weak - have to be adjusted.
According to the comment from Szabolcs Nagy, the type of a cast expression is
unqualified (see http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_423.htm):
__typeof ((__typeof (*(mem)) *(mem)) tmp;
Thus from spinlock perspective the variable tmp is of type int instead of
type volatile int. This patch adjusts those macros in include/atomic.h.
With this construct GCC >= 5 omits the extra stores and loads.
The atomic macros are replaced by the C11 like atomic macros and thus
the code is aligned to it. The pthread_spin_unlock implementation is now
using release memory order instead of sequentially consistent memory order.
The issue with passed volatile int pointers applies to the C11 like atomic
macros as well as the ones used before.
I've added a glibc_likely hint to the first atomic exchange in
pthread_spin_lock in order to return immediately to the caller if the lock is
free. Without the hint, there is an additional jump if the lock is free.
I've added the atomic_spin_nop macro within the loop of plain reads.
The plain reads are also realized by C11 like atomic_load_relaxed macro.
The new define ATOMIC_EXCHANGE_USES_CAS determines if the first try to acquire
the spinlock in pthread_spin_lock or pthread_spin_trylock is an exchange
or a CAS. This is defined in atomic-machine.h for all architectures.
The define SPIN_LOCK_READS_BETWEEN_CMPXCHG is now removed.
There is no technical reason for throwing in a CAS every now and then,
and so far we have no evidence that it can improve performance.
If that would be the case, we have to adjust other spin-waiting loops
elsewhere, too! Using a CAS loop without plain reads is not a good idea
on many targets and wasn't used by one. Thus there is now no option to
do so.
Architectures are now using the generic spinlock automatically if they
do not provide an own implementation. Thus the pthread_spin_lock.c files
in sysdeps folder are deleted.
ChangeLog:
* NEWS: Mention new spinlock implementation.
* include/atomic.h:
(__atomic_val_bysize): Cast type to omit volatile qualifier.
(atomic_exchange_acq): Likewise.
(atomic_load_relaxed): Likewise.
(ATOMIC_EXCHANGE_USES_CAS): Check definition.
* nptl/pthread_spin_init.c (pthread_spin_init):
Use atomic_store_relaxed.
* nptl/pthread_spin_lock.c (pthread_spin_lock):
Use C11-like atomic macros.
* nptl/pthread_spin_trylock.c (pthread_spin_trylock):
Likewise.
* nptl/pthread_spin_unlock.c (pthread_spin_unlock):
Use atomic_store_release.
* sysdeps/aarch64/nptl/pthread_spin_lock.c: Delete File.
* sysdeps/arm/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/hppa/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/m68k/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/microblaze/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/mips/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/nios2/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/aarch64/atomic-machine.h (ATOMIC_EXCHANGE_USES_CAS): Define.
* sysdeps/alpha/atomic-machine.h: Likewise.
* sysdeps/arm/atomic-machine.h: Likewise.
* sysdeps/i386/atomic-machine.h: Likewise.
* sysdeps/ia64/atomic-machine.h: Likewise.
* sysdeps/m68k/coldfire/atomic-machine.h: Likewise.
* sysdeps/m68k/m680x0/m68020/atomic-machine.h: Likewise.
* sysdeps/microblaze/atomic-machine.h: Likewise.
* sysdeps/mips/atomic-machine.h: Likewise.
* sysdeps/powerpc/powerpc32/atomic-machine.h: Likewise.
* sysdeps/powerpc/powerpc64/atomic-machine.h: Likewise.
* sysdeps/s390/atomic-machine.h: Likewise.
* sysdeps/sparc/sparc32/atomic-machine.h: Likewise.
* sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: Likewise.
* sysdeps/sparc/sparc64/atomic-machine.h: Likewise.
* sysdeps/tile/tilegx/atomic-machine.h: Likewise.
* sysdeps/tile/tilepro/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/hppa/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/nios2/atomic-machine.h: Likewise.
* sysdeps/unix/sysv/linux/sh/atomic-machine.h: Likewise.
* sysdeps/x86_64/atomic-machine.h: Likewise.
|
|
With the removal of divdi3 object from sparcv9-linux-gnu build, its
definition came from libgcc and its functions internall calls .udiv.
Since glibc also exports these symbols for compatibility reasons, it
will end up creating PLT calls internally in libc.so.
To avoid it, this patch uses the linker option --wrap to replace all
the internal libc.so .udiv calls to the wrapper __wrap_.udiv. Along
with strong alias in the udiv implementations, it makes linker do
local calls.
Checked on sparcv9-linux-gnu.
* sysdeps/sparc/sparc32/Makefile (libc.so-gnulib): New rule.
* sysdeps/sparc/sparc32/sparcv8/udiv.S (.udiv): Make a strong_alias
to __wrap_.udiv.
* sysdeps/sparc/sparc32/sparcv9/udiv.S (.udiv): Likewise.
* sysdeps/sparc/sparc32/udiv.S (.udiv): Likewise.
|
|
famx{,f}/fmin{,f} and 32-bit lrint cause math testsuite failures
either because they generate incorrect results or they fail to signal
the proper exceptions.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fmax-vis3.S: Remove file.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fmax.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fmaxf-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fmaxf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fmin-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fmin.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fminf-vis3.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/s_fminf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/multiarch/Makefile
(libm-sysdep_routines): Update.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_fmax.S: Remove file.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_fmaxf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_fmin.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_fminf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_lrint.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_fmax.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_fmaxf.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_fmin.S: Likewise.
* sysdeps/sparc/sparc64/fpu/s_fminf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmax-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmax.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmaxf-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmaxf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmin-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fmin.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fminf-vis3.S:
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_fminf.S: Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/Makefile
(libm-sysdep_routines): Update.
|
|
This commit moves one step towards the deprecation of wrappers that
use _LIB_VERSION / matherr / __kernel_standard functionality, by
adding the suffix '_compat' to their filenames and adjusting Makefiles
and #includes accordingly.
New template wrappers that do not use such functionality will be added
by future patches and will be first used by the float128 wrappers.
|
|
|
|
This was used by --enable-omitfp, and the bulk of it was removed in this
commit:
commit bdeba1354b7364d9b7857a048286a71ddbcdff86
Author: Ulrich Drepper <drepper@gmail.com>
Date: Sat Jan 7 11:29:31 2012 -0500
Remove --enable-omitfp support
|
|
Current sparc32 sem_init and default one only differ on sem.newsem.pad
initialization. This patch removes sparc32 and sparc32v9 sem_init arch
specific implementation and set sparc32 to use nptl default one.
The default implementation sets the required sem.newsem.pad to 0 (which
is ununsed in other architectures).
I checked on i686 and a sparc32v9 build.
* nptl/sem_init.c (sem_init): Init pad value to 0.
* sysdeps/sparc/sparc32/sem_init.c: Remove file.
* sysdeps/sparc/sparc32/sparcv9/sem_init.c: Likewise.
|
|
This patch removes the sparc32 sem_wait.c implementation since it is
identical to default nptl one. The sparcv9 is no longer required with
the removal.
Checked with a sparcv9 build.
* sysdeps/sparc/sparc32/sem_wait.c: Remove file.
* sysdeps/sparc/sparc32/sparcv9/sem_wait.c: Likewise.
|
|
Current sparc32 sem_open and default one only differ on:
1. Default one contains a 'futex_supports_pshared' check.
2. sem.newsem.pad is initialized to zero.
This patch removes sparc32 and sparc32v9 sem_open arch specific
implementation and instead set sparc32 to use nptl default one.
Using 1. is fine since it should always evaluate 0 for Linux
(an optimized away by the compiler). Adding 2. to default
implementation should be ok since 'pad' field is used mainly
on sparc32 code.
I checked on i686 and checked a sparc32v9 build.
* nptl/sem_open.c (sem_open): Init pad value to 0.
* sysdeps/sparc/sparc32/sem_open.c: Remove file.
* sysdeps/sparc/sparc32/sparcv9/sem_open.c: Likewise.
|
|
The only difference is the usage of math_narrow_eval when
building s_fdiml.c. This should be harmless for long double,
but I did observe some code generation changes on m68k, but
lack the resources to test it.
Likewise, to more easily support overriding symbol generation,
the aliasing macros are always conditionally defined on their
absence to reduce boilerplate.
I also ran builds for i486, ppc64, sparcv9, aarch64,
s390x and observed no changes to s_fdim* objects.
|
|
Use s_fdim.c from sysdeps/ieee754/ldbl-opt/ instead of
math/ to ensure a compat symbol for fdiml is created.
|