aboutsummaryrefslogtreecommitdiff
path: root/sysdeps/tile/tilegx
AgeCommit message (Collapse)Author
2017-12-20Simplify tilegx sysdeps folderAdhemerval Zanella
With tilepro support removal we can now simplify internal tile support by moving the directory structure to avoid the unnecessary directory levels in tile/tilegx both on generic and linux folders. Checked with a build for tilegx-linux-gnu and tilegx-linux-gnu-32 with and without the patch, there is no difference in generated binary with a dissassemble. * stdlib/bug-getcontext.c (do_test): Remove tilepro mention in comment. * sysdeps/tile/preconfigure: Remove tilegx folder. * sysdeps/tile/tilegx/Implies: Move definitions to ... * sysdeps/tile/Implies: ... here. * sysdeps/tile/tilegx/Makefile: Move rules to ... * sysdeps/tile/Makefile: ... here. * sysdeps/tile/tilegx/atomic-machine.h: Move definitions to ... * sysdeps/tile/atomic-machine.h: ... here. Add include guards. * sysdeps/tile/tilegx/bits/wordsize.h: Move to ... * sysdeps/tile/bits/wordsize.h: ... here. * sysdeps/tile/tilegx/*: Move to ... * sysdeps/tile/*: ... here. * sysdeps/tile/tilegx/tilegx32/Implies: Move to ... * sysdeps/tile/tilegx32/Implies: ... here. * sysdeps/tile/tilegx/tilegx64/Implies: Move to ... * sysdeps/tile/tilegx64/Implies: ... here. * sysdeps/unix/sysv/linux/tile/tilegx/Makefile: Move definitions to ... * sysdeps/unix/sysv/linux/tile/Makefile: ... here. * sysdeps/unix/sysv/linux/tile/tilegx/*: Move to ... * sysdeps/unix/sysv/linux/tile/*: ... here. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/*: Move to ... * sysdeps/unix/sysv/linux/tile/tilegx32/*: ... here. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/*: Move to ... * sysdeps/unix/sysv/linux/tile/tilegx64/*: ... here.
2017-12-05tilegx: tag __insn_OP builtin issue with gcc bugzilla #Chris Metcalf
2017-12-05tilegx: work around vector insn bug in gccChris Metcalf
Avoid an issue in gcc where some of the vector (aka SIMD) ops will sometimes end up getting wrongly optimized out. We use these instructions in many of the string implementations. If/when we have an upstreamed fix for this problem in gcc we can conditionalize the use of the extended assembly workaround in glibc.
2017-06-06Optimize generic spinlock code and use C11 like atomic macros.Stefan Liebler
This patch optimizes the generic spinlock code. The type pthread_spinlock_t is a typedef to volatile int on all archs. Passing a volatile pointer to the atomic macros which are not mapped to the C11 atomic builtins can lead to extra stores and loads to stack if such a macro creates a temporary variable by using "__typeof (*(mem)) tmp;". Thus, those macros which are used by spinlock code - atomic_exchange_acquire, atomic_load_relaxed, atomic_compare_exchange_weak - have to be adjusted. According to the comment from Szabolcs Nagy, the type of a cast expression is unqualified (see http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_423.htm): __typeof ((__typeof (*(mem)) *(mem)) tmp; Thus from spinlock perspective the variable tmp is of type int instead of type volatile int. This patch adjusts those macros in include/atomic.h. With this construct GCC >= 5 omits the extra stores and loads. The atomic macros are replaced by the C11 like atomic macros and thus the code is aligned to it. The pthread_spin_unlock implementation is now using release memory order instead of sequentially consistent memory order. The issue with passed volatile int pointers applies to the C11 like atomic macros as well as the ones used before. I've added a glibc_likely hint to the first atomic exchange in pthread_spin_lock in order to return immediately to the caller if the lock is free. Without the hint, there is an additional jump if the lock is free. I've added the atomic_spin_nop macro within the loop of plain reads. The plain reads are also realized by C11 like atomic_load_relaxed macro. The new define ATOMIC_EXCHANGE_USES_CAS determines if the first try to acquire the spinlock in pthread_spin_lock or pthread_spin_trylock is an exchange or a CAS. This is defined in atomic-machine.h for all architectures. The define SPIN_LOCK_READS_BETWEEN_CMPXCHG is now removed. There is no technical reason for throwing in a CAS every now and then, and so far we have no evidence that it can improve performance. If that would be the case, we have to adjust other spin-waiting loops elsewhere, too! Using a CAS loop without plain reads is not a good idea on many targets and wasn't used by one. Thus there is now no option to do so. Architectures are now using the generic spinlock automatically if they do not provide an own implementation. Thus the pthread_spin_lock.c files in sysdeps folder are deleted. ChangeLog: * NEWS: Mention new spinlock implementation. * include/atomic.h: (__atomic_val_bysize): Cast type to omit volatile qualifier. (atomic_exchange_acq): Likewise. (atomic_load_relaxed): Likewise. (ATOMIC_EXCHANGE_USES_CAS): Check definition. * nptl/pthread_spin_init.c (pthread_spin_init): Use atomic_store_relaxed. * nptl/pthread_spin_lock.c (pthread_spin_lock): Use C11-like atomic macros. * nptl/pthread_spin_trylock.c (pthread_spin_trylock): Likewise. * nptl/pthread_spin_unlock.c (pthread_spin_unlock): Use atomic_store_release. * sysdeps/aarch64/nptl/pthread_spin_lock.c: Delete File. * sysdeps/arm/nptl/pthread_spin_lock.c: Likewise. * sysdeps/hppa/nptl/pthread_spin_lock.c: Likewise. * sysdeps/m68k/nptl/pthread_spin_lock.c: Likewise. * sysdeps/microblaze/nptl/pthread_spin_lock.c: Likewise. * sysdeps/mips/nptl/pthread_spin_lock.c: Likewise. * sysdeps/nios2/nptl/pthread_spin_lock.c: Likewise. * sysdeps/aarch64/atomic-machine.h (ATOMIC_EXCHANGE_USES_CAS): Define. * sysdeps/alpha/atomic-machine.h: Likewise. * sysdeps/arm/atomic-machine.h: Likewise. * sysdeps/i386/atomic-machine.h: Likewise. * sysdeps/ia64/atomic-machine.h: Likewise. * sysdeps/m68k/coldfire/atomic-machine.h: Likewise. * sysdeps/m68k/m680x0/m68020/atomic-machine.h: Likewise. * sysdeps/microblaze/atomic-machine.h: Likewise. * sysdeps/mips/atomic-machine.h: Likewise. * sysdeps/powerpc/powerpc32/atomic-machine.h: Likewise. * sysdeps/powerpc/powerpc64/atomic-machine.h: Likewise. * sysdeps/s390/atomic-machine.h: Likewise. * sysdeps/sparc/sparc32/atomic-machine.h: Likewise. * sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: Likewise. * sysdeps/sparc/sparc64/atomic-machine.h: Likewise. * sysdeps/tile/tilegx/atomic-machine.h: Likewise. * sysdeps/tile/tilepro/atomic-machine.h: Likewise. * sysdeps/unix/sysv/linux/hppa/atomic-machine.h: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: Likewise. * sysdeps/unix/sysv/linux/nios2/atomic-machine.h: Likewise. * sysdeps/unix/sysv/linux/sh/atomic-machine.h: Likewise. * sysdeps/x86_64/atomic-machine.h: Likewise.
2017-01-16tile: Check for pointer add overflow in memchrChris Metcalf
As was done in b224637928e9, check for large size causing an overflow in the loop that walks over the array. Branching out of line here is the fastest approach for handling this problem, since tile can bundle the instructions to compute the branch test in parallel with doing the required memchr loop setup computation. Unfortunately, the existing saturated ops (e.g. tilegx addxsc) are all signed saturing ops, so don't help with unsigned saturation.
2017-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
2016-11-04Define wordsize.h macros everywhereSteve Ellcey
* bits/wordsize.h: Add documentation. * sysdeps/aarch64/bits/wordsize.h : New file * sysdeps/generic/stdint.h (PTRDIFF_MIN, PTRDIFF_MAX): Update definitions. (SIZE_MAX): Change ifdef to if in __WORDSIZE32_SIZE_ULONG check. * sysdeps/gnu/bits/utmp.h (__WORDSIZE_TIME64_COMPAT32): Check with #if instead of #ifdef. * sysdeps/gnu/bits/utmpx.h (__WORDSIZE_TIME64_COMPAT32): Ditto. * sysdeps/mips/bits/wordsize.h (__WORDSIZE32_SIZE_ULONG, __WORDSIZE32_PTRDIFF_LONG, __WORDSIZE_TIME64_COMPAT32): Add or change defines. * sysdeps/powerpc/powerpc32/bits/wordsize.h: Likewise. * sysdeps/powerpc/powerpc64/bits/wordsize.h: Likewise. * sysdeps/s390/s390-32/bits/wordsize.h: Likewise. * sysdeps/s390/s390-64/bits/wordsize.h: Likewise. * sysdeps/sparc/sparc32/bits/wordsize.h: Likewise. * sysdeps/sparc/sparc64/bits/wordsize.h: Likewise. * sysdeps/tile/tilegx/bits/wordsize.h: Likewise. * sysdeps/tile/tilepro/bits/wordsize.h: Likewise. * sysdeps/unix/sysv/linux/alpha/bits/wordsize.h: Likewise. * sysdeps/unix/sysv/linux/powerpc/bits/wordsize.h: Likewise. * sysdeps/unix/sysv/linux/sparc/bits/wordsize.h: Likewise. * sysdeps/wordsize-32/bits/wordsize.h: Likewise. * sysdeps/wordsize-64/bits/wordsize.h: Likewise. * sysdeps/x86/bits/wordsize.h: Likewise.
2016-01-04Update copyright dates with scripts/update-copyrights.Joseph Myers
2015-09-11Move bits/atomic.h to atomic-machine.h (bug 14912).Joseph Myers
It was noted in <https://sourceware.org/ml/libc-alpha/2012-09/msg00305.html> that the bits/*.h naming scheme should only be used for installed headers. This patch renames bits/atomic.h to atomic-machine.h to follow that convention. This is the only change in this series that needs to change the filename rather than simply removing a directory level (because both atomic.h and bits/atomic.h exist at present). Tested for x86_64 (testsuite, and that installed stripped shared libraries are unchanged by the patch). [BZ #14912] * sysdeps/aarch64/bits/atomic.h: Move to ... * sysdeps/aarch64/atomic-machine.h: ...here. (_AARCH64_BITS_ATOMIC_H): Rename macro to _AARCH64_ATOMIC_MACHINE_H. * sysdeps/alpha/bits/atomic.h: Move to ... * sysdeps/alpha/atomic-machine.h: ...here. * sysdeps/arm/bits/atomic.h: Move to ... * sysdeps/arm/atomic-machine.h: ...here. Update comments. * bits/atomic.h: Move to ... * sysdeps/generic/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/i386/bits/atomic.h: Move to ... * sysdeps/i386/atomic-machine.h: ...here. * sysdeps/ia64/bits/atomic.h: Move to ... * sysdeps/ia64/atomic-machine.h: ...here. * sysdeps/m68k/coldfire/bits/atomic.h: Move to ... * sysdeps/m68k/coldfire/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/m68k/m680x0/m68020/bits/atomic.h: Move to ... * sysdeps/m68k/m680x0/m68020/atomic-machine.h: ...here. * sysdeps/microblaze/bits/atomic.h: Move to ... * sysdeps/microblaze/atomic-machine.h: ...here. * sysdeps/mips/bits/atomic.h: Move to ... * sysdeps/mips/atomic-machine.h: ...here. (_MIPS_BITS_ATOMIC_H): Rename macro to _MIPS_ATOMIC_MACHINE_H. * sysdeps/powerpc/bits/atomic.h: Move to ... * sysdeps/powerpc/atomic-machine.h: ...here. Update comments. * sysdeps/powerpc/powerpc32/bits/atomic.h: Move to ... * sysdeps/powerpc/powerpc32/atomic-machine.h: ...here. Update comments. Include <atomic-machine.h> instead of <bits/atomic.h>. * sysdeps/powerpc/powerpc64/bits/atomic.h: Move to ... * sysdeps/powerpc/powerpc64/atomic-machine.h: ...here. Include <atomic-machine.h> instead of <bits/atomic.h>. * sysdeps/s390/bits/atomic.h: Move to ... * sysdeps/s390/atomic-machine.h: ...here. * sysdeps/sparc/sparc32/bits/atomic.h: Move to ... * sysdeps/sparc/sparc32/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/sparc/sparc32/sparcv9/bits/atomic.h: Move to ... * sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: ...here. * sysdeps/sparc/sparc64/bits/atomic.h: Move to ... * sysdeps/sparc/sparc64/atomic-machine.h: ...here. * sysdeps/tile/bits/atomic.h: Move to ... * sysdeps/tile/atomic-machine.h: ...here. * sysdeps/tile/tilegx/bits/atomic.h: Move to ... * sysdeps/tile/tilegx/atomic-machine.h: ...here. Include <sysdeps/tile/atomic-machine.h> instead of <sysdeps/tile/bits/atomic.h>. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/tile/tilepro/bits/atomic.h: Move to ... * sysdeps/tile/tilepro/atomic-machine.h: ...here. Include <sysdeps/tile/atomic-machine.h> instead of <sysdeps/tile/bits/atomic.h>. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/arm/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/arm/atomic-machine.h: ...here. Include <sysdeps/arm/atomic-machine.h> instead of <sysdeps/arm/bits/atomic.h>. * sysdeps/unix/sysv/linux/hppa/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/hppa/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: ...here. (_BITS_ATOMIC_H): Rename macro to _ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/nios2/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/nios2/atomic-machine.h: ...here. (_NIOS2_BITS_ATOMIC_H): Rename macro to _NIOS2_ATOMIC_MACHINE_H. * sysdeps/unix/sysv/linux/sh/bits/atomic.h: Move to ... * sysdeps/unix/sysv/linux/sh/atomic-machine.h: ...here. * sysdeps/x86_64/bits/atomic.h: Move to ... * sysdeps/x86_64/atomic-machine.h: ...here. * include/atomic.h: Include <atomic-machine.h> instead of <bits/atomic.h>.
2015-06-02Use libc_hidden_proto / libc_hidden_def with __strnlen.Joseph Myers
Various code in glibc uses __strnlen instead of strnlen for namespace reasons. However, __strnlen does not use libc_hidden_proto / libc_hidden_def (as is normally done for any function defined and called within the same library, whether or not exported from the library and whatever namespace it is in), so the compiler does not know that those calls are to a function within libc. This patch uses libc_hidden_proto / libc_hidden_def with __strnlen. On x86_64, it makes no difference to the installed stripped shared libraries. On 32-bit x86, it causes __strnlen calls to go to the same place as strnlen calls (the fallback strnlen implementation), rather than through a PLT entry for the strnlen IFUNC; I'm not sure of the logic behind when calls from within libc should use IFUNCs versus when they should go direct to a particular function implementation, but clearly it doesn't make sense for strnlen and __strnlen to be handled differently in this regard. Tested for x86_64 and x86 (testsuite, and comparison of installed shared libraries as described above). * string/strnlen.c [!STRNLEN] (__strnlen): Use libc_hidden_def. * include/string.h (__strnlen): Use libc_hidden_proto. * sysdeps/aarch64/strnlen.S (__strnlen): Use libc_hidden_def. * sysdeps/i386/i686/multiarch/strnlen-c.c [SHARED] (libc_hidden_def): Define __GI___strnlen as well as __GI_strnlen. * sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-power7.S (libc_hidden_def): Undefine and redefine. * sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c [SHARED] (libc_hidden_def): Define __GI___strnlen as well as __GI_strnlen. * sysdeps/powerpc/powerpc32/power7/strnlen.S (__strnlen): Use libc_hidden_def. * sysdeps/tile/tilegx/strnlen.c (__strnlen): Likewise.
2015-01-28tilegx32: set __HAVE_64B_ATOMICS to 0Chris Metcalf
This is because of alignment issues in the sem_t support. tilegx32 does in fact support 64-bit atomics and we will need to revisit this after the 2.21 freeze.
2015-01-02Update copyright dates with scripts/update-copyrights.Joseph Myers
2014-12-23tilegx: enable wordsize-64 support for ieee745 dbl-64.Chris Metcalf
I missed this during the initial port. Some testing shows that enabling this mode does, unsurprisingly, yield some nice speedups on the math functions in question.
2014-12-22tilegx: remove implicit boolean conversion in strstr.Chris Metcalf
[BZ #17746] The __builtin_expect() truncated a uint64_t to a 32-bit long in ILP32 mode, discarding the high 32 bits, and potentially missing the NUL terminator that we were searching for with SIMD operations. Explicitly compare to zero to fix the problem.
2014-12-19tilegx: fix strstr to build and link betterChris Metcalf
The two_way_short_needle() routine included from str-two-way.h is not used, so mark it so to avoid compiler warnings. Calling strnlen() breaks linknamespace tests, so change it to __strnlen().
2014-12-11tile: add inhibit_loop_to_libcall to string functionsChris Metcalf
Without this, on gcc 4.8.2 the built glibc crashes when memcpy or memset are invoked, since they call themselves recursively. See commit 85c2e6110c9a01ec for the generic inhibit_loop_to_libcall.
2014-11-20Add arch-specific configuration for C11 atomics support.Torvald Riegel
This sets __HAVE_64B_ATOMICS if provided. It also sets USE_ATOMIC_COMPILER_BUILTINS to true if the existing atomic ops use the __atomic* builtins (aarch64, mips partially) or if this has been tested (x86_64); otherwise, this is set to false so that C11 atomics will be based on the existing atomic operations.
2014-10-06tile: fix copyright header blocks in just-committed filesChris Metcalf
I accidentally committed versions not following the conventions.
2014-10-06tilegx: provide optimized strnlen, strstr, and strcasestrChris Metcalf
strnlen() is based on the existing tile strlen() with length checking added. It speeds up by up to 5x, but on average across the benchtest corpus by around 35%. No regressions are seen. strstr() does 8-byte aligned loads and compares using a 2-byte filter on the first two bytes of the needle and then testing the remaining bytes in needle using memcmp(). It speeds up about 5x in the best case (for "found" needles), about 2x looking at benchtest as a whole, with some slowdowns as much as 45%. on a few cases (including the "fail" case for 128KB search). strcasestr() is based on strstr() but uses a SIMD tolower routine to convert 8-bytes to lower case in 5 instructions. It also uses a 2-byte filter and then strncasecmp() for the remaining bytes. strncasecmp() is not optimized for SIMD, so there is futher room for improvement. However, it is still up to 16x faster for "found" needles, averaging 2x faster on the whole corpus of benchtests. It does slow down by up to 35% on a few cases, similarly to strstr().
2014-10-06tilegx: optimize string copy_byte() internal functionChris Metcalf
We can use one "shufflebytes" instruction instead of 3 "bfins" instructions to optimize the string functions.
2014-06-28Fix Wundef warning for MEMCPY_OK_FOR_FWD_MEMMOVESiddhesh Poyarekar
Define MEMCPY_OK_FOR_FWD_MEMMOVE in memcopy.h and let arch-specific implementations of that file override the value if necessary. This override is only useful for tile and moving this macro to memcopy.h allows us to remove the tile-specific memmove.c.
2014-02-10Move tilegx, tilepro, and linux-generic from ports to libc.Chris Metcalf
I've moved the TILE-Gx and TILEPro ports to the main sysdeps hierarchy, along with the linux-generic ports infrastructure. Beyond the README update, the move was just git mv ports/sysdeps/tile sysdeps/tile git mv ports/sysdeps/unix/sysv/linux/tile \ sysdeps/unix/sysv/linux/tile git mv ports/sysdeps/unix/sysv/linux/generic \ sysdeps/unix/sysv/linux/generic I updated the relevant ChangeLogs along the lines of the ARM move in commit c6bfe5c4d75 and tested the 64-bit tilegx build to confirm that there were no changes in "objdump -dr" output in the shared objects.