aboutsummaryrefslogtreecommitdiff
path: root/sysdeps/powerpc/powerpc64/power8
AgeCommit message (Collapse)Author
2019-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
2018-08-20powerpc: Remove powerpc specific sinf and cosf optimizationRajalakshmi Srinivasaraghavan
New generic optimization of sinf and cosf introduced by commit 599cf3976679e1b345307d9c02057f02aa95528f shows improvement compared to powerpc specific assembly version. Hence removing the powerpc assembly versions to make use of generic code.
2018-04-27powerpc64*: fix the order of implied sysdeps directoriesGabriel F. T. Gomes
The creation of the divergent sysdeps directory for powerpc64le commit 2f7f3cd8cd302bb10908c86f3f7b349df0a78e6a Author: Paul E. Murphy <murphyp@linux.vnet.ibm.com> Date: Fri Jul 15 18:04:40 2016 -0500 powerpc64le: Create divergent sysdep directory for powerpc64le. allowed float128 to be enabled for powerpc64le (little-endian) and not for powerpc64 (big-endian). Since the only intended difference between them was the presence or absence of the float128 interface, the sysdeps directory for powerpc64le explicitly reused the files from powerpc64 (through the use of Implies files). Although this works, it also means that files under the powerpc64 directory might be preferred over files under powerpc64le. For instance, on a build for powerpc64le with target set to power9, a file from powerpc64/power5 might get built, even though a file with the same name exists in powerpc64le/power8. That happens because the processor hierarchy was only defined in the sysdeps directory for powerpc64 (and borrowed by powerpc64le). This patch fixes this behavior, by creating new subdirectories under powerpc64 (i.e.: powerpc64/be and powerpc64/le) and creating new Implies files to provide the hierarchy of processors for powerpc64 and powerpc64le separately. These changes have no effect on installed, stripped binaries (which remain unchanged). Tested that installed stripped binaries are unchanged and that there are no regressions on powerpc64 and powerpc64le.
2018-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
2017-12-15powerpc: st{r,p}cpy optimization for aligned stringsRajalakshmi Srinivasaraghavan
This patch makes use of vectors for aligned inputs. Improvements upto 30% seen for larger aligned inputs. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
2017-12-05Use libm_alias_float for powerpc.Joseph Myers
Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes powerpc libm function implementations use libm_alias_float to define function aliases. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged for all its hard-float powerpc configurations. * sysdeps/powerpc/fpu/s_cosf.c: Include <libm-alias-float.h>. (cosf): Define using libm_alias_float. * sysdeps/powerpc/fpu/s_fabs.S: Include <libm-alias-float.h>. (fabsf): Define using libm_alias_float. * sysdeps/powerpc/fpu/s_fmaf.S: Include <libm-alias-float.h>. (fmaf): Define using libm_alias_float. * sysdeps/powerpc/fpu/s_rintf.c: Include <libm-alias-float.h>. (rintf): Define using libm_alias_float. * sysdeps/powerpc/fpu/s_sinf.c: Include <libm-alias-float.h>. (sinf): Define using libm_alias_float. * sysdeps/powerpc/power5+/fpu/s_modff.c: Include <libm-alias-float.h>. (modff): Define using libm_alias_float. * sysdeps/powerpc/power7/fpu/s_logbf.c: Include <libm-alias-float.h>. (logbf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_ceilf.S: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_floorf.S: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_llroundf.c: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Include <libm-alias-float.h>. (lrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_lround.S: Include <libm-alias-float.h>. (lroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S: Include <libm-alias-float.h>. (nearbyintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_rintf.S: Include <libm-alias-float.h>. (rintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_truncf.S: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf.c: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llroundf.c: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf.c: Include <libm-alias-float.h>. (logbf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrintf.c: Include <libm-alias-float.h>. (lrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lroundf.c: Include <libm-alias-float.h>. (lroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff.c: Include <libm-alias-float.h>. (modff): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_ceilf.S: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_floorf.S: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Include <libm-alias-float.h>. (lroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_roundf.S: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_truncf.S: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power6x/fpu/s_lrint.S: Include <libm-alias-float.h>. (lrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power6x/fpu/s_lround.S: Include <libm-alias-float.h>. (lroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf.c: Include <libm-alias-float.h>. (cosf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrintf.c: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf.c: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf.c: Include <libm-alias-float.h>. (logbf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c: Include <libm-alias-float.h>. (modff): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf.c: Include <libm-alias-float.h>. (sinf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_llroundf.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Include <libm-alias-float.h>. (nearbyintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Include <libm-alias-float.h>. (rintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power5+/fpu/s_roundf.S: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power5+/fpu/s_truncf.S: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S: Include <libm-alias-float.h>. (cosf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power8/fpu/s_sinf.S: Include <libm-alias-float.h>. (sinf): Define using libm_alias_float.
2017-12-02Use libm_alias_double for remaining powerpc functions.Joseph Myers
Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes the remaining double powerpc functions use libm_alias_double to define function aliases (with consequent removal of the need for local compat symbol handling). Previous cleanups avoid this patch changing installed stripped shared libraries for any build-many-glibcs.py configuration (there are still some functions in this patch for which the order of double and float aliases changes within an individual source file, but in this case this doesn't result in changes to the final library). Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged for all its hard-float powerpc configurations. * sysdeps/powerpc/power7/fpu/s_logb.c: Include <libm-alias-double.h>. (logb): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Include <libm-alias-double.h>. (copysign): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/fpu/s_llrint.c: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/fpu/s_llround.c: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Include <libm-alias-double.h>. (lrint): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/fpu/s_lround.S: Include <libm-alias-double.h>. (lround): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Include <libm-alias-double.h>. (copysign): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint.c: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround.c: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb.c: Include <libm-alias-double.h>. (logb): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint.c: Include <libm-alias-double.h>. (lrint): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround.c: Include <libm-alias-double.h>. (lround): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power4/fpu/s_llrint.S: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Include <libm-alias-double.h>. (lround): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Include <libm-alias-double.h>. (copysign): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power6x/fpu/s_lrint.S: Include <libm-alias-double.h>. (lrint): Define using libm_alias_double. * sysdeps/powerpc/powerpc32/power6x/fpu/s_lround.S: Include <libm-alias-double.h>. (lround): Define using libm_alias_double. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Include <libm-alias-double.h>. (copysign): Define using libm_alias_double. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint.c: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. (lrint): Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround.c: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. (lround): Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb.c: Include <libm-alias-double.h>. (logb): Define using libm_alias_double. * sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Include <libm-alias-double.h>. (copysign): Define using libm_alias_double. * sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. (lrint): Likewise. * sysdeps/powerpc/powerpc64/fpu/s_llround.S: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. (lround): Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. (lround): Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Include <libm-alias-double.h>. (copysign): Define using libm_alias_double. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. (lrint): Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. (lround): Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Include <libm-alias-double.h>. (llrint): Define using libm_alias_double. (lrint): Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Include <libm-alias-double.h>. (llround): Define using libm_alias_double. (lround): Likewise.
2017-10-23PowerPC64 power8 strncpy cfi fixesAlan Modra
cfi info for stack adjust needs to be on the insn doing the adjust. cfi describing register saves can be anywhere after the save insn but before the reg is altered. Fewer locations with cfi result in smaller cfi programs and possibly slightly faster exception handling. Thus the LR cfi_offset move. The idea behind ajusting sp after restoring regs is to break a register dependency chain, in this case not be using r1 immediately after it is modified. The missing LR cfi_restore meant that code after the blr, unaligned_lt_16 and other labels, would have cfi that said LR was at cfa+16, but that code is reached without LR being saved. * sysdeps/powerpc/powerpc64/power8/strncpy.S: Move LR cfi. Adjust stack after restoring regs. Add missing LR cfi_restore. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
2017-10-06powerpc: Fix IFUNC for memrchrRajalakshmi Srinivasaraghavan
Recent commit 59ba2d2b5421 missed to add __memrchr_power8 in ifunc list. Also handled discarding unwanted bytes for unaligned inputs in power8 optimization. 2017-10-05 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> * sysdeps/powerpc/powerpc64/multiarch/memrchr-ppc64.c: Revert back to powerpc32 file. * sysdeps/powerpc/powerpc64/multiarch/memrchr.c (memrchr): Add __memrchr_power8 to ifunc list. * sysdeps/powerpc/powerpc64/power8/memrchr.S: Mask extra bytes for unaligned inputs.
2017-10-02Do not wrap expf and exp2fSzabolcs Nagy
The new generic expf and exp2f code don't need wrappers any more, they set errno inline, so only use the wrappers on targets that need it. (If the wrapper is needed, then the top level wrapper code is included, otherwise empty w_exp*f.c is used to suppress the wrapper.) A powerpc64 expf implementation includes the expf c code directly which needed some changes. * sysdeps/ieee754/flt-32/e_exp2f.c (__exp2f): Define without wrapper. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Likewise * sysdeps/ieee754/flt-32/w_exp2f.c: New file. * sysdeps/ieee754/flt-32/w_expf.c: New file. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Update for the new expf code. * sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: New file. * sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp2f.c: New file. * sysdeps/m68k/m680x0/fpu/w_expf.c: New file. * sysdeps/i386/fpu/w_exp2f.c: New file. * sysdeps/i386/fpu/w_expf.c: New file. * sysdeps/i386/i686/fpu/multiarch/w_expf.c: New file. * sysdeps/x86_64/fpu/w_expf.c: New file.
2017-10-02powerpc: Optimize memrchr for power8Rajalakshmi Srinivasaraghavan
Vectorized loops are used for sizes greater than 32B to improve performance over power7 optimization. This shows as an average of 25% improvement depending on the position of search character. The performance is same for shorter strings.
2017-09-19powerpc: Avoid misaligned stores in memsetRajalakshmi Srinivasaraghavan
As per the section "3.1.4.2 Alignment Interrupts" of the "POWER8 Processor User's Manual for the Single-Chip Module", alignment interrupt is reported for misaligned stores in Caching-inhibited storage. As memset is used in some drivers for DMA (like xorg), this patch avoids misaligned stores for sizes less than 8 in memset.
2017-08-08Do not use __ptr_t.Joseph Myers
sys/cdefs.h has a macro __ptr_t, which a few places in glibc use instead of void *. void * is a well-understood standard type for that purpose and in a post-C89 context there is no need for a macro for it; this patch changes those places to use void * directly instead. Unlike __long_double_t, __ptr_t is widely used outside glibc (or at least has many hits on codesearch.debian.net). I don't know how many of those uses would break if sys/cdefs.h ceased to define the macro, but there's enough risk that this patch leaves the definition and just removes the uses within glibc; removal of the definition can be considered separately if desired. Tested for x86_64, and with build-many-glibcs.py. * malloc/mcheck.c (old_free_hook): Use void * instead of __ptr_t. (old_malloc_hook): Likewise. (old_memalign_hook): Likewise. (old_realloc_hook): Likewise. (struct hdr): Likewise. (flood): Likewise. (freehook): Likewise. (mallochook): Likewise. (memalignhook): Likewise. (reallochook): Likewise. (mprobe): Likewise. * malloc/mtrace.c (mallwatch): Likewise. (tr_old_free_hook): Likewise. (tr_old_malloc_hook): Likewise. (tr_old_realloc_hook): Likewise. (tr_old_memalign_hook): Likewise. (tr_where): Likewise. (lock_and_info): Likewise. (tr_freehook): Likewise. (tr_mallochook): Likewise. (tr_reallochook): Likewise. (tr_memalignhook): Likewise. * misc/err.h [!__GNUC_VA_LIST] (__gnuc_va_list): Likewise. * misc/mmap.c (__mmap): Likewise. * misc/mmap64.c (__mmap64): Likewise. * misc/mprotect.c (__mprotect): Likewise. * misc/msync.c (msync): Likewise. * misc/munmap.c (__munmap): Likewise. * posix/posix_madvise.c (posix_madvise): Likewise. * socket/send.c (__send): Likewise. * socket/sendto.c (__sendto): Likewise. * socket/setsockopt.c (__setsockopt): Likewise. * string/memcmp.c (__ptr_t): Remove macro. (MEMCMP): Use void * instead of ptr_t. * string/memrchr.c (__ptr_t): Remove macro. (__memrchr): Use void * instead of ptr_t. * sysdeps/mach/hurd/dl-sysdep.c (__mmap): Likewise. * sysdeps/mach/hurd/mmap.c (__mmap): Likewise. * sysdeps/mach/hurd/mmap64.c (__mmap64): Likewise. * sysdeps/mach/mprotect.c (__mprotect): Likewise. * sysdeps/mach/msync.c (msync): Likewise. * sysdeps/mach/munmap.c (__munmap): Likewise. * sysdeps/mips/bits/setjmp.h (struct __jmp_buf_internal_tag): Likewise. * sysdeps/posix/getcwd.c (__getcwd): Likewise. * sysdeps/powerpc/powerpc32/memset.S (memset): Likewise. * sysdeps/powerpc/powerpc32/power4/memcpy.S (memcpy): Likewise. * sysdeps/powerpc/powerpc32/power4/memset.S (memset): Likewise. * sysdeps/powerpc/powerpc32/power6/memcpy.S (memcpy): Likewise. * sysdeps/powerpc/powerpc32/power6/memset.S (memset): Likewise. * sysdeps/powerpc/powerpc32/power7/memcpy.S (memcpy): Likewise. * sysdeps/powerpc/powerpc32/power7/mempcpy.S (__mempcpy): Likewise. * sysdeps/powerpc/powerpc32/power7/memset.S (memset): Likewise. * sysdeps/powerpc/powerpc64/memcpy.S (memcpy): Likewise. * sysdeps/powerpc/powerpc64/memset.S (memset): Likewise. * sysdeps/powerpc/powerpc64/power4/memcpy.S (memcpy): Likewise. * sysdeps/powerpc/powerpc64/power4/memset.S (memset): Likewise. * sysdeps/powerpc/powerpc64/power6/memcpy.S (memcpy): Likewise. * sysdeps/powerpc/powerpc64/power6/memset.S (memset): Likewise. * sysdeps/powerpc/powerpc64/power7/memcpy.S (memcpy): Likewise. * sysdeps/powerpc/powerpc64/power7/mempcpy.S (__mempcpy): Likewise. * sysdeps/powerpc/powerpc64/power7/memset.S (memset): Likewise. * sysdeps/powerpc/powerpc64/power8/memset.S (memset): Likewise. * sysdeps/tile/memcmp.c (__ptr_t): Remove macro. (MEMCMP): Use void * instead of ptr_t. * sysdeps/unix/sysv/linux/alpha/oldglob.c (old_glob_t): Likewise. * sysdeps/unix/sysv/linux/mmap.c (__mmap): Likewise.
2017-07-03powerpc: Clean up strlen and strnlen for power8Rajalakshmi Srinivasaraghavan
To align a quadword aligned address to 64 bytes, maximum of three 16 bytes load is needed for worst case instead of loading four times.
2017-06-23powerpc: refactor strrchr IFUNCRajalakshmi Srinivasaraghavan
As done in commit 6d15a5c2e9450a1e926d5b4991759e1cfa50fccf clean up IFUNC implementation for power8 in order to remove unneeded macro definitions.
2017-06-23powerpc: Add optimized version of [l]lroundfRajalakshmi Srinivasaraghavan
This patch makes use of optimized double version of llround for single precision as both the versions return [long] long type.
2017-06-21powerpc: Optimize memchr for power8Rajalakshmi Srinivasaraghavan
Vectorized loops are used for sizes greater than 32B to improve performance over power7 optimiztion.
2017-06-21powerpc: Add optimized version of [l]lrintfRajalakshmi Srinivasaraghavan
This patch makes use of optimized double version of llrint for single precision as both the versions return [long] long type.
2017-06-14PowerPC64 ENTRY_TOCLESSAlan Modra
A number of functions in the sysdeps/powerpc/powerpc64/ tree don't use or change r2, yet declare a global entry that sets up r2. This patch fixes that problem, and consolidates the ENTRY and EALIGN macros. * sysdeps/powerpc/powerpc64/sysdep.h: Formatting. (NOPS, ENTRY_3): New macros. (ENTRY): Rewrite. (ENTRY_TOCLESS): Define. (EALIGN, EALIGN_W_0, EALIGN_W_1, EALIGN_W_2, EALIGN_W_4, EALIGN_W_5, EALIGN_W_6, EALIGN_W_7, EALIGN_W_8): Delete. * sysdeps/powerpc/powerpc64/a2/memcpy.S: Replace EALIGN with ENTRY. * sysdeps/powerpc/powerpc64/dl-trampoline.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_ceil.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_floor.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_rint.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_round.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_trunc.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Likewise. * sysdeps/powerpc/powerpc64/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strstr.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_sinf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strcasestr.S: Likewise. * sysdeps/powerpc/powerpc64/addmul_1.S: Use ENTRY_TOCLESS. * sysdeps/powerpc/powerpc64/cell/memcpy.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_copysignl.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_fabsl.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_llrintf.S: Likewise. * sysdeps/powerpc/powerpc64/lshift.S: Likewise. * sysdeps/powerpc/powerpc64/memcpy.S: Likewise. * sysdeps/powerpc/powerpc64/mul_1.S: Likewise. * sysdeps/powerpc/powerpc64/power4/memcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power4/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power4/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_ceil.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_round.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_roundf.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_trunc.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_truncf.S: Likewise. * sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power6/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc64/power7/add_n.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memchr.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memmove.S: Likewise. * sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memrchr.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power7/rawmemchr.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strcasecmp.S (strcasecmp_l): Likewise. * sysdeps/powerpc/powerpc64/power7/strchr.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strchrnul.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strlen.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strncpy.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strnlen.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strrchr.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc64/power8/memcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power8/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strchr.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strlen.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strnlen.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strrchr.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strspn.S: Likewise. * sysdeps/powerpc/powerpc64/power9/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power9/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/strchr.S: Likewise. * sysdeps/powerpc/powerpc64/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/strlen.S: Likewise. * sysdeps/powerpc/powerpc64/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/ppc-mcount.S: Store LR earlier. Don't add nop when SHARED. * sysdeps/powerpc/powerpc64/start.S: Fix comment. * sysdeps/powerpc/powerpc64/multiarch/strrchr-power8.S (ENTRY): Don't define. (ENTRY_TOCLESS): Define. * sysdeps/powerpc/powerpc32/sysdep.h (ENTRY_TOCLESS): Define. * sysdeps/powerpc/fpu/s_fma.S: Use ENTRY_TOCLESS. * sysdeps/powerpc/fpu/s_fmaf.S: Likewise.
2017-06-14PowerPC64 strncpy, stpncpy and strstr fixesAlan Modra
Makes __stpncpy_power8 call __memset_power8 directly rather than via an IFUNC. Fixes a missing _mcount, and removes some redundant NOPS. The *_is_local defines are also used in a followup patch. * sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Define MEMSET_is_local. * sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise. Define MEMSET. * sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Define STRLEN_is_local, STRNLEN_is_local, and STRCHR_is_local. * sysdeps/powerpc/powerpc64/power7/strstr.S: Likewise. Don't add nop after local calls. * sysdeps/powerpc/powerpc64/power7/strncpy.S: Define MEMSET_is_local. Don't add nop after local call. * sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise. Add missing CALL_MCOUNT.
2017-05-18powerpc: Improve memcmp performance for POWER8Rajalakshmi Srinivasaraghavan
Vectorization improves performance over the current implementation. Tested on powerpc64 and powerpc64le.
2017-05-17powerpc: Add a POWER8-optimized version of cosf()Paul Clarke
This implementation is based on the one already used at sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf-power8.S. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile [$(subdir) = math] (libm-sysdep_routines): Add s_cosf-power8 and s_cosf-ppc64. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf-power8.S: New file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf.c: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S: Likewise.
2017-04-18powerpc64: strrchr optimization for power8Rajalakshmi Srinivasaraghavan
P7 code is used for <=32B strings and for > 32B vectorized loops are used. This shows as an average 25% improvement depending on the position of search character. The performance is same for shorter strings. Tested on ppc64 and ppc64le.
2017-04-11powerpc: refactor memset IFUNC.Wainer dos Santos Moschetta
Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/memset-power4.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/memset-power6.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memset-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memset-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/memset.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power4/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power6/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power7/memset.S: Likewise. * sysdeps/powerpc/powerpc64/power8/memset.S: Likewise.
2017-04-11powerpc: refactor strcasestr and strstr IFUNC.Wainer dos Santos Moschetta
Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/strcasestr-power8.S: Define the strcasestr implementation name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Define strstr implementation name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/power7/strstr.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power8/strcasestr.S: Likewise.
2017-04-11powerpc: refactor strchr, strchrnul, and strrchr IFUNC.Wainer dos Santos Moschetta
Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/strchr-power7.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/strchr-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strchr-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strchrnul-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strchrnul-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strrchr-power7.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strchr.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power7/strchrnul.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strrchr.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strchr.S: Likewise. * sysdeps/powerpc/powerpc64/strchr.S: Likewise.
2017-04-11powerpc: refactor strnlen and strlen IFUNC.Wainer dos Santos Moschetta
Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/strlen-power7.S: Define the strlen implementation name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/strlen-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strlen-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strnlen-power7.S: Define the strnlen implementation name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/power7/strlen.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power7/strnlen.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strlen.S: Likewise. * sysdeps/powerpc/powerpc64/strlen.S: Likewise.
2017-04-11powerpc: refactor strcasecmp, strcmp, and strncmp IFUNC.Wainer dos Santos Moschetta
Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l-power7.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/strcmp-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcmp-power8.S Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcmp-power9.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcmp-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp-power4.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp-power9.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncmp-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/power4/strncmp.S: Set a default function name if not defined and pass as parameter to macros accordingly. * sysdeps/powerpc/powerpc64/power7/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/power9/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/power9/strncmp.S: Likewise. * sysdeps/powerpc/powerpc64/strcmp.S: Likewise. * sysdeps/powerpc/powerpc64/strncmp.S: Likewise.
2017-04-11powerpc: refactor stpcpy, stpncpy, strcpy, and strncpy IFUNC.Wainer dos Santos Moschetta
Clean up the IFUNC implementations for powerpc in order to remove unneeded macro definitions. Tested on ppc64le with and without --disable-multi-arch flag. * sysdeps/powerpc/powerpc64/multiarch/stpcpy-power8.S: Define the implementation-specific function name and remove unneeded macros definition. * sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strcpy-power8.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Likewise. * sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise. * sysdeps/powerpc/powerpc64/power7/strncpy.S: Set a default function name if not defined. * sysdeps/powerpc/powerpc64/power8/strcpy.S: Likewise. * sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise.
2017-04-05powerpc64: Add POWER8 strnlenWainer dos Santos Moschetta
Added strnlen POWER8 otimized for long strings. It delivers same performance as POWER7 implementation for short strings. This takes advantage of reasonably performing unaligned loads and bit permutes to check the first 1-16 bytes until quadword aligned, then checks in 64 bytes strides until unsafe, then 16 bytes, truncating the count if need be. Likewise, the POWER7 code is recycled for less than 32 bytes strings. Tested on ppc64 and ppc64le. * sysdeps/powerpc/powerpc64/multiarch/Makefile (sysdep_routines): Add strnlen-power8. * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c (strnlen): Add __strnlen_power8 to list of strnlen functions. * sysdeps/powerpc/powerpc64/multiarch/strnlen-power8.S: New file. * sysdeps/powerpc/powerpc64/multiarch/strnlen.c (__strnlen): Add __strnlen_power8 to ifunc list. * sysdeps/powerpc/powerpc64/power8/strnlen.S: New file.
2017-02-07powerpc: Improve strcmp performance for shorter stringsRajalakshmi Srinivasaraghavan
For strings >16B and <32B existing algorithm takes more time than default implementation when strings are placed closed to end of page. This is due to byte by byte access for handling page cross. This is improved by following >32B code path where the address is adjusted to aligned memory before doing load doubleword operation instead of loading bytes. Tested on powerpc64 and powerpc64le.
2017-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
2016-12-28powerpc64: strchr/strchrnul optimization for power8Rajalakshmi Srinivasaraghavan
The P7 code is used for <=32B strings and for > 32B vectorized loops are used. This shows as an average 25% improvement depending on the position of search character. The performance is same for shorter strings. Tested on ppc64 and ppc64le.
2016-07-05powerpc: Fix return code of strcasecmp for unaligned inputsRajalakshmi Srinivasaraghavan
If the input values are unaligned and if there are null characters in the memory before the starting address of the input values, strcasecmp gives incorrect return code. Fixed it by adding mask the bits that are not part of the string.
2016-06-30powerpc: Add a POWER8-optimized version of sinf()Anton Blanchard
This uses the implementation of sinf() in sysdeps/x86_64/fpu/s_sinf.S as inspiration.
2016-06-30powerpc: Add a POWER8-optimized version of expf()Tulio Magno Quites Machado Filho
This implementation is based on the one already used at sysdeps/x86_64/fpu/e_expf.S. This implementation improves the performance by ~14% on average in synthetic benchmarks at the cost of decreasing accuracy to 1 ULP.
2016-06-14powerpc: strcasecmp/strncasecmp optmization for power8raji
This implementation utilizes vectors to improve performance compared to current byte by byte implementation for POWER7. The performance improvement is upto 4x. This patch is tested on powerpc64 and powerpc64le.
2016-06-06powerpc: Fix --disable-multi-arch build on POWER8Tulio Magno Quites Machado Filho
Add missing symbols of stpncpy and strcasestr when multi-arch is disabled. Fix memset call from strncpy/stpncpy when multi-arch is disabled.
2016-05-04powerpc: Fix operand prefixesGabriel F. T. Gomes
The file sysdeps/powerpc/sysdeps.h defines aliases for condition register operands. E.g.: 'cr7' means condition register 7. On the one hand, this increases readability, as it makes it easier for readers to know whether the operand is a condition register, a general purpose register or an immediate. On the other hand, this permits that condition registers be written as if they were general purpose, and vice-versa, thus reducing the readability of the code. This commit removes some of these unintentional misuses. The changes have no effect on the final code. Checked with objdump.
2016-04-29powerpc: Zero pad using memset in strncpy/stpncpyGabriel F. T. Gomes
Call __memset_power8 to pad, with zeros, the remaining bytes in the dest string on __strncpy_power8 and __stpncpy_power8. This improves performance when n is larger than the input string, giving ~30% gain for larger strings without impacting much shorter strings.
2016-04-25powerpc: Add optimized strcspn for P8Paul E. Murphy
A few minor adjustments to the P8 strspn gives us an almost equally optimized P8 strcspn.
2016-04-22powerpc: strcasestr optmization for power8Rajalakshmi Srinivasaraghavan
This patch optimizes strcasestr function for power >= 8 systems. The average improvement of this optimization is ~40% and compares 16 bytes at a time using vector instructions. This patch is tested on powerpc64 and powerpc64le.
2016-04-15powerpc: Optimization for strlen for POWER8.Carlos Eduardo Seo
This implementation takes advantage of vectorization to improve performance of the loop over the current strlen implementation for POWER7.
2016-04-07powerpc: Add optimized P8 strspnPaul E. Murphy
This utilizes vectors and bitmasks. For small needle, large haystack, the performance improvement is upto 8x. For short strings (0-4B), the cost of computing the bitmask dominates, and is a tad slower.
2016-01-04Update copyright dates with scripts/update-copyrights.Joseph Myers
2015-10-01PowerPC: Add comments to optimized strncpyGabriel F. T. Gomes
* sysdeps/powerpc/powerpc64/power8/strncpy.S: Added comments to some assembly instructions.
2015-10-01PowerPC: Fix operand prefixesGabriel F. T. Gomes
The file sysdeps/powerpc/sysdeps.h defines aliases for register operands, which add the letter 'r' as a prefix to a register name. E.g.: register 20 can be written as 'r20', instead of '20'. On the one hand, this increases readability, as it makes it easier for readers to know whether the operand is a register or an immediate. On the other hand, this permits that immediate operands be written as if they were registers, and vice-versa, thus reducing the readability of the code. This commit removes some of these unintentional misuses. This commit also increases readability of the code by adding the prefix 'cr' to some uses of the control register. Both changes have no effect on the final code. Checked with objdump. * sysdeps/powerpc/powerpc64/power8/strncpy.S: Remove or add register prefix from operands.
2015-01-24powerpc: Fix powerpc64 build failure with binutils 2.22Adhemerval Zanella
GLIBC memset optimization for POWER8 uses the '.machine power8' directive, which is only supported officially on binutils 2.24+. This causes a build failure on older binutils. Since the requirement of .machine power8 is to correctly assembly the 'mtvsrd' instruction and it is already handled by the MTVSRD_V1_R4 macro, there is no really needed of using it. The patch replaces the power8 with power7 for .machine directive. It fixes BZ#17869.
2015-01-13powerpc: Optimized strncmp for POWER8/PPC64Adhemerval Zanella
This patch adds an optimized POWER8 strncmp. The implementation focus on speeding up unaligned cases follwing the ideas of power8 strcmp. The algorithm first check the initial 16 bytes, then align the first function source and uses unaligned loads on second argument only. Aditional checks for page boundaries are done for unaligned cases (where sources alignment are different).
2015-01-13powerpc: Optimized strcmp for POWER8/PPC64Adhemerval Zanella
This patch adds an optimized POWER8 strcmp using unaligned accesses. The algorithm first check the initial 16 bytes, then align the first function source and uses unaligned loads on second argument only. Aditional checks for page boundaries are done for unaligned cases