Age | Commit message (Collapse) | Author |
|
|
|
|
|
Syncs up with generic code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This includes the overridden mpa.c in power4.
|
|
|
|
Fixed comment style and clearer wording in some cases.
|
|
|
|
|
|
The power4-specific mpa.c depended on some global variables that were
removed by earlier patches. Also, it did not define mpone and mptwo.
|
|
|
|
|
|
|
|
|
|
Initially based on the versions found in wcsmbs/* ; these files have
been changed by hand unrolling, and adding some additional variables
to allow some read-ahead to occur, which then relieves some of the
wait-for-increment/wait-for-load/wait-for-compare-results pressure
that was slowing down every iteration through the while-loop.
For 64-bit Power7, These changes give an approx 20% throughput boost
for the wcschr and wcsrchr functions; and approx 40% boost for the
wcscpy function. 32-bit improvements appear to be slightly better
with ~ %30 and ~ %45 respectively. Results for Power6 closely match
those for power7.
|
|
Assorted tweaking, twisting and tuning to squeeze a few additional cycles
out of the memchr code. Changes include bypassing the shift pairs
(sld,srd) when they are not required, and unrolling the small_loop that
handles short and trailing strings.
Per scrollpipe data measuring aligned strings for 64-bit, these changes
save between five and eight cycles (9-13% overall) for short strings (<32),
Longer aligned strings see slight improvement of 1-3% due to bypassing the
shifts and the instruction rearranging.
|
|
|
|
|
|
Split PowerPC definitions in PPC32 and PPC64 headers.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* sysdeps/powerpc/powerpc32/dl-irel.h (elf_ifunc_invoke): Pass
dl_hwcap to ifunc resolver.
* sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Use
elf_ifunc_invoke.
* sysdeps/powerpc/powerpc64/dl-irel.h (elf_ifunc_invoke): Pass
dl_hwcap to ifunc resolver.
* sysdeps/powerpc/powerpc64/dl-machine.h (resolve_ifunc): Likewise.
|
|
Update for libm abilist for POWER6 and POWER7.
|
|
|
|
In the past the "-ftree-loop-linear" switch provided a measurable
improvement in performance for certain functions. At some point it
was assigned as the responsibility of Graphite in GCC. It has been
found that even with Graphite enabled these flags no longer perform
any appreciable improvement over the baseline.
Graphite now has some open bugs which need to be fixed in order for it
to provide measurable performance improvements but it lacks active
development. As a result some compiler distributors may disable
Graphite. If Graphite is disabled then building GLIBC will fail if
the "-ftree-loop-linear" switch is used.
This patch removes the use of "-ftree-loop-linear" as unnecessary.
|
|
|
|
|
|
|
|
This patch provides optimized logb (1.2x on PPC32 and 2.5x on PPC64),
logbf (1.1x on PPC32 and 2.2x on PPC64), and logbl (1.3x on PPC32 and
50% on PPC64) for the POWER7 processor.
|
|
|
|
This fix replaces switch statements that contain individual
[fwd|bwd]_align_merge (<constant>) calls with a single [fwd|bwd]_align_merge
(align) call.
|
|
|
|
* elf/dynamic-link.h (_ELF_DYNAMIC_DO_RELOC): Reduce down to one
definition.
* sysdeps/powerpc/powerpc32/dl-machine.h
(ELF_MACHINE_PLTREL_OVERLAP): Delete.
* sysdeps/s390/s390-32/dl-machine.h
(ELF_MACHINE_PLTREL_OVERLAP): Likewise.
* sysdeps/sparc/sparc32/dl-machine.h
(ELF_MACHINE_PLTREL_OVERLAP): Likewise.
* sysdeps/sparc/sparc64/dl-machine.h
(ELF_MACHINE_PLTREL_OVERLAP): Likewise.
|
|
|
|
* sysdeps/powerpc/powerpc32/elf/bzero.S: Moved to ...
* sysdeps/powerpc/powerpc32/bzero.S: ... here.
* sysdeps/powerpc/powerpc32/elf/start.S: Moved to ...
* sysdeps/powerpc/powerpc32/start.S: ... here.
* sysdeps/powerpc/powerpc32/elf/configure.in: Merge into ...
* sysdeps/powerpc/powerpc32/configure.in: ... this.
* sysdeps/powerpc/powerpc32/elf/configure: Delete file.
|
|
|
|
|
|
|
|
Entire tree edited via find | grep | sed.
|
|
|
|
|