diff options
author | Wilco Dijkstra <wdijkstr@arm.com> | 2016-06-20 17:38:13 +0100 |
---|---|---|
committer | Wilco Dijkstra <wdijkstr@arm.com> | 2016-06-20 17:41:33 +0100 |
commit | b998e16e71c8617746b7c39500e925d28ff22ed8 (patch) | |
tree | 6cca1efccbf603c0664547d4fd66d625806d6396 /ChangeLog | |
parent | aca1daef298b43bd7b1987b31f5aabcf6c2f6021 (diff) | |
download | glibc-b998e16e71c8617746b7c39500e925d28ff22ed8.tar glibc-b998e16e71c8617746b7c39500e925d28ff22ed8.tar.gz glibc-b998e16e71c8617746b7c39500e925d28ff22ed8.tar.bz2 glibc-b998e16e71c8617746b7c39500e925d28ff22ed8.zip |
This is an optimized memcpy/memmove for AArch64. Copies are split into 3 main
cases: small copies of up to 16 bytes, medium copies of 17..96 bytes which are
fully unrolled. Large copies of more than 96 bytes align the destination and
use an unrolled loop processing 64 bytes per iteration. In order to share code
with memmove, small and medium copies read all data before writing, allowing
any kind of overlap. All memmoves except for the large backwards case fall
into memcpy for optimal performance. On a random copy test memcpy/memmove are
40% faster on Cortex-A57 and 28% on Cortex-A53.
* sysdeps/aarch64/memcpy.S (memcpy):
Rewrite of optimized memcpy and memmove.
* sysdeps/aarch64/memmove.S (memmove): Remove
memmove code (merged into memcpy.S).
Diffstat (limited to 'ChangeLog')
-rw-r--r-- | ChangeLog | 7 |
1 files changed, 7 insertions, 0 deletions
@@ -1,3 +1,10 @@ +2016-06-20 Wilco Dijkstra <wdijkstr@arm.com> + + * sysdeps/aarch64/memcpy.S (memcpy): + Rewrite of optimized memcpy and memmove. + * sysdeps/aarch64/memmove.S (memmove): Remove + memmove code (merged into memcpy.S). + 2016-06-20 Florian Weimer <fweimer@redhat.com> Consolidate machine-agnostic DTV definitions in <dl-dtv.h>. |