diff options
author | Adhemerval Zanella <azanella@linux.vnet.ibm.com> | 2014-12-31 11:47:41 -0500 |
---|---|---|
committer | Adhemerval Zanella <azanella@linux.vnet.ibm.com> | 2015-01-13 11:28:44 -0500 |
commit | f06a4faf8a2b4d046eb40e94b47948cc47d79902 (patch) | |
tree | 846d1fc4c0ce0be53ef275c227b2accab263bbda /NEWS | |
parent | 9f2f36e5a91c2ce6edba5415e176155eb1008ae1 (diff) | |
download | glibc-f06a4faf8a2b4d046eb40e94b47948cc47d79902.tar glibc-f06a4faf8a2b4d046eb40e94b47948cc47d79902.tar.gz glibc-f06a4faf8a2b4d046eb40e94b47948cc47d79902.tar.bz2 glibc-f06a4faf8a2b4d046eb40e94b47948cc47d79902.zip |
powerpc: Optimized st{r,p}ncpy for POWER8/PPC64
This patch adds an optimized POWER8 st{r,p}ncpy using unaligned accesses.
It shows 10%-80% improvement over the optimized POWER7 one that uses
only aligned accesses, specially on unaligned inputs.
The algorithm first read and check 16 bytes (if inputs do not cross a 4K
page size). The it realign source to 16-bytes and issue a 16 bytes read
and compare loop to speedup null byte checks for large strings. Also,
different from POWER7 optimization, the null pad is done inline in the
implementation using possible unaligned accesses, instead of realying on
a memset call. Special case is added for page cross reads.
Diffstat (limited to 'NEWS')
-rw-r--r-- | NEWS | 3 |
1 files changed, 2 insertions, 1 deletions
@@ -19,7 +19,8 @@ Version 2.21 17744, 17745, 17746, 17747, 17748, 17775, 17777, 17780, 17781, 17782, 17791, 17793, 17796, 17797, 17803, 17806, 17834 -* Optimized strcpy and stpcpy implementations for powerpc64/powerpc64le. +* Optimized strcpy, stpcpy, strncpy, stpncpy implementations for + powerpc64/powerpc64le. * Added support for TSX lock elision of pthread mutexes on powerpc32, powerpc64 and powerpc64le. This may improve lock scaling of existing programs on |