Age | Commit message (Collapse) | Author | |
---|---|---|---|
2010-07-31 | Add support for SSSE3 and SSE4.2 versions of strcasecmp on x86-64. | Ulrich Drepper | |
2010-07-30 | Pretty printing x86-64 SSE4.3 strcmp. | Ulrich Drepper | |
2010-07-30 | Implement optimized strcaecmp for x86-64. | Ulrich Drepper | |
2010-07-30 | Fix tolower operation in strcasestr. | Ulrich Drepper | |
2010-07-27 | Avoid compiling unneeded file in ld.so. | Ulrich Drepper | |
2010-07-26 | Add optimized x86-64 implementation of strnlen. | Ulrich Drepper | |
While at it, beef up the test suite for strnlen and add performance tests for it, too. | |||
2010-07-24 | Speed up x86-64 strcasestr a bit moew. | Ulrich Drepper | |
Using the new SSE4.2 instructions is cool but not really the fastest. Some older SSE instructions can do the trick faster. | |||
2010-07-21 | Add strcasestr-nonascii to i386 build | Andreas Schwab | |
2010-07-16 | Fix non-ASCII case of SSE4.2 strcasstr. | Ulrich Drepper | |
2010-07-16 | Speed up SSE4.2 strcasestr by avoiding indirect function call. | Ulrich Drepper | |
2010-06-30 | Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7 | H.J. Lu | |
This patch includes optimized 64bit memcpy/memmove for Atom, Core 2 and Core i7. It improves memcpy by up to 3X on Atom, up to 4X on Core 2 and up to 1X on Core i7. It also improves memmove by up to 3X on Atom, up to 4X on Core 2 and up to 2X on Core i7. | |||
2010-05-27 | Incorrect x86 CPU family and model check. | H.J. Lu | |
2010-04-14 | Whitespace fix. | Ulrich Drepper | |
2010-04-14 | Add x86-32 FMA support | H.J. Lu | |
2010-04-14 | Check DATA_CACHE_SIZE_HALF | H.J. Lu | |
2010-04-14 | Optimie x86-64 SSE4 memcmp for unaligned data. | H.J. Lu | |
2010-04-14 | x86-64 SSE4 optimized memcmp | H.J. Lu | |
This is 64bit SSE4 optimized memcmp. It improves memcmp by upto 3X on Intel Core i7. | |||
2010-04-13 | Update x86-64 cpu multiarch selection header. | Ulrich Drepper | |
2010-04-04 | Fix concurrent handling of __cpu_features. | Ulrich Drepper | |
2010-03-24 | Don't define __strpbrk_sse42 in static library | H.J. Lu | |
2010-03-04 | Fix R_X86_64_PC32 overflow detection | Richard Guenther | |
2010-02-24 | We can use the 64-bit register versions of the double functions. | Ulrich Drepper | |
2010-02-09 | Avoid PLT call to fegetenv on s390 | Andreas Schwab | |
2010-01-14 | Prevent silent errors should x86-64 strncmp be needed outside libc. | Ulrich Drepper | |
2010-01-13 | Unroll the loop x86-64 SSE4.2 strlen. | H.J. Lu | |
2010-01-12 | Optimize 32bit memset/memcpy with SSE2/SSSE3. | H.J. Lu | |
2009-12-13 | Define bit_SSE2 and index_SSE2. | H.J. Lu | |
2009-12-13 | Define bit_XXX and index_XXX. | H.J. Lu | |
This patch defines bit_XXX and index_XXX and use them to check processor feature in assembly code. It can prevent typos in processor feature check. | |||
2009-10-22 | Fix whitespaces. | Ulrich Drepper | |
2009-10-22 | Implement SSE4.2 optimized strchr and strrchr. | H.J. Lu | |
2009-10-06 | Clean up unnecessary libc_hidden_builtin_def fiddling in x86 multiarch ↵ | Roland McGrath | |
definitions. | |||
2009-10-06 | Clean up x86 multiarch HAS_FOO macros. | Roland McGrath | |
2009-09-15 | configure tweaks, support $libc_add_on_config_subdirs | Roland McGrath | |
2009-09-02 | Fix strstr/strcasestr/fma/fmaf on x86_64. | Jakub Jelinek | |
2009-09-01 | Fix x86_64 bits/mathinline.h for -m32 compilation. | Jakub Jelinek | |
2009-08-31 | Fix parse error in bits/mathinline.h with --std=c99 | Andreas Schwab | |
2009-08-28 | Remove ENABLE_SSSE3_ON_ATOM. | H.J. Lu | |
It turns that SSSE3 isn't slow on Atom. The problem is bsf. This patch removes ENABLE_SSSE3_ON_ATOM. | |||
2009-08-25 | Optimize out duplicated scalbln code for x86-64. | Ulrich Drepper | |
2009-08-25 | Optimized signbit{,f} for x86-64. | Ulrich Drepper | |
2009-08-25 | Handle AVX saving on x86-64 in interrupted smbol lookups. | Ulrich Drepper | |
If a signal arrived during a symbol lookup and the signal handler also required a symbol lookup, the end of the lookup in the signal handler reset the flag whether restoring AVX/SSE registers is needed. Resetting means in this case that the tail part of the outer lookup code will try to restore the registers and this can fail miserably. We now restore to the previous value which makes nesting calls possible. | |||
2009-08-24 | Add ceil implementation for 64-bit machines. | Ulrich Drepper | |
On 64-bit machines we should not split doubles into two 32 bit integer and handle the words separately. We have wide registers. This patch implements a 64-bit ceil version. Ideally all other functions will be converted over time. | |||
2009-08-24 | Optimize float construction/extraction on x86-64. | Ulrich Drepper | |
2009-08-24 | Optimize x86-64 signbit{,f} a bit. | Ulrich Drepper | |
2009-08-08 | Support mixed SSE/AVX audit and check AVX only once. | H.J. Lu | |
This patch fixes mixed SSE/AVX audit and checks AVX only once in _dl_runtime_profile. When an AVX or SSE register value in pltenter is modified, we have to make sure that the SSE part value is the same in both lr_xmm and lr_vector fields so that pltexit will get the correct value from either lr_xmm or lr_vector fields. AVX-enabled pltenter should update both lr_xmm and lr_vector fields to support stacked AVX/SSE pltenter functions. | |||
2009-08-08 | Move SSE4.2 functions together. | Ulrich Drepper | |
2009-08-07 | Add SSSE3-optimized implementation of str{,n}cmp for x86-64. | Ulrich Drepper | |
2009-08-07 | Avoid warning through fake initialization. | Ulrich Drepper | |
2009-08-07 | Fix whitespaces in last checkin. | Ulrich Drepper | |
2009-08-07 | Properly count number of logical processors on Intel CPUs. | H.J. Lu | |
The meaning of the 25-14 bits in EAX returned from cpuid with EAX = 4 has been changed from "the maximum number of threads sharing the cache" to "the maximum number of addressable IDs for logical processors sharing the cache" if cpuid takes EAX = 11. We need to use results from both EAX = 4 and EAX = 11 to get the number of threads sharing the cache. The 25-14 bits in EAX on Core i7 is 15 although the number of logical processors is 8. Here is a white paper on this: http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/ This patch correctly counts number of logical processors on Intel CPUs with EAX = 11 support on cpuid. Tested on Dinnington, Core i7 and Nehalem EX/EP. It also fixed Pentium Ds workaround since EBX may not have the right value returned from cpuid with EAX = 1. | |||
2009-08-04 | Add x86 32-bit SSE4.2 string functions. | H.J. Lu | |
This patch adds 32bit SSE4.2 string functions. It uses -16L instead of 0xfffffffffffffff0L, which works for both 32bit and 64bit long. Tested on 32bit Core i7 and Core 2. |