Age | Commit message (Collapse) | Author | |
---|---|---|---|
2019-01-01 | Update copyright dates with scripts/update-copyrights. | Joseph Myers | |
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise. | |||
2018-06-01 | x86-64: Optimize strcmp/wcscmp and strncmp/wcsncmp with AVX2 | Leonardo Sandoval | |
Optimize x86-64 strcmp/wcscmp and strncmp/wcsncmp with AVX2. It uses vector comparison as much as possible. Peak performance observed on a SkyLake machine: 9x, 3x, 2.5x and 5.5x for strcmp, strncmp, wcscmp and wcsncmp, respectively. The larger the comparison length, the more benefit using avx2 functions, except on the strcmp, where peak is observed at length == 32 bytes. Select AVX2 strcmp/wcscmp on AVX2 machines where vzeroupper is preferred and AVX unaligned load is fast. NB: It uses TZCNT instead of BSF since TZCNT produces the same result as BSF for non-zero input. TZCNT is faster than BSF and is executed as BSF if machine doesn't support TZCNT. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add strcmp-avx2, strncmp-avx2, wcscmp-avx2, wcscmp-sse2, wcsncmp-avx2 and wcsncmp-sse2. * sysdeps/x86_64/multiarch/ifunc-impl-list.c (__libc_ifunc_impl_list): Add tests for __strcmp_avx2, __strncmp_avx2, __wcscmp_avx2, __wcsncmp_avx2, __wcscmp_sse2 and __wcsncmp_sse2. * sysdeps/x86_64/multiarch/strcmp.c (OPTIMIZE (avx2)): (IFUNC_SELECTOR): Return OPTIMIZE (avx2) on AVX 2 machines if AVX unaligned load is fast and vzeroupper is preferred. * sysdeps/x86_64/multiarch/strncmp.c: Likewise. * sysdeps/x86_64/multiarch/strcmp-avx2.S: New file. * sysdeps/x86_64/multiarch/strncmp-avx2.S: Likewise. * sysdeps/x86_64/multiarch/wcscmp-avx2.S: Likewise. * sysdeps/x86_64/multiarch/wcscmp-sse2.S: Likewise. * sysdeps/x86_64/multiarch/wcscmp.c: Likewise. * sysdeps/x86_64/multiarch/wcsncmp-avx2.S: Likewise. * sysdeps/x86_64/multiarch/wcsncmp-sse2.c: Likewise. * sysdeps/x86_64/multiarch/wcsncmp.c: Likewise. * sysdeps/x86_64/wcscmp.S (__wcscmp): Add alias only if __wcscmp is undefined. | |||
2018-01-01 | Update copyright dates with scripts/update-copyrights. | Joseph Myers | |
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise. | |||
2017-01-01 | Update copyright dates with scripts/update-copyrights. | Joseph Myers | |
2016-01-04 | Update copyright dates with scripts/update-copyrights. | Joseph Myers | |
2015-06-09 | Fix regcomp wcscoll, wcscmp namespace (bug 18497). | Joseph Myers | |
regcomp brings in references to wcscoll, which isn't in all the standards that contain regcomp. In turn, wcscoll brings in references to wcscmp, also not in all those standards. This patch fixes this by making those functions into weak aliases of __wcscoll and __wcscmp and calling those names instead as needed. Tested for x86_64 and x86 (testsuite, and that disassembly of installed shared libraries is unchanged by the patch). [BZ #18497] * wcsmbs/wcscmp.c [!WCSCMP] (WCSCMP): Define as __wcscmp instead of wcscmp. (wcscmp): Define as weak alias of WCSCMP. * wcsmbs/wcscoll.c (STRCOLL): Define as __wcscoll instead of wcscoll. (USE_HIDDEN_DEF): Define. [!USE_IN_EXTENDED_LOCALE_MODEL] (wcscoll): Define as weak alias of __wcscoll. Don't use libc_hidden_weak. * wcsmbs/wcscoll_l.c (STRCMP): Define as __wcscmp instead of wcscmp. * sysdeps/i386/i686/multiarch/wcscmp-c.c [SHARED] (libc_hidden_def): Define __GI___wcscmp instead of __GI_wcscmp. (weak_alias): Undefine and redefine. * sysdeps/i386/i686/multiarch/wcscmp.S (wcscmp): Rename to __wcscmp and define as weak alias of __wcscmp. * sysdeps/x86_64/wcscmp.S (wcscmp): Likewise. * include/wchar.h (__wcscmp): Declare. Use libc_hidden_proto. (__wcscoll): Likewise. (wcscmp): Don't use libc_hidden_proto. (wcscoll): Likewise. * posix/regcomp.c (build_range_exp): Call __wcscoll instead of wcscoll. * posix/regexec.c (check_node_accept_bytes): Likewise. * conform/Makefile (test-xfail-XPG3/regex.h/linknamespace): Remove variable. (test-xfail-XPG4/regex.h/linknamespace): Likewise. (test-xfail-POSIX/regex.h/linknamespace): Likewise. | |||
2015-01-02 | Update copyright dates with scripts/update-copyrights. | Joseph Myers | |
2014-01-01 | Update copyright notices with scripts/update-copyrights | Allan McRae | |
2013-01-02 | Update copyright notices with scripts/update-copyrights. | Joseph Myers | |
2012-02-09 | Replace FSF snail mail address with URLs. | Paul Eggert | |
2011-10-23 | Fix WS | Ulrich Drepper | |
2011-10-23 | Fix signedness in wcscmp comparison | Liubov Dmitrieva | |
2011-09-06 | Remove now-wrong comment | Ulrich Drepper | |
2011-09-05 | Add optimized x86-64 wcscmp | Ulrich Drepper | |