aboutsummaryrefslogtreecommitdiff
path: root/string/strstr.c
AgeCommit message (Collapse)Author
2019-06-12Improve performance of strstrWilco Dijkstra
This patch significantly improves performance of strstr using a novel modified Horspool algorithm. Needles up to size 256 use a bad-character table indexed by hashed pairs of characters to quickly skip past mismatches. Long needles use a self-adapting filtering step to avoid comparing the whole needle repeatedly. By limiting the needle length to 256, the shift table only requires 8 bits per entry, lowering preprocessing overhead and minimizing cache effects. This limit also implies worst-case performance is linear. Small needles up to size 3 use a dedicated linear search. Very long needles use the Two-Way algorithm. The performance gain using the improved bench-strstr on Cortex-A72 is 5.8 times basic_strstr and 3.7 times twoway_strstr. Tested against GLIBC testsuite, randomized tests and the GNULIB strstr test (https://git.savannah.gnu.org/cgit/gnulib.git/tree/tests/test-strstr.c). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> * string/str-two-way.h (two_way_short_needle): Add inline to avoid warning. (two_way_long_needle): Block inlining. * string/strstr.c (strstr2): Add new function. (strstr3): Likewise. (STRSTR): Completely rewrite strstr to improve performance.
2019-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
2018-09-19Fix strstr bug with huge needles (bug 23637)Wilco Dijkstra
The generic strstr in GLIBC 2.28 fails to match huge needles. The optimized AVAILABLE macro reads ahead a large fixed amount to reduce the overhead of repeatedly checking for the end of the string. However if the needle length is larger than this, two_way_long_needle may confuse this as meaning the end of the string and return NULL. This is fixed by adding the needle length to the amount to read ahead. [BZ #23637] * string/test-strstr.c (pr23637): New function. (test_main): Add tests with longer needles. * string/strcasestr.c (AVAILABLE): Fix readahead distance. * string/strstr.c (AVAILABLE): Likewise.
2018-08-03Simplify and speedup strstr/strcasestr first matchWilco Dijkstra
Looking at the benchtests, both strstr and strcasestr spend a lot of time in a slow initialization loop handling one character per iteration. This can be simplified and use the much faster strlen/strnlen/strchr/memcmp. Read ahead a few cachelines to reduce the number of strnlen calls, which improves performance by ~3-4%. This patch improves the time taken for the full strstr benchtest by >40%. * string/strcasestr.c (STRCASESTR): Simplify and speedup first match. * string/strstr.c (AVAILABLE): Likewise.
2018-07-16Improve strstr performanceWilco Dijkstra
Improve strstr performance. Strstr tends to be slow because it uses many calls to memchr and a slow byte loop to scan for the next match. Performance is significantly improved by using strnlen on larger blocks and using strchr to search for the next matching character. strcasestr can also use strnlen to scan ahead, and memmem can use memchr to check for the next match. On the GLIBC bench tests the performance gains on Cortex-A72 are: strstr: +25% strcasestr: +4.3% memmem: +18% On a 256KB dataset strstr performance improves by 67%, strcasestr by 47%. Reviewd-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2018-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
2017-01-01Update copyright dates with scripts/update-copyrights.Joseph Myers
2016-01-04Update copyright dates with scripts/update-copyrights.Joseph Myers
2015-01-02Update copyright dates with scripts/update-copyrights.Joseph Myers
2014-01-01Update copyright notices with scripts/update-copyrightsAllan McRae
2013-01-02Update copyright notices with scripts/update-copyrights.Joseph Myers
2012-10-08Fix BZ #14602: strstr and strcasestr return wrong result.Maxim Kuvyrkov
2012-08-21Micro-optimize critical path of strstr, strcase and memmem.Maxim Kuvyrkov
2012-08-21Detect EOL on-the-fly in strstr, strcasestr and memmem.Maxim Kuvyrkov
2012-02-09Replace FSF snail mail address with URLs.Paul Eggert
2009-07-20SSE4.2 strstr/strcasestr for x86-64.H.J. Lu
This patch implements SSE4.2 strstr/strcasestr, using Knuth-Morris-Pratt string searching algorithm.
2008-05-15* string/Makefile (distribute): Add str-two-way.h.cvs/fedora-glibc-20080515T0735Ulrich Drepper
2008-03-29 Eric Blake <ebb9@byu.net> Rewrite string searches to O(n) rather than O(n^2). * string/str-two-way.h: New file. For linear fixed-allocation string searching. * string/memmem.c: New implementation. * string/strstr.c: New implementation. * string/strcasestr.c: New implementation. * sysdeps/posix/getaddrinfo.c (getaddrinfo): Call _res_hconf_init
2005-12-14Moved to csu/errno-loc.c.Ulrich Drepper
2004-12-22(CFLAGS-tst-align.c): Add -mpreferred-stack-boundary=4.Ulrich Drepper
2007-07-122.5-18.1Jakub Jelinek