public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug libc/11607] New: strstr function very slow. @ 2010-05-17 9:27 xoswald at debian dot org 2010-05-17 9:28 ` [Bug libc/11607] " xoswald at debian dot org ` (4 more replies) 0 siblings, 5 replies; 12+ messages in thread From: xoswald at debian dot org @ 2010-05-17 9:27 UTC (permalink / raw) To: glibc-bugs I think tere are a huge regression for the strstr function between version 2.7 and 2.10. In our code we use the 'strstr' function a lot and became to the fact that strstr is having a strange behaviour. I attach 2 file so you can test it. $gcc test-strstr.c -o test $time test 10000 Some results (User Time): ================= Ubuntu 8.04 : 10s (glibc 2.7) Ubuntu 9.04 : 1m20s (glibc 2.10) Ubuntu 10.04 : 1m20s (glibc 2.10) Debian lenny : 8s (glibc 2.7) Debian Squeeze : 1m10s (glibc 2.10) The results are the same on AMD64 and i386 architectures. -- Summary: strstr function very slow. Product: glibc Version: 2.10 Status: NEW Severity: normal Priority: P2 Component: libc AssignedTo: drepper at redhat dot com ReportedBy: xoswald at debian dot org CC: glibc-bugs at sources dot redhat dot com http://sourceware.org/bugzilla/show_bug.cgi?id=11607 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libc/11607] strstr function very slow. 2010-05-17 9:27 [Bug libc/11607] New: strstr function very slow xoswald at debian dot org @ 2010-05-17 9:28 ` xoswald at debian dot org 2010-05-17 9:29 ` xoswald at debian dot org ` (3 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: xoswald at debian dot org @ 2010-05-17 9:28 UTC (permalink / raw) To: glibc-bugs ------- Additional Comments From xoswald at debian dot org 2010-05-17 09:28 ------- Created an attachment (id=4798) --> (http://sourceware.org/bugzilla/attachment.cgi?id=4798&action=view) Source code -- http://sourceware.org/bugzilla/show_bug.cgi?id=11607 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libc/11607] strstr function very slow. 2010-05-17 9:27 [Bug libc/11607] New: strstr function very slow xoswald at debian dot org 2010-05-17 9:28 ` [Bug libc/11607] " xoswald at debian dot org @ 2010-05-17 9:29 ` xoswald at debian dot org 2010-05-17 9:33 ` xoswald at debian dot org ` (2 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: xoswald at debian dot org @ 2010-05-17 9:29 UTC (permalink / raw) To: glibc-bugs ------- Additional Comments From xoswald at debian dot org 2010-05-17 09:28 ------- Created an attachment (id=4799) --> (http://sourceware.org/bugzilla/attachment.cgi?id=4799&action=view) Data file -- http://sourceware.org/bugzilla/show_bug.cgi?id=11607 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libc/11607] strstr function very slow. 2010-05-17 9:27 [Bug libc/11607] New: strstr function very slow xoswald at debian dot org 2010-05-17 9:28 ` [Bug libc/11607] " xoswald at debian dot org 2010-05-17 9:29 ` xoswald at debian dot org @ 2010-05-17 9:33 ` xoswald at debian dot org 2010-05-17 9:42 ` jakub at redhat dot com 2010-05-17 12:14 ` eblake at redhat dot com 4 siblings, 0 replies; 12+ messages in thread From: xoswald at debian dot org @ 2010-05-17 9:33 UTC (permalink / raw) To: glibc-bugs ------- Additional Comments From xoswald at debian dot org 2010-05-17 09:33 ------- in test-strstr.c, just try to match a string a the end of the file test-strstr.txt I tested with a needle > 16 and < 16, it doesn't change anything. -- http://sourceware.org/bugzilla/show_bug.cgi?id=11607 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libc/11607] strstr function very slow. 2010-05-17 9:27 [Bug libc/11607] New: strstr function very slow xoswald at debian dot org ` (2 preceding siblings ...) 2010-05-17 9:33 ` xoswald at debian dot org @ 2010-05-17 9:42 ` jakub at redhat dot com 2010-05-17 12:14 ` eblake at redhat dot com 4 siblings, 0 replies; 12+ messages in thread From: jakub at redhat dot com @ 2010-05-17 9:42 UTC (permalink / raw) To: glibc-bugs -- What |Removed |Added ---------------------------------------------------------------------------- CC| |ebb9 at byu dot net http://sourceware.org/bugzilla/show_bug.cgi?id=11607 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libc/11607] strstr function very slow. 2010-05-17 9:27 [Bug libc/11607] New: strstr function very slow xoswald at debian dot org ` (3 preceding siblings ...) 2010-05-17 9:42 ` jakub at redhat dot com @ 2010-05-17 12:14 ` eblake at redhat dot com 4 siblings, 0 replies; 12+ messages in thread From: eblake at redhat dot com @ 2010-05-17 12:14 UTC (permalink / raw) To: glibc-bugs ------- Additional Comments From eblake at redhat dot com 2010-05-17 12:13 ------- In glibc 2.7, the strstr() implementation has worst-case quadratic complexity of O(m*n) (m the length of the haystack, n the length of the string); and gnulib introduced a test case that demonstrated this quadratic complexity could be used as a denial of service attack on any program that searches for a large needle within a large string. With glibc 2.10, the strstr() has a guaranteed linear complexity of O(m+n), but that reduced algorithmic complexity comes with a larger startup cost of factoring the needle, where the shorter the needle is, the larger proportion of the overall strstr() is spent on that factoring. Furthermore, the glibc 2.7 implementation has had years of tweaking to be cache-line friendly and tuned for various architectures, while my 2.10 implementation is relatively new, and probably still has some room for improvements (at any rate, I know that when I wrote it, I was catering to large needles and large haystacks, and was focused on the algorithmic improvement, rather than the cache-line effects of small needles). With appropriate benchmarking, it may be worth altering the implementation to be a hybrid, where short-enough needles are handled with the older brute force implementation - even if it is quadratic, less startup cost might be better for short needles even if it results in more comparisons against the haystack for some of those needles. Furthermore, strstr() has a lousy interface when compared with regex searching. That is, if you repeatedly search for the same needle, the use of re_compile() is spent only once, while you can reuse that compiled needle across multiple re_search() calls. But with strstr(), the cost of factoring the needle is spent on every call. Perhaps we should consider exposing some new interfaces that separate the strstr() factorization startup code from the body that uses the factored needle for the actual search. -- What |Removed |Added ---------------------------------------------------------------------------- CC|ebb9 at byu dot net |eblake at redhat dot com http://sourceware.org/bugzilla/show_bug.cgi?id=11607 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <bug-11607-131@http.sourceware.org/bugzilla/>]
* [Bug libc/11607] strstr function very slow. [not found] <bug-11607-131@http.sourceware.org/bugzilla/> @ 2012-06-28 22:23 ` maxim.kuvyrkov at gmail dot com 2012-06-28 22:24 ` maxim.kuvyrkov at gmail dot com ` (4 subsequent siblings) 5 siblings, 0 replies; 12+ messages in thread From: maxim.kuvyrkov at gmail dot com @ 2012-06-28 22:23 UTC (permalink / raw) To: glibc-bugs http://sourceware.org/bugzilla/show_bug.cgi?id=11607 Maxim Kuvyrkov <maxim.kuvyrkov at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED CC| |maxim.kuvyrkov at gmail dot | |com --- Comment #5 from Maxim Kuvyrkov <maxim.kuvyrkov at gmail dot com> 2012-06-28 22:22:51 UTC --- We expect this to be fixed in 2.17. Maxim's patches [1] fix performance regression for small needles -- which is what the testcase for this bug exercises. Ondrej's patches [2] appear to further improve performance for x86. [1] http://sourceware.org/ml/libc-alpha/2012-05/msg01228.html [2] http://sourceware.org/ml/libc-alpha/2012-06/msg00027.html -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libc/11607] strstr function very slow. [not found] <bug-11607-131@http.sourceware.org/bugzilla/> 2012-06-28 22:23 ` maxim.kuvyrkov at gmail dot com @ 2012-06-28 22:24 ` maxim.kuvyrkov at gmail dot com 2012-06-28 22:40 ` eblake at redhat dot com ` (3 subsequent siblings) 5 siblings, 0 replies; 12+ messages in thread From: maxim.kuvyrkov at gmail dot com @ 2012-06-28 22:24 UTC (permalink / raw) To: glibc-bugs http://sourceware.org/bugzilla/show_bug.cgi?id=11607 Maxim Kuvyrkov <maxim.kuvyrkov at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|drepper.fsp at gmail dot |maxim.kuvyrkov at gmail dot |com |com -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libc/11607] strstr function very slow. [not found] <bug-11607-131@http.sourceware.org/bugzilla/> 2012-06-28 22:23 ` maxim.kuvyrkov at gmail dot com 2012-06-28 22:24 ` maxim.kuvyrkov at gmail dot com @ 2012-06-28 22:40 ` eblake at redhat dot com 2012-09-04 15:14 ` carlos_odonell at mentor dot com ` (2 subsequent siblings) 5 siblings, 0 replies; 12+ messages in thread From: eblake at redhat dot com @ 2012-06-28 22:40 UTC (permalink / raw) To: glibc-bugs http://sourceware.org/bugzilla/show_bug.cgi?id=11607 --- Comment #6 from Eric Blake <eblake at redhat dot com> 2012-06-28 22:40:23 UTC --- (In reply to comment #5) > We expect this to be fixed in 2.17. > > Maxim's patches [1] fix performance regression for small needles -- which is > what the testcase for this bug exercises. Ondrej's patches [2] appear to > further improve performance for x86. > > [1] http://sourceware.org/ml/libc-alpha/2012-05/msg01228.html > [2] http://sourceware.org/ml/libc-alpha/2012-06/msg00027.html Note that [2] was just a summary of Ondrej's benchmarking; searching for an actual patch of his turns up this link: http://sourceware.org/ml/libc-help/2011-11/msg00011.html -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libc/11607] strstr function very slow. [not found] <bug-11607-131@http.sourceware.org/bugzilla/> ` (2 preceding siblings ...) 2012-06-28 22:40 ` eblake at redhat dot com @ 2012-09-04 15:14 ` carlos_odonell at mentor dot com 2012-09-04 20:02 ` maxim.kuvyrkov at gmail dot com 2014-06-30 18:04 ` fweimer at redhat dot com 5 siblings, 0 replies; 12+ messages in thread From: carlos_odonell at mentor dot com @ 2012-09-04 15:14 UTC (permalink / raw) To: glibc-bugs http://sourceware.org/bugzilla/show_bug.cgi?id=11607 Carlos O'Donell <carlos_odonell at mentor dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |carlos_odonell at mentor | |dot com --- Comment #7 from Carlos O'Donell <carlos_odonell at mentor dot com> 2012-09-04 15:14:02 UTC --- Maxim, I thought this had already been checked in and fixed? -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libc/11607] strstr function very slow. [not found] <bug-11607-131@http.sourceware.org/bugzilla/> ` (3 preceding siblings ...) 2012-09-04 15:14 ` carlos_odonell at mentor dot com @ 2012-09-04 20:02 ` maxim.kuvyrkov at gmail dot com 2014-06-30 18:04 ` fweimer at redhat dot com 5 siblings, 0 replies; 12+ messages in thread From: maxim.kuvyrkov at gmail dot com @ 2012-09-04 20:02 UTC (permalink / raw) To: glibc-bugs http://sourceware.org/bugzilla/show_bug.cgi?id=11607 Maxim Kuvyrkov <maxim.kuvyrkov at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED --- Comment #8 from Maxim Kuvyrkov <maxim.kuvyrkov at gmail dot com> 2012-09-04 20:02:10 UTC --- Hm, I thought I marked this FIXED. Apparently, I didn't. Thanks Carlos. -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug libc/11607] strstr function very slow. [not found] <bug-11607-131@http.sourceware.org/bugzilla/> ` (4 preceding siblings ...) 2012-09-04 20:02 ` maxim.kuvyrkov at gmail dot com @ 2014-06-30 18:04 ` fweimer at redhat dot com 5 siblings, 0 replies; 12+ messages in thread From: fweimer at redhat dot com @ 2014-06-30 18:04 UTC (permalink / raw) To: glibc-bugs https://sourceware.org/bugzilla/show_bug.cgi?id=11607 Florian Weimer <fweimer at redhat dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |security- -- You are receiving this mail because: You are on the CC list for the bug. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2014-06-30 18:04 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-05-17 9:27 [Bug libc/11607] New: strstr function very slow xoswald at debian dot org 2010-05-17 9:28 ` [Bug libc/11607] " xoswald at debian dot org 2010-05-17 9:29 ` xoswald at debian dot org 2010-05-17 9:33 ` xoswald at debian dot org 2010-05-17 9:42 ` jakub at redhat dot com 2010-05-17 12:14 ` eblake at redhat dot com [not found] <bug-11607-131@http.sourceware.org/bugzilla/> 2012-06-28 22:23 ` maxim.kuvyrkov at gmail dot com 2012-06-28 22:24 ` maxim.kuvyrkov at gmail dot com 2012-06-28 22:40 ` eblake at redhat dot com 2012-09-04 15:14 ` carlos_odonell at mentor dot com 2012-09-04 20:02 ` maxim.kuvyrkov at gmail dot com 2014-06-30 18:04 ` fweimer at redhat dot com
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).