* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
@ 2010-10-05 5:58 ` ppluzhnikov at google dot com
2010-10-05 17:08 ` ppluzhnikov at google dot com
` (14 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: ppluzhnikov at google dot com @ 2010-10-05 5:58 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Paul Pluzhnikov <ppluzhnikov at google dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #5035|application/octet-stream |text/plain
mime type| |
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
2010-10-05 5:58 ` [Bug libc/12092] " ppluzhnikov at google dot com
@ 2010-10-05 17:08 ` ppluzhnikov at google dot com
2010-10-05 17:36 ` ppluzhnikov at google dot com
` (13 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: ppluzhnikov at google dot com @ 2010-10-05 17:08 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
--- Comment #1 from Paul Pluzhnikov <ppluzhnikov at google dot com> 2010-10-05 17:08:33 UTC ---
Additional analysis from iant@google.com:
I'm not completely sure, but this is what I see so far. The bug can only occur
when the second argument to strstr (the needle) is periodic, which is to say
that it consists entirely of some repeated string. When that happens, the code
can fail to match if the first argument to strstr (the haystack) contains two
or more repetitions of the needle's periodic string, but not as many as the
number of occurrences as are in the needle. In that case strstr can sometimes
return a pointer to the smaller number of repetitions, when it should properly
return NULL or a later pointer. Also, the needle has to be 32 bytes or more.
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
2010-10-05 5:58 ` [Bug libc/12092] " ppluzhnikov at google dot com
2010-10-05 17:08 ` ppluzhnikov at google dot com
@ 2010-10-05 17:36 ` ppluzhnikov at google dot com
2010-10-05 18:17 ` ian at airs dot com
` (12 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: ppluzhnikov at google dot com @ 2010-10-05 17:36 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Paul Pluzhnikov <ppluzhnikov at google dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #5035|0 |1
is obsolete| |
--- Comment #2 from Paul Pluzhnikov <ppluzhnikov at google dot com> 2010-10-05 17:36:31 UTC ---
Created attachment 5037
--> http://sourceware.org/bugzilla/attachment.cgi?id=5037
slightly simplified test case
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (2 preceding siblings ...)
2010-10-05 17:36 ` ppluzhnikov at google dot com
@ 2010-10-05 18:17 ` ian at airs dot com
2010-10-05 18:18 ` ian at airs dot com
` (11 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: ian at airs dot com @ 2010-10-05 18:17 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Ian Lance Taylor <ian at airs dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |ian at airs dot com
--- Comment #3 from Ian Lance Taylor <ian at airs dot com> 2010-10-05 18:17:44 UTC ---
I think the problem is the Boyer-Moore shift in two_way_long_needle in
str-two-way.h. It does not correctly update MEMORY. I think we need something
like
if (memory && shift < period)
{
/* Since needle is periodic, but the last period has
a byte out of place, there can be no match until
after the mismatch. */
shift = needle_len - period;
memory = 0;
}
else if (memory > shift)
memory = memory - shift;
else
memory = 0;
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (3 preceding siblings ...)
2010-10-05 18:17 ` ian at airs dot com
@ 2010-10-05 18:18 ` ian at airs dot com
2010-10-05 18:31 ` eblake at redhat dot com
` (10 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: ian at airs dot com @ 2010-10-05 18:18 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Ian Lance Taylor <ian at airs dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |ebb9 at byu dot net
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (4 preceding siblings ...)
2010-10-05 18:18 ` ian at airs dot com
@ 2010-10-05 18:31 ` eblake at redhat dot com
2010-10-05 22:10 ` eblake at redhat dot com
` (9 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: eblake at redhat dot com @ 2010-10-05 18:31 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Eric Blake <eblake at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC|ebb9 at byu dot net |eblake at redhat dot com
--- Comment #4 from Eric Blake <eblake at redhat dot com> 2010-10-05 18:31:36 UTC ---
Yep, resetting 'memory' after a large shift is required; I'm testing your idea
now, but think you have the right patch in mind.
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (5 preceding siblings ...)
2010-10-05 18:31 ` eblake at redhat dot com
@ 2010-10-05 22:10 ` eblake at redhat dot com
2010-10-05 22:24 ` eblake at redhat dot com
` (8 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: eblake at redhat dot com @ 2010-10-05 22:10 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
--- Comment #5 from Eric Blake <eblake at redhat dot com> 2010-10-05 22:10:16 UTC ---
Your test for (memory > shift) will never be reached. Other than the
assignment added by your proposed patch, memory is only ever assigned to be 0
or needle_len - period. And for a periodic needle, shift is either needle_len
or < period, by virtue of how the shift table is constructed. Therefore, if
memory is non-zero but shift >= period, then shift is necessarily > memory at
that point.
Which means your code can be reduced to this simpler patch:
diff --git i/string/str-two-way.h w/string/str-two-way.h
index 502af47..76044b3 100644
--- i/string/str-two-way.h
+++ w/string/str-two-way.h
@@ -350,8 +350,8 @@ two_way_long_needle (const unsigned char *haystack,
a byte out of place, there can be no match until
after the mismatch. */
shift = needle_len - period;
- memory = 0;
}
+ memory = 0;
j += shift;
continue;
}
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (6 preceding siblings ...)
2010-10-05 22:10 ` eblake at redhat dot com
@ 2010-10-05 22:24 ` eblake at redhat dot com
2010-10-05 22:29 ` jakub at redhat dot com
` (7 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: eblake at redhat dot com @ 2010-10-05 22:24 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
--- Comment #6 from Eric Blake <eblake at redhat dot com> 2010-10-05 22:23:54 UTC ---
Created attachment 5039
--> http://sourceware.org/bugzilla/attachment.cgi?id=5039
fix strstr and memmem
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (7 preceding siblings ...)
2010-10-05 22:24 ` eblake at redhat dot com
@ 2010-10-05 22:29 ` jakub at redhat dot com
2010-10-06 15:44 ` eblake at redhat dot com
` (6 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: jakub at redhat dot com @ 2010-10-05 22:29 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Jakub Jelinek <jakub at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at redhat dot com
--- Comment #7 from Jakub Jelinek <jakub at redhat dot com> 2010-10-05 22:29:09 UTC ---
Please add a testcase and post to libc-alpha@sources.redhat.com.
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (8 preceding siblings ...)
2010-10-05 22:29 ` jakub at redhat dot com
@ 2010-10-06 15:44 ` eblake at redhat dot com
2010-10-06 15:54 ` [Bug libc/12092] memmem broken on some inputs on all machines; strstr and strcasestr likewise broken " eblake at redhat dot com
` (5 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: eblake at redhat dot com @ 2010-10-06 15:44 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
--- Comment #8 from Eric Blake <eblake at redhat dot com> 2010-10-06 15:43:59 UTC ---
Interestingly enough:
strstr() and strcasestr() are only broken pre-SSE4, but memmem() is broken even
on SSE4 machines.
On the other hand, on SSE4 machines, strstr() and strcasestr() are quadratic in
behavior; in other words, the use of an assembly implementation has actually
caused a performance regression over the fix for
http://sourceware.org/bugzilla/show_bug.cgi?id=5514
$ cat foo.c
#define _GNU_SOURCE
#include <string.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
#define P ":012345678-"
static void quit (int sig) { exit (sig + 128); }
int main(int argc, char **argv)
{
const char *hay = ";" ":013245678-" P P ":012345678." P ":012345678." P;
const char *needle = P P P;
size_t m = 1000000;
char *largehay = malloc (2 * m + 2);
char *largeneedle = malloc (m + 2);
signal (SIGALRM, quit);
alarm (5);
if (!largehay || !largeneedle)
return 2;
memset (largehay, 'A', 2 * m);
largehay[2 * m] = 'B';
largehay[2 * m + 1] = 0;
memset (largeneedle, 'A', m);
largeneedle[m] = 'B';
largeneedle[m + 1] = 0;
switch (argc > 1 ? atoi (argv[1]) : 0)
{
/* Demonstrate str-two-way.h bug. */
case 1:
return !!memmem (hay, strlen (hay), needle, strlen (needle));
case 2:
return !!strstr (hay, needle);
case 3:
return !!strcasestr (hay, needle);
/* Demonstrate quadratic behavior. */
case 4:
return !memmem (largehay, strlen (largehay),
largeneedle, strlen (largeneedle));
case 5:
return !strstr (largehay, largeneedle);
case 6:
return !strcasestr (largehay, largeneedle);
/* Usage error. */
default:
return 2;
}
}
$ for i in $(seq 6); do ./foo $i; echo $?; done
1
0
0
0
142
142
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] memmem broken on some inputs on all machines; strstr and strcasestr likewise broken on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (9 preceding siblings ...)
2010-10-06 15:44 ` eblake at redhat dot com
@ 2010-10-06 15:54 ` eblake at redhat dot com
2010-10-06 17:50 ` [Bug libc/12092] strstr broken for some inputs " drepper.fsp at gmail dot com
` (4 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: eblake at redhat dot com @ 2010-10-06 15:54 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Eric Blake <eblake at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|unspecified |2.9
Summary|strstr broken for some |memmem broken on some
|inputs on pre-SSE4 machines |inputs on all machines;
| |strstr and strcasestr
| |likewise broken on pre-SSE4
| |machines
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (10 preceding siblings ...)
2010-10-06 15:54 ` [Bug libc/12092] memmem broken on some inputs on all machines; strstr and strcasestr likewise broken " eblake at redhat dot com
@ 2010-10-06 17:50 ` drepper.fsp at gmail dot com
2010-10-06 17:59 ` eblake at redhat dot com
` (3 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: drepper.fsp at gmail dot com @ 2010-10-06 17:50 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Ulrich Drepper <drepper.fsp at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Version|2.9 |unspecified
Resolution| |FIXED
Summary|memmem broken on some |strstr broken for some
|inputs on all machines; |inputs on pre-SSE4 machines
|strstr and strcasestr |
|likewise broken on pre-SSE4 |
|machines |
--- Comment #9 from Ulrich Drepper <drepper.fsp at gmail dot com> 2010-10-06 17:50:05 UTC ---
Fixed in git.
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (11 preceding siblings ...)
2010-10-06 17:50 ` [Bug libc/12092] strstr broken for some inputs " drepper.fsp at gmail dot com
@ 2010-10-06 17:59 ` eblake at redhat dot com
2010-10-11 20:07 ` pasky at suse dot cz
` (2 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: eblake at redhat dot com @ 2010-10-06 17:59 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
--- Comment #10 from Eric Blake <eblake at redhat dot com> 2010-10-06 17:58:36 UTC ---
(In reply to comment #9)
> Fixed in git.
The incorrect results of memmem() and of non-SSE4 strstr() are fixed. However,
the glibc 2.11 regression of reintroducing quadratic behavior for SSE4 strstr
is not yet fixed.
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (12 preceding siblings ...)
2010-10-06 17:59 ` eblake at redhat dot com
@ 2010-10-11 20:07 ` pasky at suse dot cz
2010-11-01 21:36 ` pasky at suse dot cz
2014-06-30 7:54 ` fweimer at redhat dot com
15 siblings, 0 replies; 17+ messages in thread
From: pasky at suse dot cz @ 2010-10-11 20:07 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Petr Baudis <pasky at suse dot cz> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |glibc_2.11, glibc_2.12
CC| |pasky at suse dot cz
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (13 preceding siblings ...)
2010-10-11 20:07 ` pasky at suse dot cz
@ 2010-11-01 21:36 ` pasky at suse dot cz
2014-06-30 7:54 ` fweimer at redhat dot com
15 siblings, 0 replies; 17+ messages in thread
From: pasky at suse dot cz @ 2010-11-01 21:36 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=12092
Petr Baudis <pasky at suse dot cz> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords|glibc_2.11, glibc_2.12 |
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug libc/12092] strstr broken for some inputs on pre-SSE4 machines
2010-10-05 5:56 [Bug libc/12092] New: strstr broken for some inputs on pre-SSE4 machines ppluzhnikov at google dot com
` (14 preceding siblings ...)
2010-11-01 21:36 ` pasky at suse dot cz
@ 2014-06-30 7:54 ` fweimer at redhat dot com
15 siblings, 0 replies; 17+ messages in thread
From: fweimer at redhat dot com @ 2014-06-30 7:54 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=12092
Florian Weimer <fweimer at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |fweimer at redhat dot com
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 17+ messages in thread