public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
* [Bug regex/558] New: regcomp and regexec bug
@ 2004-11-18 2:06 vprodan dot hosting at artstyle dot ru
2004-11-22 9:15 ` [Bug regex/558] " bonzini at gnu dot org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: vprodan dot hosting at artstyle dot ru @ 2004-11-18 2:06 UTC (permalink / raw)
To: glibc-bugs-regex
Some regular expressions make regcomp to eat CPU and memory.
Try run date |egrep '.{1,2048}' and enjoy the result if you're patient enough.
Test case:
#include <stdio.h>
#include <regex.h>
int main(void) {
char s[] = "/.{1,2048}.{1,2048}.{1,2048}.{1,2048}.{1,2048}/";
regex_t reg;
int res;
fprintf(stderr, "starting\n");
res = regcomp(®, s, REG_EXTENDED);
fprintf(stderr, "regcomp(®, '%s', REG_EXTENDED) = %d\n", s, res);
}
This bug appeared in between 2.2.5 and 2.3:
> date: 2002/02/27 19:00:56; author: drepper; state: Exp; lines: +7 -8399
> Check in complete rewrite.
--
Summary: regcomp and regexec bug
Product: glibc
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: regex
AssignedTo: gotom at debian dot or dot jp
ReportedBy: vprodan dot hosting at artstyle dot ru
CC: glibc-bugs-regex at sources dot redhat dot com,glibc-
bugs at sources dot redhat dot com,vprodan dot hosting
at artstyle dot ru
GCC build triplet: i686-pc-linux-gnu
http://sources.redhat.com/bugzilla/show_bug.cgi?id=558
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug regex/558] regcomp and regexec bug
2004-11-18 2:06 [Bug regex/558] New: regcomp and regexec bug vprodan dot hosting at artstyle dot ru
@ 2004-11-22 9:15 ` bonzini at gnu dot org
2004-11-23 12:31 ` bonzini at gnu dot org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: bonzini at gnu dot org @ 2004-11-22 9:15 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From bonzini at gnu dot org 2004-11-22 09:15 -------
It can be fixed by lowering for example .{1,5} to (.(.(.(..?)?)?)?)? instead
of .?.?.?.?.? --- of course the brackets are not capturing, they're only there
to show what is the question mark applied to.
This has the property of making all the period nodes epsilon-transit to the end
of the braced expression, rather than to the next period. The epsilon closure
thus is much smaller.
Another improvement is to use a special version of re_node_set_insert in
calc_inveclosure, since we know that the node is being added at the end of the
set. I have a patch to do that and it obtains a 30% improvement on this test
case. It does not cure the complexity problem though, so I am not submitting
it yet.
--
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|gotom at debian dot or dot |bonzini at gnu dot org
|jp |
Status|NEW |ASSIGNED
http://sources.redhat.com/bugzilla/show_bug.cgi?id=558
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug regex/558] regcomp and regexec bug
2004-11-18 2:06 [Bug regex/558] New: regcomp and regexec bug vprodan dot hosting at artstyle dot ru
2004-11-22 9:15 ` [Bug regex/558] " bonzini at gnu dot org
@ 2004-11-23 12:31 ` bonzini at gnu dot org
2004-11-23 12:32 ` bonzini at gnu dot org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: bonzini at gnu dot org @ 2004-11-23 12:31 UTC (permalink / raw)
To: glibc-bugs-regex
--
What |Removed |Added
----------------------------------------------------------------------------
OtherBugsDependingO| |500
nThis| |
http://sources.redhat.com/bugzilla/show_bug.cgi?id=558
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug regex/558] regcomp and regexec bug
2004-11-18 2:06 [Bug regex/558] New: regcomp and regexec bug vprodan dot hosting at artstyle dot ru
2004-11-22 9:15 ` [Bug regex/558] " bonzini at gnu dot org
2004-11-23 12:31 ` bonzini at gnu dot org
@ 2004-11-23 12:32 ` bonzini at gnu dot org
2004-12-10 16:56 ` bonzini at gnu dot org
2005-01-27 19:08 ` drepper at redhat dot com
4 siblings, 0 replies; 6+ messages in thread
From: bonzini at gnu dot org @ 2004-11-23 12:32 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From bonzini at gnu dot org 2004-11-23 12:32 -------
Created an attachment (id=289)
--> (http://sources.redhat.com/bugzilla/attachment.cgi?id=289&action=view)
patch to speed up testcase by 50%
Reduce complexity of calc_eclosure in this case, by modifying the way .{1,2048}
is lowered. calc_inveclosure is sped up but still expensive.
--
http://sources.redhat.com/bugzilla/show_bug.cgi?id=558
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug regex/558] regcomp and regexec bug
2004-11-18 2:06 [Bug regex/558] New: regcomp and regexec bug vprodan dot hosting at artstyle dot ru
` (2 preceding siblings ...)
2004-11-23 12:32 ` bonzini at gnu dot org
@ 2004-12-10 16:56 ` bonzini at gnu dot org
2005-01-27 19:08 ` drepper at redhat dot com
4 siblings, 0 replies; 6+ messages in thread
From: bonzini at gnu dot org @ 2004-12-10 16:56 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From bonzini at gnu dot org 2004-12-10 16:56 -------
Patch applied. I'm going through some more changes in the regex matcher so I'm
delaying the inveclosure work for a while.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |SUSPENDED
http://sources.redhat.com/bugzilla/show_bug.cgi?id=558
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug regex/558] regcomp and regexec bug
2004-11-18 2:06 [Bug regex/558] New: regcomp and regexec bug vprodan dot hosting at artstyle dot ru
` (3 preceding siblings ...)
2004-12-10 16:56 ` bonzini at gnu dot org
@ 2005-01-27 19:08 ` drepper at redhat dot com
4 siblings, 0 replies; 6+ messages in thread
From: drepper at redhat dot com @ 2005-01-27 19:08 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From drepper at redhat dot com 2005-01-27 19:07 -------
Updated patch sent to libc-hacker checked into cvs.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|SUSPENDED |RESOLVED
Resolution| |FIXED
http://sources.redhat.com/bugzilla/show_bug.cgi?id=558
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-01-27 19:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-18 2:06 [Bug regex/558] New: regcomp and regexec bug vprodan dot hosting at artstyle dot ru
2004-11-22 9:15 ` [Bug regex/558] " bonzini at gnu dot org
2004-11-23 12:31 ` bonzini at gnu dot org
2004-11-23 12:32 ` bonzini at gnu dot org
2004-12-10 16:56 ` bonzini at gnu dot org
2005-01-27 19:08 ` drepper at redhat dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).