public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
* [Bug regex/611] New: regex with a long character sequence requires huge stack space
@ 2004-12-17 8:34 bonzini at gnu dot org
2004-12-17 8:38 ` [Bug regex/611] " bonzini at gnu dot org
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: bonzini at gnu dot org @ 2004-12-17 8:34 UTC (permalink / raw)
To: glibc-bugs-regex
The parse tree produced by concat nodes is extremely unbalanced. With an 8 MB
stack,
perl -e "print 's/' . ('a' x 300000) . '//'" > huge
sed -f huge
segfaults in calc_first. Of course with a lower stack size it takes much
shorter regexes to trigger the bug, and anyway this is not nice behavior (ok,
regex is a memory hog but at least it usually gives REG_ESPACE).
With a 512 KB stack, even compiling a{32767} segfaults.
--
Summary: regex with a long character sequence requires huge stack
space
Product: glibc
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: regex
AssignedTo: bonzini at gnu dot org
ReportedBy: bonzini at gnu dot org
CC: glibc-bugs-regex at sources dot redhat dot com,glibc-
bugs at sources dot redhat dot com
http://sources.redhat.com/bugzilla/show_bug.cgi?id=611
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug regex/611] regex with a long character sequence requires huge stack space
2004-12-17 8:34 [Bug regex/611] New: regex with a long character sequence requires huge stack space bonzini at gnu dot org
@ 2004-12-17 8:38 ` bonzini at gnu dot org
2004-12-20 10:22 ` bonzini at gnu dot org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bonzini at gnu dot org @ 2004-12-17 8:38 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From bonzini at gnu dot org 2004-12-17 08:37 -------
Created an attachment (id=312)
--> (http://sources.redhat.com/bugzilla/attachment.cgi?id=312&action=view)
patch fixing this bug
This patch is a big one that completely separates parse tree creation and NFA
construction. It was posted on
http://sources.redhat.com/ml/libc-alpha/2004-12/msg00045.html first; this
version is identical except that it includes the bug 605 changes on which the
libc-alpha patch is based.
Improvements in this patch include:
1) It eliminates recursion on CONCAT nodes from regcomp.c; with an 8 MB
stack,
perl -e "print 's/' . ('a' x 300000) . '//'" | sed -f -
segfaults in analyze_tree without the patch. Of course with a lower
stack size it takes much shorter regexes to trigger the bug, and anyway
this is not nice behavior (ok, regex is a memory hog but at least it
usually gives REG_ESPACE). I don't have the time to fix this in a
different way -- as I said, I am not doing this just for fun.
2) It provides a useful infrastructure while actually removing code from
regcomp.c (the patched file is ~60 lines shorter). This infrastructure
can help in turn removing code from regexec.c and making it faster (see
for example the star-normal form mentioned in libc-alpha).
--
http://sources.redhat.com/bugzilla/show_bug.cgi?id=611
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug regex/611] regex with a long character sequence requires huge stack space
2004-12-17 8:34 [Bug regex/611] New: regex with a long character sequence requires huge stack space bonzini at gnu dot org
2004-12-17 8:38 ` [Bug regex/611] " bonzini at gnu dot org
@ 2004-12-20 10:22 ` bonzini at gnu dot org
2004-12-27 9:57 ` bonzini at gnu dot org
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bonzini at gnu dot org @ 2004-12-20 10:22 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From bonzini at gnu dot org 2004-12-20 10:22 -------
I'm putting pending patches at P1, just as a convention.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Priority|P2 |P1
http://sources.redhat.com/bugzilla/show_bug.cgi?id=611
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug regex/611] regex with a long character sequence requires huge stack space
2004-12-17 8:34 [Bug regex/611] New: regex with a long character sequence requires huge stack space bonzini at gnu dot org
2004-12-17 8:38 ` [Bug regex/611] " bonzini at gnu dot org
2004-12-20 10:22 ` bonzini at gnu dot org
@ 2004-12-27 9:57 ` bonzini at gnu dot org
2004-12-27 15:11 ` bonzini at gnu dot org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bonzini at gnu dot org @ 2004-12-27 9:57 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From bonzini at gnu dot org 2004-12-27 09:57 -------
This patch, introducing generic visitors, may also help in the elimination of
CONCAT nodes, which are a memory hog. They could be replaced (together with
the calc_first pass) with a linked list of the binary trees that are
concatenated.
--
http://sources.redhat.com/bugzilla/show_bug.cgi?id=611
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug regex/611] regex with a long character sequence requires huge stack space
2004-12-17 8:34 [Bug regex/611] New: regex with a long character sequence requires huge stack space bonzini at gnu dot org
` (2 preceding siblings ...)
2004-12-27 9:57 ` bonzini at gnu dot org
@ 2004-12-27 15:11 ` bonzini at gnu dot org
2004-12-27 16:43 ` drepper at redhat dot com
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bonzini at gnu dot org @ 2004-12-27 15:11 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From bonzini at gnu dot org 2004-12-27 15:11 -------
Created an attachment (id=324)
--> (http://sources.redhat.com/bugzilla/attachment.cgi?id=324&action=view)
version of the patch not including the changes in BZ605 (same as posted to
libc-alpha)
--
http://sources.redhat.com/bugzilla/show_bug.cgi?id=611
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug regex/611] regex with a long character sequence requires huge stack space
2004-12-17 8:34 [Bug regex/611] New: regex with a long character sequence requires huge stack space bonzini at gnu dot org
` (3 preceding siblings ...)
2004-12-27 15:11 ` bonzini at gnu dot org
@ 2004-12-27 16:43 ` drepper at redhat dot com
2004-12-28 16:34 ` bonzini at gnu dot org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: drepper at redhat dot com @ 2004-12-27 16:43 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From drepper at redhat dot com 2004-12-27 16:43 -------
I'm waiting to be able to apply the patch in bug 605.
Also, always use unified diffs, not this horrible, unreadable context diff format.
--
http://sources.redhat.com/bugzilla/show_bug.cgi?id=611
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug regex/611] regex with a long character sequence requires huge stack space
2004-12-17 8:34 [Bug regex/611] New: regex with a long character sequence requires huge stack space bonzini at gnu dot org
` (4 preceding siblings ...)
2004-12-27 16:43 ` drepper at redhat dot com
@ 2004-12-28 16:34 ` bonzini at gnu dot org
2005-01-26 22:28 ` drepper at redhat dot com
2005-01-26 22:43 ` drepper at redhat dot com
7 siblings, 0 replies; 9+ messages in thread
From: bonzini at gnu dot org @ 2004-12-28 16:34 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From bonzini at gnu dot org 2004-12-28 16:34 -------
Ok. I usually prefer context diffs for very large patches, but I'll remember this.
I've updated BZ 605.
Paolo
--
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |605
http://sources.redhat.com/bugzilla/show_bug.cgi?id=611
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug regex/611] regex with a long character sequence requires huge stack space
2004-12-17 8:34 [Bug regex/611] New: regex with a long character sequence requires huge stack space bonzini at gnu dot org
` (5 preceding siblings ...)
2004-12-28 16:34 ` bonzini at gnu dot org
@ 2005-01-26 22:28 ` drepper at redhat dot com
2005-01-26 22:43 ` drepper at redhat dot com
7 siblings, 0 replies; 9+ messages in thread
From: drepper at redhat dot com @ 2005-01-26 22:28 UTC (permalink / raw)
To: glibc-bugs-regex
--
Bug 611 depends on bug 605, which changed state.
Bug 605 Summary: regex goes uselessly through slow paths
http://sources.redhat.com/bugzilla/show_bug.cgi?id=605
What |Old Value |New Value
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
http://sources.redhat.com/bugzilla/show_bug.cgi?id=611
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug regex/611] regex with a long character sequence requires huge stack space
2004-12-17 8:34 [Bug regex/611] New: regex with a long character sequence requires huge stack space bonzini at gnu dot org
` (6 preceding siblings ...)
2005-01-26 22:28 ` drepper at redhat dot com
@ 2005-01-26 22:43 ` drepper at redhat dot com
7 siblings, 0 replies; 9+ messages in thread
From: drepper at redhat dot com @ 2005-01-26 22:43 UTC (permalink / raw)
To: glibc-bugs-regex
------- Additional Comments From drepper at redhat dot com 2005-01-26 22:43 -------
Applied to cvs after adjusted to current sources.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
http://sources.redhat.com/bugzilla/show_bug.cgi?id=611
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2005-01-26 22:43 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-17 8:34 [Bug regex/611] New: regex with a long character sequence requires huge stack space bonzini at gnu dot org
2004-12-17 8:38 ` [Bug regex/611] " bonzini at gnu dot org
2004-12-20 10:22 ` bonzini at gnu dot org
2004-12-27 9:57 ` bonzini at gnu dot org
2004-12-27 15:11 ` bonzini at gnu dot org
2004-12-27 16:43 ` drepper at redhat dot com
2004-12-28 16:34 ` bonzini at gnu dot org
2005-01-26 22:28 ` drepper at redhat dot com
2005-01-26 22:43 ` drepper at redhat dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).