public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
* [Bug regex/20095] New: parse_dup_op duplicates the tree exponentially when using repeated +
@ 2016-05-14  1:08 dualbus at gmail dot com
  2020-04-15 15:53 ` [Bug regex/20095] " dpmendenhall at gmail dot com
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: dualbus at gmail dot com @ 2016-05-14  1:08 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=20095

            Bug ID: 20095
           Summary: parse_dup_op duplicates the tree exponentially when
                    using repeated +
           Product: glibc
           Version: 2.24
            Status: NEW
          Severity: normal
          Priority: P2
         Component: regex
          Assignee: unassigned at sourceware dot org
          Reporter: dualbus at gmail dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

For every repeated + in an extended regex, parse_dup_op seems to duplicate the
parse tree.

dualbus@hp:~/v$ ulimit -a | grep cpu
cpu time               (seconds, -t) 1
dualbus@hp:~/v$ grep -E '.++++++++++++++++++++++++++++++++' <<< .
Killed

This seems to be special to +, since * doesn't behave that way.

My guess is that:

.+ is expanded to ..*

So

.+++ is expanded to ........*

And so on. Is this documented somewhere?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
>From glibc-bugs-regex-return-704-listarch-glibc-bugs-regex=sources.redhat.com@sourceware.org Sat May 14 01:10:55 2016
Return-Path: <glibc-bugs-regex-return-704-listarch-glibc-bugs-regex=sources.redhat.com@sourceware.org>
Delivered-To: listarch-glibc-bugs-regex@sources.redhat.com
Received: (qmail 77993 invoked by alias); 14 May 2016 01:10:54 -0000
Mailing-List: contact glibc-bugs-regex-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <glibc-bugs-regex.sourceware.org>
List-Subscribe: <mailto:glibc-bugs-regex-subscribe@sourceware.org>
List-Post: <mailto:glibc-bugs-regex@sourceware.org>
List-Help: <mailto:glibc-bugs-regex-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: glibc-bugs-regex-owner@sourceware.org
Delivered-To: mailing list glibc-bugs-regex@sourceware.org
Received: (qmail 77654 invoked by uid 48); 14 May 2016 01:10:49 -0000
From: "dualbus at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs-regex@sourceware.org
Subject: [Bug regex/20095] parse_dup_op duplicates the tree exponentially when using repeated +
Date: Sat, 14 May 2016 01:10:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: glibc
X-Bugzilla-Component: regex
X-Bugzilla-Version: 2.24
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: dualbus at gmail dot com
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at sourceware dot org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-20095-132-buROaqRg0z@http.sourceware.org/bugzilla/>
In-Reply-To: <bug-20095-132@http.sourceware.org/bugzilla/>
References: <bug-20095-132@http.sourceware.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://sourceware.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2016-05/txt/msg00001.txt.bz2
Content-length: 508

https://sourceware.org/bugzilla/show_bug.cgi?id=20095

Eduardo Bustamante <dualbus at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dualbus at gmail dot com

--- Comment #1 from Eduardo Bustamante <dualbus at gmail dot com> ---
This seems to be related to bug 17150

-- 
You are receiving this mail because:
You are on the CC list for the bug.
>From glibc-bugs-regex-return-705-listarch-glibc-bugs-regex=sources.redhat.com@sourceware.org Tue Jul 19 10:01:34 2016
Return-Path: <glibc-bugs-regex-return-705-listarch-glibc-bugs-regex=sources.redhat.com@sourceware.org>
Delivered-To: listarch-glibc-bugs-regex@sources.redhat.com
Received: (qmail 26999 invoked by alias); 19 Jul 2016 10:01:33 -0000
Mailing-List: contact glibc-bugs-regex-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <glibc-bugs-regex.sourceware.org>
List-Subscribe: <mailto:glibc-bugs-regex-subscribe@sourceware.org>
List-Post: <mailto:glibc-bugs-regex@sourceware.org>
List-Help: <mailto:glibc-bugs-regex-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: glibc-bugs-regex-owner@sourceware.org
Delivered-To: mailing list glibc-bugs-regex@sourceware.org
Received: (qmail 26792 invoked by uid 48); 19 Jul 2016 10:01:21 -0000
From: "fweimer at redhat dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs-regex@sourceware.org
Subject: [Bug regex/20381] New: different results between whether fastmap is available or not
Date: Tue, 19 Jul 2016 10:01:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: glibc
X-Bugzilla-Component: regex
X-Bugzilla-Version: unspecified
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: fweimer at redhat dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at sourceware dot org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: security-
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cc target_milestone flagtypes.name
Message-ID: <bug-20381-132@http.sourceware.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://sourceware.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2016-07/txt/msg00000.txt.bz2
Content-length: 1136

https://sourceware.org/bugzilla/show_bug.cgi?id=20381

            Bug ID: 20381
           Summary: different results between whether fastmap is available
                    or not
           Product: glibc
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: regex
          Assignee: unassigned at sourceware dot org
          Reporter: noritnk at kcn dot ne.jp
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---
             Flags: security-

--
LC_ALL=el_GR.iso88597

<U03A3>  /xd3  GREEK CAPITAL LETTER SIGMA
<U03C2>  /xf2  GREEK SMALL LETTER FINAL SIGMA
<U03C3>  /xf3  GREEK SMALL LETTER SIGMA

toupper
<U03C3>,<U03A3>
<U03C2>,<U03A3>

tolower
<U03A3>,<U03C3>

totitle
<U03C3>,<U03A3>
<U03C2>,<U03A3>
--

If fastmap is not available, any character in three characters match the
character and other characters without exception.

However if available, GREEK SMALL LETTER FINAL SIGMA does not match
neither the character nor other characters.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
>From glibc-bugs-regex-return-706-listarch-glibc-bugs-regex=sources.redhat.com@sourceware.org Sun Jul 24 03:59:17 2016
Return-Path: <glibc-bugs-regex-return-706-listarch-glibc-bugs-regex=sources.redhat.com@sourceware.org>
Delivered-To: listarch-glibc-bugs-regex@sources.redhat.com
Received: (qmail 121899 invoked by alias); 24 Jul 2016 03:59:16 -0000
Mailing-List: contact glibc-bugs-regex-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <glibc-bugs-regex.sourceware.org>
List-Subscribe: <mailto:glibc-bugs-regex-subscribe@sourceware.org>
List-Post: <mailto:glibc-bugs-regex@sourceware.org>
List-Help: <mailto:glibc-bugs-regex-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: glibc-bugs-regex-owner@sourceware.org
Delivered-To: mailing list glibc-bugs-regex@sourceware.org
Received: (qmail 121781 invoked by uid 48); 24 Jul 2016 03:59:02 -0000
From: "noritnk at kcn dot ne.jp" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs-regex@sourceware.org
Subject: [Bug regex/20381] different results between whether fastmap is available or not
Date: Sun, 24 Jul 2016 03:59:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: glibc
X-Bugzilla-Component: regex
X-Bugzilla-Version: unspecified
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: noritnk at kcn dot ne.jp
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at sourceware dot org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: security-
X-Bugzilla-Changed-Fields: attachments.created
Message-ID: <bug-20381-132-jgJ5f52i21@http.sourceware.org/bugzilla/>
In-Reply-To: <bug-20381-132@http.sourceware.org/bugzilla/>
References: <bug-20381-132@http.sourceware.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://sourceware.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2016-07/txt/msg00001.txt.bz2
Content-length: 328

https://sourceware.org/bugzilla/show_bug.cgi?id=20381

--- Comment #1 from Norihiro Tanaka <noritnk at kcn dot ne.jp> ---
Created attachment 9401
  --> https://sourceware.org/bugzilla/attachment.cgi?id=9401&action=edit
test case for this bug

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug regex/20095] parse_dup_op duplicates the tree exponentially when using repeated +
  2016-05-14  1:08 [Bug regex/20095] New: parse_dup_op duplicates the tree exponentially when using repeated + dualbus at gmail dot com
@ 2020-04-15 15:53 ` dpmendenhall at gmail dot com
  2023-08-24 15:14 ` jwakely.gcc at gmail dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: dpmendenhall at gmail dot com @ 2020-04-15 15:53 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=20095

David Mendenhall <dpmendenhall at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dpmendenhall at gmail dot com

--- Comment #2 from David Mendenhall <dpmendenhall at gmail dot com> ---
*** Bug 25814 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug regex/20095] parse_dup_op duplicates the tree exponentially when using repeated +
  2016-05-14  1:08 [Bug regex/20095] New: parse_dup_op duplicates the tree exponentially when using repeated + dualbus at gmail dot com
  2020-04-15 15:53 ` [Bug regex/20095] " dpmendenhall at gmail dot com
@ 2023-08-24 15:14 ` jwakely.gcc at gmail dot com
  2023-08-24 20:21 ` adhemerval.zanella at linaro dot org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: jwakely.gcc at gmail dot com @ 2023-08-24 15:14 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=20095

Jonathan Wakely <jwakely.gcc at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jwakely.gcc at gmail dot com

--- Comment #3 from Jonathan Wakely <jwakely.gcc at gmail dot com> ---
This behaviour can rapidly exhaust memory (Bug 25814, Bug 28864, Bug 20095
, Bug 29642), which seems unhelpful when ".++" is not even a valid regex. POSIX
says it's undefined:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_06

Why doesn't regcomp just fail to compile it with REG_BADRPT?

Similarly for ".**" etc.

GNU grep seems to have tests for these that expect BADRPT:
https://git.savannah.gnu.org/cgit/grep.git/tree/tests/tests?h=v3.11#n234

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug regex/20095] parse_dup_op duplicates the tree exponentially when using repeated +
  2016-05-14  1:08 [Bug regex/20095] New: parse_dup_op duplicates the tree exponentially when using repeated + dualbus at gmail dot com
  2020-04-15 15:53 ` [Bug regex/20095] " dpmendenhall at gmail dot com
  2023-08-24 15:14 ` jwakely.gcc at gmail dot com
@ 2023-08-24 20:21 ` adhemerval.zanella at linaro dot org
  2023-08-24 20:28 ` jwakely.gcc at gmail dot com
  2023-08-24 22:43 ` sh200105 at mail dot ru
  4 siblings, 0 replies; 6+ messages in thread
From: adhemerval.zanella at linaro dot org @ 2023-08-24 20:21 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=20095

Adhemerval Zanella <adhemerval.zanella at linaro dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |adhemerval.zanella at linaro dot o
                   |                            |rg

--- Comment #4 from Adhemerval Zanella <adhemerval.zanella at linaro dot org> ---
I am not sure who exactly GNU grep handles this since it also uses gnulib regex
code. Is this code really being tested by GNU grep? I deleted the fil, make
check, and it seems not to affect the test's outcome.

I also tried to sync with gnulib master, but it also does not help.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug regex/20095] parse_dup_op duplicates the tree exponentially when using repeated +
  2016-05-14  1:08 [Bug regex/20095] New: parse_dup_op duplicates the tree exponentially when using repeated + dualbus at gmail dot com
                   ` (2 preceding siblings ...)
  2023-08-24 20:21 ` adhemerval.zanella at linaro dot org
@ 2023-08-24 20:28 ` jwakely.gcc at gmail dot com
  2023-08-24 22:43 ` sh200105 at mail dot ru
  4 siblings, 0 replies; 6+ messages in thread
From: jwakely.gcc at gmail dot com @ 2023-08-24 20:28 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=20095

--- Comment #5 from Jonathan Wakely <jwakely.gcc at gmail dot com> ---
(In reply to Adhemerval Zanella from comment #4)
> Is this code really being tested by GNU grep?

No idea - I grepped the grep code for cases of '++' and found those tests, I
don't know if they're actually run or not.

FWIW, Solaris 2.11 and AIX 7.3 both have the same behaviour for "a++"

jwakely@gcc-solaris11:~$ /usr/xpg4/bin/grep -E  a++ <<< a
a

$ /usr/bin/grep -E  a++ <<< a
a


So maybe POSIX says it's undefined to allow for this traditional/common
behaviour.

Glibc's support for it seems poor though, given the memory exhaustion problems.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug regex/20095] parse_dup_op duplicates the tree exponentially when using repeated +
  2016-05-14  1:08 [Bug regex/20095] New: parse_dup_op duplicates the tree exponentially when using repeated + dualbus at gmail dot com
                   ` (3 preceding siblings ...)
  2023-08-24 20:28 ` jwakely.gcc at gmail dot com
@ 2023-08-24 22:43 ` sh200105 at mail dot ru
  4 siblings, 0 replies; 6+ messages in thread
From: sh200105 at mail dot ru @ 2023-08-24 22:43 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=20095

Alexander Kernozhitsky <sh200105 at mail dot ru> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sh200105 at mail dot ru

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-24 22:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-14  1:08 [Bug regex/20095] New: parse_dup_op duplicates the tree exponentially when using repeated + dualbus at gmail dot com
2020-04-15 15:53 ` [Bug regex/20095] " dpmendenhall at gmail dot com
2023-08-24 15:14 ` jwakely.gcc at gmail dot com
2023-08-24 20:21 ` adhemerval.zanella at linaro dot org
2023-08-24 20:28 ` jwakely.gcc at gmail dot com
2023-08-24 22:43 ` sh200105 at mail dot ru

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).