public inbox for glibc-bugs-regex@sourceware.org
help / color / mirror / Atom feed
* [Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory
@ 2020-04-12  4:06 dpmendenhall at gmail dot com
  2020-04-12 11:55 ` [Bug regex/25814] " schwab@linux-m68k.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: dpmendenhall at gmail dot com @ 2020-04-12  4:06 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=25814

            Bug ID: 25814
           Summary: Consecutive + operators accepted but have no effect
                    except consuming more memory
           Product: glibc
           Version: 2.27
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: regex
          Assignee: unassigned at sourceware dot org
          Reporter: dpmendenhall at gmail dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

Created attachment 12452
  --> https://sourceware.org/bugzilla/attachment.cgi?id=12452&action=edit
test program

The attached test program just takes a regex pattern and a string at the
command line.

$ gcc -o regex regex.c
$ ./regex 0+ 0
pattern: 0+
string: 0

regex matched
$ ./regex 0++ 0
pattern: 0++
string: 0

regex matched
$ ./regex 0++++++++++++++++++++++++++++++++++ 0
pattern: 0++++++++++++++++++++++++++++++++++
string: 0
<hangs consuming all system memory>

I'm not even sure what consecutive + operators is supposed to mean, so I don't
know why "0++" accepts "0".

I tested this against the bionic/NetBSD regex implementation and compilation of
"0++" fails with REG_BADRPT, which makes more sense.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory
  2020-04-12  4:06 [Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory dpmendenhall at gmail dot com
@ 2020-04-12 11:55 ` schwab@linux-m68k.org
  2020-04-12 12:15 ` schwab@linux-m68k.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: schwab@linux-m68k.org @ 2020-04-12 11:55 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #1 from Andreas Schwab <schwab@linux-m68k.org> ---
RE_CONTEXT_INVALID_DUP is only part of RE_SYNTAX_POSIX_BASIC, but not
RE_SYNTAX_POSIX_EXTENDED.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory
  2020-04-12  4:06 [Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory dpmendenhall at gmail dot com
  2020-04-12 11:55 ` [Bug regex/25814] " schwab@linux-m68k.org
@ 2020-04-12 12:15 ` schwab@linux-m68k.org
  2020-04-13  2:58 ` dpmendenhall at gmail dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: schwab@linux-m68k.org @ 2020-04-12 12:15 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #2 from Andreas Schwab <schwab@linux-m68k.org> ---
POSIX says that multiple adjacent duplication symbols produce undefined results
(both BRE and ERE).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory
  2020-04-12  4:06 [Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory dpmendenhall at gmail dot com
  2020-04-12 11:55 ` [Bug regex/25814] " schwab@linux-m68k.org
  2020-04-12 12:15 ` schwab@linux-m68k.org
@ 2020-04-13  2:58 ` dpmendenhall at gmail dot com
  2020-04-13  2:59 ` dpmendenhall at gmail dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: dpmendenhall at gmail dot com @ 2020-04-13  2:58 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #3 from David Mendenhall <dpmendenhall at gmail dot com> ---
Created attachment 12453
  --> https://sourceware.org/bugzilla/attachment.cgi?id=12453&action=edit
Same as regex.c, but BRE syntax

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory
  2020-04-12  4:06 [Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory dpmendenhall at gmail dot com
                   ` (2 preceding siblings ...)
  2020-04-13  2:58 ` dpmendenhall at gmail dot com
@ 2020-04-13  2:59 ` dpmendenhall at gmail dot com
  2020-04-13  3:02 ` dpmendenhall at gmail dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: dpmendenhall at gmail dot com @ 2020-04-13  2:59 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #4 from David Mendenhall <dpmendenhall at gmail dot com> ---
> RE_CONTEXT_INVALID_DUP is only part of RE_SYNTAX_POSIX_BASIC, but not RE_SYNTAX_POSIX_EXTENDED

The same behavior is reproducible with basic syntax. See regex-basic.c
attached.

$ gcc -o regex-basic regex-basic.c
$ ./regex-basic 0\\+ 0
pattern: 0\+
string: 0

regex matched
$ ./regex-basic 0\\+\\+ 0
pattern: 0\+\+
string: 0

regex matched
./regex-basic
0\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+\\+
0
pattern: 0\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+
string: 0
<hangs consuming all system memory>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory
  2020-04-12  4:06 [Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory dpmendenhall at gmail dot com
                   ` (3 preceding siblings ...)
  2020-04-13  2:59 ` dpmendenhall at gmail dot com
@ 2020-04-13  3:02 ` dpmendenhall at gmail dot com
  2020-04-13 11:05 ` schwab@linux-m68k.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: dpmendenhall at gmail dot com @ 2020-04-13  3:02 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #5 from David Mendenhall <dpmendenhall at gmail dot com> ---
> POSIX says that multiple adjacent duplication symbols produce undefined results (both BRE and ERE).

Thanks for pointing this out. I was unaware.

So do you think this bug should be closed as INVALID or WONTFIX, or is there
value in investigating the excessive memory consumption and/or rejecting on
compilation like bionic does?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory
  2020-04-12  4:06 [Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory dpmendenhall at gmail dot com
                   ` (4 preceding siblings ...)
  2020-04-13  3:02 ` dpmendenhall at gmail dot com
@ 2020-04-13 11:05 ` schwab@linux-m68k.org
  2020-04-13 15:35 ` dpmendenhall at gmail dot com
  2020-04-15 15:53 ` dpmendenhall at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: schwab@linux-m68k.org @ 2020-04-13 11:05 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #6 from Andreas Schwab <schwab@linux-m68k.org> ---
0\+ is not a valid BRE.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory
  2020-04-12  4:06 [Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory dpmendenhall at gmail dot com
                   ` (5 preceding siblings ...)
  2020-04-13 11:05 ` schwab@linux-m68k.org
@ 2020-04-13 15:35 ` dpmendenhall at gmail dot com
  2020-04-15 15:53 ` dpmendenhall at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: dpmendenhall at gmail dot com @ 2020-04-13 15:35 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=25814

--- Comment #7 from David Mendenhall <dpmendenhall at gmail dot com> ---
Really? The regex.h comments suggest otherwise.

#define RE_SYNTAX_POSIX_BASIC                                           \
  (_RE_SYNTAX_POSIX_COMMON | RE_BK_PLUS_QM | RE_CONTEXT_INVALID_DUP)

/* If this bit is not set, then + and ? are operators, and \+ and \? are
     literals.
   If set, then \+ and \? are operators and + and ? are literals.  */
# define RE_BK_PLUS_QM (RE_BACKSLASH_ESCAPE_IN_LISTS << 1)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug regex/25814] Consecutive + operators accepted but have no effect except consuming more memory
  2020-04-12  4:06 [Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory dpmendenhall at gmail dot com
                   ` (6 preceding siblings ...)
  2020-04-13 15:35 ` dpmendenhall at gmail dot com
@ 2020-04-15 15:53 ` dpmendenhall at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: dpmendenhall at gmail dot com @ 2020-04-15 15:53 UTC (permalink / raw)
  To: glibc-bugs-regex

https://sourceware.org/bugzilla/show_bug.cgi?id=25814

David Mendenhall <dpmendenhall at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |DUPLICATE

--- Comment #8 from David Mendenhall <dpmendenhall at gmail dot com> ---
Dup of 20095

*** This bug has been marked as a duplicate of bug 20095 ***

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-04-15 15:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-12  4:06 [Bug regex/25814] New: Consecutive + operators accepted but have no effect except consuming more memory dpmendenhall at gmail dot com
2020-04-12 11:55 ` [Bug regex/25814] " schwab@linux-m68k.org
2020-04-12 12:15 ` schwab@linux-m68k.org
2020-04-13  2:58 ` dpmendenhall at gmail dot com
2020-04-13  2:59 ` dpmendenhall at gmail dot com
2020-04-13  3:02 ` dpmendenhall at gmail dot com
2020-04-13 11:05 ` schwab@linux-m68k.org
2020-04-13 15:35 ` dpmendenhall at gmail dot com
2020-04-15 15:53 ` dpmendenhall at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).