public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended
@ 2023-08-24 11:37 gcc at octaforge dot org
  2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: gcc at octaforge dot org @ 2023-08-24 11:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129

            Bug ID: 111129
           Summary: std::regex incorrectly matches quantifiers with plus
                    appended
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gcc at octaforge dot org
  Target Milestone: ---

Example code:

```
#include <cstdio>
#include <string>
#include <regex>

int main(void) {
    std::smatch matches;
    auto re = std::regex(R"(a++)", std::regex::icase);
    std::string inp = "aaa";
    std::regex_search(inp, matches, re);
    for (auto &match: matches) {
        printf("%s\n", match.str().data());
    }
}
```

With libstdc++, this does not crash, and outputs 'aaa'.


This gives people a false idea that libstdc++ implements possessive quantifiers
(see e.g. https://github.com/wwmm/easyeffects/pull/2536) despite the
documentation or code having no references to any such extension (and the C++
standard likewise not mentioning it). You can verify that the semantics are not
possessive by changing the pattern to 'a++a', which should with possessive
semantics not match anything, but with libstdc++ it's an identical match as
before.

With libc++, this correctly fails with: libc++abi: terminating due to uncaught
exception of type std::__1::regex_error: One of *?+{ was not preceded by a
valid regular expression.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended
  2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org
@ 2023-08-24 12:22 ` redi at gcc dot gnu.org
  2023-08-24 15:16 ` redi at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: redi at gcc dot gnu.org @ 2023-08-24 12:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2023-08-24
     Ever confirmed|0                           |1

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Confirmed. There is no such extension in libstdc++, this is just a bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended
  2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org
  2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org
@ 2023-08-24 15:16 ` redi at gcc dot gnu.org
  2023-08-31  9:24 ` redi at gcc dot gnu.org
  2023-10-07 11:32 ` redi at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: redi at gcc dot gnu.org @ 2023-08-24 15:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |timshen at gcc dot gnu.org

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
This changed with r0-127716-g053eb1f31ede72

            * include/bits/regex_compiler.h (_Comipler<>::_M_quantifier()):
            Fix parse error of multiple consecutive quantifiers like "a**".
            * include/bits/regex_compiler.tcc (_Comipler<>::_M_quantifier()):
            Likewise.
            * testsuite/28_regex/basic_regex/multiple_quantifiers.cc: New.

That added the following tests:

  regex re1("a++");
  regex re2("(a+)+");

The second one is fine, but the first is invalid. The "a**" case mentioned in
the commit message is invalid too.

I don't know why Tim wanted to accept the first one. ECMAscript and POSIX don't
accept it. My GNU grep does, but I think that's just a bug in glibc's regcomp:
https://sourceware.org/bugzilla/show_bug.cgi?id=20095

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended
  2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org
  2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org
  2023-08-24 15:16 ` redi at gcc dot gnu.org
@ 2023-08-31  9:24 ` redi at gcc dot gnu.org
  2023-10-07 11:32 ` redi at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: redi at gcc dot gnu.org @ 2023-08-31  9:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
I have a patch to reject "a++" and "a**" but it doesn't work for "a+?" which is
invalid for POSIX REs, but valid for ECMAScript.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended
  2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org
                   ` (2 preceding siblings ...)
  2023-08-31  9:24 ` redi at gcc dot gnu.org
@ 2023-10-07 11:32 ` redi at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: redi at gcc dot gnu.org @ 2023-10-07 11:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hewillk at gmail dot com

--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
*** Bug 111713 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-10-07 11:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org
2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org
2023-08-24 15:16 ` redi at gcc dot gnu.org
2023-08-31  9:24 ` redi at gcc dot gnu.org
2023-10-07 11:32 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).