public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended
@ 2023-08-24 11:37 gcc at octaforge dot org
2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: gcc at octaforge dot org @ 2023-08-24 11:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129
Bug ID: 111129
Summary: std::regex incorrectly matches quantifiers with plus
appended
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: gcc at octaforge dot org
Target Milestone: ---
Example code:
```
#include <cstdio>
#include <string>
#include <regex>
int main(void) {
std::smatch matches;
auto re = std::regex(R"(a++)", std::regex::icase);
std::string inp = "aaa";
std::regex_search(inp, matches, re);
for (auto &match: matches) {
printf("%s\n", match.str().data());
}
}
```
With libstdc++, this does not crash, and outputs 'aaa'.
This gives people a false idea that libstdc++ implements possessive quantifiers
(see e.g. https://github.com/wwmm/easyeffects/pull/2536) despite the
documentation or code having no references to any such extension (and the C++
standard likewise not mentioning it). You can verify that the semantics are not
possessive by changing the pattern to 'a++a', which should with possessive
semantics not match anything, but with libstdc++ it's an identical match as
before.
With libc++, this correctly fails with: libc++abi: terminating due to uncaught
exception of type std::__1::regex_error: One of *?+{ was not preceded by a
valid regular expression.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended
2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org
@ 2023-08-24 12:22 ` redi at gcc dot gnu.org
2023-08-24 15:16 ` redi at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: redi at gcc dot gnu.org @ 2023-08-24 12:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2023-08-24
Ever confirmed|0 |1
--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Confirmed. There is no such extension in libstdc++, this is just a bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended
2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org
2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org
@ 2023-08-24 15:16 ` redi at gcc dot gnu.org
2023-08-31 9:24 ` redi at gcc dot gnu.org
2023-10-07 11:32 ` redi at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: redi at gcc dot gnu.org @ 2023-08-24 15:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |timshen at gcc dot gnu.org
--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
This changed with r0-127716-g053eb1f31ede72
* include/bits/regex_compiler.h (_Comipler<>::_M_quantifier()):
Fix parse error of multiple consecutive quantifiers like "a**".
* include/bits/regex_compiler.tcc (_Comipler<>::_M_quantifier()):
Likewise.
* testsuite/28_regex/basic_regex/multiple_quantifiers.cc: New.
That added the following tests:
regex re1("a++");
regex re2("(a+)+");
The second one is fine, but the first is invalid. The "a**" case mentioned in
the commit message is invalid too.
I don't know why Tim wanted to accept the first one. ECMAscript and POSIX don't
accept it. My GNU grep does, but I think that's just a bug in glibc's regcomp:
https://sourceware.org/bugzilla/show_bug.cgi?id=20095
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended
2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org
2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org
2023-08-24 15:16 ` redi at gcc dot gnu.org
@ 2023-08-31 9:24 ` redi at gcc dot gnu.org
2023-10-07 11:32 ` redi at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: redi at gcc dot gnu.org @ 2023-08-31 9:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129
--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
I have a patch to reject "a++" and "a**" but it doesn't work for "a+?" which is
invalid for POSIX REs, but valid for ECMAScript.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended
2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org
` (2 preceding siblings ...)
2023-08-31 9:24 ` redi at gcc dot gnu.org
@ 2023-10-07 11:32 ` redi at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: redi at gcc dot gnu.org @ 2023-10-07 11:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hewillk at gmail dot com
--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
*** Bug 111713 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-10-07 11:32 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org
2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org
2023-08-24 15:16 ` redi at gcc dot gnu.org
2023-08-31 9:24 ` redi at gcc dot gnu.org
2023-10-07 11:32 ` redi at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).