public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended @ 2023-08-24 11:37 gcc at octaforge dot org 2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org ` (3 more replies) 0 siblings, 4 replies; 5+ messages in thread From: gcc at octaforge dot org @ 2023-08-24 11:37 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129 Bug ID: 111129 Summary: std::regex incorrectly matches quantifiers with plus appended Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: gcc at octaforge dot org Target Milestone: --- Example code: ``` #include <cstdio> #include <string> #include <regex> int main(void) { std::smatch matches; auto re = std::regex(R"(a++)", std::regex::icase); std::string inp = "aaa"; std::regex_search(inp, matches, re); for (auto &match: matches) { printf("%s\n", match.str().data()); } } ``` With libstdc++, this does not crash, and outputs 'aaa'. This gives people a false idea that libstdc++ implements possessive quantifiers (see e.g. https://github.com/wwmm/easyeffects/pull/2536) despite the documentation or code having no references to any such extension (and the C++ standard likewise not mentioning it). You can verify that the semantics are not possessive by changing the pattern to 'a++a', which should with possessive semantics not match anything, but with libstdc++ it's an identical match as before. With libc++, this correctly fails with: libc++abi: terminating due to uncaught exception of type std::__1::regex_error: One of *?+{ was not preceded by a valid regular expression. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended 2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org @ 2023-08-24 12:22 ` redi at gcc dot gnu.org 2023-08-24 15:16 ` redi at gcc dot gnu.org ` (2 subsequent siblings) 3 siblings, 0 replies; 5+ messages in thread From: redi at gcc dot gnu.org @ 2023-08-24 12:22 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129 Jonathan Wakely <redi at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2023-08-24 Ever confirmed|0 |1 --- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> --- Confirmed. There is no such extension in libstdc++, this is just a bug. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended 2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org 2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org @ 2023-08-24 15:16 ` redi at gcc dot gnu.org 2023-08-31 9:24 ` redi at gcc dot gnu.org 2023-10-07 11:32 ` redi at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: redi at gcc dot gnu.org @ 2023-08-24 15:16 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129 Jonathan Wakely <redi at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |timshen at gcc dot gnu.org --- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> --- This changed with r0-127716-g053eb1f31ede72 * include/bits/regex_compiler.h (_Comipler<>::_M_quantifier()): Fix parse error of multiple consecutive quantifiers like "a**". * include/bits/regex_compiler.tcc (_Comipler<>::_M_quantifier()): Likewise. * testsuite/28_regex/basic_regex/multiple_quantifiers.cc: New. That added the following tests: regex re1("a++"); regex re2("(a+)+"); The second one is fine, but the first is invalid. The "a**" case mentioned in the commit message is invalid too. I don't know why Tim wanted to accept the first one. ECMAscript and POSIX don't accept it. My GNU grep does, but I think that's just a bug in glibc's regcomp: https://sourceware.org/bugzilla/show_bug.cgi?id=20095 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended 2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org 2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org 2023-08-24 15:16 ` redi at gcc dot gnu.org @ 2023-08-31 9:24 ` redi at gcc dot gnu.org 2023-10-07 11:32 ` redi at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: redi at gcc dot gnu.org @ 2023-08-31 9:24 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129 --- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> --- I have a patch to reject "a++" and "a**" but it doesn't work for "a+?" which is invalid for POSIX REs, but valid for ECMAScript. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/111129] std::regex incorrectly matches quantifiers with plus appended 2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org ` (2 preceding siblings ...) 2023-08-31 9:24 ` redi at gcc dot gnu.org @ 2023-10-07 11:32 ` redi at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: redi at gcc dot gnu.org @ 2023-10-07 11:32 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111129 Jonathan Wakely <redi at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hewillk at gmail dot com --- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> --- *** Bug 111713 has been marked as a duplicate of this bug. *** ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-10-07 11:32 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-08-24 11:37 [Bug libstdc++/111129] New: std::regex incorrectly matches quantifiers with plus appended gcc at octaforge dot org 2023-08-24 12:22 ` [Bug libstdc++/111129] " redi at gcc dot gnu.org 2023-08-24 15:16 ` redi at gcc dot gnu.org 2023-08-31 9:24 ` redi at gcc dot gnu.org 2023-10-07 11:32 ` redi at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).