public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/61424] New: std::regex matches right to left, not leftmost longest
@ 2014-06-05 20:11 redi at gcc dot gnu.org
2014-06-05 21:06 ` [Bug libstdc++/61424] " redi at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: redi at gcc dot gnu.org @ 2014-06-05 20:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61424
Bug ID: 61424
Summary: std::regex matches right to left, not leftmost longest
Product: gcc
Version: 4.9.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: redi at gcc dot gnu.org
#include <regex>
#include <iostream>
using namespace std;
int main()
{
regex_constants::syntax_option_type grammar[] = {
regex_constants::ECMAScript, regex_constants::extended,
regex_constants::awk, regex_constants::egrep
};
for (auto g : grammar)
{
regex re("tournament|tour", g);
const char str[] = "tournament";
cmatch m;
regex_search(str, m, re);
cout << m[0] << endl;
}
}
This prints:
tour
tour
tour
tour
ECMAscript should check alternations left to right, and POSIX has the leftmost,
longest rule
(http://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/boost_regex/syntax/leftmost_longest_rule.html)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/61424] std::regex matches right to left, not leftmost longest
2014-06-05 20:11 [Bug libstdc++/61424] New: std::regex matches right to left, not leftmost longest redi at gcc dot gnu.org
@ 2014-06-05 21:06 ` redi at gcc dot gnu.org
2014-06-05 22:33 ` timshen at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: redi at gcc dot gnu.org @ 2014-06-05 21:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61424
--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
A slight variation:
#include <regex>
#include <iostream>
using namespace std;
int main()
{
regex_constants::syntax_option_type grammar[] = {
regex_constants::ECMAScript, regex_constants::extended,
regex_constants::awk, regex_constants::egrep
};
for (auto g : grammar)
{
regex re("tour|tournament|tourn", g);
const char str[] = "tournament";
cmatch m;
if (regex_search(str, m, re))
cout << m[0] << endl;
else
cout << "-" << endl;
}
}
ECMAscript should match "tour", the POSIX ERE grammars should match
"tournament"
Instead we match "tourn" for all grammars.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/61424] std::regex matches right to left, not leftmost longest
2014-06-05 20:11 [Bug libstdc++/61424] New: std::regex matches right to left, not leftmost longest redi at gcc dot gnu.org
2014-06-05 21:06 ` [Bug libstdc++/61424] " redi at gcc dot gnu.org
@ 2014-06-05 22:33 ` timshen at gcc dot gnu.org
2014-07-01 2:11 ` timshen at gcc dot gnu.org
2015-02-10 14:43 ` pierreblavy at yahoo dot fr
3 siblings, 0 replies; 5+ messages in thread
From: timshen at gcc dot gnu.org @ 2014-06-05 22:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61424
--- Comment #2 from Tim Shen <timshen at gcc dot gnu.org> ---
Sorry, the preference of results of "|" is actually arbitrary. I'll fix it
later.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/61424] std::regex matches right to left, not leftmost longest
2014-06-05 20:11 [Bug libstdc++/61424] New: std::regex matches right to left, not leftmost longest redi at gcc dot gnu.org
2014-06-05 21:06 ` [Bug libstdc++/61424] " redi at gcc dot gnu.org
2014-06-05 22:33 ` timshen at gcc dot gnu.org
@ 2014-07-01 2:11 ` timshen at gcc dot gnu.org
2015-02-10 14:43 ` pierreblavy at yahoo dot fr
3 siblings, 0 replies; 5+ messages in thread
From: timshen at gcc dot gnu.org @ 2014-07-01 2:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61424
--- Comment #3 from Tim Shen <timshen at gcc dot gnu.org> ---
Author: timshen
Date: Tue Jul 1 02:10:31 2014
New Revision: 212184
URL: https://gcc.gnu.org/viewcvs?rev=212184&root=gcc&view=rev
Log:
PR libstdc++/61424
* include/bits/regex.tcc (__regex_algo_impl<>): Use DFS for ECMAScript,
not just regex containing back-references.
* include/bits/regex_compiler.tcc (_Compiler<>::_M_disjunction):
exchange _M_next and _M_alt for alternative operator,
making matching from left to right.
* include/bits/regex_executor.h (_State_info<>::_M_get_sol_pos):
Add position tracking fom DFS.
* include/bits/regex_executor.tcc (_Executor<>::_M_main_dispatch,
_Executor<>::_M_dfs): Likewise.
* include/bits/regex_scanner.h: Remove unused enum entry.
* testsuite/28_regex/algorithms/regex_search/61424.cc: New
testcase from PR.
Added:
trunk/libstdc++-v3/testsuite/28_regex/algorithms/regex_search/61424.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/regex.tcc
trunk/libstdc++-v3/include/bits/regex_compiler.tcc
trunk/libstdc++-v3/include/bits/regex_executor.h
trunk/libstdc++-v3/include/bits/regex_executor.tcc
trunk/libstdc++-v3/include/bits/regex_scanner.h
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/61424] std::regex matches right to left, not leftmost longest
2014-06-05 20:11 [Bug libstdc++/61424] New: std::regex matches right to left, not leftmost longest redi at gcc dot gnu.org
` (2 preceding siblings ...)
2014-07-01 2:11 ` timshen at gcc dot gnu.org
@ 2015-02-10 14:43 ` pierreblavy at yahoo dot fr
3 siblings, 0 replies; 5+ messages in thread
From: pierreblavy at yahoo dot fr @ 2015-02-10 14:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61424
pierreblavy at yahoo dot fr changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |pierreblavy at yahoo dot fr
--- Comment #4 from pierreblavy at yahoo dot fr ---
*** Bug 64936 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-02-10 14:43 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-05 20:11 [Bug libstdc++/61424] New: std::regex matches right to left, not leftmost longest redi at gcc dot gnu.org
2014-06-05 21:06 ` [Bug libstdc++/61424] " redi at gcc dot gnu.org
2014-06-05 22:33 ` timshen at gcc dot gnu.org
2014-07-01 2:11 ` timshen at gcc dot gnu.org
2015-02-10 14:43 ` pierreblavy at yahoo dot fr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).