* [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior
@ 2014-11-18 19:25 Tim Shen
2014-11-18 19:29 ` Paolo Carlini
2014-11-24 11:50 ` Jonathan Wakely
0 siblings, 2 replies; 10+ messages in thread
From: Tim Shen @ 2014-11-18 19:25 UTC (permalink / raw)
To: libstdc++, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 58 bytes --]
Bootstrapped and tested.
Thanks!
--
Regards,
Tim Shen
[-- Attachment #2: a.diff --]
[-- Type: text/plain, Size: 2815 bytes --]
commit f17155183b9ae1283d04f3bbdb61d05d9279ebe4
Author: timshen <timshen@google.com>
Date: Tue Nov 18 00:07:28 2014 -0800
PR libstdc++/63920
* include/bits/regex_executor.h: Make _M_begin non const.
* include/bits/regex_executor.tcc (_Executor<>::_M_search): Increase
_M_begin in search algorithm, so that _M_begin is treated as
"current start position" for each search iteration.
* testsuite/28_regex/algorithms/regex_search/ecma/flags.cc: New
testcase.
diff --git a/libstdc++-v3/include/bits/regex_executor.h b/libstdc++-v3/include/bits/regex_executor.h
index b26992c..2d7796f 100644
--- a/libstdc++-v3/include/bits/regex_executor.h
+++ b/libstdc++-v3/include/bits/regex_executor.h
@@ -205,7 +205,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
public:
_ResultsVec _M_cur_results;
_BiIter _M_current;
- const _BiIter _M_begin;
+ _BiIter _M_begin;
const _BiIter _M_end;
const _RegexT& _M_re;
const _NFAT& _M_nfa;
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc b/libstdc++-v3/include/bits/regex_executor.tcc
index 38d4781..a973667 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -39,17 +39,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
bool _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
_M_search()
{
+ if (_M_search_from_first())
+ return true;
if (_M_flags & regex_constants::match_continuous)
- return _M_search_from_first();
- auto __cur = _M_begin;
- do
+ return false;
+ _M_flags |= regex_constants::match_prev_avail;
+ while (_M_begin != _M_end)
{
- _M_current = __cur;
- if (_M_main(_Match_mode::_Prefix))
+ ++_M_begin;
+ if (_M_search_from_first())
return true;
}
- // Continue when __cur == _M_end
- while (__cur++ != _M_end);
return false;
}
diff --git a/libstdc++-v3/testsuite/28_regex/algorithms/regex_search/ecma/flags.cc b/libstdc++-v3/testsuite/28_regex/algorithms/regex_search/ecma/flags.cc
index 45ad0f6..6b038ee 100644
--- a/libstdc++-v3/testsuite/28_regex/algorithms/regex_search/ecma/flags.cc
+++ b/libstdc++-v3/testsuite/28_regex/algorithms/regex_search/ecma/flags.cc
@@ -65,6 +65,8 @@ test01()
regex_constants::match_prev_avail));
VERIFY( regex_search_debug("ba"+1, regex("\\Ba"),
regex_constants::match_prev_avail));
+ // PR libstdc++/63920
+ VERIFY(!regex_search_debug("a", regex("b*"), regex_constants::match_not_null));
}
int
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior
2014-11-18 19:25 [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior Tim Shen
@ 2014-11-18 19:29 ` Paolo Carlini
2014-11-19 8:54 ` Tim Shen
2014-11-24 11:50 ` Jonathan Wakely
1 sibling, 1 reply; 10+ messages in thread
From: Paolo Carlini @ 2014-11-18 19:29 UTC (permalink / raw)
To: Tim Shen, libstdc++, gcc-patches
Hi,
On 11/18/2014 08:12 PM, Tim Shen wrote:
> Bootstrapped and tested.
Jonathan lately is following your work much better than me, but naively
seems weird that _M_begin is non-const and _M_end is const, a different
type anyway.
Paolo.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior
2014-11-18 19:29 ` Paolo Carlini
@ 2014-11-19 8:54 ` Tim Shen
2014-11-19 16:25 ` Paolo Carlini
0 siblings, 1 reply; 10+ messages in thread
From: Tim Shen @ 2014-11-19 8:54 UTC (permalink / raw)
To: Paolo Carlini; +Cc: libstdc++, gcc-patches
On Tue, Nov 18, 2014 at 11:19 AM, Paolo Carlini
<paolo.carlini@oracle.com> wrote:
> Jonathan lately is following your work much better than me, but naively
> seems weird that _M_begin is non-const and _M_end is const, a different type
> anyway.
Hmm. The current regex_search algorithm is implemented as try match
starting from _M_begin; if it doesn't match, start over from
_M_begin+1, ...
As we can tell, _M_begin is never changed, and it's const.
The problem is when the executer reaches the accept state (which
indicates a match) we use _M_current == _M_begin to verify if it's an
empty match. It is possible that, when we are not in the first
iteration, say, in the second iteration actually, _M_current is
initialized with _M_begin+1. It turns out even _M_current has never
been increased (no chars are eaten, aka empty match), _M_current !=
_M_begin is still true.
This fix is making each regex_search iteration more thorough, with
increased _M_begin, as if it's a new regex _M_search_from_first.
I've carefully (admittedly, after sending this patch) inspect
everywhere when _M_begin is used. It turns out _M_begin is under
well-defined (the initial position of _M_current when current
iteration starts) invariants (see _Executor<>::_M_at_begin), indicated
by the use of regex_constants::match_prev_avail. This flag actually
implies that __begin iterator passed into regex_search is not always
"the physical boundary" who matches "^". Boost (and we) conforms this
behavior:
std::regex_search("asdf", std::regex("^asdf"),
std::regex_constants::match_prev_avail)
returns false.
It's more elegant to move _Executor::_M_search out of its class and
make _M_begin still const, but _Executor costs too much to initialize.
--
Regards,
Tim Shen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior
2014-11-19 8:54 ` Tim Shen
@ 2014-11-19 16:25 ` Paolo Carlini
2014-11-19 18:54 ` Tim Shen
0 siblings, 1 reply; 10+ messages in thread
From: Paolo Carlini @ 2014-11-19 16:25 UTC (permalink / raw)
To: Tim Shen; +Cc: libstdc++, gcc-patches
Hi Tim,
On 11/19/2014 08:27 AM, Tim Shen wrote:
> On Tue, Nov 18, 2014 at 11:19 AM, Paolo Carlini
> <paolo.carlini@oracle.com> wrote:
>> Jonathan lately is following your work much better than me, but naively
>> seems weird that _M_begin is non-const and _M_end is const, a different type
>> anyway.
> Hmm. The current regex_search algorithm is implemented as try match
> starting from _M_begin; if it doesn't match, start over from
> _M_begin+1, ...
>
> As we can tell, _M_begin is never changed, and it's const.
>
> The problem is when the executer reaches the accept state (which
> indicates a match) we use _M_current == _M_begin to verify if it's an
> empty match. It is possible that, when we are not in the first
> iteration, say, in the second iteration actually, _M_current is
> initialized with _M_begin+1. It turns out even _M_current has never
> been increased (no chars are eaten, aka empty match), _M_current !=
> _M_begin is still true.
>
> This fix is making each regex_search iteration more thorough, with
> increased _M_begin, as if it's a new regex _M_search_from_first.
>
> I've carefully (admittedly, after sending this patch) inspect
> everywhere when _M_begin is used. It turns out _M_begin is under
> well-defined (the initial position of _M_current when current
> iteration starts) invariants (see _Executor<>::_M_at_begin), indicated
> by the use of regex_constants::match_prev_avail. This flag actually
> implies that __begin iterator passed into regex_search is not always
> "the physical boundary" who matches "^". Boost (and we) conforms this
> behavior:
>
> std::regex_search("asdf", std::regex("^asdf"),
> std::regex_constants::match_prev_avail)
>
> returns false.
>
> It's more elegant to move _Executor::_M_search out of its class and
> make _M_begin still const, but _Executor costs too much to initialize.
Good. To be clear, not having carefully analyzed whatsoever, my point
was more about changing _M_end too, to non-const, than about not
touching _M_begin. Would that make sense?
Thanks,
Paolo.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior
2014-11-19 16:25 ` Paolo Carlini
@ 2014-11-19 18:54 ` Tim Shen
2014-11-19 19:03 ` Daniel Krügler
0 siblings, 1 reply; 10+ messages in thread
From: Tim Shen @ 2014-11-19 18:54 UTC (permalink / raw)
To: Paolo Carlini; +Cc: libstdc++, gcc-patches
On Wed, Nov 19, 2014 at 8:16 AM, Paolo Carlini <paolo.carlini@oracle.com> wrote:
> Good. To be clear, not having carefully analyzed whatsoever, my point was
> more about changing _M_end too, to non-const, than about not touching
> _M_begin. Would that make sense?
Currently we never mutate _M_end. I *believe* that we still won't in
the future, since _M_end is not as volatile as _M_begin.
--
Regards,
Tim Shen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior
2014-11-19 18:54 ` Tim Shen
@ 2014-11-19 19:03 ` Daniel Krügler
2014-11-19 19:05 ` Paolo Carlini
0 siblings, 1 reply; 10+ messages in thread
From: Daniel Krügler @ 2014-11-19 19:03 UTC (permalink / raw)
To: Tim Shen; +Cc: Paolo Carlini, libstdc++, gcc-patches
2014-11-19 19:42 GMT+01:00 Tim Shen <timshen@google.com>:
> On Wed, Nov 19, 2014 at 8:16 AM, Paolo Carlini <paolo.carlini@oracle.com> wrote:
>> Good. To be clear, not having carefully analyzed whatsoever, my point was
>> more about changing _M_end too, to non-const, than about not touching
>> _M_begin. Would that make sense?
>
> Currently we never mutate _M_end. I *believe* that we still won't in
> the future, since _M_end is not as volatile as _M_begin.
I agree with Tim here, why shouldn't the const member and the
non-const member coexist?
- Daniel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior
2014-11-19 19:03 ` Daniel Krügler
@ 2014-11-19 19:05 ` Paolo Carlini
2014-11-19 19:52 ` Tim Shen
0 siblings, 1 reply; 10+ messages in thread
From: Paolo Carlini @ 2014-11-19 19:05 UTC (permalink / raw)
To: Daniel Krügler, Tim Shen; +Cc: libstdc++, gcc-patches
Hi,
On 11/19/2014 07:43 PM, Daniel Krügler wrote:
> 2014-11-19 19:42 GMT+01:00 Tim Shen <timshen@google.com>:
>> On Wed, Nov 19, 2014 at 8:16 AM, Paolo Carlini <paolo.carlini@oracle.com> wrote:
>>> Good. To be clear, not having carefully analyzed whatsoever, my point was
>>> more about changing _M_end too, to non-const, than about not touching
>>> _M_begin. Would that make sense?
>> Currently we never mutate _M_end. I *believe* that we still won't in
>> the future, since _M_end is not as volatile as _M_begin.
> I agree with Tim here, why shouldn't the const member and the
> non-const member coexist?
I was just aiming for consistency, from a very, very, general point of
view. Jon will review the substance of the patch, anyway.
Thanks!
Paolo.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior
2014-11-19 19:05 ` Paolo Carlini
@ 2014-11-19 19:52 ` Tim Shen
0 siblings, 0 replies; 10+ messages in thread
From: Tim Shen @ 2014-11-19 19:52 UTC (permalink / raw)
To: Paolo Carlini; +Cc: Daniel Krügler, libstdc++, gcc-patches
On Wed, Nov 19, 2014 at 10:54 AM, Paolo Carlini
<paolo.carlini@oracle.com> wrote:
> I was just aiming for consistency, from a very, very, general point of view.
Not a problem. Thank you actually, since it's obviously good for us :)
--
Regards,
Tim Shen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior
2014-11-18 19:25 [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior Tim Shen
2014-11-18 19:29 ` Paolo Carlini
@ 2014-11-24 11:50 ` Jonathan Wakely
2014-11-25 8:19 ` Tim Shen
1 sibling, 1 reply; 10+ messages in thread
From: Jonathan Wakely @ 2014-11-24 11:50 UTC (permalink / raw)
To: Tim Shen; +Cc: libstdc++, gcc-patches
On 18/11/14 11:12 -0800, Tim Shen wrote:
>Bootstrapped and tested.
>
>Thanks!
>
>
>--
>Regards,
>Tim Shen
>commit f17155183b9ae1283d04f3bbdb61d05d9279ebe4
>Author: timshen <timshen@google.com>
>Date: Tue Nov 18 00:07:28 2014 -0800
>
> PR libstdc++/63920
> * include/bits/regex_executor.h: Make _M_begin non const.
> * include/bits/regex_executor.tcc (_Executor<>::_M_search): Increase
> _M_begin in search algorithm, so that _M_begin is treated as
> "current start position" for each search iteration.
> * testsuite/28_regex/algorithms/regex_search/ecma/flags.cc: New
> testcase.
OK for trunk - thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior
2014-11-24 11:50 ` Jonathan Wakely
@ 2014-11-25 8:19 ` Tim Shen
0 siblings, 0 replies; 10+ messages in thread
From: Tim Shen @ 2014-11-25 8:19 UTC (permalink / raw)
To: Jonathan Wakely; +Cc: libstdc++, gcc-patches
On Mon, Nov 24, 2014 at 3:17 AM, Jonathan Wakely <jwakely@redhat.com> wrote:
> OK for trunk - thanks.
Committed. :)
Thanks!
--
Regards,
Tim Shen
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-11-25 8:04 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-18 19:25 [Patch, libstdc++/63920] Fix regex_constants::match_not_null behavior Tim Shen
2014-11-18 19:29 ` Paolo Carlini
2014-11-19 8:54 ` Tim Shen
2014-11-19 16:25 ` Paolo Carlini
2014-11-19 18:54 ` Tim Shen
2014-11-19 19:03 ` Daniel Krügler
2014-11-19 19:05 ` Paolo Carlini
2014-11-19 19:52 ` Tim Shen
2014-11-24 11:50 ` Jonathan Wakely
2014-11-25 8:19 ` Tim Shen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).