public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/102667] New: Inconsistent result of std::regex_match
@ 2021-10-09 17:33 fchelnokov at gmail dot com
  2021-10-09 18:41 ` [Bug libstdc++/102667] " redi at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: fchelnokov at gmail dot com @ 2021-10-09 17:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102667

            Bug ID: 102667
           Summary: Inconsistent result of std::regex_match
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: fchelnokov at gmail dot com
  Target Milestone: ---

This program:
```
#include <iostream>
#include <string>
#include <regex>
#include <assert.h>

int main()
{
    std::string input("4321");
    std::regex rg("^([0-9])");
    std::smatch sm;

    bool found = std::regex_match(input, sm, rg);

    assert(!sm.size() == sm.empty());

     std::cout << "ready: " << sm.ready() << ", found: " <<
          found << ", size: " << sm.size() << std::endl;


    for (auto it = sm.begin(); it != sm.end(); ++it)
    {
        std::cout << "iterate '" << *it << "'\n";
    }
}
```
prints rather unexpected:
```
ready: 1, found: 0, size: 0
iterate ''
iterate ''
iterate ''
```
So std::smatch contains 3 entries while its size is zero.

Expected result:
```
ready: 1, found: 0, size: 0
```
Demo: https://gcc.godbolt.org/z/Wfh1vaPqq

Related discussion: https://stackoverflow.com/q/66611132/7325599

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/102667] Inconsistent result of std::regex_match
  2021-10-09 17:33 [Bug libstdc++/102667] New: Inconsistent result of std::regex_match fchelnokov at gmail dot com
@ 2021-10-09 18:41 ` redi at gcc dot gnu.org
  2021-10-11 19:37 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2021-10-09 18:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102667

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-10-09
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |redi at gcc dot gnu.org

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
The match_results sequence contains extra "hidden" elements at the end, which
should not be visible when traversing the container using begin() and end().

The bug is in the end() function, which returns the position after the hidden
elements, not after the (zero length) sequence of matches.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/102667] Inconsistent result of std::regex_match
  2021-10-09 17:33 [Bug libstdc++/102667] New: Inconsistent result of std::regex_match fchelnokov at gmail dot com
  2021-10-09 18:41 ` [Bug libstdc++/102667] " redi at gcc dot gnu.org
@ 2021-10-11 19:37 ` cvs-commit at gcc dot gnu.org
  2021-10-12 10:58 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-11 19:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102667

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:84088dc4bb6a546c896a068dc201463493babf43

commit r12-4325-g84088dc4bb6a546c896a068dc201463493babf43
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Mon Oct 11 09:07:15 2021 +0100

    libstdc++: Fix std::match_results::end() for failed matches [PR102667]

    The end() function needs to consider whether the underlying vector is
    empty, not whether the match_results object is empty. That's because the
    underlying vector will always contain at least three elements for a
    match_results object that is "ready". It contains three extra elements
    which are stored in the vector but are not considered part of sequence,
    and so should not be part of the [begin(),end()) range.

    libstdc++-v3/ChangeLog:

            PR libstdc++/102667
            * include/bits/regex.h (match_result::empty()): Optimize by
            calling the base function directly.
            (match_results::end()): Check _Base_type::empty() not empty().
            * testsuite/28_regex/match_results/102667.C: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/102667] Inconsistent result of std::regex_match
  2021-10-09 17:33 [Bug libstdc++/102667] New: Inconsistent result of std::regex_match fchelnokov at gmail dot com
  2021-10-09 18:41 ` [Bug libstdc++/102667] " redi at gcc dot gnu.org
  2021-10-11 19:37 ` cvs-commit at gcc dot gnu.org
@ 2021-10-12 10:58 ` cvs-commit at gcc dot gnu.org
  2021-10-12 16:27 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-12 10:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102667

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:

https://gcc.gnu.org/g:2560bab6ceb7c1eb7c5cdadb5f0a608ac166b829

commit r11-9099-g2560bab6ceb7c1eb7c5cdadb5f0a608ac166b829
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Mon Oct 11 09:07:15 2021 +0100

    libstdc++: Fix std::match_results::end() for failed matches [PR102667]

    The end() function needs to consider whether the underlying vector is
    empty, not whether the match_results object is empty. That's because the
    underlying vector will always contain at least three elements for a
    match_results object that is "ready". It contains three extra elements
    which are stored in the vector but are not considered part of sequence,
    and so should not be part of the [begin(),end()) range.

    libstdc++-v3/ChangeLog:

            PR libstdc++/102667
            * include/bits/regex.h (match_result::empty()): Optimize by
            calling the base function directly.
            (match_results::end()): Check _Base_type::empty() not empty().
            * testsuite/28_regex/match_results/102667.C: New test.

    (cherry picked from commit 84088dc4bb6a546c896a068dc201463493babf43)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/102667] Inconsistent result of std::regex_match
  2021-10-09 17:33 [Bug libstdc++/102667] New: Inconsistent result of std::regex_match fchelnokov at gmail dot com
                   ` (2 preceding siblings ...)
  2021-10-12 10:58 ` cvs-commit at gcc dot gnu.org
@ 2021-10-12 16:27 ` cvs-commit at gcc dot gnu.org
  2021-10-13 19:42 ` cvs-commit at gcc dot gnu.org
  2021-10-13 19:44 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-12 16:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102667

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:

https://gcc.gnu.org/g:be3fbe792444bc750a8a1b37481bac9a84528949

commit r10-10181-gbe3fbe792444bc750a8a1b37481bac9a84528949
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Mon Oct 11 09:07:15 2021 +0100

    libstdc++: Fix std::match_results::end() for failed matches [PR102667]

    The end() function needs to consider whether the underlying vector is
    empty, not whether the match_results object is empty. That's because the
    underlying vector will always contain at least three elements for a
    match_results object that is "ready". It contains three extra elements
    which are stored in the vector but are not considered part of sequence,
    and so should not be part of the [begin(),end()) range.

    libstdc++-v3/ChangeLog:

            PR libstdc++/102667
            * include/bits/regex.h (match_result::empty()): Optimize by
            calling the base function directly.
            (match_results::end()): Check _Base_type::empty() not empty().
            * testsuite/28_regex/match_results/102667.C: New test.

    (cherry picked from commit 84088dc4bb6a546c896a068dc201463493babf43)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/102667] Inconsistent result of std::regex_match
  2021-10-09 17:33 [Bug libstdc++/102667] New: Inconsistent result of std::regex_match fchelnokov at gmail dot com
                   ` (3 preceding siblings ...)
  2021-10-12 16:27 ` cvs-commit at gcc dot gnu.org
@ 2021-10-13 19:42 ` cvs-commit at gcc dot gnu.org
  2021-10-13 19:44 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-13 19:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102667

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:

https://gcc.gnu.org/g:081f08b80db2ce8a10375b3af118db008308affc

commit r9-9774-g081f08b80db2ce8a10375b3af118db008308affc
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Mon Oct 11 09:07:15 2021 +0100

    libstdc++: Fix std::match_results::end() for failed matches [PR102667]

    The end() function needs to consider whether the underlying vector is
    empty, not whether the match_results object is empty. That's because the
    underlying vector will always contain at least three elements for a
    match_results object that is "ready". It contains three extra elements
    which are stored in the vector but are not considered part of sequence,
    and so should not be part of the [begin(),end()) range.

    libstdc++-v3/ChangeLog:

            PR libstdc++/102667
            * include/bits/regex.h (match_result::empty()): Optimize by
            calling the base function directly.
            (match_results::end()): Check _Base_type::empty() not empty().
            * testsuite/28_regex/match_results/102667.C: New test.

    (cherry picked from commit 84088dc4bb6a546c896a068dc201463493babf43)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libstdc++/102667] Inconsistent result of std::regex_match
  2021-10-09 17:33 [Bug libstdc++/102667] New: Inconsistent result of std::regex_match fchelnokov at gmail dot com
                   ` (4 preceding siblings ...)
  2021-10-13 19:42 ` cvs-commit at gcc dot gnu.org
@ 2021-10-13 19:44 ` redi at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2021-10-13 19:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102667

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
   Target Milestone|---                         |9.5
         Resolution|---                         |FIXED

--- Comment #6 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Fixed for 9.5, 10.4 and 11.3, thanks for the report.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-13 19:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-09 17:33 [Bug libstdc++/102667] New: Inconsistent result of std::regex_match fchelnokov at gmail dot com
2021-10-09 18:41 ` [Bug libstdc++/102667] " redi at gcc dot gnu.org
2021-10-11 19:37 ` cvs-commit at gcc dot gnu.org
2021-10-12 10:58 ` cvs-commit at gcc dot gnu.org
2021-10-12 16:27 ` cvs-commit at gcc dot gnu.org
2021-10-13 19:42 ` cvs-commit at gcc dot gnu.org
2021-10-13 19:44 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).