public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/84110] Null character in regex throws std::regex_error
       [not found] <bug-84110-4@http.gcc.gnu.org/bugzilla/>
@ 2021-09-22 10:17 ` redi at gcc dot gnu.org
  2021-09-24 20:40 ` redi at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: redi at gcc dot gnu.org @ 2021-09-22 10:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84110

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-09-22

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/84110] Null character in regex throws std::regex_error
       [not found] <bug-84110-4@http.gcc.gnu.org/bugzilla/>
  2021-09-22 10:17 ` [Bug libstdc++/84110] Null character in regex throws std::regex_error redi at gcc dot gnu.org
@ 2021-09-24 20:40 ` redi at gcc dot gnu.org
  2021-09-25 10:43 ` redi at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: redi at gcc dot gnu.org @ 2021-09-24 20:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84110

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |redi at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/84110] Null character in regex throws std::regex_error
       [not found] <bug-84110-4@http.gcc.gnu.org/bugzilla/>
  2021-09-22 10:17 ` [Bug libstdc++/84110] Null character in regex throws std::regex_error redi at gcc dot gnu.org
  2021-09-24 20:40 ` redi at gcc dot gnu.org
@ 2021-09-25 10:43 ` redi at gcc dot gnu.org
  2021-09-29 12:49 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: redi at gcc dot gnu.org @ 2021-09-25 10:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84110

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
I agree ECMAScript should treat NUL as an ordinary char. POSIX doesn't though:

"The interfaces specified in POSIX.1-2017 do not permit the inclusion of a NUL
character in an RE or in the string to be matched. If during the operation of a
standard utility a NUL is included in the text designated to be matched, that
NUL may designate the end of the text string for the purposes of matching."

I'm not sure how that is supposed to translate to std::regex.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/84110] Null character in regex throws std::regex_error
       [not found] <bug-84110-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2021-09-25 10:43 ` redi at gcc dot gnu.org
@ 2021-09-29 12:49 ` cvs-commit at gcc dot gnu.org
  2021-09-29 15:48 ` redi at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-29 12:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84110

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:b701e1f8f6870c0f8cb4050674da489101dd05a5

commit r12-3961-gb701e1f8f6870c0f8cb4050674da489101dd05a5
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Sep 29 13:48:11 2021 +0100

    libstdc++: std::basic_regex should treat '\0' as an ordinary char [PR84110]

    When the input sequence contains a _CharT(0) character, the strchr call
    in _Scanner<_CharT>::_M_scan_normal() will search for '\0' and so return
    a pointer to the terminating null at the end of the string. This makes
    the scanner think it's found a special character. Because it doesn't
    match any of the actual special characters, we fall off the end of the
    function (or assert in debug mode).

    We should check for a null character explicitly and either treat it as
    an ordinary character (for the ECMAScript grammar) or an error (for all
    others). I'm not 100% sure that's right, but it seems consistent with
    the POSIX RE rules where a '\0' means the end of the regex pattern or
    the end of the sequence being matched.

    Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

    libstdc++-v3/ChangeLog:

            PR libstdc++/84110
            * include/bits/regex_error.h (regex_constants::_S_null): New
            error code for internal use.
            * include/bits/regex_scanner.tcc (_Scanner::_M_scan_normal()):
            Check for null character.
            * testsuite/28_regex/basic_regex/84110.cc: New test.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/84110] Null character in regex throws std::regex_error
       [not found] <bug-84110-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2021-09-29 12:49 ` cvs-commit at gcc dot gnu.org
@ 2021-09-29 15:48 ` redi at gcc dot gnu.org
  2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: redi at gcc dot gnu.org @ 2021-09-29 15:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84110

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |12.0

--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Fixed for GCC 12. I might backport this after some soak time on trunk.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/84110] Null character in regex throws std::regex_error
       [not found] <bug-84110-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2021-09-29 15:48 ` redi at gcc dot gnu.org
@ 2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
  2022-07-07 23:35 ` redi at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-07-07 23:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84110

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:

https://gcc.gnu.org/g:5df21c00aedb7878b8854901e95d7eda70266d31

commit r11-10118-g5df21c00aedb7878b8854901e95d7eda70266d31
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Sep 29 13:48:11 2021 +0100

    libstdc++: std::basic_regex should treat '\0' as an ordinary char [PR84110]

    When the input sequence contains a _CharT(0) character, the strchr call
    in _Scanner<_CharT>::_M_scan_normal() will search for '\0' and so return
    a pointer to the terminating null at the end of the string. This makes
    the scanner think it's found a special character. Because it doesn't
    match any of the actual special characters, we fall off the end of the
    function (or assert in debug mode).

    We should check for a null character explicitly and either treat it as
    an ordinary character (for the ECMAScript grammar) or an error (for all
    others). I'm not 100% sure that's right, but it seems consistent with
    the POSIX RE rules where a '\0' means the end of the regex pattern or
    the end of the sequence being matched.

    Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

    libstdc++-v3/ChangeLog:

            PR libstdc++/84110
            * include/bits/regex_error.h (regex_constants::_S_null): New
            error code for internal use.
            * include/bits/regex_scanner.tcc (_Scanner::_M_scan_normal()):
            Check for null character.
            * testsuite/28_regex/basic_regex/84110.cc: New test.

    (cherry picked from commit b701e1f8f6870c0f8cb4050674da489101dd05a5)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/84110] Null character in regex throws std::regex_error
       [not found] <bug-84110-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
@ 2022-07-07 23:35 ` redi at gcc dot gnu.org
  2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
  2023-06-23 16:17 ` redi at gcc dot gnu.org
  8 siblings, 0 replies; 9+ messages in thread
From: redi at gcc dot gnu.org @ 2022-07-07 23:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84110

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|12.0                        |11.4

--- Comment #6 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Backported for 11.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/84110] Null character in regex throws std::regex_error
       [not found] <bug-84110-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2022-07-07 23:35 ` redi at gcc dot gnu.org
@ 2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
  2023-06-23 16:17 ` redi at gcc dot gnu.org
  8 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-23 16:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84110

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:

https://gcc.gnu.org/g:d877bf3bdf46b5c996505fc247d170e79fbfa4bf

commit r10-11461-gd877bf3bdf46b5c996505fc247d170e79fbfa4bf
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Wed Sep 29 13:48:11 2021 +0100

    libstdc++: std::basic_regex should treat '\0' as an ordinary char [PR84110]

    When the input sequence contains a _CharT(0) character, the strchr call
    in _Scanner<_CharT>::_M_scan_normal() will search for '\0' and so return
    a pointer to the terminating null at the end of the string. This makes
    the scanner think it's found a special character. Because it doesn't
    match any of the actual special characters, we fall off the end of the
    function (or assert in debug mode).

    We should check for a null character explicitly and either treat it as
    an ordinary character (for the ECMAScript grammar) or an error (for all
    others). I'm not 100% sure that's right, but it seems consistent with
    the POSIX RE rules where a '\0' means the end of the regex pattern or
    the end of the sequence being matched.

    Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

    libstdc++-v3/ChangeLog:

            PR libstdc++/84110
            * include/bits/regex_error.h (regex_constants::_S_null): New
            error code for internal use.
            * include/bits/regex_scanner.tcc (_Scanner::_M_scan_normal()):
            Check for null character.
            * testsuite/28_regex/basic_regex/84110.cc: New test.

    (cherry picked from commit b701e1f8f6870c0f8cb4050674da489101dd05a5)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/84110] Null character in regex throws std::regex_error
       [not found] <bug-84110-4@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
@ 2023-06-23 16:17 ` redi at gcc dot gnu.org
  8 siblings, 0 replies; 9+ messages in thread
From: redi at gcc dot gnu.org @ 2023-06-23 16:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84110

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.4                        |10.5

--- Comment #8 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Backported for 10.5 too.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-06-23 16:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-84110-4@http.gcc.gnu.org/bugzilla/>
2021-09-22 10:17 ` [Bug libstdc++/84110] Null character in regex throws std::regex_error redi at gcc dot gnu.org
2021-09-24 20:40 ` redi at gcc dot gnu.org
2021-09-25 10:43 ` redi at gcc dot gnu.org
2021-09-29 12:49 ` cvs-commit at gcc dot gnu.org
2021-09-29 15:48 ` redi at gcc dot gnu.org
2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
2022-07-07 23:35 ` redi at gcc dot gnu.org
2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
2023-06-23 16:17 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).