public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0
@ 2021-12-11 21:52 artur77 at freemail dot hu
  2021-12-12  2:39 ` [Bug libstdc++/103664] " pinskia at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: artur77 at freemail dot hu @ 2021-12-11 21:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664

            Bug ID: 103664
           Summary: std::regex_replace bug if the string contains \0
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: artur77 at freemail dot hu
  Target Milestone: ---

I think I found a bug in std::regex_replace.

The following code should write "1a b2" with length 5, but it writes "1a2" with
length 3.


#include <iostream>
#include <regex>

using namespace std;
int main()
{
    string a = regex_replace("1<sn>2", std::regex("<sn>"), string("a\0b", 3));

    cout << "a: " << a << "\n";
    cout << a.length();

    return 0;
}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
  2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
@ 2021-12-12  2:39 ` pinskia at gcc dot gnu.org
  2021-12-12 12:26 ` redi at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-12  2:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note LLVM's libc++ also has the same bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
  2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
  2021-12-12  2:39 ` [Bug libstdc++/103664] " pinskia at gcc dot gnu.org
@ 2021-12-12 12:26 ` redi at gcc dot gnu.org
  2021-12-12 12:34 ` redi at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2021-12-12 12:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-12-12

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
The problem is that the overload of regex_replace taking a basic_string is
implemented in terms of the one taking a null-terminated string:

  regex_replace(_Out_iter __out, _Bi_iter __first, _Bi_iter __last,
                const basic_regex<_Ch_type, _Rx_traits>& __e,
                const basic_string<_Ch_type, _St, _Sa>& __fmt,
                regex_constants::match_flag_type __flags
                = regex_constants::match_default)
  {
    return regex_replace(__out, __first, __last, __e, __fmt.c_str(), __flags);
  }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
  2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
  2021-12-12  2:39 ` [Bug libstdc++/103664] " pinskia at gcc dot gnu.org
  2021-12-12 12:26 ` redi at gcc dot gnu.org
@ 2021-12-12 12:34 ` redi at gcc dot gnu.org
  2021-12-13 11:16 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2021-12-12 12:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |redi at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
I have a patch.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
  2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
                   ` (2 preceding siblings ...)
  2021-12-12 12:34 ` redi at gcc dot gnu.org
@ 2021-12-13 11:16 ` cvs-commit at gcc dot gnu.org
  2021-12-13 11:50 ` redi at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-12-13 11:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:

https://gcc.gnu.org/g:ef5d671cd80a4afa4f74c3dfe2904c63f51fcfde

commit r12-5924-gef5d671cd80a4afa4f74c3dfe2904c63f51fcfde
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Sun Dec 12 21:15:17 2021 +0000

    libstdc++: Fix std::regex_replace for strings with embedded null [PR103664]

    The overload of std::regex_replace that takes a std::basic_string as the
    fmt argument (for the replacement string) is implemented in terms of the
    one taking a const C*, which uses std::char_traits to find the length.
    That means it stops at a null character, even though the basic_string
    might have additional characters beyond that.

    Rather than duplicate the implementation of the const C* one for the
    std::basic_string case, this moves that implementation to a new
    __regex_replace function which takes a const C* and a length. Then both
    the std::basic_string and const C* overloads can call that (with the
    latter using char_traits to find the length to pass to the new
    function).

    libstdc++-v3/ChangeLog:

            PR libstdc++/103664
            * include/bits/regex.h (__regex_replace): Declare.
            (regex_replace): Use it.
            * include/bits/regex.tcc (__regex_replace): Replace regex_replace
            definition with __regex_replace.
            * testsuite/28_regex/algorithms/regex_replace/char/103664.cc: New
test.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
  2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
                   ` (3 preceding siblings ...)
  2021-12-13 11:16 ` cvs-commit at gcc dot gnu.org
@ 2021-12-13 11:50 ` redi at gcc dot gnu.org
  2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2021-12-13 11:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |102445

--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Fixed on trunk so far.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102445
[Bug 102445] [meta-bug] std::regex issues

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
  2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
                   ` (4 preceding siblings ...)
  2021-12-13 11:50 ` redi at gcc dot gnu.org
@ 2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
  2023-06-09  9:47 ` redi at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-07-07 23:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:

https://gcc.gnu.org/g:8bee3c458ec14ca3e3a429a08694740a894e0c96

commit r11-10125-g8bee3c458ec14ca3e3a429a08694740a894e0c96
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Sun Dec 12 21:15:17 2021 +0000

    libstdc++: Fix std::regex_replace for strings with embedded null [PR103664]

    The overload of std::regex_replace that takes a std::basic_string as the
    fmt argument (for the replacement string) is implemented in terms of the
    one taking a const C*, which uses std::char_traits to find the length.
    That means it stops at a null character, even though the basic_string
    might have additional characters beyond that.

    Rather than duplicate the implementation of the const C* one for the
    std::basic_string case, this moves that implementation to a new
    __regex_replace function which takes a const C* and a length. Then both
    the std::basic_string and const C* overloads can call that (with the
    latter using char_traits to find the length to pass to the new
    function).

    libstdc++-v3/ChangeLog:

            PR libstdc++/103664
            * include/bits/regex.h (__regex_replace): Declare.
            (regex_replace): Use it.
            * include/bits/regex.tcc (__regex_replace): Replace regex_replace
            definition with __regex_replace.
            * testsuite/28_regex/algorithms/regex_replace/char/103664.cc: New
test.

    (cherry picked from commit ef5d671cd80a4afa4f74c3dfe2904c63f51fcfde)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
  2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
                   ` (5 preceding siblings ...)
  2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
@ 2023-06-09  9:47 ` redi at gcc dot gnu.org
  2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
  2023-06-23 16:18 ` redi at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2023-06-09  9:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.4
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #7 from Jonathan Wakely <redi at gcc dot gnu.org> ---
I'm not planning to backport this to gcc-10, so it's fixed for 11.4 and 12.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
  2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
                   ` (6 preceding siblings ...)
  2023-06-09  9:47 ` redi at gcc dot gnu.org
@ 2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
  2023-06-23 16:18 ` redi at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-23 16:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:

https://gcc.gnu.org/g:364cb498c472790e14561f7672dc5ab4a9222287

commit r10-11462-g364cb498c472790e14561f7672dc5ab4a9222287
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Sun Dec 12 21:15:17 2021 +0000

    libstdc++: Fix std::regex_replace for strings with embedded null [PR103664]

    The overload of std::regex_replace that takes a std::basic_string as the
    fmt argument (for the replacement string) is implemented in terms of the
    one taking a const C*, which uses std::char_traits to find the length.
    That means it stops at a null character, even though the basic_string
    might have additional characters beyond that.

    Rather than duplicate the implementation of the const C* one for the
    std::basic_string case, this moves that implementation to a new
    __regex_replace function which takes a const C* and a length. Then both
    the std::basic_string and const C* overloads can call that (with the
    latter using char_traits to find the length to pass to the new
    function).

    libstdc++-v3/ChangeLog:

            PR libstdc++/103664
            * include/bits/regex.h (__regex_replace): Declare.
            (regex_replace): Use it.
            * include/bits/regex.tcc (__regex_replace): Replace regex_replace
            definition with __regex_replace.
            * testsuite/28_regex/algorithms/regex_replace/char/103664.cc: New
test.

    (cherry picked from commit ef5d671cd80a4afa4f74c3dfe2904c63f51fcfde)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
  2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
                   ` (7 preceding siblings ...)
  2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
@ 2023-06-23 16:18 ` redi at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2023-06-23 16:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|11.4                        |10.5

--- Comment #9 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Jonathan Wakely from comment #7)
> I'm not planning to backport this to gcc-10, so it's fixed for 11.4 and 12.1

Change of plans. Backported for 10.5 now too.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-06-23 16:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
2021-12-12  2:39 ` [Bug libstdc++/103664] " pinskia at gcc dot gnu.org
2021-12-12 12:26 ` redi at gcc dot gnu.org
2021-12-12 12:34 ` redi at gcc dot gnu.org
2021-12-13 11:16 ` cvs-commit at gcc dot gnu.org
2021-12-13 11:50 ` redi at gcc dot gnu.org
2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
2023-06-09  9:47 ` redi at gcc dot gnu.org
2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
2023-06-23 16:18 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).