public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0
@ 2021-12-11 21:52 artur77 at freemail dot hu
2021-12-12 2:39 ` [Bug libstdc++/103664] " pinskia at gcc dot gnu.org
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: artur77 at freemail dot hu @ 2021-12-11 21:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664
Bug ID: 103664
Summary: std::regex_replace bug if the string contains \0
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: artur77 at freemail dot hu
Target Milestone: ---
I think I found a bug in std::regex_replace.
The following code should write "1a b2" with length 5, but it writes "1a2" with
length 3.
#include <iostream>
#include <regex>
using namespace std;
int main()
{
string a = regex_replace("1<sn>2", std::regex("<sn>"), string("a\0b", 3));
cout << "a: " << a << "\n";
cout << a.length();
return 0;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
@ 2021-12-12 2:39 ` pinskia at gcc dot gnu.org
2021-12-12 12:26 ` redi at gcc dot gnu.org
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-12 2:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note LLVM's libc++ also has the same bug.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
2021-12-12 2:39 ` [Bug libstdc++/103664] " pinskia at gcc dot gnu.org
@ 2021-12-12 12:26 ` redi at gcc dot gnu.org
2021-12-12 12:34 ` redi at gcc dot gnu.org
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2021-12-12 12:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Last reconfirmed| |2021-12-12
--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
The problem is that the overload of regex_replace taking a basic_string is
implemented in terms of the one taking a null-terminated string:
regex_replace(_Out_iter __out, _Bi_iter __first, _Bi_iter __last,
const basic_regex<_Ch_type, _Rx_traits>& __e,
const basic_string<_Ch_type, _St, _Sa>& __fmt,
regex_constants::match_flag_type __flags
= regex_constants::match_default)
{
return regex_replace(__out, __first, __last, __e, __fmt.c_str(), __flags);
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
2021-12-12 2:39 ` [Bug libstdc++/103664] " pinskia at gcc dot gnu.org
2021-12-12 12:26 ` redi at gcc dot gnu.org
@ 2021-12-12 12:34 ` redi at gcc dot gnu.org
2021-12-13 11:16 ` cvs-commit at gcc dot gnu.org
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2021-12-12 12:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> ---
I have a patch.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
` (2 preceding siblings ...)
2021-12-12 12:34 ` redi at gcc dot gnu.org
@ 2021-12-13 11:16 ` cvs-commit at gcc dot gnu.org
2021-12-13 11:50 ` redi at gcc dot gnu.org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-12-13 11:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:
https://gcc.gnu.org/g:ef5d671cd80a4afa4f74c3dfe2904c63f51fcfde
commit r12-5924-gef5d671cd80a4afa4f74c3dfe2904c63f51fcfde
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Sun Dec 12 21:15:17 2021 +0000
libstdc++: Fix std::regex_replace for strings with embedded null [PR103664]
The overload of std::regex_replace that takes a std::basic_string as the
fmt argument (for the replacement string) is implemented in terms of the
one taking a const C*, which uses std::char_traits to find the length.
That means it stops at a null character, even though the basic_string
might have additional characters beyond that.
Rather than duplicate the implementation of the const C* one for the
std::basic_string case, this moves that implementation to a new
__regex_replace function which takes a const C* and a length. Then both
the std::basic_string and const C* overloads can call that (with the
latter using char_traits to find the length to pass to the new
function).
libstdc++-v3/ChangeLog:
PR libstdc++/103664
* include/bits/regex.h (__regex_replace): Declare.
(regex_replace): Use it.
* include/bits/regex.tcc (__regex_replace): Replace regex_replace
definition with __regex_replace.
* testsuite/28_regex/algorithms/regex_replace/char/103664.cc: New
test.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
` (3 preceding siblings ...)
2021-12-13 11:16 ` cvs-commit at gcc dot gnu.org
@ 2021-12-13 11:50 ` redi at gcc dot gnu.org
2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2021-12-13 11:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Blocks| |102445
--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Fixed on trunk so far.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102445
[Bug 102445] [meta-bug] std::regex issues
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
` (4 preceding siblings ...)
2021-12-13 11:50 ` redi at gcc dot gnu.org
@ 2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
2023-06-09 9:47 ` redi at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-07-07 23:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:
https://gcc.gnu.org/g:8bee3c458ec14ca3e3a429a08694740a894e0c96
commit r11-10125-g8bee3c458ec14ca3e3a429a08694740a894e0c96
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Sun Dec 12 21:15:17 2021 +0000
libstdc++: Fix std::regex_replace for strings with embedded null [PR103664]
The overload of std::regex_replace that takes a std::basic_string as the
fmt argument (for the replacement string) is implemented in terms of the
one taking a const C*, which uses std::char_traits to find the length.
That means it stops at a null character, even though the basic_string
might have additional characters beyond that.
Rather than duplicate the implementation of the const C* one for the
std::basic_string case, this moves that implementation to a new
__regex_replace function which takes a const C* and a length. Then both
the std::basic_string and const C* overloads can call that (with the
latter using char_traits to find the length to pass to the new
function).
libstdc++-v3/ChangeLog:
PR libstdc++/103664
* include/bits/regex.h (__regex_replace): Declare.
(regex_replace): Use it.
* include/bits/regex.tcc (__regex_replace): Replace regex_replace
definition with __regex_replace.
* testsuite/28_regex/algorithms/regex_replace/char/103664.cc: New
test.
(cherry picked from commit ef5d671cd80a4afa4f74c3dfe2904c63f51fcfde)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
` (5 preceding siblings ...)
2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
@ 2023-06-09 9:47 ` redi at gcc dot gnu.org
2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
2023-06-23 16:18 ` redi at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2023-06-09 9:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |11.4
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #7 from Jonathan Wakely <redi at gcc dot gnu.org> ---
I'm not planning to backport this to gcc-10, so it's fixed for 11.4 and 12.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
` (6 preceding siblings ...)
2023-06-09 9:47 ` redi at gcc dot gnu.org
@ 2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
2023-06-23 16:18 ` redi at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-23 16:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664
--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:
https://gcc.gnu.org/g:364cb498c472790e14561f7672dc5ab4a9222287
commit r10-11462-g364cb498c472790e14561f7672dc5ab4a9222287
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Sun Dec 12 21:15:17 2021 +0000
libstdc++: Fix std::regex_replace for strings with embedded null [PR103664]
The overload of std::regex_replace that takes a std::basic_string as the
fmt argument (for the replacement string) is implemented in terms of the
one taking a const C*, which uses std::char_traits to find the length.
That means it stops at a null character, even though the basic_string
might have additional characters beyond that.
Rather than duplicate the implementation of the const C* one for the
std::basic_string case, this moves that implementation to a new
__regex_replace function which takes a const C* and a length. Then both
the std::basic_string and const C* overloads can call that (with the
latter using char_traits to find the length to pass to the new
function).
libstdc++-v3/ChangeLog:
PR libstdc++/103664
* include/bits/regex.h (__regex_replace): Declare.
(regex_replace): Use it.
* include/bits/regex.tcc (__regex_replace): Replace regex_replace
definition with __regex_replace.
* testsuite/28_regex/algorithms/regex_replace/char/103664.cc: New
test.
(cherry picked from commit ef5d671cd80a4afa4f74c3dfe2904c63f51fcfde)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug libstdc++/103664] std::regex_replace bug if the string contains \0
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
` (7 preceding siblings ...)
2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
@ 2023-06-23 16:18 ` redi at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2023-06-23 16:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103664
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|11.4 |10.5
--- Comment #9 from Jonathan Wakely <redi at gcc dot gnu.org> ---
(In reply to Jonathan Wakely from comment #7)
> I'm not planning to backport this to gcc-10, so it's fixed for 11.4 and 12.1
Change of plans. Backported for 10.5 now too.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-06-23 16:18 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-11 21:52 [Bug libstdc++/103664] New: std::regex_replace bug if the string contains \0 artur77 at freemail dot hu
2021-12-12 2:39 ` [Bug libstdc++/103664] " pinskia at gcc dot gnu.org
2021-12-12 12:26 ` redi at gcc dot gnu.org
2021-12-12 12:34 ` redi at gcc dot gnu.org
2021-12-13 11:16 ` cvs-commit at gcc dot gnu.org
2021-12-13 11:50 ` redi at gcc dot gnu.org
2022-07-07 23:32 ` cvs-commit at gcc dot gnu.org
2023-06-09 9:47 ` redi at gcc dot gnu.org
2023-06-23 16:12 ` cvs-commit at gcc dot gnu.org
2023-06-23 16:18 ` redi at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).