public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/102592] New: heap-use-after-free when constructing std::filesystem::path from iterator pair
@ 2021-10-04 12:54 dean.gcc.bugzilla at gmail dot com
2021-10-04 13:22 ` [Bug libstdc++/102592] [11/12 Regression] " redi at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: dean.gcc.bugzilla at gmail dot com @ 2021-10-04 12:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102592
Bug ID: 102592
Summary: heap-use-after-free when constructing
std::filesystem::path from iterator pair
Product: gcc
Version: 11.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: dean.gcc.bugzilla at gmail dot com
Target Milestone: ---
The std::filesystem::path constructor that takes an iterator pair produces a
heap-use-after-free error (address sanitizer) in case the iterators are
non-contiguous and value_type != char.
Full self-contained reproducible example on Compiler Explorer:
https://godbolt.org/z/3nxqq5Ef8 I've also included the reproduction code and
error below for completeness.
Affected versions: 11.x, trunk
Not affected: 10.x
As far as I've been able to track this in the std::filesystem::path
constructor, for the case of non-contiguous, non-char iterators, a temporary
container is created and then assigned to a std::string_view. However that temp
container has a shorter lifetime than the std::string_view.
The code:
#include <filesystem>
namespace fs = std::filesystem;
fs::path to_path(std::string_view sv) {
struct AsU8 {
std::string_view::const_iterator it;
using iterator_category = std::forward_iterator_tag;
using difference_type = std::ptrdiff_t;
using value_type = char8_t;
using reference = const value_type&;
using pointer = const value_type*;
char8_t operator*() const { return static_cast<char8_t>(*it); }
AsU8& operator++() { return ++it, *this; }
bool operator==(const AsU8&) const = default;
};
return {AsU8{sv.begin()}, AsU8{sv.end()}};
}
int main() {
return
to_path("/long/path/to/a/file/to/avoid/small/string/opt").string().size();
}
=1==ERROR: AddressSanitizer: heap-use-after-free on address 0x604000000010 at
pc 0x7fed85f999c0 bp 0x7ffdddc9ccc0 sp 0x7ffdddc9c470
READ of size 46 at 0x604000000010 thread T0
#0 0x7fed85f999bf in __interceptor_memcpy
(/opt/compiler-explorer/gcc-11.2.0/lib64/libasan.so.6+0x3c9bf)
#1 0x4039e5 in std::char_traits<char>::copy(char*, char const*, unsigned
long)
/opt/compiler-explorer/gcc-11.2.0/include/c++/11.2.0/bits/char_traits.h:409
#2 0x405214 in std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::_S_copy(char*, char const*, unsigned long)
/opt/compiler-explorer/gcc-11.2.0/include/c++/11.2.0/bits/basic_string.h:359
#3 0x4058f1 in std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::_S_copy_chars(char*, char const*, char const*)
/opt/compiler-explorer/gcc-11.2.0/include/c++/11.2.0/bits/basic_string.h:406
#4 0x405109 in void std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>(char
const*, char const*, std::forward_iterator_tag)
/opt/compiler-explorer/gcc-11.2.0/include/c++/11.2.0/bits/basic_string.tcc:225
#5 0x40578b in void std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >::_M_construct_aux<char
const*>(char const*, char const*, std::__false_type)
/opt/compiler-explorer/gcc-11.2.0/include/c++/11.2.0/bits/basic_string.h:255
#6 0x404a8e in void std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>(char
const*, char const*)
/opt/compiler-explorer/gcc-11.2.0/include/c++/11.2.0/bits/basic_string.h:274
#7 0x404119 in std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::basic_string(char const*, unsigned long,
std::allocator<char> const&)
/opt/compiler-explorer/gcc-11.2.0/include/c++/11.2.0/bits/basic_string.h:521
#8 0x404e66 in std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::basic_string(std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> >::__sv_wrapper,
std::allocator<char> const&)
/opt/compiler-explorer/gcc-11.2.0/include/c++/11.2.0/bits/basic_string.h:154
#9 0x404373 in std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >::basic_string<std::basic_string_view<char,
std::char_traits<char> >, void>(std::basic_string_view<char,
std::char_traits<char> > const&, std::allocator<char> const&)
/opt/compiler-explorer/gcc-11.2.0/include/c++/11.2.0/bits/basic_string.h:664
#10 0x402c63 in path<to_path(std::string_view)::AsU8>
/opt/compiler-explorer/gcc-11.2.0/include/c++/11.2.0/bits/fs_path.h:292
#11 0x4024ae in to_path[abi:cxx11](std::basic_string_view<char,
std::char_traits<char> >) /app/example.cpp:19
#12 0x40265c in main /app/example.cpp:23
#13 0x7fed85a140b2 in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
#14 0x40226d in _start (/app/output.s+0x40226d)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/102592] [11/12 Regression] heap-use-after-free when constructing std::filesystem::path from iterator pair
2021-10-04 12:54 [Bug libstdc++/102592] New: heap-use-after-free when constructing std::filesystem::path from iterator pair dean.gcc.bugzilla at gmail dot com
@ 2021-10-04 13:22 ` redi at gcc dot gnu.org
2021-10-04 13:27 ` redi at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2021-10-04 13:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102592
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|heap-use-after-free when |[11/12 Regression]
|constructing |heap-use-after-free when
|std::filesystem::path from |constructing
|iterator pair |std::filesystem::path from
| |iterator pair
Ever confirmed|0 |1
Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org
Keywords| |wrong-code
Last reconfirmed| |2021-10-04
Known to work| |10.3.1
Status|UNCONFIRMED |ASSIGNED
Known to fail| |11.2.1, 12.0
--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
The problem is the last line of this function:
template<typename _Tp>
static auto
_S_convert(const _Tp& __str)
{
if constexpr (is_same_v<_Tp, string_type>)
return __str;
else if constexpr (is_same_v<_Tp, basic_string_view<value_type>>)
return __str;
else if constexpr (is_same_v<typename _Tp::value_type, value_type>)
return basic_string_view<value_type>(__str.data(), __str.size());
else
return _S_convert(__str.data(), __str.data() + __str.size());
}
That returns a basic_string_view<char8_t> referring to the contents of __str,
but that is an rvalue basic_string<char8_t> that goes out of scope before the
result is used.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/102592] [11/12 Regression] heap-use-after-free when constructing std::filesystem::path from iterator pair
2021-10-04 12:54 [Bug libstdc++/102592] New: heap-use-after-free when constructing std::filesystem::path from iterator pair dean.gcc.bugzilla at gmail dot com
2021-10-04 13:22 ` [Bug libstdc++/102592] [11/12 Regression] " redi at gcc dot gnu.org
@ 2021-10-04 13:27 ` redi at gcc dot gnu.org
2021-10-13 19:39 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2021-10-04 13:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102592
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
Target Milestone|--- |11.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/102592] [11/12 Regression] heap-use-after-free when constructing std::filesystem::path from iterator pair
2021-10-04 12:54 [Bug libstdc++/102592] New: heap-use-after-free when constructing std::filesystem::path from iterator pair dean.gcc.bugzilla at gmail dot com
2021-10-04 13:22 ` [Bug libstdc++/102592] [11/12 Regression] " redi at gcc dot gnu.org
2021-10-04 13:27 ` redi at gcc dot gnu.org
@ 2021-10-13 19:39 ` cvs-commit at gcc dot gnu.org
2021-10-13 19:39 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-13 19:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102592
--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:
https://gcc.gnu.org/g:85b24e32dc27ec2e70b853713e0713cbc1ff08c3
commit r12-4380-g85b24e32dc27ec2e70b853713e0713cbc1ff08c3
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Wed Oct 13 17:02:59 2021 +0100
libstdc++: Fix dangling string_view in filesystem::path [PR102592]
When creating a path from a pair of non-contiguous iterators we pass the
iterators to _S_convert(Iter, Iter). That function passes the iterators
to __string_from_range to get a contiguous sequence of characters, and
then calls _S_convert(const C*, const C*) to perform the encoding
conversions. If the value type, C, is char8_t, then no conversion is
needed and the _S_convert<char8_t>(const char8_t*, const char8_t*)
specialization casts the pointer to const char* and returns a
std::string_view that refs to the char8_t sequence. However, that
sequence is owned by the std::u8string rvalue returned by
__string_from_range, which goes out of scope when _S_convert(Iter, Iter)
returns. That means the std::string_view is dangling and we get
undefined behaviour when parsing it as a path.
The same problem does not exist for the path members taking a "Source"
argument, because those functions all convert a non-contiguous range
into a basic_string<C> immediately, using __effective_range(__source).
That means that the rvalue string returned by that function is still in
scope for the full expression, so the string_view does not dangle.
The solution for the buggy functions is to do the same thing, and call
__string_from_range immediately, so that the returned rvalue is still in
scope for the lifetime of the string_view returned by _S_convert. To
avoid reintroducing the same problem, remove the _S_convert(Iter, Iter)
overload that calls __string_from_range and returns a dangling view.
libstdc++-v3/ChangeLog:
PR libstdc++/102592
* include/bits/fs_path.h (path::path(Iter, Iter, format))
(path::append(Iter, Iter), path::concat(Iter, Iter)): Call
__string_from_range directly, instead of two-argument overload
of _S_convert.
(path::_S_convert(Iter, Iter)): Remove.
* testsuite/27_io/filesystem/path/construct/102592.C: New test.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/102592] [11/12 Regression] heap-use-after-free when constructing std::filesystem::path from iterator pair
2021-10-04 12:54 [Bug libstdc++/102592] New: heap-use-after-free when constructing std::filesystem::path from iterator pair dean.gcc.bugzilla at gmail dot com
` (2 preceding siblings ...)
2021-10-13 19:39 ` cvs-commit at gcc dot gnu.org
@ 2021-10-13 19:39 ` cvs-commit at gcc dot gnu.org
2021-10-13 20:54 ` cvs-commit at gcc dot gnu.org
2021-10-13 20:55 ` redi at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-13 19:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102592
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:
https://gcc.gnu.org/g:b83b810ac440f72e7551b6496539e60ac30c0d8a
commit r12-4381-gb83b810ac440f72e7551b6496539e60ac30c0d8a
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Wed Oct 13 17:19:57 2021 +0100
libstdc++: Refactor filesystem::path encoding conversions
Adjust the __detail::__effective_range overloads so they always return a
string or string view using std::char_traits, because we don't care
about the traits of an incoming string.
Use std::contiguous_iterator in the __effective_range(const Source&)
overload, to allow returning a basic_string_view in more cases. For the
non-contiguous casecl in both __effective_range and __string_from_range,
return a std::string instead of std::u8string when the value type of the
range is char8_t. These changes avoid unnecessary basic_string
temporaries.
Also simplify __string_from_range(Iter, Iter) to not need
std::__to_address for the contiguous case.
Combine the _S_convert(string_type) and _S_convert(const T&) overloads
into a single _S_convert(T) function which also avoids the dangling
view problem of PR 102592 (should that recur somehow).
libstdc++-v3/ChangeLog:
* include/bits/fs_path.h (__detail::__is_contiguous): New
variable template to identify contiguous iterators.
(__detail::__unified_char8_t): New alias template to decide when
to treat char8_t as char without encoding conversion.
(__detail::__effective_range(const basic_string<C,T>&)): Use
std::char_traits<C> for returned string view.
(__detail::__effective_range(const basic_string_view<C,T>&)):
Likewise.
(__detail::__effective_range(const Source&)): Use
__is_contiguous to detect mode cases of contiguous iterators.
Use __unified_char8_t to return a std::string instead of
std::u8string.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/102592] [11/12 Regression] heap-use-after-free when constructing std::filesystem::path from iterator pair
2021-10-04 12:54 [Bug libstdc++/102592] New: heap-use-after-free when constructing std::filesystem::path from iterator pair dean.gcc.bugzilla at gmail dot com
` (3 preceding siblings ...)
2021-10-13 19:39 ` cvs-commit at gcc dot gnu.org
@ 2021-10-13 20:54 ` cvs-commit at gcc dot gnu.org
2021-10-13 20:55 ` redi at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-13 20:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102592
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:
https://gcc.gnu.org/g:9ef31bab15a426a6829acf0e7ae5420d3b10cbae
commit r11-9149-g9ef31bab15a426a6829acf0e7ae5420d3b10cbae
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Wed Oct 13 17:02:59 2021 +0100
libstdc++: Fix dangling string_view in filesystem::path [PR102592]
When creating a path from a pair of non-contiguous iterators we pass the
iterators to _S_convert(Iter, Iter). That function passes the iterators
to __string_from_range to get a contiguous sequence of characters, and
then calls _S_convert(const C*, const C*) to perform the encoding
conversions. If the value type, C, is char8_t, then no conversion is
needed and the _S_convert<char8_t>(const char8_t*, const char8_t*)
specialization casts the pointer to const char* and returns a
std::string_view that refs to the char8_t sequence. However, that
sequence is owned by the std::u8string rvalue returned by
__string_from_range, which goes out of scope when _S_convert(Iter, Iter)
returns. That means the std::string_view is dangling and we get
undefined behaviour when parsing it as a path.
The same problem does not exist for the path members taking a "Source"
argument, because those functions all convert a non-contiguous range
into a basic_string<C> immediately, using __effective_range(__source).
That means that the rvalue string returned by that function is still in
scope for the full expression, so the string_view does not dangle.
The solution for the buggy functions is to do the same thing, and call
__string_from_range immediately, so that the returned rvalue is still in
scope for the lifetime of the string_view returned by _S_convert. To
avoid reintroducing the same problem, remove the _S_convert(Iter, Iter)
overload that calls __string_from_range and returns a dangling view.
libstdc++-v3/ChangeLog:
PR libstdc++/102592
* include/bits/fs_path.h (path::path(Iter, Iter, format))
(path::append(Iter, Iter), path::concat(Iter, Iter)): Call
__string_from_range directly, instead of two-argument overload
of _S_convert.
(path::_S_convert(Iter, Iter)): Remove.
* testsuite/27_io/filesystem/path/construct/102592.C: New test.
(cherry picked from commit 85b24e32dc27ec2e70b853713e0713cbc1ff08c3)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug libstdc++/102592] [11/12 Regression] heap-use-after-free when constructing std::filesystem::path from iterator pair
2021-10-04 12:54 [Bug libstdc++/102592] New: heap-use-after-free when constructing std::filesystem::path from iterator pair dean.gcc.bugzilla at gmail dot com
` (4 preceding siblings ...)
2021-10-13 20:54 ` cvs-commit at gcc dot gnu.org
@ 2021-10-13 20:55 ` redi at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: redi at gcc dot gnu.org @ 2021-10-13 20:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102592
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #5 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Fixed for 11.3, thanks for the report.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-10-13 20:55 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-04 12:54 [Bug libstdc++/102592] New: heap-use-after-free when constructing std::filesystem::path from iterator pair dean.gcc.bugzilla at gmail dot com
2021-10-04 13:22 ` [Bug libstdc++/102592] [11/12 Regression] " redi at gcc dot gnu.org
2021-10-04 13:27 ` redi at gcc dot gnu.org
2021-10-13 19:39 ` cvs-commit at gcc dot gnu.org
2021-10-13 19:39 ` cvs-commit at gcc dot gnu.org
2021-10-13 20:54 ` cvs-commit at gcc dot gnu.org
2021-10-13 20:55 ` redi at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).