From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2181) id 069A53858D28; Wed, 3 May 2023 12:19:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 069A53858D28 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1683116384; bh=EvwvYlYd2ajNPuzrSW+Qw6Bw/NOwYE1nCPjqB5dZ9m8=; h=From:To:Subject:Date:From; b=JsWvmOd9zY0EyyQLj56af8Qbcm6j82uyhg9Zgw417gT7ToyJ2x0Vyf+jr+84k4V/r yQ8MCsHtferxQobCK9Ey8IhGrrwFvHkJkkHoooXkoq7aEaKd7AHqr05gHLYxPTitge plnUvRb/sUw9YJaZhSp4A6nrIVIa5Hl+TwfnibK0= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Jonathan Wakely To: gcc-cvs@gcc.gnu.org, libstdc++-cvs@gcc.gnu.org Subject: [gcc r14-430] libstdc++: Set _M_string_length before calling _M_dispose() [PR109703] X-Act-Checkin: gcc X-Git-Author: Kefu Chai X-Git-Refname: refs/heads/master X-Git-Oldrev: 203f3060dd363361b172f7295f42bb6bf5ac0b3b X-Git-Newrev: cbf6c7a1d16490a1e63e9a5ce00e9a5c44c4c2f2 Message-Id: <20230503121944.069A53858D28@sourceware.org> Date: Wed, 3 May 2023 12:19:44 +0000 (GMT) List-Id: https://gcc.gnu.org/g:cbf6c7a1d16490a1e63e9a5ce00e9a5c44c4c2f2 commit r14-430-gcbf6c7a1d16490a1e63e9a5ce00e9a5c44c4c2f2 Author: Kefu Chai Date: Mon May 1 21:24:26 2023 +0100 libstdc++: Set _M_string_length before calling _M_dispose() [PR109703] This always sets _M_string_length in the constructor for ranges of input iterators, such as stream iterators. We copy from the source range to the local buffer, and then repeatedly reallocate a larger one if necessary. When disposing the old buffer, _M_is_local() is used to tell if the buffer is the local one or not (and so must be deallocated). In addition to comparing the buffer address with the local buffer, _M_is_local() has an optimization hint so that the compiler knows that for a string using the local buffer, there is an invariant that _M_string_length <= _S_local_capacity (added for PR109299 via r13-6915-gbf78b43873b0b7). But we failed to set _M_string_length in the constructor taking a pair of iterators, so the invariant might not hold, and __builtin_unreachable() is reached. This causes UBsan errors, and potentially misoptimization. To ensure the invariant holds, _M_string_length is initialized to zero before doing anything else, so that _M_is_local() doesn't see an uninitialized value. This issue only surfaces when constructing a string with a range of input iterator, and the uninitialized _M_string_length happens to be greater than _S_local_capacity, i.e., 15 for the std::string specialization. libstdc++-v3/ChangeLog: PR libstdc++/109703 * include/bits/basic_string.h (basic_string(Iter, Iter, Alloc)): Initialize _M_string_length. Signed-off-by: Kefu Chai Co-authored-by: Jonathan Wakely Diff: --- libstdc++-v3/include/bits/basic_string.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h index 8247ee6bdc6..b16b2898b62 100644 --- a/libstdc++-v3/include/bits/basic_string.h +++ b/libstdc++-v3/include/bits/basic_string.h @@ -760,7 +760,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 _GLIBCXX20_CONSTEXPR basic_string(_InputIterator __beg, _InputIterator __end, const _Alloc& __a = _Alloc()) - : _M_dataplus(_M_local_data(), __a) + : _M_dataplus(_M_local_data(), __a), _M_string_length(0) { #if __cplusplus >= 201103L _M_construct(__beg, __end, std::__iterator_category(__beg));