From: Jonathan Wakely <jwakely@redhat.com>
To: Kefu Chai <tchaikov@gmail.com>
Cc: libstdc++@gcc.gnu.org, Kefu Chai <kefu.chai@scylladb.com>
Subject: Re: [PATCH v1 1/1] libstdc++: Set _M_string_length before calling _M_dispose()
Date: Tue, 2 May 2023 15:06:32 +0100 [thread overview]
Message-ID: <CACb0b4kr4b56NsEGpHPDgkZhbgtmkKaWgbh6oHe_LhDbr09MMQ@mail.gmail.com> (raw)
In-Reply-To: <20230501070622.847749-2-tchaikov@gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 2957 bytes --]
On Mon, 1 May 2023 at 08:06, Kefu Chai via Libstdc++ <libstdc++@gcc.gnu.org>
wrote:
> This always sets _M_string_length in the constructor specialized for
> range of input_iterator, for the cases like istringstream.
>
> We copy from source range to the local buffer, and then reallocate
> to larger one if necessary, when disposing the old buffer. And the
> old buffer could be provisioned by the local buffer or an allocated
> buffer. _M_is_local() is used to tell if the buffer is the local one
> or not. In addition to comparing the buffer address with the local buffer,
> this function also performs the sanity check if _M_string_length is
> greater than _S_local_capacity, if the check fails
> __builtin_unreachable() is called. But we failed to set _M_string_length
> in this constructor is specialized for std::input_iterator. So,
> if UBSan is enabled when compiling the source, there are chances that
> the uninitialized data in _M_string_length is greater than
> _S_local_capacity, and the application aborts a runtime error or
> exception emitted by the UBSan.
>
> In this change, to avoid the false alarm, _M_string_length is
> updated with the length of number of bytes copied to local buffer, so
> that _M_is_local() is able to check based on the correct length.
>
> This issue only surfaces when constructing a string with a range of
> input_iterator, and the uninitialized _M_string_length is greater than
> _S_local_capacity, i.e., 15.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/basic_string.tcc (_M_construct): Set
> _M_string_length before calling _M_dispose().
>
> Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
> ---
> libstdc++-v3/include/bits/basic_string.tcc | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/libstdc++-v3/include/bits/basic_string.tcc
> b/libstdc++-v3/include/bits/basic_string.tcc
> index 99fdbeee5ad..ec2198ee20b 100644
> --- a/libstdc++-v3/include/bits/basic_string.tcc
> +++ b/libstdc++-v3/include/bits/basic_string.tcc
> @@ -177,6 +177,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> __p[__len++] = *__beg;
> ++__beg;
> }
> + _M_length(__len);
>
> struct _Guard
> {
> --
> 2.40.1
>
Thanks for finding the issue, and providing the patch.
N.B. patches for libstdc++ (like all GCC patches) need to be sent to the
gcc-patches list. For libstdc++ they should also be sent to the libstdc++
list (as you did) but they still need to be CC'd to gcc-patches.
I think I'd prefer to do it this way instead, just initializing to zero at
the start of the constructor. What do you think?
Init'ing to an immediate zero should always be cheap. For the
ForwardIterator case the initialization can probably be eliminated as a
dead store. For the InputIterator case it avoids the bug (it doesn't matter
that _M_string_length isn't correct when _M_is_local() checks it, only that
it's not uninitialized to a value greater than the local capacity).
[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 2458 bytes --]
commit 8b173f993683acdb5ca8048a0a229660c18f45ab
Author: Kefu Chai <kefu.chai@scylladb.com>
Date: Mon May 1 21:24:26 2023
libstdc++: Set _M_string_length before calling _M_dispose()
This always sets _M_string_length in the constructor specialized for
range of input_iterator, for the cases like istringstream.
We copy from source range to the local buffer, and then reallocate
to larger one if necessary, when disposing the old buffer. And the
old buffer could be provisioned by the local buffer or an allocated
buffer. _M_is_local() is used to tell if the buffer is the local one
or not. In addition to comparing the buffer address with the local buffer,
this function also performs the sanity check if _M_string_length is
greater than _S_local_capacity, if the check fails
__builtin_unreachable() is called. But we failed to set _M_string_length
in this constructor is specialized for std::input_iterator. So,
if UBSan is enabled when compiling the source, there are chances that
the uninitialized data in _M_string_length is greater than
_S_local_capacity, and the application aborts a runtime error or
exception emitted by the UBSan.
In this change, to avoid the false alarm, _M_string_length is
initialized to zero before doing anything else, so that _M_is_local()
doesn't see an uninitialized value.
This issue only surfaces when constructing a string with a range of
input_iterator, and the uninitialized _M_string_length is greater than
_S_local_capacity, i.e., 15.
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (basic_string(Iter, Iter, Alloc)):
Initialize _M_string_length.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index 8247ee6bdc6..b16b2898b62 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -760,7 +760,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
_GLIBCXX20_CONSTEXPR
basic_string(_InputIterator __beg, _InputIterator __end,
const _Alloc& __a = _Alloc())
- : _M_dataplus(_M_local_data(), __a)
+ : _M_dataplus(_M_local_data(), __a), _M_string_length(0)
{
#if __cplusplus >= 201103L
_M_construct(__beg, __end, std::__iterator_category(__beg));
next prev parent reply other threads:[~2023-05-02 14:06 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-01 7:06 [PATCH v1 0/1] " Kefu Chai
2023-05-01 7:06 ` [PATCH v1 1/1] libstdc++: " Kefu Chai
2023-05-02 14:06 ` Jonathan Wakely [this message]
2023-05-03 2:17 ` [PATCH v2 0/1] " Kefu Chai
2023-05-03 2:17 ` [PATCH v2 1/1] libstdc++: Set _M_string_length before calling _M_dispose() [PR109703] Kefu Chai
2023-05-03 12:22 ` [PATCH v2 0/1] Set _M_string_length before calling _M_dispose() Jonathan Wakely
2023-05-03 12:30 ` kefu chai
2023-05-01 20:04 ` [PATCH v1 " Jonathan Wakely
2023-05-01 20:05 ` Jonathan Wakely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CACb0b4kr4b56NsEGpHPDgkZhbgtmkKaWgbh6oHe_LhDbr09MMQ@mail.gmail.com \
--to=jwakely@redhat.com \
--cc=kefu.chai@scylladb.com \
--cc=libstdc++@gcc.gnu.org \
--cc=tchaikov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).