From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id C806C3857719 for ; Tue, 2 May 2023 14:06:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C806C3857719 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1683036407; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=WovsSvvTtBroE8A5QBHMy0w9tqtha5A96fXgO+AwiMg=; b=ZB6mI2GYErDPKnmiG8pGNr3BJILdnr+1tD5KaNjs/v9tVJwF1uyqOJe0oeAhJzOIieuq6n IWOEtIamruxvp2cDGiKKAMfiH3D9UxYsLLFcLdOdGUB9Pva2eSxw6FuhpsO8CL1apX2WEb V0wHirbizyh7rN4LSM64S7l1dxBM/b0= Received: from mail-lf1-f69.google.com (mail-lf1-f69.google.com [209.85.167.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-631-oCS5MuVhMzKfR2CbKSPzTw-1; Tue, 02 May 2023 10:06:45 -0400 X-MC-Unique: oCS5MuVhMzKfR2CbKSPzTw-1 Received: by mail-lf1-f69.google.com with SMTP id 2adb3069b0e04-4ec790b902bso2235656e87.1 for ; Tue, 02 May 2023 07:06:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683036404; x=1685628404; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=27XvA+KWp6UYfqeOwkrT+NUF3TMSMs9DDA4rrtnjYBs=; b=MCYjujhDv74GZEi7PrRZx/O8wm2kCTrIlDJ3dGlHUEOf/HAvykUpSNypanQQeeYuyL XBogJ1crA6rKSYUC7nzdM/SRo8q9Y3KaERy+KJ+py6H52T0nCo5wrZch/bY3bv1Jniqu 6tYuQaQuvFHm5g+1wibOb4H3E3YwQvO1vu8EMMtrndxDLdApOSGopkLlFFLsWrd96cXt gZRVBTNC1SW92X+CRMf8NMwHydzOIdFuDhxUVZ8OH/unzn5OEyoGabqIlGLpHW95GIoF h6m3jkXt7zuoXFVkfkLoWNFOydjPsCQT5ewAjferS/cuBnP5HjTaIehWwB55OmFpzwVG gi0A== X-Gm-Message-State: AC+VfDyQnQkWe1GoJHbJ7NQdVl8+AqOLv47m62dOF5td6n9LQn39ECo6 dLWinersGDvh/Er7Pb505x+KOrMcl8Ab7LshqF5YA+w/oQITP+NcwQdaTuFS2B2yypRI5ZpxreI EhZfvUzoWQIxhvWy51kugIDqFwMSId/s= X-Received: by 2002:ac2:5973:0:b0:4ed:c494:32d0 with SMTP id h19-20020ac25973000000b004edc49432d0mr7456lfp.59.1683036403973; Tue, 02 May 2023 07:06:43 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7HOzuH5NckZa+UtxocczoQrEQau82x8tmR0qOt/UWROyCPOUKrg3CPL07ocgO8XIMCP1x7Mq0A0Wn0T0ahmpg= X-Received: by 2002:ac2:5973:0:b0:4ed:c494:32d0 with SMTP id h19-20020ac25973000000b004edc49432d0mr7447lfp.59.1683036403673; Tue, 02 May 2023 07:06:43 -0700 (PDT) MIME-Version: 1.0 References: <20230501070622.847749-1-tchaikov@gmail.com> <20230501070622.847749-2-tchaikov@gmail.com> In-Reply-To: <20230501070622.847749-2-tchaikov@gmail.com> From: Jonathan Wakely Date: Tue, 2 May 2023 15:06:32 +0100 Message-ID: Subject: Re: [PATCH v1 1/1] libstdc++: Set _M_string_length before calling _M_dispose() To: Kefu Chai Cc: libstdc++@gcc.gnu.org, Kefu Chai X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/mixed; boundary="000000000000fc67ae05fab672f5" X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000fc67ae05fab672f5 Content-Type: multipart/alternative; boundary="000000000000fc67ac05fab672f3" --000000000000fc67ac05fab672f3 Content-Type: text/plain; charset="UTF-8" On Mon, 1 May 2023 at 08:06, Kefu Chai via Libstdc++ wrote: > This always sets _M_string_length in the constructor specialized for > range of input_iterator, for the cases like istringstream. > > We copy from source range to the local buffer, and then reallocate > to larger one if necessary, when disposing the old buffer. And the > old buffer could be provisioned by the local buffer or an allocated > buffer. _M_is_local() is used to tell if the buffer is the local one > or not. In addition to comparing the buffer address with the local buffer, > this function also performs the sanity check if _M_string_length is > greater than _S_local_capacity, if the check fails > __builtin_unreachable() is called. But we failed to set _M_string_length > in this constructor is specialized for std::input_iterator. So, > if UBSan is enabled when compiling the source, there are chances that > the uninitialized data in _M_string_length is greater than > _S_local_capacity, and the application aborts a runtime error or > exception emitted by the UBSan. > > In this change, to avoid the false alarm, _M_string_length is > updated with the length of number of bytes copied to local buffer, so > that _M_is_local() is able to check based on the correct length. > > This issue only surfaces when constructing a string with a range of > input_iterator, and the uninitialized _M_string_length is greater than > _S_local_capacity, i.e., 15. > > libstdc++-v3/ChangeLog: > > * include/bits/basic_string.tcc (_M_construct): Set > _M_string_length before calling _M_dispose(). > > Signed-off-by: Kefu Chai > --- > libstdc++-v3/include/bits/basic_string.tcc | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/libstdc++-v3/include/bits/basic_string.tcc > b/libstdc++-v3/include/bits/basic_string.tcc > index 99fdbeee5ad..ec2198ee20b 100644 > --- a/libstdc++-v3/include/bits/basic_string.tcc > +++ b/libstdc++-v3/include/bits/basic_string.tcc > @@ -177,6 +177,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > __p[__len++] = *__beg; > ++__beg; > } > + _M_length(__len); > > struct _Guard > { > -- > 2.40.1 > Thanks for finding the issue, and providing the patch. N.B. patches for libstdc++ (like all GCC patches) need to be sent to the gcc-patches list. For libstdc++ they should also be sent to the libstdc++ list (as you did) but they still need to be CC'd to gcc-patches. I think I'd prefer to do it this way instead, just initializing to zero at the start of the constructor. What do you think? Init'ing to an immediate zero should always be cheap. For the ForwardIterator case the initialization can probably be eliminated as a dead store. For the InputIterator case it avoids the bug (it doesn't matter that _M_string_length isn't correct when _M_is_local() checks it, only that it's not uninitialized to a value greater than the local capacity). --000000000000fc67ac05fab672f3-- --000000000000fc67ae05fab672f5 Content-Type: text/plain; charset="US-ASCII"; name="patch.txt" Content-Disposition: attachment; filename="patch.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lh6cbzwo0 Y29tbWl0IDhiMTczZjk5MzY4M2FjZGI1Y2E4MDQ4YTBhMjI5NjYwYzE4ZjQ1 YWIKQXV0aG9yOiBLZWZ1IENoYWkgPGtlZnUuY2hhaUBzY3lsbGFkYi5jb20+ CkRhdGU6ICAgTW9uIE1heSAxIDIxOjI0OjI2IDIwMjMKCiAgICBsaWJzdGRj Kys6IFNldCBfTV9zdHJpbmdfbGVuZ3RoIGJlZm9yZSBjYWxsaW5nIF9NX2Rp c3Bvc2UoKQogICAgCiAgICBUaGlzIGFsd2F5cyBzZXRzIF9NX3N0cmluZ19s ZW5ndGggaW4gdGhlIGNvbnN0cnVjdG9yIHNwZWNpYWxpemVkIGZvcgogICAg cmFuZ2Ugb2YgaW5wdXRfaXRlcmF0b3IsIGZvciB0aGUgY2FzZXMgbGlrZSBp c3RyaW5nc3RyZWFtLgogICAgCiAgICBXZSBjb3B5IGZyb20gc291cmNlIHJh bmdlIHRvIHRoZSBsb2NhbCBidWZmZXIsIGFuZCB0aGVuIHJlYWxsb2NhdGUK ICAgIHRvIGxhcmdlciBvbmUgaWYgbmVjZXNzYXJ5LCB3aGVuIGRpc3Bvc2lu ZyB0aGUgb2xkIGJ1ZmZlci4gQW5kIHRoZQogICAgb2xkIGJ1ZmZlciBjb3Vs ZCBiZSBwcm92aXNpb25lZCBieSB0aGUgbG9jYWwgYnVmZmVyIG9yIGFuIGFs bG9jYXRlZAogICAgYnVmZmVyLiBfTV9pc19sb2NhbCgpIGlzIHVzZWQgdG8g dGVsbCBpZiB0aGUgYnVmZmVyIGlzIHRoZSBsb2NhbCBvbmUKICAgIG9yIG5v dC4gSW4gYWRkaXRpb24gdG8gY29tcGFyaW5nIHRoZSBidWZmZXIgYWRkcmVz cyB3aXRoIHRoZSBsb2NhbCBidWZmZXIsCiAgICB0aGlzIGZ1bmN0aW9uIGFs c28gcGVyZm9ybXMgdGhlIHNhbml0eSBjaGVjayBpZiBfTV9zdHJpbmdfbGVu Z3RoIGlzCiAgICBncmVhdGVyIHRoYW4gX1NfbG9jYWxfY2FwYWNpdHksIGlm IHRoZSBjaGVjayBmYWlscwogICAgX19idWlsdGluX3VucmVhY2hhYmxlKCkg aXMgY2FsbGVkLiBCdXQgd2UgZmFpbGVkIHRvIHNldCBfTV9zdHJpbmdfbGVu Z3RoCiAgICBpbiB0aGlzIGNvbnN0cnVjdG9yIGlzIHNwZWNpYWxpemVkIGZv ciBzdGQ6OmlucHV0X2l0ZXJhdG9yLiBTbywKICAgIGlmIFVCU2FuIGlzIGVu YWJsZWQgd2hlbiBjb21waWxpbmcgdGhlIHNvdXJjZSwgdGhlcmUgYXJlIGNo YW5jZXMgdGhhdAogICAgdGhlIHVuaW5pdGlhbGl6ZWQgZGF0YSBpbiBfTV9z dHJpbmdfbGVuZ3RoIGlzIGdyZWF0ZXIgdGhhbgogICAgX1NfbG9jYWxfY2Fw YWNpdHksIGFuZCB0aGUgYXBwbGljYXRpb24gYWJvcnRzIGEgcnVudGltZSBl cnJvciBvcgogICAgZXhjZXB0aW9uIGVtaXR0ZWQgYnkgdGhlIFVCU2FuLgog ICAgCiAgICBJbiB0aGlzIGNoYW5nZSwgdG8gYXZvaWQgdGhlIGZhbHNlIGFs YXJtLCBfTV9zdHJpbmdfbGVuZ3RoIGlzCiAgICBpbml0aWFsaXplZCB0byB6 ZXJvIGJlZm9yZSBkb2luZyBhbnl0aGluZyBlbHNlLCBzbyB0aGF0IF9NX2lz X2xvY2FsKCkKICAgIGRvZXNuJ3Qgc2VlIGFuIHVuaW5pdGlhbGl6ZWQgdmFs dWUuCiAgICAKICAgIFRoaXMgaXNzdWUgb25seSBzdXJmYWNlcyB3aGVuIGNv bnN0cnVjdGluZyBhIHN0cmluZyB3aXRoIGEgcmFuZ2Ugb2YKICAgIGlucHV0 X2l0ZXJhdG9yLCBhbmQgdGhlIHVuaW5pdGlhbGl6ZWQgX01fc3RyaW5nX2xl bmd0aCBpcyBncmVhdGVyIHRoYW4KICAgIF9TX2xvY2FsX2NhcGFjaXR5LCBp LmUuLCAxNS4KICAgIAogICAgbGlic3RkYysrLXYzL0NoYW5nZUxvZzoKICAg IAogICAgICAgICAgICAqIGluY2x1ZGUvYml0cy9iYXNpY19zdHJpbmcuaCAo YmFzaWNfc3RyaW5nKEl0ZXIsIEl0ZXIsIEFsbG9jKSk6CiAgICAgICAgICAg IEluaXRpYWxpemUgX01fc3RyaW5nX2xlbmd0aC4KICAgIAogICAgU2lnbmVk LW9mZi1ieTogS2VmdSBDaGFpIDxrZWZ1LmNoYWlAc2N5bGxhZGIuY29tPgog ICAgQ28tYXV0aG9yZWQtYnk6IEpvbmF0aGFuIFdha2VseSA8andha2VseUBy ZWRoYXQuY29tPgoKZGlmZiAtLWdpdCBhL2xpYnN0ZGMrKy12My9pbmNsdWRl L2JpdHMvYmFzaWNfc3RyaW5nLmggYi9saWJzdGRjKystdjMvaW5jbHVkZS9i aXRzL2Jhc2ljX3N0cmluZy5oCmluZGV4IDgyNDdlZTZiZGM2Li5iMTZiMjg5 OGI2MiAxMDA2NDQKLS0tIGEvbGlic3RkYysrLXYzL2luY2x1ZGUvYml0cy9i YXNpY19zdHJpbmcuaAorKysgYi9saWJzdGRjKystdjMvaW5jbHVkZS9iaXRz L2Jhc2ljX3N0cmluZy5oCkBAIC03NjAsNyArNzYwLDcgQEAgX0dMSUJDWFhf QkVHSU5fTkFNRVNQQUNFX0NYWDExCiAJX0dMSUJDWFgyMF9DT05TVEVYUFIK ICAgICAgICAgYmFzaWNfc3RyaW5nKF9JbnB1dEl0ZXJhdG9yIF9fYmVnLCBf SW5wdXRJdGVyYXRvciBfX2VuZCwKIAkJICAgICBjb25zdCBfQWxsb2MmIF9f YSA9IF9BbGxvYygpKQotCTogX01fZGF0YXBsdXMoX01fbG9jYWxfZGF0YSgp LCBfX2EpCisJOiBfTV9kYXRhcGx1cyhfTV9sb2NhbF9kYXRhKCksIF9fYSks IF9NX3N0cmluZ19sZW5ndGgoMCkKIAl7CiAjaWYgX19jcGx1c3BsdXMgPj0g MjAxMTAzTAogCSAgX01fY29uc3RydWN0KF9fYmVnLCBfX2VuZCwgc3RkOjpf X2l0ZXJhdG9yX2NhdGVnb3J5KF9fYmVnKSk7Cg== --000000000000fc67ae05fab672f5--