From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id 002903857C6C for ; Thu, 6 Aug 2020 13:14:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 002903857C6C Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-459-lN0oA6lgPyCig5hE3hNpVw-1; Thu, 06 Aug 2020 09:14:53 -0400 X-MC-Unique: lN0oA6lgPyCig5hE3hNpVw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 06785100AA22; Thu, 6 Aug 2020 13:14:53 +0000 (UTC) Received: from localhost (unknown [10.33.36.145]) by smtp.corp.redhat.com (Postfix) with ESMTP id A691F5DA7B; Thu, 6 Aug 2020 13:14:49 +0000 (UTC) Date: Thu, 6 Aug 2020 14:14:48 +0100 From: Jonathan Wakely To: Martin Sebor Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org, Jakub Jelinek Subject: Re: [committed] libstdc++: Replace operator>>(istream&, char*) [LWG 2499] Message-ID: <20200806131448.GY3400@redhat.com> References: <20200805212500.GA2013665@redhat.com> <5a18ba7e-e550-72b9-150f-ce7abed339fd@redhat.com> MIME-Version: 1.0 In-Reply-To: <5a18ba7e-e550-72b9-150f-ce7abed339fd@redhat.com> X-Clacks-Overhead: GNU Terry Pratchett X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Aug 2020 13:14:57 -0000 On 05/08/20 16:31 -0600, Martin Sebor via Libstdc++ wrote: >On 8/5/20 3:25 PM, Jonathan Wakely wrote: >>P0487R1 resolved LWG 2499 for C++20 by removing the operator>> overloads >>that have high risk of buffer overflows. They were replaced by >>equivalents that only accept a reference to an array, and so can >>guarantee not to write past the end of the array. >> >>In order to support both the old and new functionality, this patch >>introduces a new overloaded __istream_extract function which takes a >>maximum length. The new operator>> overloads use the array size as the >>maximum length. The old overloads now use __builtin_object_size to >>determine the available buffer size if available (which requires -O2) or >>use numeric_limits::max()/sizeof(char_type) otherwise. This >>is a change in behaviour, as the old overloads previously always used >>numeric_limits::max(), without considering sizeof(char_type) >>and without attempting to prevent overflows. >> >>Because they now do little more than call __istream_extract, the old >>operator>> overloads are very small inline functions. This means there >>is no advantage to explicitly instantiating them in the library (in fact >>that would prevent the __builtin_object_size checks from ever working). >>As a result, the explicit instantiation declarations can be removed from >>the header. The explicit instantiation definitions are still needed, for >>backwards compatibility with existing code that expects to link to the >>definitions in the library. >> >>While working on this change I noticed that src/c++11/istream-inst.cc >>has the following explicit instantiation definition: >> template istream& operator>>(istream&, char*); >>This had no effect (and so should not have been present in that file), >>because there was an explicit specialization declared in and >>defined in src/++98/istream.cc. However, this change removes the >>explicit specialization, and now the explicit instantiation definition >>is necessary to ensure the symbol gets defined in the library. >> >>libstdc++-v3/ChangeLog: >> >> * config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Export new symbols. >> * include/bits/istream.tcc (__istream_extract): New function >> template implementing both of operator>>(istream&, char*) and >> operator>>(istream&, char(&)[N]). Add explicit instantiation >> declaration for it. Remove explicit instantiation declarations >> for old function templates. >> * include/std/istream (__istream_extract): Declare. >> (operator>>(basic_istream&, C*)): Define inline and simply >> call __istream_extract. >> (operator>>(basic_istream&, signed char*)): Likewise. >> (operator>>(basic_istream&, unsigned char*)): Likewise. >> (operator>>(basic_istream&, C(7)[N])): Define for LWG 2499. >> (operator>>(basic_istream&, signed char(&)[N])): >> Likewise. >> (operator>>(basic_istream&, unsigned char(&)[N])): >> Likewise. >> * include/std/streambuf (basic_streambuf): Declare char overload >> of __istream_extract as a friend. >> * src/c++11/istream-inst.cc: Add explicit instantiation >> definition for wchar_t overload of __istream_extract. Remove >> explicit instantiation definitions of old operator>> overloads >> for versioned-namespace build. >> * src/c++98/istream.cc (operator>>(istream&, char*)): Replace >> with __istream_extract(istream&, char*, streamsize). >> * testsuite/27_io/basic_istream/extractors_character/char/3.cc: >> Do not use variable-length array. >> * testsuite/27_io/basic_istream/extractors_character/char/4.cc: >> Do not run test for C++20. >> * testsuite/27_io/basic_istream/extractors_character/char/9555-ic.cc: >> Do not test writing to pointers for C++20. >> * testsuite/27_io/basic_istream/extractors_character/char/9826.cc: >> Use array instead of pointer. >> * testsuite/27_io/basic_istream/extractors_character/wchar_t/3.cc: >> Do not use variable-length array. >> * testsuite/27_io/basic_istream/extractors_character/wchar_t/4.cc: >> Do not run test for C++20. >> * testsuite/27_io/basic_istream/extractors_character/wchar_t/9555-ic.cc: >> Do not test writing to pointers for C++20. >> * testsuite/27_io/basic_istream/extractors_character/char/lwg2499.cc: >> New test. >> * testsuite/27_io/basic_istream/extractors_character/char/lwg2499_neg.cc: >> New test. >> * testsuite/27_io/basic_istream/extractors_character/char/overflow.cc: >> New test. >> * testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499.cc: >> New test. >> * testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499_neg.cc: >> New test. >> >>Tested powerpc64le-linux. Committed to trunk. >> >>Martin, Jakub, could you please double-check the usage of >>__builtin_object_size? (line 805 in libstdc++-v3/include/std/istream) >>Do you see any problems with using it here? If it can't tell the size >>then we just assume it's larger than the string to be extracted, which >>is what the old code did anyway. I do have an idea for stronger >>checking in debug mode, which I'll post in a minute. > >I've always found the second argument to __builtin_object_size >confusing for types above 1. I don't see anything wrong in >the diff but I believe the most useful results are with type 1 >for string functions and type 0 for raw memory functions like >memcpy (that's what _FORTIFY_SOURCE uses for the two sets of >functions). In type 2 when the result is zero it means one of >two things: either the size of the array couldn't be determined >or it really is zero. That's less than helpful in cases like: > > char a[8]; > strcpy (a + 8, s); > >where it prevents detecting the buffer overflow. > >I would suggest to use type 1 in the iostream functions instead. > >Adding attribute access to the __istream_extract overloads should >also let GCC check for out-of-bounds accesses by the function and >issue warnings. This goes beyond what __builtin_object_size makes >possible (it considers also variable sizes in known ranges). I tried the attached patch with a testcase like: #include void test01(std::istream& in) { char pc[1]; in >> (pc + 1); // { dg-warning "writing 1 byte into a region of size 0" } } I get a -Wstringop-overflow warning with -O0, but not at -O2. Is that because the call to operator>> gets inlined, and __builtin_object_size returns 0, and we call __istream_extract(in, s, 0) which says we won't write anything, so it's now considered safe? That's unfortunate, because actually __istream_extract will always write at least one byte as a null terminator. So I tried various ways toi try and get the warning back at -O2 and nothing worked e.g. when __builtin_object_size(__s, 0) returns 0 I tried calling __istream_extract(__in, __s, 1) which should say I'm going to write a character, and so should warn if the size is known to be zero. But no wanring. So now I'm considering this: template __attribute__((__nonnull__(2), __access__(__write_only__, 2))) inline basic_istream<_CharT, _Traits>& operator>>(basic_istream<_CharT, _Traits>& __in, _CharT* __s) { size_t __n = __builtin_object_size(__s, 0); if (__builtin_expect(__n < sizeof(_CharT), false)) { // not even space for null terminator __glibcxx_assert(__n >= sizeof(_CharT)); __in.width(0); __in.setstate(ios_base::failbit); } else { if (__n == (size_t)-1) __n = __gnu_cxx::__numeric_traits::__max; std::__istream_extract(__in, __s, __n / sizeof(_CharT)); } return __in; } This will give a -Wstringop-overflow warning at -O0 and then overflow the buffer, with undefined behaviour. And it will give no warning but avoid the overflow when optimising. This isn't my preferred outcome, I'd prefer to always get a warning, *and* be able to avoid the overflow when optimising and the size is known.