From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x22b.google.com (mail-oi1-x22b.google.com [IPv6:2607:f8b0:4864:20::22b]) by sourceware.org (Postfix) with ESMTPS id 5232D3858403; Fri, 10 Dec 2021 16:35:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5232D3858403 Received: by mail-oi1-x22b.google.com with SMTP id t19so13988815oij.1; Fri, 10 Dec 2021 08:35:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=l6Z6agJGGjTHS4kqDVNUdMaYigjeJH1L7+muhuLDvP4=; b=fQA2nWEKyKUX+kHCfaJFii5iZ70s6L7UA/dCO2KpZggEKr81QzXQdeUrLMlqLuKtrr Crom32GcqvctMPSc2TgnDnx08pAcC5g6ZRaQTgzgtEyQTlnpyn1MA8mSwZa8mO8lFoiP JfI3XTSvAPuyBM90dBl5yghK77eb1HjptTcq36S3Og3T18M3HT1v/BLPCTyO095zaNic iTqkeWRnFjwgr3R9zRtAi46DRPYyEgVAQfitGp1fRojNc5kqsOqAyFK0hk4wMh27J760 enJUZ+b8JA5Y+q8iBfW+ktFxx4+O8zqaDyW4ILR1yzYIhjyYHNrZWWr/0l34iyS2dwPz N+MQ== X-Gm-Message-State: AOAM532L1p+1ZVD5Qcrz0Pe+7qSwTkAKswMhmlNAEdvIFLAgZzMBFcBf 33vbFFXzpJNd4z57JNkNN/yXF/Y6E4Q= X-Google-Smtp-Source: ABdhPJz//D79hOivZhRP6Nc/d8lB200LXcqUEWME22hf+8rBNY5UGtp/Et0c34xZSmQ++QQt1ncZVw== X-Received: by 2002:aca:b382:: with SMTP id c124mr13075716oif.169.1639154152550; Fri, 10 Dec 2021 08:35:52 -0800 (PST) Received: from [192.168.0.41] (184-96-227-137.hlrn.qwest.net. [184.96.227.137]) by smtp.gmail.com with ESMTPSA id t12sm608483ood.22.2021.12.10.08.35.51 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 10 Dec 2021 08:35:52 -0800 (PST) Subject: Re: [committed] libstdc++: Disable over-zealous warnings about std::string copies [PR103332] To: Jonathan Wakely Cc: Jonathan Wakely , libstdc++ , gcc-patches References: <20211209232453.1568609-1-jwakely@redhat.com> From: Martin Sebor Message-ID: <8a43115a-fab5-e2e7-6737-e0fbc43ebb59@gmail.com> Date: Fri, 10 Dec 2021 09:35:50 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2021 16:35:55 -0000 On 12/10/21 3:12 AM, Jonathan Wakely wrote: > On Fri, 10 Dec 2021 at 01:50, Martin Sebor via Libstdc++ > wrote: >> >> On 12/9/21 5:38 PM, Martin Sebor wrote: >>> On 12/9/21 4:24 PM, Jonathan Wakely via Gcc-patches wrote: >>>> These warnings are triggered by perfectly valid code using std::string. >>>> They're particularly bad when --enable-fully-dynamic-string is used, >>>> because even std::string().begin() will give a warning. >>>> >>>> Use pragmas to stop the troublesome warnings for copies done by >>>> std::char_traits. >>> >>> I'm still experimenting with some of the approaches we discussed >>> last week, but based on my findings so far this was going to be >>> my suggestion at lest for now, until or unless the problem turns >>> out to affect more code than just std::string. >> >> Just minutes after I wrote this I tried following the clue >> in the note printed for the test case from PR 103534 with >> an enhancement I'm experimenting with: >> >> /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:426:56: >> warning: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ >> specified size between 18446744073709551600 and 18446744073709551615 >> exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=] >> 426 | return static_cast(__builtin_memcpy(__s1, >> __s2, __n)); >> | >> ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ >> /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:426:56: >> note: when >> ‘(.std::__cxx11::basic_string::_M_string_length > >> 18446744073709551599)’ >> >> and adding an assert to string::size(): >> >> constexpr >> size_type >> size() const noexcept >> { >> if (_M_string_length >= -1LU >> 1) > > I think that will be wrong for 64-bit Windows, where long is 32-bit, > but the string's max size is closer to -1LLU >> 2, and we don't want > to just use -1LLU because that's wrong for 32-bit targets. > > Technically the max size is (N - 1)/2 where N is the allocator's max > size, which is PTRDIFF_MAX for std::allocator, SIZE_MAX for allocators > that use the default value from std::allocator_traits, or some other > value that the allocator defines. And the value is reduced > appropriately if sizeof(char_type) > 1. > > But if we call this->max_size() which calls _Alloc_traits::max_size() > which potentially calls allocator_type::max_size(), will the compiler > actually make those calls to mark the path unreachable, or will it > just ignore the unreachable if it can't fold all those calls at > compile time? The above was just a quick proof of concept experiment. You're of course right that the final solution can't be so crude(*). But if the required functions are always_inline (I think member functions defined in the body of the class implicitly are) then I'd expect them to be folded at compile time at all optimization levels. That's what makes -Winvalid-memory-order work in C++ even at -O0 in the patch I posted earlier this week. If I still remember my C++-library-fu, even though the standard requires containers to call max_size() etc., since std::string is (or can be) an explicit specialization, there shouldn't be a way for a conforming program to find out if it does, so it could take shortcuts. That makes me wonder if it could even call malloc instead of operator new to get around the problem with the operator's replaceability. > > We could just use __SIZE_MAX__/2 which might be larger than the real > maximum, but will never be smaller. > > Jason made a suggestion last week which was that if the warning code > has a range like [2,INT_MAX] or [2UL,ULONG_MAX] that it should assume > that is really [2,infinity] and so not warn. If the upper bound is the > maximum possible value for the type, that's effectively unbounded, and > the warning could assume it's simply not got enough information to > give a useful warning. If it has a range like [2,300] for a buffer of > length 100, that should warn. But [2,infinity] is effectively ranger > saying "I have no idea what this value is" and the warning machinery > should not be issuing warnings on the basis of "I have no idea what > this value is". The warnings use the lower bound of the access size and ignore the upper bound. They use the upper bound of the space remaining in the destination(s) and ignore the lower bound. In a call to memcpy (P + I, s, N) the access size is N, and the space remaining in A is the upper bound of the size of the largest array A that P points to (it might point to any number of them, and some might be dynamically allocated with a size in some range) minus the lower bound of I or zero, whichever is greater. This is the most conservative strategy. There are other strategies where we could get more true positives but also more false positives. Those might be appropriate for the "maybe" kind of warnings. Martin >> __builtin_unreachable (); >> return _M_string_length; >> } >> >> That gets rid of the false positive in this PR. I realize >> the others happen for other reasons but this approach at least >> suggests that there might be other ways to suppress them than >> the #pragma. Unlike it, the alternative approaches should also >> improve codegen.