From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 35E39385840F for ; Thu, 17 Mar 2022 15:31:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 35E39385840F Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-625-FhEmtTu1NxmkTAV7Hnn7Cw-1; Thu, 17 Mar 2022 11:31:54 -0400 X-MC-Unique: FhEmtTu1NxmkTAV7Hnn7Cw-1 Received: by mail-qt1-f198.google.com with SMTP id a24-20020ac81098000000b002e1e06a72aeso3762316qtj.6 for ; Thu, 17 Mar 2022 08:31:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=9V49sF2rEQdG1l7aWCEEy4xmwsjwfbV21uMmjvCVHbo=; b=v82FSMhiIdUcgspzg8ibBgHGploybbXvtYvMnmcoc7I6IhNTtUTUtDDcv0iN3is4Nb O2nPth6onh4dpLvcLdsyU1nNXtdFQoZHoGI7IuMbir+cRmvRfv6FNQ2hkAexsPBTu2GX 1w96kqJcYql+7UJr0GjpbDq6y1pbWTU3E8Q1fBsaI3GwT+b3nBhHWABtNJXT9v15Hsmx DhNQlgjDV3SxN8eGkY0VnQvRNNnfP/Ot4tU5odBPsRJ6hc56yQd/eCLLJAWyDVZGHK7f Iuz6ihT/wx/tjSPUuCAWP768duYxIRToVK5yYE6+McoqKAiFwuQsqcO4/vsRgdu0PNPE PdFw== X-Gm-Message-State: AOAM530YGHg5PG4F3QOKSJFGAOq/Z0+yx2tyaYAeOGYlowuy557V854m 7KOHO72OXAbo+n3TL1sWF4+pOmBY2LzvSyeyzNMOuFEjcVxvHMhYDQfkj2SwCqHZGv3yuV74FEG CRNJcpv1dAjgyEYLQ/Q== X-Received: by 2002:a05:622a:100c:b0:2e1:b261:f66d with SMTP id d12-20020a05622a100c00b002e1b261f66dmr4207924qte.381.1647531113674; Thu, 17 Mar 2022 08:31:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwpr4ynJe1BVntMW5MmlhSlKaj71WS9MuKgYw2z88gycnPGkSCHUnYluc4cFGQZ7wsRETqxaQ== X-Received: by 2002:a05:622a:100c:b0:2e1:b261:f66d with SMTP id d12-20020a05622a100c00b002e1b261f66dmr4207882qte.381.1647531113255; Thu, 17 Mar 2022 08:31:53 -0700 (PDT) Received: from redhat.com ([2601:184:4780:4310::3f37]) by smtp.gmail.com with ESMTPSA id q6-20020a37a706000000b0067e1bd58a77sm2662817qke.136.2022.03.17.08.31.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Mar 2022 08:31:52 -0700 (PDT) Date: Thu, 17 Mar 2022 11:31:50 -0400 From: Marek Polacek To: Siddhesh Poyarekar Cc: Martin Sebor , gcc-patches@gcc.gnu.org, jakub@redhat.com Subject: Re: [PATCH v2] middle-end/104854: Limit strncmp overread warnings Message-ID: References: <20220311071236.2349381-1-siddhesh@gotplt.org> <20220315053158.1486555-1-siddhesh@gotplt.org> <01c49ecb-cd91-3bcc-c5a0-97e81281e1dd@gotplt.org> <4104ab01-c36c-12e9-5601-6d99019f6bef@redhat.com> MIME-Version: 1.0 In-Reply-To: User-Agent: Mutt/2.1.5 (2021-12-30) X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Mar 2022 15:31:57 -0000 On Wed, Mar 16, 2022 at 06:54:51AM +0530, Siddhesh Poyarekar wrote: > On 16/03/2022 02:06, Martin Sebor wrote: > > The intended use of the strncmp bound is to limit the comparison to > > at most the size of the arrays or (in a subset of cases) the length > > of an initial substring. Providing an arbitrary bound that's not > > related to the sizes as you describe sounds very much like a misuse. > > Nothing in the standard says that the bound is related to the sizes of input > buffers. I don't think deducing that intent makes sense either, nor > concluding that any other use case is misuse. > > > As a historical note, strncmp was first introduced in UNIX v7 where > > its purpose, alongside strncpy, was to manipulate (potentially) > > unterminated character arrays like file names stored in fixed size > > arrays (typically 14 bytes).  Strncpy would fill the buffers with > > ASCII data up to their size and pad the rest with nuls only if there > > was room. > > > > Strncmp was then used to compare these potentially unterminated > > character arrays (e.g., archive headers in ld and ranlib).  The bound > > was the size of the fixed size array.  Its other use case was to compare > > leading portions of strings (e.g, when looking for an environment > > variable or when stripping "./" from path names). > > Thanks for sharing the historical perspective. > > > Since the early UNIX days, both strncpy and to a lesser extent strncmp > > have been widely misused and, along with many other functions in > > , a frequent source of bugs due to common misunderstanding > > of their intended purpose.  The aim of these warnings is to detect > > the common (and sometimes less common) misuses and bugs. > > They're all valid uses however since they do not violate the standard. If we > find at compile time that the strings don't terminate at the bounds, > emitting the warning is OK but the more pessimistic check seems like > overkill. I think I agree, I've tried #include char a[] = "abc"; char b[] = "abcd"; int f (void) { return strncmp (a, b, 8); } where I get t.c:7:10: warning: ‘strncmp’ specified bound 8 exceeds source size 5 [-Wstringop-overread] 7 | return strncmp (a, b, 8); // -Wstringop-overread | ^~~~~~~~~~~~~~~~~ even without -Wall. strncmp sees that a[3] is '\0' so it stops comparing and there's no UB. GCC 11 didn't emit that, so +1 for dialing this warning down. > > I haven't seen these so I can't very well comment on them.  But I can > > assure you that warning for the code above is intentional.  Whether > > or not the arrays are nul-terminated, the expected way to call > > the function is with a bound no greater than their size (some coding > > guidelines are explicit about this; see for example the CERT C Secure > > Coding standard rule ARR38-C). > > > > (Granted, the manual makes it sound like -Wstringop-overread only > > detects provable past-the-end reads.  That's a mistake in > > the documentation that should be fixed.  The warning was never quite > > so limited, nor was it intended to be.) > > The contention is not that it's not provable, it's more that it's doesn't > even pass the "based on available information this is definitely buggy" > assertion, making it more a strong suggestion than a warning that something > is definitely amiss. Which is why IMO it is more suitable as an analyzer > check than a warning. > > Thanks, > Siddhesh > Marek