From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 102273 invoked by alias); 20 Aug 2018 10:12:46 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 102264 invoked by uid 89); 20 Aug 2018 10:12:45 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=investigations, proxy, his X-HELO: mail-lj1-f196.google.com Received: from mail-lj1-f196.google.com (HELO mail-lj1-f196.google.com) (209.85.208.196) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 20 Aug 2018 10:12:43 +0000 Received: by mail-lj1-f196.google.com with SMTP id q127-v6so11127509ljq.11 for ; Mon, 20 Aug 2018 03:12:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yLGHbx9RBidI2mAU6YOdLKjwvRJTJVvFXuaOhwvQJdk=; b=L0stZk+az1nh+ygG0L7WWodUPMygPwXOE0byOkavkNed9sUXFa8JfdJdkjBnAPCdJ8 4MNjboYIHutLZFz3qkMbcQbXUle0Bz9EETcWwfY2VcSAjcFxpo+R6AwpkaDQgOdlA3rp ieBeK5Dj3dv5GZE27ZJ0NjIwKuw2j87vh56Q0xRlADRQAIUPxL0zmGOpK6rYKo/t6xqj 2MCfpraAcgREWUaQAor0V+yst7oSU9AS0k9nsbGQUYOMhqKESc7mNtyHASovyy6EA8Ke pmC/lUjMuMPSNZIUgGdnCdv53pfvi0p9VteaWKBblz77vQDCcU5ozULQ3N6jiumgez1/ Ad/w== MIME-Version: 1.0 References: <28fed157-7221-f517-4d2a-0d3f74b19e29@redhat.com> <93caaaa6-d6d1-0d4d-c735-b4d9d5bcce07@gmail.com> <8b0e06a1-eea4-418e-35df-c394766bea10@gmail.com> <20180731063839.GC17988@tucnak> <3d6899a7-4536-253e-e082-819301e6ab38@gmail.com> <20180731154812.GF17988@tucnak> <933a1c4a-8cd0-a538-1e7e-d481b7d6ce80@gmail.com> <908f3c26-6456-d6fa-3ad0-aefdb021f8b1@gmail.com> <6da9021b-9b90-2d34-f94e-73f98648a640@gmail.com> <20eb46c4-6480-a6dd-cc91-9142cf9da3c4@redhat.com> In-Reply-To: <20eb46c4-6480-a6dd-cc91-9142cf9da3c4@redhat.com> From: Richard Biener Date: Mon, 20 Aug 2018 10:12:00 -0000 Message-ID: Subject: Re: [PATCH] Make strlen range computations more conservative To: Jeff Law Cc: Martin Sebor , Bernd Edlinger , GCC Patches , Richard Guenther , Jakub Jelinek Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2018-08/txt/msg01123.txt.bz2 On Wed, Aug 15, 2018 at 6:39 AM Jeff Law wrote: > > On 08/10/2018 10:56 AM, Martin Sebor wrote: > > On 08/08/2018 11:36 PM, Jeff Law wrote: > >> On 08/02/2018 09:42 AM, Martin Sebor wrote: > >> > >>> The warning bits are definitely not okay by me. The purpose > >>> of the warnings (-W{format,sprintf}-{overflow,truncation} is > >>> to detect buffer overflows. When a warning doesn't have access > >>> to string length information for dynamically created strings > >>> (like the strlen pass does) it uses array sizes as a proxy. > >>> This is useful both to detect possible buffer overflows and > >>> to prevent false positives for overflows that cannot happen > >>> in correctly written programs. > >> So how much of this falling-back to array sizes as a proxy would become > >> unnecessary if sprintf had access to the strlen pass as an analysis > >> module? > >> > >> As you know we've been kicking that around and from my investigations > >> that doesn't really look hard to do. Encapsulate the data structures in > >> a class, break up the statement handling into analysis and optimization > >> and we should be good to go. I'm hoping to start prototyping this week. > >> > >> If we think that has a reasonable chance to eliminate the array-size > >> fallback, then that seems like the most promising path forward. > > > > We discussed this idea this morning so let me respond here and > > reiterate the answer. Using the strlen data will help detect > > buffer overflow where the array size isn't available but it > > cannot replace the array size heuristic. Here's a simple > > example: > > > > struct S { char a[8]; }; > > > > char d[8]; > > void f (struct S *s, int i) > > { > > sprintf (d, "%s-%i", s[i].a, i); > > } > > > > We don't know the length of s->a but we do know that it can > > be up to 7 bytes long (assuming it's nul-terminated of course) > > so we know the sprintf call can overflow. Conversely, if > > the size of the destination is increased to 20 the sprintf > > call cannot overflow so the diagnostic can be avoided. > > > > Removing the array size heuristic would force us to either give > > up on diagnosing the first case or issue false positives for > > the second case. I think the second alternative would make > > the warning too noisy to be useful. > > > > The strlen pass will help detect buffer overflows in cases > > where the array size isn't known (e.g., with dynamically > > allocated buffers) but where the length of the string store > > in the array is known. It will also help avoid false positives > > in cases where the string stored in an array of known size is > > known to be too short to cause an overflow. For instance here: > > > > struct S { char a[8]; }; > > > > char d[8]; > > void f (struct S *s, int i) > > { > > if (strlen (s->a) < 4 && i >= 0 && i < 100) > > sprintf (d, "%s-%i", s->a, i); > > } > ACK. Thanks for explaining things here too. I can't speak for others, > but seeing examples along with the explanation is easier for me to absorb. > > For Bernd and others -- after kicking things around a bit with Martin, > what we're recommending is that compute_string_length we compute the > bounds using both GIMPLE and C semantics and return both. But you can't do this because GIMPLE did transforms that are not valid in C, thus you can't interpret the GIMPLE IL as "C", you can only interpret it as GIMPLE. What you'd do is return GIMPLE semantics length and "foobar" semantics length which doesn't match the original source. > > Anything which influences code generation or optimization must use the > GIMPLE semantics. Warnings may use the C semantics in an effort to > improve preciseness. > > Martin has some other stuff to flush out of his queue, then he'll be > focused on the changes to compute_string_length. > Jeff