From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 100022 invoked by alias); 3 Aug 2018 03:06:49 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 100008 invoked by uid 89); 3 Aug 2018 03:06:48 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=stricter, tracks, built-ins X-HELO: mail-qt0-f171.google.com Received: from mail-qt0-f171.google.com (HELO mail-qt0-f171.google.com) (209.85.216.171) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 03 Aug 2018 03:06:46 +0000 Received: by mail-qt0-f171.google.com with SMTP id h4-v6so4719816qtj.7 for ; Thu, 02 Aug 2018 20:06:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=neNcPEGhViBmNTA5oby9yXP3miHXzyUOhI6scqQFKes=; b=XCnZggo99GNSDIkvBAwNLrsFRW+Gu+Wh7bEOZpd54wD3heiqU9bikuK2eYdBLSKZUF DKYEO+ZyHVqz51oRQuYvapyQZJDnbe79+pzIBrwK56nj5NzrOlwZUUDaFvaDQSgeoZyL yHzNhh6rhhAGA+5/VELMukcjO6xUkP7xEPQJdCTPyQ8xnQg+1qqDaHop/ety/Fmm45qr RLCoQBniiGqMJfydJ4XBsmBCKbsMzy1tXlzhsMfkfvld6zAM+1RqDfVCgCui6oK/JYH4 Zy7QNRsLkrwwSPbznZaF2eT5qZp+PgFiP4KgYM6Y5P8URBOEjown0W2fe+AQSw+ujfpW zoaw== Return-Path: Received: from localhost.localdomain (97-118-124-30.hlrn.qwest.net. [97.118.124.30]) by smtp.gmail.com with ESMTPSA id c11-v6sm2795217qkb.22.2018.08.02.20.06.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 02 Aug 2018 20:06:43 -0700 (PDT) Subject: Re: [PATCH] Make strlen range computations more conservative To: Bernd Edlinger , GCC Patches References: <28fed157-7221-f517-4d2a-0d3f74b19e29@redhat.com> <93caaaa6-d6d1-0d4d-c735-b4d9d5bcce07@gmail.com> <8b0e06a1-eea4-418e-35df-c394766bea10@gmail.com> <20180731063839.GC17988@tucnak> <3d6899a7-4536-253e-e082-819301e6ab38@gmail.com> <20180731154812.GF17988@tucnak> <933a1c4a-8cd0-a538-1e7e-d481b7d6ce80@gmail.com> <908f3c26-6456-d6fa-3ad0-aefdb021f8b1@gmail.com> <6da9021b-9b90-2d34-f94e-73f98648a640@gmail.com> <62d84546-b41c-21a3-4cf1-7737ed7d8cd6@gmail.com> Cc: Richard Biener , Jakub Jelinek , Jeff Law From: Martin Sebor Message-ID: <0d53c60a-2b96-f02f-72e1-5727b514ac4b@gmail.com> Date: Fri, 03 Aug 2018 03:06:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2018-08/txt/msg00252.txt.bz2 On 08/02/2018 12:15 PM, Bernd Edlinger wrote: > On 08/02/18 19:00, Martin Sebor wrote: >> As an alternate approach I have been thinking about, if >> there is a strong feeling that allowing strlen to iterate >> past the subobject boundary is necessary (I don't believe >> it is.) >> >> Rather than indiscriminately expanding the provenance of >> the subobject regardless of what members follow it in >> the enclosing structure, only consider doing that if >> the next member is an array of the same type. E.g., >> >> struct S { char a[4], b[3], c[2], d; }; >> extern struct S *p; >> >> strlen (p->a); // consider p->a's bounds to be char[9] >> > > No, initially I thought in the same direction, > but looking at the way how X-server is broken, > I realized that will probably not be sufficient. > > Maybe it would be good to have one set of optimistic range infos > that follow the standards, and can be used to control the warnings, > and another set of pessimistic range infos, that control optimizations. > > Consider > extern char garbage[10]; > char x[10]; > memcpy(x, garbage, 10); > x[9] = 0; > > strlen(x) will be 0..9 no matter what was in garbage. > That information can be safely used for optimizations. > > But if we have > extern char garbage[10]; > char x[10]; > memcpy(x, garbage, 10); > char *y = x; > if (strlen(y) < 10) <= may not be removed pessimistically This example involves a whole object. There should be no question that running off the end of a full object is undefined and a bug. It's far preferable to avoid such bugs. Emitting a library call that can return an arbitrary value or crash is the least secure and least user-friendly solution. > char z[10]; > sprintf(z, "%s", y); <= omitting warning would be okay optimistically. > > It is not really easy to do but possible. I see three cases here: 1) y points to a constant array whose initializer we know 2) y points to an array with contents (length) known to the strlen pass 3) we don't know what y points to or its contents (1) can be handled easily by extending the approach in my patch for bug 86552 to other built-ins. I have a patch that does that for sprintf. (2) can be handled by extending the strlen pass to also track the sizes of array objects into which it tracks stores. I'm working on a patch for GCC 9. (3) can only be handled by adding an even stricter setting for -Wformat-overflow by warning for all %s arguments of unknown length. I haven't considered this yet. > > In the moment I would like to concentrate exclusively on wrong code issues The only wrong code is in programs that run off the end of unterminated buffers. Those are the ones I believe we should focus on helping detect and prevent. Warnings are the fist step. Enhancing _FORTIFY_SOURCE and the sanitizers to also detect such reads would be another solution (in addition to warnings). Emitting traps (say under an option) would be yet another. But changing GCC to silently accept such programs is, in my opinion, the worst possible approach. Martin