From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 77270 invoked by alias); 2 Nov 2017 11:48:36 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 77199 invoked by uid 89); 2 Nov 2017 11:48:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=sk:indisti, that!, forcing, msebor@gmail.com X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 02 Nov 2017 11:48:26 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 15F04AAC1; Thu, 2 Nov 2017 11:48:19 +0000 (UTC) Date: Thu, 02 Nov 2017 11:48:00 -0000 From: Richard Biener To: Martin Sebor cc: Gcc Patch List , Jeff Law Subject: Re: [PATCH] enhance -Warray-bounds to detect out-of-bounds offsets (PR 82455) In-Reply-To: <1064925a-ac64-502e-d0dd-85c27e7432f6@gmail.com> Message-ID: References: <79634da6-bf31-b7f0-15f5-0436fc21a51a@gmail.com> <5FF222AD-B155-434B-9C65-721009D1964E@suse.de> <1064925a-ac64-502e-d0dd-85c27e7432f6@gmail.com> User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-SW-Source: 2017-11/txt/msg00072.txt.bz2 On Mon, 30 Oct 2017, Martin Sebor wrote: > On 10/30/2017 02:56 PM, Richard Biener wrote: > > On October 30, 2017 9:13:04 PM GMT+01:00, Martin Sebor > > wrote: > > > On 10/30/2017 01:53 PM, Richard Biener wrote: > > > > On October 30, 2017 4:19:25 PM GMT+01:00, Martin Sebor > > > wrote: > > > > > On 10/30/2017 05:45 AM, Richard Biener wrote: > > > > > > On Sun, 29 Oct 2017, Martin Sebor wrote: > > > > > > > > > > > > > In my work on -Wrestrict, to issue meaningful warnings, I found > > > > > > > it important to detect both out of bounds array indices as well > > > > > > > as offsets in calls to restrict-qualified functions like strcpy. > > > > > > > GCC already detects some of these cases but my tests for > > > > > > > the enhanced warning exposed a few gaps. > > > > > > > > > > > > > > The attached patch enhances -Warray-bounds to detect more > > > instances > > > > > > > out-of-bounds indices and offsets to member arrays and non-array > > > > > > > members. For example, it detects the out-of-bounds offset in the > > > > > > > call to strcpy below. > > > > > > > > > > > > > > The patch is meant to be applied on top posted here but not yet > > > > > > > committed: > > > > > > > https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01304.html > > > > > > > > > > > > > > Richard, since this also touches tree-vrp.c I look for your > > > > > comments. > > > > > > > > > > > > You fail to tell what you are changing and why - I have to reverse > > > > > > engineer this from the patch which a) isn't easy in this case, b) > > > > > feels > > > > > > like a waste of time. Esp. since the patch does many things. > > > > > > > > > > > > My first question is why do you add a warning from forwprop? It > > > > > > _feels_ like you're trying to warn about arbitrary out-of-bound > > > > > > addresses at the point they are folded to MEM_REFs. And it looks > > > > > > like you're warning about pointer arithmetic like &p->a + 6. > > > > > > That doesn't look correct to me. Pointer arithmetic in GIMPLE > > > > > > is not restricted to operate within fields that are appearantly > > > > > > accessed here - the only restriction is with respect to the > > > > > > whole underlying pointed-to-object. > > > > > > > > > > > > By doing the warning from forwprop you'll run into all such cases > > > > > > introduced by GCC itself during quite late optimization passes. > > > > > > > > > > I haven't run into any such cases. What would a more appropriate > > > > > place to detect out-of-bounds offsets? I'm having a hard time > > > > > distinguishing what is appropriate and what isn't. For instance, > > > > > if it's okay to detect some out of bounds offsets/indices in vrp > > > > > why is it wrong to do a better job of it in forwprop? > > > > > > > > > > > > > > > > > You're trying to re-do __builtin_object_size even when that wasn't > > > > > > used. > > > > > > > > > > That's not the quite my intent, although it is close. > > > > > > > > > > > > > > > > > So it looks like you're on the wrong track. Yes, > > > > > > > > > > > > strcpy (p->a + 6, "y"); > > > > > > > > > > > > _may_ be "invalid" C (I'm not even sure about that!) but it > > > > > > is certainly not invalid GIMPLE. > > > > > > > > > > Adding (or subtracting) an integer to/from a pointer to an array > > > > > is defined in both C and C++ only if the result points to an element > > > > > of the array or just past the last element of the array. Otherwise > > > > > it's undefined. (A non-array object is considered an array of one > > > > > for this purpose.) > > > > > > > > On GIMPLE this is indistinguishable from (short *) (p->a) + 3. > > > > > > Sure, they're both the same: > > > > > > pa_3 = &p_2(D)->a; > > > _1 = pa_3 + 6; > > > > > > and that's fine because the implementation of the warning sees and > > > uses the byte offset from the beginning of a, so I don't understand > > > the problem you are pointing out. Can you clarify what you mean? > > > > It does not access the array but the underlying object. On GIMPLE it is just > > an address calculation without constraints. > > But the computation starts with the subobject and so is only > valid within the bounds of the subobject. Or are you saying > that GCC emits such GIMPLE expressions for valid code? If so, > can you give an example of such code? There were elaborate transforms of ptr + CST to ptr->a.b.c[3] in the past. We have ripped out _most_ of them because of bad interaction with dependence analysis and _b_o_s warnings. But for example PRE might still end up propagating _1 = &ptr->a.b.c; _2 = (*_1)[i_3]; as _2 = ptr->a.b.c[i_3]; But it's not so much GCC building up GIMPLE expressions that would cause false positives but "valid" C code and "invalid" C code represented exactly the same in GCC. Let's say struct S { short s[4]; int i; }; char foo (struct S *p) { *((char *)p->s + 8); } for example which I think is perfectly valid but ends up as foo (struct S * p) { short int[4] * _1; char * _2; [0.00%] [count: INV]: _1 = &p_3(D)->s; _2 = _1 + 8; return; in GIMPLE. You might say well, I should have used (char *)p, not (char *)p->s, but I believe such code is more common than you think (and we have the different _FORITIFY_SOURCE levels for a reason! You are forcing the most strict interpretation on all address computations!) > Or, if that's not it, what exactly is your concern with this > enhancement? If it's that it's implemented in forwprop, what > would be a better place, e.g., earlier in the optimization > phase? If it's something something else, I'd appreciate it > if you could explain what. For one implementing this in forwprop looks like a move in the wrong direction. I'd like to have separate warning passes or at most amend warnings from optimization passes, not add new ones. Richard.