From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 57468 invoked by alias); 3 Jun 2019 17:24:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 57417 invoked by uid 89); 3 Jun 2019 17:24:58 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.1 spammy=usedef, Though, use-def X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 03 Jun 2019 17:24:56 +0000 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 83E9367F; Mon, 3 Jun 2019 17:24:47 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-57.rdu2.redhat.com [10.10.112.57]) by smtp.corp.redhat.com (Postfix) with ESMTP id 70626600CC; Mon, 3 Jun 2019 17:24:44 +0000 (UTC) Subject: Re: [PATCH] warn on returning alloca and VLA (PR 71924, 90549) To: Richard Biener Cc: Martin Sebor , gcc-patches References: <7dc3d2b2-7bd4-b82f-bc48-9e5878e27475@redhat.com> <0a1e0b3c-df54-6562-adf9-6facbc6dbd9f@gmail.com> <244d2a97-a753-4296-df9c-f53feaadd70d@redhat.com> <030f897f-a074-8319-96ae-c86e6c9a8551@gmail.com> <66d21304-8dd3-cd97-5ee8-317413c45ffb@redhat.com> From: Jeff Law Openpgp: preference=signencrypt Message-ID: <4062d590-b889-9567-6af6-e798aca65909@redhat.com> Date: Mon, 03 Jun 2019 17:24:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2019-06/txt/msg00106.txt.bz2 On 6/3/19 3:37 AM, Richard Biener wrote: > On Fri, May 31, 2019 at 5:35 PM Jeff Law wrote:> >> On 5/30/19 4:56 PM, Martin Sebor wrote: >>> On 5/30/19 10:15 AM, Jeff Law wrote: >>>> On 5/30/19 9:34 AM, Martin Sebor wrote: >>>> >>>>>>> If the alias oracle can be used to give the same results without >>>>>>> excessive false positives then I think it would be fine to make >>>>>>> use of it. Is that something you consider a prerequisite for >>>>>>> this change or should I look into it as a followup? >>>>>> I think we should explore it a bit before making a final decision. It >>>>>> may guide us for other work in this space (like detecting escaping >>>>>> locals). I think a dirty prototype to see if it's even in the right >>>>>> ballpark would make sense. >>>>> >>>>> Okay, let me look into it. >>>> Sounds good. Again, go with a quick prototype to see if it's likely >>>> feasible. The tests you've added should dramatically help evaluating if >>>> the oracle is up to the task. >>> >>> So to expand on what I said on the phone when we spoke: the problem >>> I quickly ran into with the prototype is that I wasn't able to find >>> a way to identify pointers to alloca/VLA storage. >> Your analysis matches my very quick read of the aliasing code. It may >> be the case that the Steensgaard patent got in the way here. >> >>> >>> In the the points-to solution for the pointer being returned they >>> both have the vars_contains_escaped_heap flag set. That seems like >>> an omission that shouldn't be hard to fix, but on its own, I don't >>> think it would be sufficient. >> RIght. In theory the result of an alloca call shouldn't alias anything >> in the heap -- but there were some implementations of alloca that were >> built on top of malloc (ugh). That flag may be catering to that case. >> >> But in the case of a __builtin_alloca that shouldn't apply. Hmm. That >> ultimately might be a bug. >> >>> >>> In the IL a VLA is represented as a pointer to an array, but when >>> returning a pointer into a VLA (at some offset so it's an SSA_NAME), >>> the pointer's point-to solution doesn't include the VLA pointer or >>> (AFAICS) make it possible to tell even that it is a VLA. For example >>> here: >>> >>> f (int n) >>> { >>> int * p; >>> int[0:D.1912] * a.1; >>> sizetype _1; >>> void * saved_stack.3_3; >>> sizetype _6; >>> >>> [local count: 1073741824]: >>> saved_stack.3_3 = __builtin_stack_save (); >>> _1 = (sizetype) n_2(D); >>> _6 = _1 * 4; >>> a.1_8 = __builtin_alloca_with_align (_6, 32); >>> p_9 = a.1_8 + _6; >>> __builtin_stack_restore (saved_stack.3_3); >>> return p_9; >>> } >>> >>> p_9's solution's is: >>> >>> p_9, points-to vars: { D.1925 } (escaped, escaped heap) >>> >>> I couldn't find out how to determine that D.1925 is a VLA (or even >>> what it is). It's not among the function's local variables that >>> FOR_EACH_LOCAL_DECL iterates over. >> It's possible that decl was created internally as part of the alias >> oracle's analysis. > > Yes. Note that only the UID was reserved the fake decl doesn't > live on. > > Note that for the testcase above the "local" alloca storage escapes > which means you run into a catch-22 here given points-to computes > a conservative correct solution and you want to detect escaping > locals. Usually detecting a pointer to local storage can be done > by using ptr_deref_may_alias_global_p but of course in this > case the storage was marked global by PTA itself (and our PTA > is not flow-sensitive and it doesn't distinguish an escape through > a return stmt from an escape through a call which is relevant > even for local storage). Good point. The inability to tell why something escaped is another significant hurdle. > > Feature-wise the PTA solver is missing sth like a "filter" > we could put in front of return stmts that doesn't let > addresses of locals leak. The simplest way of implementing > this might be to not include 'returns' in the constraints at all > (in non-IPA mode) and handle them by post-processing the > solver result. That gets us some additional flow-sensitivity > and a way to filter locals. Let me see if I can cook up this. Another thought would be to somehow flag how the pointer escaped. ie, was it as an argument, return value or stored to memory? Though in the end that level of detail may not be useful, so collecting it may not be worth the effort. > > That may ultimatively also help the warning code where you > then can use ptr_deref_may_alias_global_p. Another thought would be to use the alias oracle to prune what we look at. If we ask the oracle and the value doesn't escape, then it's not worth walking up the use-def chain to see where it came from. Jeff