From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 50372 invoked by alias); 30 May 2019 22:56:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 50364 invoked by uid 89); 30 May 2019 22:56:31 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-7.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mail-ot1-f67.google.com Received: from mail-ot1-f67.google.com (HELO mail-ot1-f67.google.com) (209.85.210.67) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 30 May 2019 22:56:29 +0000 Received: by mail-ot1-f67.google.com with SMTP id j49so7284003otc.13 for ; Thu, 30 May 2019 15:56:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=C1CsiTwWWCJFMCD4boQPwkceColims0GQ6G7rsvCS4A=; b=toigdsV6d85nQALKwPtWpZaSkzDi3EK3eDBxDSnS5sMNmiOLR93XB0y6Sig9FMXgou fS9AfMuLktVvf6L1F6swwcuwW02jthTx5o5g/k/iP0uqempvLS+dmqX8wq2Jc0fVKhIO DT1nrkzeebYhFeLsRDzqO52k4QCp/x4Rx88Ukv4PMS0y3MrT2Fx7mCeiKJnnh4GSJuhp ifV3lc4pjTdlvI400ZxoqZFEdN7nJDd/Nv4FnDoCbwfyPCo+qEqmWbc+Bm+zhG28ajaG 7LJ0Sg8xwT1qeeCNM5SH13HQZT8JXDxW8y0iz29noiB97SlQfoRS92WKyAp9TWCGFUDg tgRA== Return-Path: Received: from [192.168.0.41] (97-118-125-210.hlrn.qwest.net. [97.118.125.210]) by smtp.gmail.com with ESMTPSA id q3sm1539349oig.7.2019.05.30.15.56.25 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 15:56:25 -0700 (PDT) Subject: Re: [PATCH] warn on returning alloca and VLA (PR 71924, 90549) To: Jeff Law , gcc-patches References: <7dc3d2b2-7bd4-b82f-bc48-9e5878e27475@redhat.com> <0a1e0b3c-df54-6562-adf9-6facbc6dbd9f@gmail.com> <244d2a97-a753-4296-df9c-f53feaadd70d@redhat.com> From: Martin Sebor Message-ID: <030f897f-a074-8319-96ae-c86e6c9a8551@gmail.com> Date: Thu, 30 May 2019 23:35:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2019-05/txt/msg02067.txt.bz2 On 5/30/19 10:15 AM, Jeff Law wrote: > On 5/30/19 9:34 AM, Martin Sebor wrote: > >>>> If the alias oracle can be used to give the same results without >>>> excessive false positives then I think it would be fine to make >>>> use of it.  Is that something you consider a prerequisite for >>>> this change or should I look into it as a followup? >>> I think we should explore it a bit before making a final decision.  It >>> may guide us for other work in this space (like detecting escaping >>> locals).   I think a dirty prototype to see if it's even in the right >>> ballpark would make sense. >> >> Okay, let me look into it. > Sounds good. Again, go with a quick prototype to see if it's likely > feasible. The tests you've added should dramatically help evaluating if > the oracle is up to the task. So to expand on what I said on the phone when we spoke: the problem I quickly ran into with the prototype is that I wasn't able to find a way to identify pointers to alloca/VLA storage. In the the points-to solution for the pointer being returned they both have the vars_contains_escaped_heap flag set. That seems like an omission that shouldn't be hard to fix, but on its own, I don't think it would be sufficient. In the IL a VLA is represented as a pointer to an array, but when returning a pointer into a VLA (at some offset so it's an SSA_NAME), the pointer's point-to solution doesn't include the VLA pointer or (AFAICS) make it possible to tell even that it is a VLA. For example here: f (int n) { int * p; int[0:D.1912] * a.1; sizetype _1; void * saved_stack.3_3; sizetype _6; [local count: 1073741824]: saved_stack.3_3 = __builtin_stack_save (); _1 = (sizetype) n_2(D); _6 = _1 * 4; a.1_8 = __builtin_alloca_with_align (_6, 32); p_9 = a.1_8 + _6; __builtin_stack_restore (saved_stack.3_3); return p_9; } p_9's solution's is: p_9, points-to vars: { D.1925 } (escaped, escaped heap) I couldn't find out how to determine that D.1925 is a VLA (or even what it is). It's not among the function's local variables that FOR_EACH_LOCAL_DECL iterates over. By replacing the VLA in the test case with a call to malloc I get this for the returned p_7: p_7, points-to NULL, points-to vars: { D.1916 } (escaped, escaped heap) Again, I see no way to quickly tell that this pointer points into the block returned from malloc. I'm sure Richard will correct me if there is a simple way to do it but I couldn't find one. If there is a way to identify that the returned pointer is for a VLA (or alloca), for parity with the current patch we also need to find the location of the VLA declaration or the alloca call so that the warning could point to it. Unless there's a straight path from the mysterious D.1925 above to that decl/call statement, finding it would likely require some sor of traversal. At that point I wouldn't be surprised if the complexity of the solution didn't approach or even exceed the current approach. Martin >>>> FWIW, I'm working on enhancing this to detect returning freed >>>> pointers (under a different option).  That seems like a good >>>> opportunity to also look into making use of the alias oracle. >>> Yes.  I think there's two interesting cases here to ponder.  If we free >>> a pointer that must point to a local, then we can warn & trap.  If we >>> free a pointer that may point to a local, then we can only warn (unless >>> we can isolate the path). >> >> I wasn't actually thinking of freeing locals but it sounds like >> a useful enhancement as well.  Thanks for the suggestion! :) >> >> To be clear, what I'm working on is detecting: >> >>   void* f (void *p) >>   { >>     free (p);   // note: pointer was freed here >>     // ... >>     return p;   // warning: returning a freed pointer >>   } > Ah, we were talking about different things. Though what you're doing > might be better modeled in a true global static analyzer as a > use-after-free problem. My sense is that translation-unit local version > of that problem really isn't that useful in practice. THough I guess > there isn't anything bad with having a TU local version. > > >> >>>> Besides these enhancements, there's also a request to diagnose >>>> dereferencing pointers to compound literals whose lifetime has >>>> ended (PR 89990), or more generally, those to any such local >>>> object.  It's about preventing essentially the same problem >>>> (accessing bad data or corrupting others) but it seems that >>>> both the analysis and the handling will be sufficiently >>>> different to consider implementing it somewhere else.  What >>>> are your thoughts on that? >>> I think the tough problem here is we lose binding scopes as we lower >>> into gimple, so solving it in the general case may be tough.  We've >>> started adding some clobbers into the IL to denote object death points, >>> but I'm not sure if they're sufficient to implement this kind of warning. >> >> I was afraid that might be a problem. > Way back in the early days of tree-ssa we kept the binding scopes. But > that proved problematical in various ways. Mostly they just got in the > way of analysis an optimization and we spent far too much time in the > optimizers working around them or removing those which were empty. > > They'd be helpful in this kind of analysis, stack slot sharing vs the > inliner and a couple other problems. I don't know if the pendulum has > moved far enough to revisit :-) > > jeff >