From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20257 invoked by alias); 30 May 2019 15:14:00 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 20249 invoked by uid 89); 30 May 2019 15:14:00 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-14.3 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_HELO_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 30 May 2019 15:13:58 +0000 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 811C681135; Thu, 30 May 2019 15:13:57 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-56.rdu2.redhat.com [10.10.112.56]) by smtp.corp.redhat.com (Postfix) with ESMTP id B359738E06; Thu, 30 May 2019 15:13:56 +0000 (UTC) Subject: Re: [PATCH] warn on returning alloca and VLA (PR 71924, 90549) To: Martin Sebor , gcc-patches References: <7dc3d2b2-7bd4-b82f-bc48-9e5878e27475@redhat.com> <0a1e0b3c-df54-6562-adf9-6facbc6dbd9f@gmail.com> From: Jeff Law Openpgp: preference=signencrypt Message-ID: <244d2a97-a753-4296-df9c-f53feaadd70d@redhat.com> Date: Thu, 30 May 2019 15:21:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <0a1e0b3c-df54-6562-adf9-6facbc6dbd9f@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2019-05/txt/msg02019.txt.bz2 On 5/30/19 8:52 AM, Martin Sebor wrote: > On 5/29/19 11:45 AM, Jeff Law wrote: >> On 5/22/19 3:34 PM, Martin Sebor wrote: >>> -Wreturn-local-addr detects a subset of instances of returning >>> the address of a local object from a function but the warning >>> doesn't try to handle alloca or VLAs, or some non-trivial cases >>> of ordinary automatic variables[1]. >>> >>> The attached patch extends the implementation of the warning to >>> detect those.  It still doesn't detect instances where the address >>> is the result of a built-in such strcpy[2]. >>> >>> Tested on x86_64-linux. >>> >>> Martin >>> >>> [1] For example, this is only diagnosed with the patch: >>> >>>    void* f (int i) >>>    { >>>      struct S { int a[2]; } s[2]; >>>      return &s->a[i]; >>>    } >>> >>> [2] The following is not diagnosed even with the patch: >>> >>>    void sink (void*); >>> >>>    void* f (int i) >>>    { >>>      char a[6]; >>>      char *p = __builtin_strcpy (a, "123"); >>>      sink (p); >>>      return p; >>>    } >>> >>> I would expect detecting to be possible and useful.  Maybe as >>> a follow-up. >>> >>> gcc-71924.diff >>> >>> PR middle-end/71924 - missing -Wreturn-local-addr returning alloca >>> result >>> PR middle-end/90549 - missing -Wreturn-local-addr maybe returning an >>> address of a local array plus offset >>> >>> gcc/ChangeLog: >>> >>>     PR c/71924 >>>     * gimple-ssa-isolate-paths.c (is_addr_local): New function. >>>     (warn_return_addr_local_phi_arg, warn_return_addr_local): Same. >>>     (find_implicit_erroneous_behavior): Call >>> warn_return_addr_local_phi_arg. >>>     (find_explicit_erroneous_behavior): Call warn_return_addr_local. >>> >>> gcc/testsuite/ChangeLog: >>> >>>     PR c/71924 >>>     * gcc.dg/Wreturn-local-addr-2.c: New test. >>>     * gcc.dg/Walloca-4.c: Prune expected warnings. >>>     * gcc.dg/pr41551.c: Same. >>>     * gcc.dg/pr59523.c: Same. >>>     * gcc.dg/tree-ssa/pr88775-2.c: Same. >>>     * gcc.dg/winline-7.c: Same. >>> >>> diff --git a/gcc/gimple-ssa-isolate-paths.c >>> b/gcc/gimple-ssa-isolate-paths.c >>> index 33fe352bb23..2933ecf502e 100644 >>> --- a/gcc/gimple-ssa-isolate-paths.c >>> +++ b/gcc/gimple-ssa-isolate-paths.c >>> @@ -341,6 +341,135 @@ stmt_uses_0_or_null_in_undefined_way (gimple >>> *stmt) >>>     return false; >>>   } >>>   +/* Return true if EXPR is a expression of pointer type that refers >>> +   to the address of a variable with automatic storage duration. >>> +   If so, set *PLOC to the location of the object or the call that >>> +   allocated it (for alloca and VLAs).  When PMAYBE is non-null, >>> +   also consider PHI statements and set *PMAYBE when some but not >>> +   all arguments of such statements refer to local variables, and >>> +   to clear it otherwise.  */ >>> + >>> +static bool >>> +is_addr_local (tree exp, location_t *ploc, bool *pmaybe = NULL, >>> +           hash_set *visited = NULL) >>> +{ >>> +  if (TREE_CODE (exp) == ADDR_EXPR) >>> +    { >>> +      tree baseaddr = get_base_address (TREE_OPERAND (exp, 0)); >>> +      if (TREE_CODE (baseaddr) == MEM_REF) >>> +    return is_addr_local (TREE_OPERAND (baseaddr, 0), ploc, pmaybe, >>> visited); >>> + >>> +      if ((!VAR_P (baseaddr) >>> +       || is_global_var (baseaddr)) >>> +      && TREE_CODE (baseaddr) != PARM_DECL) >>> +    return false; >>> + >>> +      *ploc = DECL_SOURCE_LOCATION (baseaddr); >>> +      return true; >>> +    } >>> + >>> +  if (TREE_CODE (exp) == SSA_NAME) >>> +    { >>> +      gimple *def_stmt = SSA_NAME_DEF_STMT (exp); >>> +      enum gimple_code code = gimple_code (def_stmt); >>> + >>> +      if (is_gimple_assign (def_stmt)) >>> +    { >>> +      tree type = TREE_TYPE (gimple_assign_lhs (def_stmt)); >>> +      if (POINTER_TYPE_P (type)) >>> +        { >>> +          tree ptr = gimple_assign_rhs1 (def_stmt); >>> +          return is_addr_local (ptr, ploc, pmaybe, visited); >>> +        } >>> +      return false; >>> +    } >>> + >>> +      if (code == GIMPLE_CALL >>> +      && gimple_call_builtin_p (def_stmt)) >>> +    { >>> +      tree fn = gimple_call_fndecl (def_stmt); >>> +      int code = DECL_FUNCTION_CODE (fn); >>> +      if (code != BUILT_IN_ALLOCA >>> +          && code != BUILT_IN_ALLOCA_WITH_ALIGN) >>> +        return false; >>> + >>> +      *ploc = gimple_location (def_stmt); >>> +      return true; >>> +    } >>> + >>> +      if (code == GIMPLE_PHI && pmaybe) >>> +    { >>> +      unsigned count = 0; >>> +      gphi *phi_stmt = as_a (def_stmt); >>> + >>> +      unsigned nargs = gimple_phi_num_args (phi_stmt); >>> +      for (unsigned i = 0; i < nargs; ++i) >>> +        { >>> +          if (!visited->add (phi_stmt)) >>> +        { >>> +          tree arg = gimple_phi_arg_def (phi_stmt, i); >>> +          if (is_addr_local (arg, ploc, pmaybe, visited)) >>> +            ++count; >>> +        } >>> +        } >>> + >>> +      *pmaybe = count && count < nargs; >>> +      return count != 0; >>> +    } >>> +    } >>> + >>> +  return false; >>> +} >> Is there some reason you didn't query the alias oracle here?  It would >> seem a fairly natural fit.  Ultimately given a pointer (which will be an >> SSA_NAME) you want to ask whether or not it conclusively points into the >> stack. >> >> That would seem to dramatically simplify is_addr_local. > > I did think about it but decided against changing the existing > design (iterating over PHI arguments), for a couple of reasons: > > 1) It feels like a bigger change than my simple "bug fix" calls >    for. I suspect the net result will be simpler though ;-) I think you just get a pt solution and iterate over the things the solution says the pointer can point to. > 2) I'm not familiar enough with the alias oracle to say for sure >    if it can be used to give the same results as the existing >    implementation.  I.e., make it possible to identify and >    isolate each path that returns a local address (rather than >    just answering: yes, this pointer may point to some local >    in this function). Precision of the oracle is certainly the big question. > > If the alias oracle can be used to give the same results without > excessive false positives then I think it would be fine to make > use of it.  Is that something you consider a prerequisite for > this change or should I look into it as a followup? I think we should explore it a bit before making a final decision. It may guide us for other work in this space (like detecting escaping locals). I think a dirty prototype to see if it's even in the right ballpark would make sense. > > FWIW, I'm working on enhancing this to detect returning freed > pointers (under a different option).  That seems like a good > opportunity to also look into making use of the alias oracle. Yes. I think there's two interesting cases here to ponder. If we free a pointer that must point to a local, then we can warn & trap. If we free a pointer that may point to a local, then we can only warn (unless we can isolate the path). > > Besides these enhancements, there's also a request to diagnose > dereferencing pointers to compound literals whose lifetime has > ended (PR 89990), or more generally, those to any such local > object.  It's about preventing essentially the same problem > (accessing bad data or corrupting others) but it seems that > both the analysis and the handling will be sufficiently > different to consider implementing it somewhere else.  What > are your thoughts on that? I think the tough problem here is we lose binding scopes as we lower into gimple, so solving it in the general case may be tough. We've started adding some clobbers into the IL to denote object death points, but I'm not sure if they're sufficient to implement this kind of warning. Jeff