public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Sebor <msebor@gmail.com>
To: Jeff Law <law@redhat.com>, gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] warn on returning alloca and VLA (PR 71924, 90549)
Date: Thu, 30 May 2019 14:58:00 -0000	[thread overview]
Message-ID: <0a1e0b3c-df54-6562-adf9-6facbc6dbd9f@gmail.com> (raw)
In-Reply-To: <7dc3d2b2-7bd4-b82f-bc48-9e5878e27475@redhat.com>

On 5/29/19 11:45 AM, Jeff Law wrote:
> On 5/22/19 3:34 PM, Martin Sebor wrote:
>> -Wreturn-local-addr detects a subset of instances of returning
>> the address of a local object from a function but the warning
>> doesn't try to handle alloca or VLAs, or some non-trivial cases
>> of ordinary automatic variables[1].
>>
>> The attached patch extends the implementation of the warning to
>> detect those.  It still doesn't detect instances where the address
>> is the result of a built-in such strcpy[2].
>>
>> Tested on x86_64-linux.
>>
>> Martin
>>
>> [1] For example, this is only diagnosed with the patch:
>>
>>    void* f (int i)
>>    {
>>      struct S { int a[2]; } s[2];
>>      return &s->a[i];
>>    }
>>
>> [2] The following is not diagnosed even with the patch:
>>
>>    void sink (void*);
>>
>>    void* f (int i)
>>    {
>>      char a[6];
>>      char *p = __builtin_strcpy (a, "123");
>>      sink (p);
>>      return p;
>>    }
>>
>> I would expect detecting to be possible and useful.  Maybe as
>> a follow-up.
>>
>> gcc-71924.diff
>>
>> PR middle-end/71924 - missing -Wreturn-local-addr returning alloca result
>> PR middle-end/90549 - missing -Wreturn-local-addr maybe returning an address of a local array plus offset
>>
>> gcc/ChangeLog:
>>
>> 	PR c/71924
>> 	* gimple-ssa-isolate-paths.c (is_addr_local): New function.
>> 	(warn_return_addr_local_phi_arg, warn_return_addr_local): Same.
>> 	(find_implicit_erroneous_behavior): Call warn_return_addr_local_phi_arg.
>> 	(find_explicit_erroneous_behavior): Call warn_return_addr_local.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	PR c/71924
>> 	* gcc.dg/Wreturn-local-addr-2.c: New test.
>> 	* gcc.dg/Walloca-4.c: Prune expected warnings.
>> 	* gcc.dg/pr41551.c: Same.
>> 	* gcc.dg/pr59523.c: Same.
>> 	* gcc.dg/tree-ssa/pr88775-2.c: Same.
>> 	* gcc.dg/winline-7.c: Same.
>>
>> diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
>> index 33fe352bb23..2933ecf502e 100644
>> --- a/gcc/gimple-ssa-isolate-paths.c
>> +++ b/gcc/gimple-ssa-isolate-paths.c
>> @@ -341,6 +341,135 @@ stmt_uses_0_or_null_in_undefined_way (gimple *stmt)
>>     return false;
>>   }
>>   
>> +/* Return true if EXPR is a expression of pointer type that refers
>> +   to the address of a variable with automatic storage duration.
>> +   If so, set *PLOC to the location of the object or the call that
>> +   allocated it (for alloca and VLAs).  When PMAYBE is non-null,
>> +   also consider PHI statements and set *PMAYBE when some but not
>> +   all arguments of such statements refer to local variables, and
>> +   to clear it otherwise.  */
>> +
>> +static bool
>> +is_addr_local (tree exp, location_t *ploc, bool *pmaybe = NULL,
>> +	       hash_set<gphi *> *visited = NULL)
>> +{
>> +  if (TREE_CODE (exp) == ADDR_EXPR)
>> +    {
>> +      tree baseaddr = get_base_address (TREE_OPERAND (exp, 0));
>> +      if (TREE_CODE (baseaddr) == MEM_REF)
>> +	return is_addr_local (TREE_OPERAND (baseaddr, 0), ploc, pmaybe, visited);
>> +
>> +      if ((!VAR_P (baseaddr)
>> +	   || is_global_var (baseaddr))
>> +	  && TREE_CODE (baseaddr) != PARM_DECL)
>> +	return false;
>> +
>> +      *ploc = DECL_SOURCE_LOCATION (baseaddr);
>> +      return true;
>> +    }
>> +
>> +  if (TREE_CODE (exp) == SSA_NAME)
>> +    {
>> +      gimple *def_stmt = SSA_NAME_DEF_STMT (exp);
>> +      enum gimple_code code = gimple_code (def_stmt);
>> +
>> +      if (is_gimple_assign (def_stmt))
>> +	{
>> +	  tree type = TREE_TYPE (gimple_assign_lhs (def_stmt));
>> +	  if (POINTER_TYPE_P (type))
>> +	    {
>> +	      tree ptr = gimple_assign_rhs1 (def_stmt);
>> +	      return is_addr_local (ptr, ploc, pmaybe, visited);
>> +	    }
>> +	  return false;
>> +	}
>> +
>> +      if (code == GIMPLE_CALL
>> +	  && gimple_call_builtin_p (def_stmt))
>> +	{
>> +	  tree fn = gimple_call_fndecl (def_stmt);
>> +	  int code = DECL_FUNCTION_CODE (fn);
>> +	  if (code != BUILT_IN_ALLOCA
>> +	      && code != BUILT_IN_ALLOCA_WITH_ALIGN)
>> +	    return false;
>> +
>> +	  *ploc = gimple_location (def_stmt);
>> +	  return true;
>> +	}
>> +
>> +      if (code == GIMPLE_PHI && pmaybe)
>> +	{
>> +	  unsigned count = 0;
>> +	  gphi *phi_stmt = as_a <gphi *> (def_stmt);
>> +
>> +	  unsigned nargs = gimple_phi_num_args (phi_stmt);
>> +	  for (unsigned i = 0; i < nargs; ++i)
>> +	    {
>> +	      if (!visited->add (phi_stmt))
>> +		{
>> +		  tree arg = gimple_phi_arg_def (phi_stmt, i);
>> +		  if (is_addr_local (arg, ploc, pmaybe, visited))
>> +		    ++count;
>> +		}
>> +	    }
>> +
>> +	  *pmaybe = count && count < nargs;
>> +	  return count != 0;
>> +	}
>> +    }
>> +
>> +  return false;
>> +}
> Is there some reason you didn't query the alias oracle here?  It would
> seem a fairly natural fit.  Ultimately given a pointer (which will be an
> SSA_NAME) you want to ask whether or not it conclusively points into the
> stack.
> 
> That would seem to dramatically simplify is_addr_local.

I did think about it but decided against changing the existing
design (iterating over PHI arguments), for a couple of reasons:

1) It feels like a bigger change than my simple "bug fix" calls
    for.
2) I'm not familiar enough with the alias oracle to say for sure
    if it can be used to give the same results as the existing
    implementation.  I.e., make it possible to identify and
    isolate each path that returns a local address (rather than
    just answering: yes, this pointer may point to some local
    in this function).

If the alias oracle can be used to give the same results without
excessive false positives then I think it would be fine to make
use of it.  Is that something you consider a prerequisite for
this change or should I look into it as a followup?

FWIW, I'm working on enhancing this to detect returning freed
pointers (under a different option).  That seems like a good
opportunity to also look into making use of the alias oracle.

Besides these enhancements, there's also a request to diagnose
dereferencing pointers to compound literals whose lifetime has
ended (PR 89990), or more generally, those to any such local
object.  It's about preventing essentially the same problem
(accessing bad data or corrupting others) but it seems that
both the analysis and the handling will be sufficiently
different to consider implementing it somewhere else.  What
are your thoughts on that?

Martin

> 
> The rest looks pretty reasonable.
> 
> Jeff
> 

  reply	other threads:[~2019-05-30 14:52 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-22 21:34 Martin Sebor
2019-05-29 18:08 ` Jeff Law
2019-05-30 14:58   ` Martin Sebor [this message]
2019-05-30 15:21     ` Jeff Law
2019-05-30 15:48       ` Martin Sebor
2019-05-30 16:20         ` Jeff Law
2019-05-30 17:27           ` Jason Merrill
2019-05-31  0:26             ` Jeff Law
2019-05-30 23:35           ` Martin Sebor
2019-05-31 15:50             ` Jeff Law
2019-06-03  9:37               ` Richard Biener
2019-06-03 11:28                 ` Richard Biener
2019-06-04 11:45                   ` Richard Biener
2019-06-03 17:24                 ` Jeff Law
2019-05-31 21:19 ` Jeff Law
2019-06-03 23:24   ` Martin Sebor
2019-06-04 19:40     ` Martin Sebor
     [not found]       ` <224f8161-e370-bcbc-3ee6-5cff5835e016@redhat.com>
2019-06-19  3:19         ` Martin Sebor
2019-06-26 14:59           ` [PING] " Martin Sebor
2019-06-27  0:12           ` Jeff Law
2019-06-30 21:50             ` Martin Sebor
2019-07-02 20:59               ` Jeff Law
2019-07-11  6:45                 ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a1e0b3c-df54-6562-adf9-6facbc6dbd9f@gmail.com \
    --to=msebor@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).