public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Sebor <msebor@gmail.com>
To: Jeff Law <law@redhat.com>, Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)
Date: Fri, 21 Sep 2018 17:40:00 -0000	[thread overview]
Message-ID: <7579c7b4-6594-9dbc-9dd1-505361aa0608@gmail.com> (raw)
In-Reply-To: <afd7c487-ffd9-044f-0f6b-1e8ce28190f4@redhat.com>

On 09/17/2018 07:30 PM, Jeff Law wrote:
> On 8/28/18 6:12 PM, Martin Sebor wrote:
>>>> Sadly, dstbase is the PARM_DECL for d.  That's where things are going
>>>> "wrong".  Not sure why you're getting the PARM_DECL in that case.  I'd
>>>> debug get_addr_base_and_unit_offset to understand what's going on.
>>>> Essentially you're getting different results of
>>>> get_addr_base_and_unit_offset in a case where they arguably should be
>>>> the same.
>>>
>>> Probably get_attr_nonstring_decl has the same "mistake" and returns
>>> the PARM_DECL instead of the SSA name pointer.  So we're comparing
>>> apples and oranges here.
>>
>> Returning the SSA_NAME_VAR from get_attr_nonstring_decl() is
>> intentional but the function need not (perhaps should not)
>> also set *REF to it.
>>
>>>
>>> Yeah:
>>>
>>> /* If EXPR refers to a character array or pointer declared attribute
>>>    nonstring return a decl for that array or pointer and set *REF to
>>>    the referenced enclosing object or pointer.  Otherwise returns
>>>    null.  */
>>>
>>> tree
>>> get_attr_nonstring_decl (tree expr, tree *ref)
>>> {
>>>   tree decl = expr;
>>>   if (TREE_CODE (decl) == SSA_NAME)
>>>     {
>>>       gimple *def = SSA_NAME_DEF_STMT (decl);
>>>
>>>       if (is_gimple_assign (def))
>>>         {
>>>           tree_code code = gimple_assign_rhs_code (def);
>>>           if (code == ADDR_EXPR
>>>               || code == COMPONENT_REF
>>>               || code == VAR_DECL)
>>>             decl = gimple_assign_rhs1 (def);
>>>         }
>>>       else if (tree var = SSA_NAME_VAR (decl))
>>>         decl = var;
>>>     }
>>>
>>>   if (TREE_CODE (decl) == ADDR_EXPR)
>>>     decl = TREE_OPERAND (decl, 0);
>>>
>>>   if (ref)
>>>     *ref = decl;
>>>
>>> I see a lot of "magic" here again in the attempt to "propagate"
>>> a nonstring attribute.
>>
>> That's the function's purpose: to look for the attribute.  Is
>> there a better way to do this?
>>
>>> Note
>>>
>>> foo (char *p __attribute__(("nonstring")))
>>> {
>>>   p = "bar";
>>>   strlen (p); // or whatever is necessary to call get_attr_nonstring_decl
>>> }
>>>
>>> is perfectly valid and p as passed to strlen is _not_ nonstring(?).
>>
>> I don't know if you're saying that it should get a warning or
>> shouldn't.  Right now it doesn't because the strlen() call is
>> folded before we check for nonstring.
>>
>> I could see an argument for diagnosing it but I suspect you
>> wouldn't like it because it would mean more warning from
>> the folder.  I could also see an argument against it because,
>> as you said, it's safe.
>>
>> If you take the assignment to p away then a warning is issued,
>> and that's because p is declared with attribute nonstring.
>> That's also why get_attr_nonstring_decl looks at SSA_NAME_VAR.
>>
>>> I think in your code comparing bases you want to look at the _original_
>>> argument to the string function rather than what get_attr_nonstring_decl
>>> returned as ref.
>>
>> I've adjusted get_attr_nonstring_decl() to avoid setting *REF
>> to SSA_NAME_VAR.  That let me remove the GIMPLE_NOP code from
>> the patch.  I've also updated the comment above SSA_NAME_VAR
>> to clarify its purpose per Jeff's comments.
>>
>> Attached is an updated revision with these changes.
>>
>> Martin
>>
>> gcc-87028.diff
>>
>> PR tree-optimization/87028 - false positive -Wstringop-truncation strncpy with global variable source string
>> gcc/ChangeLog:
>>
>> 	PR tree-optimization/87028
>> 	* calls.c (get_attr_nonstring_decl): Avoid setting *REF to
>> 	SSA_NAME_VAR.
>> 	* gimple-fold.c (gimple_fold_builtin_strncpy): Avoid folding
>> 	when statement doesn't belong to a basic block.
>> 	* tree.h (SSA_NAME_VAR): Update comment.
>> 	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Simplify.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	PR tree-optimization/87028
>> 	* c-c++-common/Wstringop-truncation.c: Remove xfails.
>> 	* gcc.dg/Wstringop-truncation-5.c: New test.
>>
>
>> Index: gcc/calls.c
>> ===================================================================
>> --- gcc/calls.c	(revision 263928)
>> +++ gcc/calls.c	(working copy)
>> @@ -1503,6 +1503,7 @@ tree
>>  get_attr_nonstring_decl (tree expr, tree *ref)
>>  {
>>    tree decl = expr;
>> +  tree var = NULL_TREE;
>>    if (TREE_CODE (decl) == SSA_NAME)
>>      {
>>        gimple *def = SSA_NAME_DEF_STMT (decl);
>> @@ -1515,17 +1516,25 @@ get_attr_nonstring_decl (tree expr, tree *ref)
>>  	      || code == VAR_DECL)
>>  	    decl = gimple_assign_rhs1 (def);
>>  	}
>> -      else if (tree var = SSA_NAME_VAR (decl))
>> -	decl = var;
>> +      else
>> +	var = SSA_NAME_VAR (decl);
>>      }
>>
>>    if (TREE_CODE (decl) == ADDR_EXPR)
>>      decl = TREE_OPERAND (decl, 0);
>>
>> +  /* To simplify calling code, store the referenced DECL regardless of
>> +     the attribute determined below, but avoid storing the SSA_NAME_VAR
>> +     obtained above (it's not useful for dataflow purposes).  */
>>    if (ref)
>>      *ref = decl;
>>
>> -  if (TREE_CODE (decl) == ARRAY_REF)
>> +  /* Use the SSA_NAME_VAR that was determined above to see if it's
>> +     declared nonstring.  Otherwise drill down into the referenced
>> +     DECL.  */
>> +  if (var)
>> +    decl = var;
>> +  else if (TREE_CODE (decl) == ARRAY_REF)
>>      decl = TREE_OPERAND (decl, 0);
>>    else if (TREE_CODE (decl) == COMPONENT_REF)
>>      decl = TREE_OPERAND (decl, 1);
> The more I look at this the more I think what we really want to be doing
> is real propagation of the property either via the alias oracle or a
> propagation engine.   You can't even guarantee that if you've got an
> SSA_NAME that the value it holds has any relation to its underlying
> SSA_NAME_VAR -- the value in the SSA_NAME could well have been copied
> from a some other SSA_NAME with a different underlying SSA_NAME_VAR.
>
> I'm not going to insist on it, but I think if we find ourselves
> extending this again in a way that is really working around lack of
> propagation of the property then we should go back and fix the
> propagation problem.

We talked about improving this back in the GCC 8 cycle.  I've
been collecting input (and test cases) from Miguel Ojeda from
the adoption of the attribute in the Linux kernel.  There are
a number of issues I was hoping to get to in stage 1 but that
has been derailed by all the strlen back and forth.  I'm still
hoping to be able to fix some of the false positives here in
stage 3 but, IIUC the constraints, a redesign along the lines
you suggest would be considered overly intrusive.  (If not,
I'm willing to look into it.)

That said, I had the impression from Richard's comments that
implementing the propagation in points-to analysis would come
at a cost and have its own downsides:

   https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01954.html

So I wasn't sure it was necessarily an endorsement of
the approach as the ideal solution or just a passing thought.

>> Index: gcc/gimple-fold.c
>> ===================================================================
>> --- gcc/gimple-fold.c	(revision 263925)
>> +++ gcc/gimple-fold.c	(working copy)
>> @@ -1702,6 +1702,11 @@ gimple_fold_builtin_strncpy (gimple_stmt_iterator
>>    if (tree_int_cst_lt (ssize, len))
>>      return false;
>>
>> +  /* Defer warning (and folding) until the next statement in the basic
>> +     block is reachable.  */
>> +  if (!gimple_bb (stmt))
>> +    return false;
>> +
>>    /* Diagnose truncation that leaves the copy unterminated.  */
>>    maybe_diag_stxncpy_trunc (*gsi, src, len);
> I thought Richi wanted the guard earlier (maybe_fold_stmt) -- it wasn't
> entirely clear to me if the subsequent comments about needing to fold
> early where meant to raise issues with guarding earlier or not.

I'm fine with moving it if that's preferable.

Moving the test to maybe_fold_stmt() would, IMO, be the right
change to make in general, at least for library built-ins.
I have been meaning to suggest it independently of this fix
but because of its pervasive impact I've been holding off,
expecting it to be controversial.  If there is consensus I'm
happy to make this change but I would prefer to do it separately
since it causes a number of regressions in tests that expect
built-ins to be folded very early on (i.e., look for evidence
of the folding in the output of -fdump-tree-gimple or
-fdump-tree-ccp1).  Some of the regression would go away if
maybe_fold_stmt() only avoided folding of library built-in
functions.  Resolving the others would require adjusting
the tests to either use optimization or look for the evidence
of folding in later passes than gimple or ccp1).  I think all
that is reasonable and won't impact the efficiency of
the emitted object code, but it's obviously a much bigger
change than a simple fix for a false positive warning.

If that sounds reasonable, is the patch acceptable as is?

The latest version is here:

   https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html

Martin

  reply	other threads:[~2018-09-21 17:13 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-24 15:58 Martin Sebor
2018-08-26  5:25 ` Jeff Law
2018-08-27  8:30   ` Richard Biener
2018-08-27 15:32     ` Jeff Law
2018-08-27 15:43       ` Richard Biener
2018-10-04 15:51         ` Jeff Law
2018-10-04 15:55           ` Martin Sebor
2018-10-08 10:14             ` Richard Biener
2018-10-08 21:40               ` Martin Sebor
2018-10-16 22:42             ` Jeff Law
2018-10-21  8:17               ` Martin Sebor
2018-10-31 17:07                 ` [PING #3][PATCH] " Martin Sebor
2018-11-16  3:12                   ` [PING #4][PATCH] " Martin Sebor
2018-11-16  9:07                     ` Richard Biener
2018-11-29 20:34                       ` Martin Sebor
2018-11-29 23:07                         ` Jeff Law
2018-11-29 23:43                           ` Martin Sebor
2018-11-30  2:02                             ` Jeff Law
2018-11-30  8:05                               ` Richard Biener
2018-11-30  8:30                                 ` Jakub Jelinek
2018-12-05 23:11                             ` Jeff Law
2018-12-06 13:00                               ` Christophe Lyon
2018-12-06 13:52                                 ` Jeff Law
2018-11-30  7:57                         ` Richard Biener
2018-11-30 15:51                           ` Martin Sebor
2018-11-07 21:28                 ` [PATCH] " Jeff Law
2018-11-09  1:25                   ` Martin Sebor
2018-10-04 19:55           ` Joseph Myers
2018-08-27 16:27     ` Martin Sebor
2018-08-28  4:27       ` Jeff Law
2018-08-28  9:56         ` Richard Biener
2018-08-28  9:57           ` Richard Biener
2018-08-29  0:12           ` Martin Sebor
2018-08-29  7:29             ` Richard Biener
2018-08-29 15:43               ` Martin Sebor
2018-08-30  0:27             ` Jeff Law
2018-08-30  8:48               ` Richard Biener
2018-09-12 15:50             ` Martin Sebor
2018-09-18  1:56             ` Jeff Law
2018-09-21 17:40               ` Martin Sebor [this message]
2018-10-01 21:31                 ` [PING] " Martin Sebor
2018-10-08 22:15                   ` Martin Sebor
2018-10-04 15:52             ` Jeff Law
2018-08-28 20:44         ` Martin Sebor
2018-08-28 22:17           ` Jeff Law
2018-08-27 20:31   ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7579c7b4-6594-9dbc-9dd1-505361aa0608@gmail.com \
    --to=msebor@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).