public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Sebor <msebor@gmail.com>
To: Jeff Law <law@redhat.com>, Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PING] [PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)
Date: Mon, 08 Oct 2018 22:15:00 -0000	[thread overview]
Message-ID: <a97b8510-80b4-afb0-dd95-fcb9e9097bb9@gmail.com> (raw)
In-Reply-To: <163d7d1f-97cd-0c8d-3a72-54865b57cd8b@gmail.com>

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html

As with the other patch (bug 84561), there may be ways to redesign
the warning, but I don't have the cycles to undertake it before
stage 1 ends.  Unless someone has a simpler suggestion for how
to avoid this false positive now can we please accept this patch
for GCC 9 and consider the more ambitious approaches for GCC 10?

On 10/01/2018 03:24 PM, Martin Sebor wrote:
> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html
>
> On 09/21/2018 11:13 AM, Martin Sebor wrote:
>> On 09/17/2018 07:30 PM, Jeff Law wrote:
>>> On 8/28/18 6:12 PM, Martin Sebor wrote:
>>>>>> Sadly, dstbase is the PARM_DECL for d.  That's where things are going
>>>>>> "wrong".  Not sure why you're getting the PARM_DECL in that case.
>>>>>> I'd
>>>>>> debug get_addr_base_and_unit_offset to understand what's going on.
>>>>>> Essentially you're getting different results of
>>>>>> get_addr_base_and_unit_offset in a case where they arguably should be
>>>>>> the same.
>>>>>
>>>>> Probably get_attr_nonstring_decl has the same "mistake" and returns
>>>>> the PARM_DECL instead of the SSA name pointer.  So we're comparing
>>>>> apples and oranges here.
>>>>
>>>> Returning the SSA_NAME_VAR from get_attr_nonstring_decl() is
>>>> intentional but the function need not (perhaps should not)
>>>> also set *REF to it.
>>>>
>>>>>
>>>>> Yeah:
>>>>>
>>>>> /* If EXPR refers to a character array or pointer declared attribute
>>>>>    nonstring return a decl for that array or pointer and set *REF to
>>>>>    the referenced enclosing object or pointer.  Otherwise returns
>>>>>    null.  */
>>>>>
>>>>> tree
>>>>> get_attr_nonstring_decl (tree expr, tree *ref)
>>>>> {
>>>>>   tree decl = expr;
>>>>>   if (TREE_CODE (decl) == SSA_NAME)
>>>>>     {
>>>>>       gimple *def = SSA_NAME_DEF_STMT (decl);
>>>>>
>>>>>       if (is_gimple_assign (def))
>>>>>         {
>>>>>           tree_code code = gimple_assign_rhs_code (def);
>>>>>           if (code == ADDR_EXPR
>>>>>               || code == COMPONENT_REF
>>>>>               || code == VAR_DECL)
>>>>>             decl = gimple_assign_rhs1 (def);
>>>>>         }
>>>>>       else if (tree var = SSA_NAME_VAR (decl))
>>>>>         decl = var;
>>>>>     }
>>>>>
>>>>>   if (TREE_CODE (decl) == ADDR_EXPR)
>>>>>     decl = TREE_OPERAND (decl, 0);
>>>>>
>>>>>   if (ref)
>>>>>     *ref = decl;
>>>>>
>>>>> I see a lot of "magic" here again in the attempt to "propagate"
>>>>> a nonstring attribute.
>>>>
>>>> That's the function's purpose: to look for the attribute.  Is
>>>> there a better way to do this?
>>>>
>>>>> Note
>>>>>
>>>>> foo (char *p __attribute__(("nonstring")))
>>>>> {
>>>>>   p = "bar";
>>>>>   strlen (p); // or whatever is necessary to call
>>>>> get_attr_nonstring_decl
>>>>> }
>>>>>
>>>>> is perfectly valid and p as passed to strlen is _not_ nonstring(?).
>>>>
>>>> I don't know if you're saying that it should get a warning or
>>>> shouldn't.  Right now it doesn't because the strlen() call is
>>>> folded before we check for nonstring.
>>>>
>>>> I could see an argument for diagnosing it but I suspect you
>>>> wouldn't like it because it would mean more warning from
>>>> the folder.  I could also see an argument against it because,
>>>> as you said, it's safe.
>>>>
>>>> If you take the assignment to p away then a warning is issued,
>>>> and that's because p is declared with attribute nonstring.
>>>> That's also why get_attr_nonstring_decl looks at SSA_NAME_VAR.
>>>>
>>>>> I think in your code comparing bases you want to look at the
>>>>> _original_
>>>>> argument to the string function rather than what
>>>>> get_attr_nonstring_decl
>>>>> returned as ref.
>>>>
>>>> I've adjusted get_attr_nonstring_decl() to avoid setting *REF
>>>> to SSA_NAME_VAR.  That let me remove the GIMPLE_NOP code from
>>>> the patch.  I've also updated the comment above SSA_NAME_VAR
>>>> to clarify its purpose per Jeff's comments.
>>>>
>>>> Attached is an updated revision with these changes.
>>>>
>>>> Martin
>>>>
>>>> gcc-87028.diff
>>>>
>>>> PR tree-optimization/87028 - false positive -Wstringop-truncation
>>>> strncpy with global variable source string
>>>> gcc/ChangeLog:
>>>>
>>>>     PR tree-optimization/87028
>>>>     * calls.c (get_attr_nonstring_decl): Avoid setting *REF to
>>>>     SSA_NAME_VAR.
>>>>     * gimple-fold.c (gimple_fold_builtin_strncpy): Avoid folding
>>>>     when statement doesn't belong to a basic block.
>>>>     * tree.h (SSA_NAME_VAR): Update comment.
>>>>     * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Simplify.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>>     PR tree-optimization/87028
>>>>     * c-c++-common/Wstringop-truncation.c: Remove xfails.
>>>>     * gcc.dg/Wstringop-truncation-5.c: New test.
>>>>
>>>
>>>> Index: gcc/calls.c
>>>> ===================================================================
>>>> --- gcc/calls.c    (revision 263928)
>>>> +++ gcc/calls.c    (working copy)
>>>> @@ -1503,6 +1503,7 @@ tree
>>>>  get_attr_nonstring_decl (tree expr, tree *ref)
>>>>  {
>>>>    tree decl = expr;
>>>> +  tree var = NULL_TREE;
>>>>    if (TREE_CODE (decl) == SSA_NAME)
>>>>      {
>>>>        gimple *def = SSA_NAME_DEF_STMT (decl);
>>>> @@ -1515,17 +1516,25 @@ get_attr_nonstring_decl (tree expr, tree *ref)
>>>>            || code == VAR_DECL)
>>>>          decl = gimple_assign_rhs1 (def);
>>>>      }
>>>> -      else if (tree var = SSA_NAME_VAR (decl))
>>>> -    decl = var;
>>>> +      else
>>>> +    var = SSA_NAME_VAR (decl);
>>>>      }
>>>>
>>>>    if (TREE_CODE (decl) == ADDR_EXPR)
>>>>      decl = TREE_OPERAND (decl, 0);
>>>>
>>>> +  /* To simplify calling code, store the referenced DECL regardless of
>>>> +     the attribute determined below, but avoid storing the
>>>> SSA_NAME_VAR
>>>> +     obtained above (it's not useful for dataflow purposes).  */
>>>>    if (ref)
>>>>      *ref = decl;
>>>>
>>>> -  if (TREE_CODE (decl) == ARRAY_REF)
>>>> +  /* Use the SSA_NAME_VAR that was determined above to see if it's
>>>> +     declared nonstring.  Otherwise drill down into the referenced
>>>> +     DECL.  */
>>>> +  if (var)
>>>> +    decl = var;
>>>> +  else if (TREE_CODE (decl) == ARRAY_REF)
>>>>      decl = TREE_OPERAND (decl, 0);
>>>>    else if (TREE_CODE (decl) == COMPONENT_REF)
>>>>      decl = TREE_OPERAND (decl, 1);
>>> The more I look at this the more I think what we really want to be doing
>>> is real propagation of the property either via the alias oracle or a
>>> propagation engine.   You can't even guarantee that if you've got an
>>> SSA_NAME that the value it holds has any relation to its underlying
>>> SSA_NAME_VAR -- the value in the SSA_NAME could well have been copied
>>> from a some other SSA_NAME with a different underlying SSA_NAME_VAR.
>>>
>>> I'm not going to insist on it, but I think if we find ourselves
>>> extending this again in a way that is really working around lack of
>>> propagation of the property then we should go back and fix the
>>> propagation problem.
>>
>> We talked about improving this back in the GCC 8 cycle.  I've
>> been collecting input (and test cases) from Miguel Ojeda from
>> the adoption of the attribute in the Linux kernel.  There are
>> a number of issues I was hoping to get to in stage 1 but that
>> has been derailed by all the strlen back and forth.  I'm still
>> hoping to be able to fix some of the false positives here in
>> stage 3 but, IIUC the constraints, a redesign along the lines
>> you suggest would be considered overly intrusive.  (If not,
>> I'm willing to look into it.)
>>
>> That said, I had the impression from Richard's comments that
>> implementing the propagation in points-to analysis would come
>> at a cost and have its own downsides:
>>
>>   https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01954.html
>>
>> So I wasn't sure it was necessarily an endorsement of
>> the approach as the ideal solution or just a passing thought.
>>
>>>> Index: gcc/gimple-fold.c
>>>> ===================================================================
>>>> --- gcc/gimple-fold.c    (revision 263925)
>>>> +++ gcc/gimple-fold.c    (working copy)
>>>> @@ -1702,6 +1702,11 @@ gimple_fold_builtin_strncpy
>>>> (gimple_stmt_iterator
>>>>    if (tree_int_cst_lt (ssize, len))
>>>>      return false;
>>>>
>>>> +  /* Defer warning (and folding) until the next statement in the basic
>>>> +     block is reachable.  */
>>>> +  if (!gimple_bb (stmt))
>>>> +    return false;
>>>> +
>>>>    /* Diagnose truncation that leaves the copy unterminated.  */
>>>>    maybe_diag_stxncpy_trunc (*gsi, src, len);
>>> I thought Richi wanted the guard earlier (maybe_fold_stmt) -- it wasn't
>>> entirely clear to me if the subsequent comments about needing to fold
>>> early where meant to raise issues with guarding earlier or not.
>>
>> I'm fine with moving it if that's preferable.
>>
>> Moving the test to maybe_fold_stmt() would, IMO, be the right
>> change to make in general, at least for library built-ins.
>> I have been meaning to suggest it independently of this fix
>> but because of its pervasive impact I've been holding off,
>> expecting it to be controversial.  If there is consensus I'm
>> happy to make this change but I would prefer to do it separately
>> since it causes a number of regressions in tests that expect
>> built-ins to be folded very early on (i.e., look for evidence
>> of the folding in the output of -fdump-tree-gimple or
>> -fdump-tree-ccp1).  Some of the regression would go away if
>> maybe_fold_stmt() only avoided folding of library built-in
>> functions.  Resolving the others would require adjusting
>> the tests to either use optimization or look for the evidence
>> of folding in later passes than gimple or ccp1).  I think all
>> that is reasonable and won't impact the efficiency of
>> the emitted object code, but it's obviously a much bigger
>> change than a simple fix for a false positive warning.
>>
>> If that sounds reasonable, is the patch acceptable as is?
>>
>> The latest version is here:
>>
>>   https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html
>>
>> Martin
>

  reply	other threads:[~2018-10-08 21:46 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-24 15:58 Martin Sebor
2018-08-26  5:25 ` Jeff Law
2018-08-27  8:30   ` Richard Biener
2018-08-27 15:32     ` Jeff Law
2018-08-27 15:43       ` Richard Biener
2018-10-04 15:51         ` Jeff Law
2018-10-04 15:55           ` Martin Sebor
2018-10-08 10:14             ` Richard Biener
2018-10-08 21:40               ` Martin Sebor
2018-10-16 22:42             ` Jeff Law
2018-10-21  8:17               ` Martin Sebor
2018-10-31 17:07                 ` [PING #3][PATCH] " Martin Sebor
2018-11-16  3:12                   ` [PING #4][PATCH] " Martin Sebor
2018-11-16  9:07                     ` Richard Biener
2018-11-29 20:34                       ` Martin Sebor
2018-11-29 23:07                         ` Jeff Law
2018-11-29 23:43                           ` Martin Sebor
2018-11-30  2:02                             ` Jeff Law
2018-11-30  8:05                               ` Richard Biener
2018-11-30  8:30                                 ` Jakub Jelinek
2018-12-05 23:11                             ` Jeff Law
2018-12-06 13:00                               ` Christophe Lyon
2018-12-06 13:52                                 ` Jeff Law
2018-11-30  7:57                         ` Richard Biener
2018-11-30 15:51                           ` Martin Sebor
2018-11-07 21:28                 ` [PATCH] " Jeff Law
2018-11-09  1:25                   ` Martin Sebor
2018-10-04 19:55           ` Joseph Myers
2018-08-27 16:27     ` Martin Sebor
2018-08-28  4:27       ` Jeff Law
2018-08-28  9:56         ` Richard Biener
2018-08-28  9:57           ` Richard Biener
2018-08-29  0:12           ` Martin Sebor
2018-08-29  7:29             ` Richard Biener
2018-08-29 15:43               ` Martin Sebor
2018-08-30  0:27             ` Jeff Law
2018-08-30  8:48               ` Richard Biener
2018-09-12 15:50             ` Martin Sebor
2018-09-18  1:56             ` Jeff Law
2018-09-21 17:40               ` Martin Sebor
2018-10-01 21:31                 ` [PING] " Martin Sebor
2018-10-08 22:15                   ` Martin Sebor [this message]
2018-10-04 15:52             ` Jeff Law
2018-08-28 20:44         ` Martin Sebor
2018-08-28 22:17           ` Jeff Law
2018-08-27 20:31   ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a97b8510-80b4-afb0-dd95-fcb9e9097bb9@gmail.com \
    --to=msebor@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).