From: Martin Sebor <msebor@gmail.com>
To: Jeff Law <law@redhat.com>, Richard Biener <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PING] [PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)
Date: Mon, 08 Oct 2018 22:15:00 -0000 [thread overview]
Message-ID: <a97b8510-80b4-afb0-dd95-fcb9e9097bb9@gmail.com> (raw)
In-Reply-To: <163d7d1f-97cd-0c8d-3a72-54865b57cd8b@gmail.com>
Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html
As with the other patch (bug 84561), there may be ways to redesign
the warning, but I don't have the cycles to undertake it before
stage 1 ends. Unless someone has a simpler suggestion for how
to avoid this false positive now can we please accept this patch
for GCC 9 and consider the more ambitious approaches for GCC 10?
On 10/01/2018 03:24 PM, Martin Sebor wrote:
> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html
>
> On 09/21/2018 11:13 AM, Martin Sebor wrote:
>> On 09/17/2018 07:30 PM, Jeff Law wrote:
>>> On 8/28/18 6:12 PM, Martin Sebor wrote:
>>>>>> Sadly, dstbase is the PARM_DECL for d. That's where things are going
>>>>>> "wrong". Not sure why you're getting the PARM_DECL in that case.
>>>>>> I'd
>>>>>> debug get_addr_base_and_unit_offset to understand what's going on.
>>>>>> Essentially you're getting different results of
>>>>>> get_addr_base_and_unit_offset in a case where they arguably should be
>>>>>> the same.
>>>>>
>>>>> Probably get_attr_nonstring_decl has the same "mistake" and returns
>>>>> the PARM_DECL instead of the SSA name pointer. So we're comparing
>>>>> apples and oranges here.
>>>>
>>>> Returning the SSA_NAME_VAR from get_attr_nonstring_decl() is
>>>> intentional but the function need not (perhaps should not)
>>>> also set *REF to it.
>>>>
>>>>>
>>>>> Yeah:
>>>>>
>>>>> /* If EXPR refers to a character array or pointer declared attribute
>>>>> nonstring return a decl for that array or pointer and set *REF to
>>>>> the referenced enclosing object or pointer. Otherwise returns
>>>>> null. */
>>>>>
>>>>> tree
>>>>> get_attr_nonstring_decl (tree expr, tree *ref)
>>>>> {
>>>>> tree decl = expr;
>>>>> if (TREE_CODE (decl) == SSA_NAME)
>>>>> {
>>>>> gimple *def = SSA_NAME_DEF_STMT (decl);
>>>>>
>>>>> if (is_gimple_assign (def))
>>>>> {
>>>>> tree_code code = gimple_assign_rhs_code (def);
>>>>> if (code == ADDR_EXPR
>>>>> || code == COMPONENT_REF
>>>>> || code == VAR_DECL)
>>>>> decl = gimple_assign_rhs1 (def);
>>>>> }
>>>>> else if (tree var = SSA_NAME_VAR (decl))
>>>>> decl = var;
>>>>> }
>>>>>
>>>>> if (TREE_CODE (decl) == ADDR_EXPR)
>>>>> decl = TREE_OPERAND (decl, 0);
>>>>>
>>>>> if (ref)
>>>>> *ref = decl;
>>>>>
>>>>> I see a lot of "magic" here again in the attempt to "propagate"
>>>>> a nonstring attribute.
>>>>
>>>> That's the function's purpose: to look for the attribute. Is
>>>> there a better way to do this?
>>>>
>>>>> Note
>>>>>
>>>>> foo (char *p __attribute__(("nonstring")))
>>>>> {
>>>>> p = "bar";
>>>>> strlen (p); // or whatever is necessary to call
>>>>> get_attr_nonstring_decl
>>>>> }
>>>>>
>>>>> is perfectly valid and p as passed to strlen is _not_ nonstring(?).
>>>>
>>>> I don't know if you're saying that it should get a warning or
>>>> shouldn't. Right now it doesn't because the strlen() call is
>>>> folded before we check for nonstring.
>>>>
>>>> I could see an argument for diagnosing it but I suspect you
>>>> wouldn't like it because it would mean more warning from
>>>> the folder. I could also see an argument against it because,
>>>> as you said, it's safe.
>>>>
>>>> If you take the assignment to p away then a warning is issued,
>>>> and that's because p is declared with attribute nonstring.
>>>> That's also why get_attr_nonstring_decl looks at SSA_NAME_VAR.
>>>>
>>>>> I think in your code comparing bases you want to look at the
>>>>> _original_
>>>>> argument to the string function rather than what
>>>>> get_attr_nonstring_decl
>>>>> returned as ref.
>>>>
>>>> I've adjusted get_attr_nonstring_decl() to avoid setting *REF
>>>> to SSA_NAME_VAR. That let me remove the GIMPLE_NOP code from
>>>> the patch. I've also updated the comment above SSA_NAME_VAR
>>>> to clarify its purpose per Jeff's comments.
>>>>
>>>> Attached is an updated revision with these changes.
>>>>
>>>> Martin
>>>>
>>>> gcc-87028.diff
>>>>
>>>> PR tree-optimization/87028 - false positive -Wstringop-truncation
>>>> strncpy with global variable source string
>>>> gcc/ChangeLog:
>>>>
>>>> PR tree-optimization/87028
>>>> * calls.c (get_attr_nonstring_decl): Avoid setting *REF to
>>>> SSA_NAME_VAR.
>>>> * gimple-fold.c (gimple_fold_builtin_strncpy): Avoid folding
>>>> when statement doesn't belong to a basic block.
>>>> * tree.h (SSA_NAME_VAR): Update comment.
>>>> * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Simplify.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>> PR tree-optimization/87028
>>>> * c-c++-common/Wstringop-truncation.c: Remove xfails.
>>>> * gcc.dg/Wstringop-truncation-5.c: New test.
>>>>
>>>
>>>> Index: gcc/calls.c
>>>> ===================================================================
>>>> --- gcc/calls.c (revision 263928)
>>>> +++ gcc/calls.c (working copy)
>>>> @@ -1503,6 +1503,7 @@ tree
>>>> get_attr_nonstring_decl (tree expr, tree *ref)
>>>> {
>>>> tree decl = expr;
>>>> + tree var = NULL_TREE;
>>>> if (TREE_CODE (decl) == SSA_NAME)
>>>> {
>>>> gimple *def = SSA_NAME_DEF_STMT (decl);
>>>> @@ -1515,17 +1516,25 @@ get_attr_nonstring_decl (tree expr, tree *ref)
>>>> || code == VAR_DECL)
>>>> decl = gimple_assign_rhs1 (def);
>>>> }
>>>> - else if (tree var = SSA_NAME_VAR (decl))
>>>> - decl = var;
>>>> + else
>>>> + var = SSA_NAME_VAR (decl);
>>>> }
>>>>
>>>> if (TREE_CODE (decl) == ADDR_EXPR)
>>>> decl = TREE_OPERAND (decl, 0);
>>>>
>>>> + /* To simplify calling code, store the referenced DECL regardless of
>>>> + the attribute determined below, but avoid storing the
>>>> SSA_NAME_VAR
>>>> + obtained above (it's not useful for dataflow purposes). */
>>>> if (ref)
>>>> *ref = decl;
>>>>
>>>> - if (TREE_CODE (decl) == ARRAY_REF)
>>>> + /* Use the SSA_NAME_VAR that was determined above to see if it's
>>>> + declared nonstring. Otherwise drill down into the referenced
>>>> + DECL. */
>>>> + if (var)
>>>> + decl = var;
>>>> + else if (TREE_CODE (decl) == ARRAY_REF)
>>>> decl = TREE_OPERAND (decl, 0);
>>>> else if (TREE_CODE (decl) == COMPONENT_REF)
>>>> decl = TREE_OPERAND (decl, 1);
>>> The more I look at this the more I think what we really want to be doing
>>> is real propagation of the property either via the alias oracle or a
>>> propagation engine. You can't even guarantee that if you've got an
>>> SSA_NAME that the value it holds has any relation to its underlying
>>> SSA_NAME_VAR -- the value in the SSA_NAME could well have been copied
>>> from a some other SSA_NAME with a different underlying SSA_NAME_VAR.
>>>
>>> I'm not going to insist on it, but I think if we find ourselves
>>> extending this again in a way that is really working around lack of
>>> propagation of the property then we should go back and fix the
>>> propagation problem.
>>
>> We talked about improving this back in the GCC 8 cycle. I've
>> been collecting input (and test cases) from Miguel Ojeda from
>> the adoption of the attribute in the Linux kernel. There are
>> a number of issues I was hoping to get to in stage 1 but that
>> has been derailed by all the strlen back and forth. I'm still
>> hoping to be able to fix some of the false positives here in
>> stage 3 but, IIUC the constraints, a redesign along the lines
>> you suggest would be considered overly intrusive. (If not,
>> I'm willing to look into it.)
>>
>> That said, I had the impression from Richard's comments that
>> implementing the propagation in points-to analysis would come
>> at a cost and have its own downsides:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01954.html
>>
>> So I wasn't sure it was necessarily an endorsement of
>> the approach as the ideal solution or just a passing thought.
>>
>>>> Index: gcc/gimple-fold.c
>>>> ===================================================================
>>>> --- gcc/gimple-fold.c (revision 263925)
>>>> +++ gcc/gimple-fold.c (working copy)
>>>> @@ -1702,6 +1702,11 @@ gimple_fold_builtin_strncpy
>>>> (gimple_stmt_iterator
>>>> if (tree_int_cst_lt (ssize, len))
>>>> return false;
>>>>
>>>> + /* Defer warning (and folding) until the next statement in the basic
>>>> + block is reachable. */
>>>> + if (!gimple_bb (stmt))
>>>> + return false;
>>>> +
>>>> /* Diagnose truncation that leaves the copy unterminated. */
>>>> maybe_diag_stxncpy_trunc (*gsi, src, len);
>>> I thought Richi wanted the guard earlier (maybe_fold_stmt) -- it wasn't
>>> entirely clear to me if the subsequent comments about needing to fold
>>> early where meant to raise issues with guarding earlier or not.
>>
>> I'm fine with moving it if that's preferable.
>>
>> Moving the test to maybe_fold_stmt() would, IMO, be the right
>> change to make in general, at least for library built-ins.
>> I have been meaning to suggest it independently of this fix
>> but because of its pervasive impact I've been holding off,
>> expecting it to be controversial. If there is consensus I'm
>> happy to make this change but I would prefer to do it separately
>> since it causes a number of regressions in tests that expect
>> built-ins to be folded very early on (i.e., look for evidence
>> of the folding in the output of -fdump-tree-gimple or
>> -fdump-tree-ccp1). Some of the regression would go away if
>> maybe_fold_stmt() only avoided folding of library built-in
>> functions. Resolving the others would require adjusting
>> the tests to either use optimization or look for the evidence
>> of folding in later passes than gimple or ccp1). I think all
>> that is reasonable and won't impact the efficiency of
>> the emitted object code, but it's obviously a much bigger
>> change than a simple fix for a false positive warning.
>>
>> If that sounds reasonable, is the patch acceptable as is?
>>
>> The latest version is here:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html
>>
>> Martin
>
next prev parent reply other threads:[~2018-10-08 21:46 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-24 15:58 Martin Sebor
2018-08-26 5:25 ` Jeff Law
2018-08-27 8:30 ` Richard Biener
2018-08-27 15:32 ` Jeff Law
2018-08-27 15:43 ` Richard Biener
2018-10-04 15:51 ` Jeff Law
2018-10-04 15:55 ` Martin Sebor
2018-10-08 10:14 ` Richard Biener
2018-10-08 21:40 ` Martin Sebor
2018-10-16 22:42 ` Jeff Law
2018-10-21 8:17 ` Martin Sebor
2018-10-31 17:07 ` [PING #3][PATCH] " Martin Sebor
2018-11-16 3:12 ` [PING #4][PATCH] " Martin Sebor
2018-11-16 9:07 ` Richard Biener
2018-11-29 20:34 ` Martin Sebor
2018-11-29 23:07 ` Jeff Law
2018-11-29 23:43 ` Martin Sebor
2018-11-30 2:02 ` Jeff Law
2018-11-30 8:05 ` Richard Biener
2018-11-30 8:30 ` Jakub Jelinek
2018-12-05 23:11 ` Jeff Law
2018-12-06 13:00 ` Christophe Lyon
2018-12-06 13:52 ` Jeff Law
2018-11-30 7:57 ` Richard Biener
2018-11-30 15:51 ` Martin Sebor
2018-11-07 21:28 ` [PATCH] " Jeff Law
2018-11-09 1:25 ` Martin Sebor
2018-10-04 19:55 ` Joseph Myers
2018-08-27 16:27 ` Martin Sebor
2018-08-28 4:27 ` Jeff Law
2018-08-28 9:56 ` Richard Biener
2018-08-28 9:57 ` Richard Biener
2018-08-29 0:12 ` Martin Sebor
2018-08-29 7:29 ` Richard Biener
2018-08-29 15:43 ` Martin Sebor
2018-08-30 0:27 ` Jeff Law
2018-08-30 8:48 ` Richard Biener
2018-09-12 15:50 ` Martin Sebor
2018-09-18 1:56 ` Jeff Law
2018-09-21 17:40 ` Martin Sebor
2018-10-01 21:31 ` [PING] " Martin Sebor
2018-10-08 22:15 ` Martin Sebor [this message]
2018-10-04 15:52 ` Jeff Law
2018-08-28 20:44 ` Martin Sebor
2018-08-28 22:17 ` Jeff Law
2018-08-27 20:31 ` Martin Sebor
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a97b8510-80b4-afb0-dd95-fcb9e9097bb9@gmail.com \
--to=msebor@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=law@redhat.com \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).