From: Bernd Edlinger <bernd.edlinger@hotmail.de>
To: Jeff Law <law@redhat.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
Richard Biener <rguenther@suse.de>,
Martin Sebor <msebor@gmail.com>
Subject: Re: [PATCH] Cleanup strcpy/stpcpy no nul warning code
Date: Tue, 18 Sep 2018 12:06:00 -0000 [thread overview]
Message-ID: <VI1PR0701MB286291A978C4E3291AA389C6E41D0@VI1PR0701MB2862.eurprd07.prod.outlook.com> (raw)
In-Reply-To: <469fc374-4dc5-e279-c08c-39ec2240b059@redhat.com>
On 09/18/18 07:31, Jeff Law wrote:
> On 9/17/18 1:18 PM, Bernd Edlinger wrote:
>> On 09/17/18 20:32, Jeff Law wrote:
>>> On 9/17/18 12:20 PM, Bernd Edlinger wrote:
>>>> On 09/17/18 19:33, Jeff Law wrote:
>>>>> On 9/16/18 1:58 PM, Bernd Edlinger wrote:
>>>>>> Hi,
>>>>>>
>>>>>> this is a cleanup of the recently added strlen/strcpy/stpcpy
>>>>>> no nul warning code.
>>>>>>
>>>>>> Most importantly it moves the SSA_NAME handling from
>>>>>> unterminated_array to string_constant, thereby fixing
>>>>>> another round of xfails in the strlen and stpcpy test cases.
>>>>>>
>>>>>> I need to say that the fix for bug 86622 is relying in
>>>>>> type info on the pointer which is probably not safe in
>>>>>> GIMPLE in the light of the recent discussion.
>>>>>>
>>>>>> I had to add two further exceptions, which should
>>>>>> be safe in general: that is a pointer arithmentic on a string
>>>>>> literal is okay, and arithmetic on a string constant
>>>>>> that is exactly the size of the whole DECL, cannot
>>>>>> access an adjacent member.
>>>>>>
>>>>>>
>>>>>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>>>>>> Is it OK for trunk?
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Bernd.
>>>>>>
>>>>>>
>>>>>> patch-cleanup-no-nul.diff
>>>>>>
>>>>>> gcc:
>>>>>> 2018-09-16 Bernd Edlinger <bernd.edlinger@hotmail.de>
>>>>>>
>>>>>> * builtins.h (unterminated_array): Remove prototype.
>>>>>> * builtins.c (unterminated_array): Simplify. Make static.
>>>>>> (expand_builtin_strcpy_args): Don't use TREE_NO_WARNING here.
>>>>>> (expand_builtin_stpcpy_1): Remove warn_string_no_nul without effect.
>>>>>> * expr.c (string_constant): Handle SSA_NAME. Add more exceptions
>>>>>> where pointer arithmetic is safe.
>>>>>> * gimple-fold.c (get_range_strlen): Handle nonstr like in c_strlen.
>>>>>> (get_max_strlen): Remove the unnecessary mynonstr handling.
>>>>>> (gimple_fold_builtin_strcpy): Simplify.
>>>>>> (gimple_fold_builtin_stpcpy): Simplify.
>>>>>> (gimple_fold_builtin_sprintf): Remove NO_WARNING propagation
>>>>>> without effect.
>>>>>> (gimple_fold_builtin_strlen): Simplify.
>>>>>>
>>>>>> testsuite:
>>>>>> 2018-09-16 Bernd Edlinger <bernd.edlinger@hotmail.de>
>>>>>>
>>>>>> * gcc.dg/warn-stpcpy-no-nul.c: Remove xfails.
>>>>>> * gcc.dg/warn-strlen-no-nul.c: Remove xfails.
>>>>>>
>>>>>> Index: gcc/builtins.c
>>>>>> ===================================================================
>>>>>> --- gcc/builtins.c (revision 264342)
>>>>>> +++ gcc/builtins.c (working copy)
>>>>>> @@ -567,31 +567,12 @@ warn_string_no_nul (location_t loc, const char *fn
>>>>>> the declaration of the object of which the array is a member or
>>>>>> element. Otherwise return null. */
>>>>>>
>>>>>> -tree
>>>>>> +static tree
>>>>>> unterminated_array (tree exp)
>>>>>> {
>>>>>> - if (TREE_CODE (exp) == SSA_NAME)
>>>>>> - {
>>>>>> - gimple *stmt = SSA_NAME_DEF_STMT (exp);
>>>>>> - if (!is_gimple_assign (stmt))
>>>>>> - return NULL_TREE;
>>>>>> -
>>>>>> - tree rhs1 = gimple_assign_rhs1 (stmt);
>>>>>> - tree_code code = gimple_assign_rhs_code (stmt);
>>>>>> - if (code == ADDR_EXPR
>>>>>> - && TREE_CODE (TREE_OPERAND (rhs1, 0)) == ARRAY_REF)
>>>>>> - rhs1 = rhs1;
>>>>>> - else if (code != POINTER_PLUS_EXPR)
>>>>>> - return NULL_TREE;
>>>>>> -
>>>>>> - exp = rhs1;
>>>>>> - }
>>>>>> -
>>>>>> tree nonstr = NULL;
>>>>>> - if (c_strlen (exp, 1, &nonstr, 1) == NULL && nonstr)
>>>>>> - return nonstr;
>>>>>> -
>>>>>> - return NULL_TREE;
>>>>>> + c_strlen (exp, 1, &nonstr);
>>>>>> + return nonstr;
>>>>>> }
>>>>> Sigh. This is going to conflict with patch #6 from Martin's kit.
>>>>>
>>>>
>>>> Sorry, it just feels wrong to do this folding here instead of in
>>>> string_constant.
>>>>
>>>> I think the handling of POINTER_PLUS_EXPR above, is faulty,
>>>> because it is ignoring the offset parameter, which typically
>>>> contains the offset part, may add an offset to a different
>>>> structure member or another array index. That is what happened
>>>> in PR 86622.
>>>>
>>>> So you are likely looking at the wrong index, or even the wrong
>>>> structure member.
>>> I'm not disagreeing that something's wrong here -- the whole concept
>>> that we extract the rhs and totally ignore the offset seems wrong. I've
>>> stumbled over it working through issues with either patch #4 or #6 from
>>> Martin. But I felt I needed to go back and reevaluate any assumptions I
>>> had about how the code was supposed to be used before I called it out.
>>>
>>>
>>> Your expr.c changes may be worth pushing through independently of the
>>> rest. AFAICT they're really just exposing more cases where we can
>>> determine that we're working with a stirng constant.
>>>
>>
>> I think that moves just the folding from here to expr.c,
>> I don't see how the change in part #6 on unterminated_array
>> is supposed to work, I quote it here and add some comments:
>>
> Essentially there's a couple of concepts he wants to get in
> unterminated_array.
>
> First, given an address, if there's a variable part we want to clear
> exact so we can adjust the text of the message in the caller. ie
> "exceeds the size" vs "may exceed the size".
>
> Second, he really wants LEN to be set to the length of the string for
> whatever the constant part of the address might be, even if it's
> potentially not properly terminated.
>
> So given:
>
> const char b[][5] = { /* { dg-message "declared here" } */
> "12", "345", "6789", "abcde"
> };
>
> A reference like:
>
> T (&b[3][1] + v0, bsz);
>
> We want the length of the string at &b[3][1] into *LEN. That's used to
> provide the potential length of the string in the message.
>
> I don't think we can get that from c_strlen right now. Right now we've
> defined c_strlen as (reasonably) returning NULL for the length when
> we've got an string without proper termination.
>
> I'm hesitant to provide a boolean argument that essentially says give me
> whatever length you've computed, even if the string isn't properly
> terminated. But conceptually that's one of the things we'd need. It's
> also needed for patch #4.
>
> Jeff
>
> ps. Yes there's still some issues in the code. I'm mostly trying to
> highlight how it's being used and what requirements we'd have if we were
> going to do something like you've suggested in your patch to
> unterminated_array.
>
Hmm, you know, I wrote a while ago the following:
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00411.html
Where I suggested to change c_strlen parameter nonstr to a
structure, where additional information could go.
There are a lot more informations that could be relevant
for warnings.
So it would be easy to convey the unterminated string length
there, also I could imagine to diagnose unaligned wide string
constants, if we have a decl with a too low alignment factor,
or a known unaligned offset for instance. Also the location
of the expression on which c_strlen found the unterminated
string, could be of interest, especially when get_range_strlen
follows phi nodes.
Well, I have the impression that the 6/6 patch should be split
up in a part that uses c_strlen in its current form, and a
follow-up patch that adds any additional info.
I prefer doing things step-by-step, you know.
Note that the example given is probably unsafe:
> T (&b[3][1] + v0, bsz);
This does not pass the check in string_constant because it
looks like _2 = (char*)&b + _1; _3 = strlen (_2);
I have no way to prove that v0 is not larger than sizeof (b[3]).
However T(&a[1] + v0, bsz) will definitely work.
So the warning will likely get completely bogus data out
of unterminated_array.
By the way test test case for pr87053 is an example
of a false positive warning (it comes twice even though
TREE_NO_WARNING is used).
gcc pr87053.c
pr87053.c: In function 'main':
pr87053.c:15:26: warning: 'strlen' argument missing terminating nul [-Wstringop-overflow=]
15 | if (__builtin_strlen (u.z) != 7)
| ~^~
pr87053.c:11:3: note: referenced argument declared here
11 | } u = {{"1234", "567"}};
| ^
pr87053.c:15:26: warning: 'strlen' argument missing terminating nul [-Wstringop-overflow=]
15 | if (__builtin_strlen (u.z) != 7)
| ~^~
pr87053.c:11:3: note: referenced argument declared here
11 | } u = {{"1234", "567"}};
| ^
I think the duplicate warnings will not go away unless
this warning is only diagnosed in the tree-ssa-strlen
pass instead of the folding code, which happens repeatedly.
Bernd.
next prev parent reply other threads:[~2018-09-18 11:44 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-16 20:05 Bernd Edlinger
2018-09-17 17:35 ` Jeff Law
2018-09-17 18:32 ` Bernd Edlinger
2018-09-17 18:35 ` Jeff Law
2018-09-17 19:18 ` Bernd Edlinger
2018-09-17 23:01 ` Jeff Law
2018-09-18 5:38 ` Jeff Law
2018-09-18 12:06 ` Bernd Edlinger [this message]
2018-09-18 18:55 ` Jeff Law
2018-09-22 18:47 ` Martin Liška
2018-09-23 8:49 ` Martin Liška
2018-09-23 14:35 ` Jeff Law
2018-09-24 18:17 ` Jeff Law
2018-09-24 18:19 ` Bernd Edlinger
2018-09-25 3:19 ` Jeff Law
2018-09-25 22:43 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=VI1PR0701MB286291A978C4E3291AA389C6E41D0@VI1PR0701MB2862.eurprd07.prod.outlook.com \
--to=bernd.edlinger@hotmail.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=law@redhat.com \
--cc=msebor@gmail.com \
--cc=rguenther@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).