From: Richard Biener <richard.guenther@gmail.com>
To: Jeff Law <law@redhat.com>
Cc: Martin Sebor <msebor@gmail.com>,
Gcc Patch List <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] handle sprintf(d, "%s", ...) in gimple-ssa-sprintf.c
Date: Tue, 02 May 2017 10:27:00 -0000 [thread overview]
Message-ID: <CAFiYyc37fsOQ4t=OXnn9R-UCaiaEEbR4HGoOT1E3YqR8KO6NFg@mail.gmail.com> (raw)
In-Reply-To: <0de09846-2f84-4763-58b2-b31e5490a6bb@redhat.com>
On Fri, Apr 28, 2017 at 6:35 PM, Jeff Law <law@redhat.com> wrote:
> On 04/25/2017 09:55 PM, Martin Sebor wrote:
>>
>> On 04/25/2017 04:05 PM, Jeff Law wrote:
>>>
>>> On 04/21/2017 03:33 PM, Martin Sebor wrote:
>>>>
>>>> Bug 77671 - missing -Wformat-overflow warning on sprintf overflow
>>>> with "%s", is caused by gimple-fold.c transforming s{,n}printf
>>>> calls with a plain "%s" format string into strcpy regardless of
>>>> whether they are valid well before the -Wformtat-overflow warning
>>>> has had a chance to validate them.
>>>>
>>>> The attached patch moves the transformation from gimple-fold.c
>>>> into the gimple-ssa-sprintf.c pass, thus allowing the calls to
>>>> be validated. Only valid calls (i.e., those that do not overflow
>>>> the destination) are transformed.
>>>>
>>>> Martin
>>>>
>>>> gcc-77671.diff
>>>>
>>>>
>>>> commit 2cd98763984ffb93606ee96ad658733b4c95c1e4
>>>> Author: Martin Sebor<msebor@redhat.com>
>>>> Date: Wed Apr 12 18:36:26 2017 -0600
>>>>
>>>> PR middle-end/77671 - missing -Wformat-overflow warning on
>>>> sprintf overflow
>>>> with %s
>>>> gcc/ChangeLog:
>>>> PR middle-end/77671
>>>> * gimple-fold.c (gimple_fold_builtin_sprintf): Make extern.
>>>> (gimple_fold_builtin_snprintf): Same.
>>>> * gimple-fold.h (gimple_fold_builtin_sprintf): Declare.
>>>> (gimple_fold_builtin_snprintf): Same.
>>>> * gimple-ssa-sprintf.c (get_format_string): Correct the
>>>> detection
>>>> of character types.
>>>> (is_call_safe): New function.
>>>> (try_substitute_return_value): Call it.
>>>> (try_simplify_call): New function.
>>>> (pass_sprintf_length::handle_gimple_call): Call it.
>>>> gcc/testsuite/ChangeLog:
>>>> PR middle-end/77671
>>>> * gcc.dg/tree-ssa/builtin-sprintf-7.c: New test.
>>>> * gcc.dg/tree-ssa/builtin-sprintf-8.c: New test.
>>>> * gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust.
>>>> * gcc.dg/tree-ssa/builtin-sprintf-warn-2.c: Adjust.
>>>> * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Adjust.
>>>
>>> I assume this went through the usual regression testing cycle? I'm a
>>> bit surprised nothing failed due to moving the transformation later in
>>> the pipeline.
>>
>>
>> It did. There was one regression I neglected to mention. A test
>> exercising -fexec-charset (bug 20110) fails because the sprintf
>> pass assumes the execution character set is the same as the host
>> character set. With -fexec-charset set to EBCDIC it gets just
>> as confused as the -Wformat warning does (this is subject of
>> the still unresolved bug 20110). I've opened bug 80523 for
>> the new problem and I'm testing a patch that handles it.
>>
>>
>>>> diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
>>>> index 2474391..07d6897 100644
>>>> --- a/gcc/gimple-ssa-sprintf.c
>>>> +++ b/gcc/gimple-ssa-sprintf.c
>>>> @@ -373,7 +373,16 @@ get_format_string (tree format, location_t *ploc)
>>>> if (TREE_CODE (format) != STRING_CST)
>>>> return NULL;
>>>> - if (TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (format))) !=
>>>> char_type_node)
>>>> + tree type = TREE_TYPE (format);
>>>> + if (TREE_CODE (type) == ARRAY_TYPE)
>>>> + type = TREE_TYPE (type);
>>>> +
>>>> + /* Can't just test that:
>>>> + TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (format))) !=
>>>> char_type_node
>>>> + See bug 79062. */
>>>> + if (TREE_CODE (type) != INTEGER_TYPE
>>>> + || TYPE_MODE (type) != TYPE_MODE (char_type_node)
>>>> + || TYPE_PRECISION (type) != TYPE_PRECISION (char_type_node))
>>>> {
>>>> /* Wide format string. */
>>>> return NULL;
>>>
>>> In the referenced BZ Richi mentions how fold-const.c checks for this
>>> case and also hints that you might want t look at tree-ssa-strlen as
>>> well. That seems wise. It also seems wise to factor that check and use
>>> it from all the identified locations. I don't like that this uses a
>>> different check than what's in fold-const.c.
>>
>>
>> I think what I did comes from tree-ssa-strlen.c (Richi's other
>> suggestion in bug 79062). In either case I don't fully understand
>> why the existing code doesn't work.
>>
>>> It's also not clear if this change is a requirement to address 77671 or
>>> not. If so ISTM that we fix 79062 first, then 77671. If not we can fix
>>> them independently. In both cases the fix for 79062 can be broken out
>>> into its own patch.
>>
>>
>> I suppose that makes sense. The hunk above doesn't fully fix
>> 79062. It just lets the sprintf return value optimization take
>> place with -flto.
>>
>>>> @@ -3278,31 +3287,21 @@ get_destination_size (tree dest)
>>>> return HOST_WIDE_INT_M1U;
>>>> }
>>>> -/* Given a suitable result RES of a call to a formatted output
>>>> function
>>>> - described by INFO, substitute the result for the return value of
>>>> - the call. The result is suitable if the number of bytes it
>>>> represents
>>>> - is known and exact. A result that isn't suitable for substitution
>>>> may
>>>> - have its range set to the range of return values, if that is known.
>>>> - Return true if the call is removed and gsi_next should not be
>>>> performed
>>>> - in the caller. */
>>>> -
>>>> static bool
>>>> -try_substitute_return_value (gimple_stmt_iterator *gsi,
>>>> - const pass_sprintf_length::call_info &info,
>>>> - const format_result &res)
>>>> +is_call_safe (const pass_sprintf_length::call_info &info,
>>>> + const format_result &res, bool under4k,
>>>> + unsigned HOST_WIDE_INT retval[2])
>>>
>>> You need a function comment for is_call_safe.
>>>
>>>
>>>
>>>> @@ -3423,6 +3458,30 @@ try_substitute_return_value
>>>> (gimple_stmt_iterator *gsi,
>>>> return removed;
>>>> }
>>>> +static bool
>>>> +try_simplify_call (gimple_stmt_iterator *gsi,
>>>> + const pass_sprintf_length::call_info &info,
>>>> + const format_result &res)
>>>
>>> Needs a function comment.
>>>
>>>
>>>
>>> So there's the minor nits to add function comments. The big question
>>> is the stuff for 79062 and a confirmation that this went through a full
>>> regression test cycle.
>>
>>
>> I think I may have included the (partial) the fix for 79062 to
>> get some tests to pass but I'm not 100% sure.
>>
>> Let me first submit the fix for the -fexec-charset limitation
>> (bug 80523), see if I can separate out the partial fix for 79062,
>> and then resubmit this patch.
>
> OK. I think 80523 is pretty close. I mentioned one implementation detail
> (initialize translation table once per file rather than once per function)
> and one question about what happens for long translated strings. I suspect
> you'll have that one good-to-go shortly.
>
>>
>> FWIW, my fix for bug 79062 is only partial (it gets the pass
>> to run but the warnings are still not issued). I don't quite
>> understand what prevents the warning flag(s) from getting set
>> when -flto is used. This seems to be a bigger problem than
>> just the sprintf pass not doing something just right.
>
> I've never dug deeply in the LTO stuff, but I believe we stream the compiler
> flags, so it could be something there.
We do.
> Alternately you might be running into a case where in LTO mode we recreate
> base types. Look for a type equality tester that goes beyond just testing
> pointer equality.
>
> ie, in LTO I think we'll create a type based on the streamed data, but I
> also think we'll create various basic types. Thus in LTO mode pointer
> equality may not be sufficient.
We make sure that for most basic types we end up re-using them where possible.
char_type_node is an example where that generally doesn't work because it's
value depends on a command-line flag.
Richard.
> jeff
next prev parent reply other threads:[~2017-05-02 10:25 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-22 0:51 Martin Sebor
2017-04-25 22:27 ` Jeff Law
2017-04-26 7:59 ` Martin Sebor
2017-04-28 6:07 ` Martin Sebor
2017-04-28 17:06 ` Jeff Law
2017-04-28 16:48 ` Jeff Law
2017-05-02 10:27 ` Richard Biener [this message]
2017-05-02 14:44 ` Martin Sebor
2017-05-03 10:21 ` Richard Biener
2017-05-03 14:48 ` Martin Sebor
2017-05-12 18:44 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFiYyc37fsOQ4t=OXnn9R-UCaiaEEbR4HGoOT1E3YqR8KO6NFg@mail.gmail.com' \
--to=richard.guenther@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=law@redhat.com \
--cc=msebor@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).