Re: [PATCH] handle MEM_REF with void* arguments (PR c++/95768)

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Martin Sebor <msebor@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: Jason Merrill <jason@redhat.com>, gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] handle MEM_REF with void* arguments (PR c++/95768)
Date: Sun, 28 Jun 2020 17:07:59 -0600	[thread overview]
Message-ID: <5b4805a5-1949-267c-dc40-6f084349a68b@gmail.com> (raw)
In-Reply-To: <CAFiYyc1oKKEiZbvpdQaBPuMivJt_g+D25CX4NokUUT5JxUhDNg@mail.gmail.com>

On 6/23/20 1:12 AM, Richard Biener wrote:
> On Tue, Jun 23, 2020 at 12:22 AM Martin Sebor via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> On 6/22/20 12:55 PM, Jason Merrill wrote:
>>> On 6/22/20 1:25 PM, Martin Sebor wrote:
>>>> The attached fix parallels the one for the equivalent C bug 95580
>>>> where the pretty printers don't correctly handle MEM_REF arguments
>>>> with type void* or other pointers to an incomplete type.
>>>>
>>>> The incorrect handling was exposed by the recent change to
>>>> -Wuninitialized which includes such expressions in diagnostics.
>>>
>>>> +        if (tree size = TYPE_SIZE_UNIT (TREE_TYPE (argtype)))
>>>> +          if (!integer_onep (size))
>>>> +            {
>>>> +              pp_cxx_left_paren (pp);
>>>> +              dump_type (pp, ptr_type_node, flags);
>>>> +              pp_cxx_right_paren (pp);
>>>> +            }
>>>
>>> Don't we want to print the cast if the pointer target type is incomplete?
>>
>> I suppose, yes, although after some more testing I think what should
>> be output is the type of the access.  The target pointer type isn't
>> meaningful (at least not in this case).
>>
>> Here's what the warning looks like in C for the test case in
>> gcc.dg/pr95580.c:
>>
>>     warning: ‘*((void *)(p)+1)’ may be used uninitialized
>>
>> and like this in C++:
>>
>>     warning: ‘*(p +1)’ may be used uninitialized
>>
>> The +1 is a byte offset, which is correct given that incrementing
>> a void* in GCC is the same as adding 1 to the byte address, but
>> dereferencing a void* doesn't correspond to what's going on in
>> the source.
>>
>> Even for a complete type (with size greater than 1), printing
>> the type of the argument plus a byte offset is wrong.  It ends
>> up with this for the C++ test case from 95768:
>>
>>     warning: ‘*((int*)<unknown> +4)’ is used uninitialized
>>
>> when the access is actually ‘*((int*)<unknown> +1)’
>>
>> So it seems to me for MEM_REF, to make the output meaningful,
>> it's the type of the access (i.e., the MEM_REF type) that should
>> be printed here, and the offset should either be in elements of
>> the accessed type, i.e.,
>>
>>     warning: ‘*((int*)<unknown> +1)’ is used uninitialized
>>
>> or, if the access is misaligned, the argument should first be
>> cast to char*, the offset added, and the result then cast to
>> the access type, like this:
>>
>>     warning: ‘*(T*)((char*)<unknown> +1)’ is used uninitialized
>>
>> The attached revised and less than fully tested patch implements
>> this for C++ only for now.  If we agree on this approach I'll see
>> about making the corresponding change in C.
> 
> Note that there is no C/C++ way of fully expressing MEM_REF
> semantics.  __MEM <int> ((T *)p + 1) is not actually
> *(int *)((char *)p + 1) because that does not reflect that the
> effective type of the lvalue when TBAA is concerned is 'T'
> rather than 'int'.

What form would you say is closest to the C/C++ semantics, or
likely the most useful to users, that GCC could print instead?

> Note for MEM_REF the offset is always
> a constant byte offset but it indeed does not have to be a
> multiple of the MEM_REF type size.
> 
> I wonder whether printing the MEM_REF in full provides
> any real diagnostic value in the more "obfuscated" cases.

I'm not sure what obfuscated cases you're thinking of, or what
you mean by printing it in full.  I instrumented the code to
print every MEM_REF in that comes up in warn_uninitialized_vars
and rebuilt GCC.  There are 17,456 distinct instances so I didn't
review them all but those I did look at all look reasonable.
Probably the least useful are those that mention <unknown> by
itself (i.e., <unknown> or *<unknown>).  Those with an offset
are more informative (e.g., *((access**)<unknown> +1).  In
a few the offset is very large, such as *((unsigned int*)sp
+4611686018427387900), but that doesn't seem like a problem.
I'd be happy to share the result.

> 
> I'd also not print <unknown> but <register>.

I also don't find <unknown> helpful, but I don't see <register>
as an improvement.  I think printing the SSA variable would be
more informative here since its name is usually related to
the variable it was derived from in the source.  But making that
change (or any other like it) feels like too much feature creep
for this fix.  I'd be happy to do it in a follow up if we agree
it's a good idea.

Either way, please let me know if the patch is okay as is or,
if not, what type it should mention.

Martin

next prev parent reply	other threads:[~2020-06-28 23:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-22 17:25 Martin Sebor
2020-06-22 18:55 ` Jason Merrill
2020-06-22 22:22   ` Martin Sebor
2020-06-23  7:12     ` Richard Biener
2020-06-28 23:07       ` Martin Sebor [this message]
2020-06-29  7:19         ` Richard Biener
2020-07-09 15:50           ` Martin Sebor
2021-01-02 22:22             ` [PATCH v3] " Martin Sebor
2021-01-05 23:17               ` Jeff Law
2021-01-07  8:26               ` Jakub Jelinek
2021-01-07 21:36                 ` Martin Sebor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5b4805a5-1949-267c-dc40-6f084349a68b@gmail.com \
    --to=msebor@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jason@redhat.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).