public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Sebor <msebor@gmail.com>
To: Jason Merrill <jason@redhat.com>, Jeff Law <law@redhat.com>,
	gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: PING [PATCH] avoid -Warray-bounds checks for vtable assignments (PR 98266)
Date: Tue, 23 Feb 2021 16:07:57 -0700	[thread overview]
Message-ID: <a9fdac77-0c02-c8fd-64ad-14fadced7afb@gmail.com> (raw)
In-Reply-To: <3ced50a4-06b9-f83b-da73-9c4842944714@redhat.com>

On 2/23/21 2:52 PM, Jason Merrill wrote:
> On 2/23/21 11:02 AM, Martin Sebor wrote:
>> [CC Jason for any further comments/clarification]
>>
>> On 2/9/21 10:49 AM, Martin Sebor wrote:
>>> On 2/8/21 4:11 PM, Jeff Law wrote:
>>>>
>>>>
>>>> On 2/8/21 3:44 PM, Martin Sebor wrote:
>>>>> On 2/8/21 3:26 PM, Jeff Law wrote:
>>>>>>
>>>>>>
>>>>>> On 2/8/21 2:56 PM, Martin Sebor wrote:
>>>>>>> On 2/8/21 12:59 PM, Jeff Law wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 1/19/21 5:56 PM, Martin Sebor via Gcc-patches wrote:
>>>>>>>>> Similar to the problem reported for -Wstringop-overflow in pr98266
>>>>>>>>> and already fixed, -Warray-bounds is also susceptible to false
>>>>>>>>> positives in assignments and copies involving virtual inheritance.
>>>>>>>>> Because the two warnings don't share code yet (hopefully in GCC 
>>>>>>>>> 12)
>>>>>>>>> the attached patch adds its own workaround for this problem to
>>>>>>>>> gimple-array-bounds.cc, this one slightly more crude because of
>>>>>>>>> the limited insight the array bounds checking has into the checked
>>>>>>>>> expressions.
>>>>>>>>>
>>>>>>>>> Tested on x86_64-linux.
>>>>>>>>>
>>>>>>>>> Martin
>>>>>>>>>
>>>>>>>>> gcc-98266.diff
>>>>>>>>>
>>>>>>>>> PR middle-end/98266 - bogus array subscript is partly outside 
>>>>>>>>> array
>>>>>>>>> bounds on virtual inheritance
>>>>>>>>>
>>>>>>>>> gcc/ChangeLog:
>>>>>>>>>
>>>>>>>>>       PR middle-end/98266
>>>>>>>>>       * gimple-array-bounds.cc
>>>>>>>>> (array_bounds_checker::check_array_bounds):
>>>>>>>>>       Avoid checking references involving artificial members.
>>>>>>>>>
>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>
>>>>>>>>>       PR middle-end/98266
>>>>>>>>>       * g++.dg/warn/Warray-bounds-15.C: New test.
>>>>>>>> It seems to me that we've got the full statement at some point  and
>>>>>>>> thus
>>>>>>>> the full expression so at some point couldn't we detect when
>>>>>>>> TYPE_SIZE_UNIT!= DECL_SIZE_UNIT?  Or should we be using 
>>>>>>>> TYPE_SIZE_UNIT
>>>>>>>> rather than DECL_SIZE_UNIT in gimple-array-bounds.cc
>>>>>>>>
>>>>>>>> Am I missing something?
>>>>>>>
>>>>>>> The expression we're looking at when the false positive is issued
>>>>>>> is the MEM_REF in the LHS of:
>>>>>>>
>>>>>>> MEM[(struct D *)&D.2652 + 24B]._vptr.D = &MEM <int (*) ()> [(void
>>>>>>> *)&_ZTC1E24_1D + 24B];
>>>>>>>
>>>>>>> TREE_TYPE(LHS) is D, DECL_SIZE_UNIT (D.2652) is 24, and
>>>>>>> TYPE_SIZE_UNIT(D) is also 24, so there's no discrepancy between
>>>>>>> DECL_SIZE and TYPE_SIZE.
>>>>>> So that seems like it's a different issue then, unrelated to 97595.
>>>>>> Right?
>>>>>
>>>>> I think the underlying problem is the same.  We're getting a size
>>>>> that doesn't correspond to what's actually being accessed, and it
>>>>> happens because of the virtual inheritance.  In pr97595 Jason
>>>>> suggested to use the decl/type size inequality to identify this
>>>>> case but I think we could have just as well used DECL_ARTIFICIAL
>>>>> instead.  At least the test cases from pr97595 both pass with
>>>>> this change.
>>>> But in the 98266 case the type and decl sizes are the same.  So to be
>>>> true that would mean that the underlying type we're using to access
>>>> memory differs from its actual type.  Is that the case in the IL?  And
>>>> does this have wider implications for diagnostics or optimizations that
>>>> rely on accurate type sizing?
>>>>
>>>> I'm just trying to make sure I understand, not accepting or rejecting
>>>> the patch yet.
>>>
>>> The part of the IL with the MEM_REF is this:
>>>
>>> void g ()
>>> {
>>>    void * D.2789;
>>>    struct E D.2652;
>>>
>>>    <bb 2> [local count: 1073741824]:
>>>    E::E (&D.2652, "");
>>>    f (&D.2652);
>>>
>>>    <bb 3> [local count: 1073741824]:
>>>    MEM[(struct D *)&D.2652 + 24B]._vptr.D = &MEM <int (*) ()> [(void 
>>> *)&_ZTC1E24_1D + 24B];
>>>    ...
>>>
>>> The access here is to the _vptr.D pointer member of D.2652 which is
>>> just past the end of the parent object (as reflected by its SIZE):
>>> it sets sets up the virtual table pointer.
>>>
>>> The access in pr97595 is to the member subobject, which, as Jason
>>> explained (and I accordingly documented under DECL_SIZE in tree.h),
>>> is also laid out separately from the parent object.
>>>
>>> These cases aren't exactly the same (which is also why the test
>>> I added for -Warray-bounds in pr97595 didn't expose this bug) but
>>> they are closely related.  The one here can be distinguished by
>>> DECL_ARTIFICAL.  The other by the DECL_SIZE != TYPE_SIZE member
>>> inequality.
>>>
>>> Might this impact other warnings?  I'd say so if they don't take
>>> these things into account.  I just learned about this in pr97595
>>> which was a -Wstringop-overflow false positive but I also saw
>>> a similar instance of -Warray-bounds with my patch to improve
>>> caching and enhance array bounds checking.  I dealt with that
>>> instance of the warning in that patch but proactively added
>>> a test case to the fix for pr97595.  But the test case is focused
>>> on the subobject access and not on one to the virtual table so
>>> (as I said above) it didn't expose this bug.
>>>
>>> Might this also impact optimizations?  I can imagine someone
>>> unaware of this "gotcha" making the same "naive" assumption
>>> I did, but I'd also expect such an invalid assumption to be
>>> found either in code review or quickly cause problems in
>>> testing.
>>
>> Jeff, does this answer your question?
> 
> I don't see how the issue here depends on the artificiality of the vptr; 
> I'd expect to see the same problem with a data member.  The problem is 
> that a D base subobject is smaller than a complete D object, and in this 
> case the base subobject is allocated such that if it were a full D 
> object, it would overlap the end of E.  And we're checking the MEM_REF 
> as though accessing a full D object, so we get a warning.

Thanks for chiming in!

There is no warning for accesses to data members defined in
the virtual bases because their offset isn't constant.  For example,
given the following modified version of the test case from the patch:

struct A
{
   virtual ~A ();
   int a;
};

struct B: virtual A { int b; };
struct C: virtual A { int c; };
struct D: virtual B, virtual C
{
   int d;
};

D d;

void g ()
{
   D *p = &d;
   p->a = p->b = p->c = p->d = 1;
}

we end up with:

void g ()
{
   ...
   <bb 2> [local count: 1073741824]:
   d.d = 1;                           <<< okay
   _1 = d._vptr.D;                    <<< DECL_ARTIFICIAL
   _2 = MEM[(long int *)_1 + -40B];   <<< not checked
   _3 = (sizetype) _2;                <<< not constant...
   _4 = &d + _3;
   MEM[(struct C *)_4].c = 1;         <<< ...no warning
   _5 = MEM[(long int *)_1 + -24B];
   _6 = (sizetype) _5;
   _7 = &d + _6;
   MEM[(struct B *)_7].b = 1;
   _8 = MEM[(long int *)_1 + -32B];
   _9 = (sizetype) _8;
   _10 = &d + _9;
   MEM[(struct A *)_10].a = 1;
   return;

}

> 
> The general issue about the confusion between complete and as-base types 
> is PR 22488; I have various thoughts about a proper fix, but there have 
> always been higher-priority things to work on.

Thanks for the reference!

> 
> One possible approach to this PR is that we don't need to check an 
> intermediate _REF; in
> 
> MEM[(struct D *)&D.2651 + 24B]._vptr.D
> 
> we aren't loading a whole D, we're only looking at the _vptr, which is 
> within the bounds of E.

This is also what the patch does: it keeps us from reaching
the MEM_REF because _vptr.D is artificial.

-Warray-bounds still of course checks other MEM_REFs in order to
detect bugs like:

struct A { int i, j; };

A* f ()
{
   A *p = (A*)new char[4];
   p->i = 1;
   p->j = 2;
   return p;
}

Martin

> 
> Jason
> 


  reply	other threads:[~2021-02-23 23:07 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-20  0:56 Martin Sebor
2021-01-29 17:22 ` PING " Martin Sebor
2021-02-06 17:12   ` PING 2 " Martin Sebor
2021-02-08 19:59 ` Jeff Law
2021-02-08 21:56   ` Martin Sebor
2021-02-08 22:26     ` Jeff Law
2021-02-08 22:44       ` Martin Sebor
2021-02-08 23:11         ` Jeff Law
2021-02-09 17:49           ` Martin Sebor
2021-02-23 16:02             ` PING " Martin Sebor
2021-02-23 21:52               ` Jason Merrill
2021-02-23 23:07                 ` Martin Sebor [this message]
2021-02-25  0:35                   ` Jason Merrill
2021-03-01 23:11                     ` Martin Sebor
2021-03-02 14:11                       ` Jason Merrill
2021-03-03 23:20                         ` Martin Sebor
2021-03-04  5:33                           ` Jason Merrill
2021-03-04 17:44                             ` Martin Sebor
2021-03-04 20:00                               ` Jason Merrill
2021-02-25 23:47                 ` Jeff Law
2021-02-26  3:56                   ` Jason Merrill
2021-02-25 23:40         ` Jeff Law
2021-03-01 23:16           ` Martin Sebor
2021-02-25 23:15     ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9fdac77-0c02-c8fd-64ad-14fadced7afb@gmail.com \
    --to=msebor@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jason@redhat.com \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).