From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by sourceware.org (Postfix) with ESMTP id 847AD3958C12 for ; Thu, 25 Feb 2021 00:35:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 847AD3958C12 Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-446-umr5P6R2OM-yeZz76Mbaig-1; Wed, 24 Feb 2021 19:35:03 -0500 X-MC-Unique: umr5P6R2OM-yeZz76Mbaig-1 Received: by mail-qt1-f197.google.com with SMTP id a41so2973391qtk.0 for ; Wed, 24 Feb 2021 16:35:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=PH3vaSpTRF6P57GXe1n7ymsSbH2bYuyXtxKfZIAQMw4=; b=LQMZ0XdOHPD12cGYu5r21NxSBpRwrPVUb/XfpZ28UFsoD9fzVI/7OKXqa5LIYeqTuz fze12IAqZ0Pajgu9gRLkVebCKQxK0ZUt6qSbecAaYTxSaEtNFFIHgfnv9bQTnbRyogsc QtLEJG6pDgs+uI4DMOh9YESakcpP5b/zyn4/4SdoT+SUWMeZruVsQUZklMBmW1BtgSWM 5dbWnhwRbPltvTtY0s85R38VZ9Xq0HYRHgI9+5RA9LqQPXX45+bOQpufZ0RCt/xaj3fU m7Zsn4Lr2cS33QDsxSWxI0qC16mKpxOqSQrwU1uMIjnUzGVSh550ycC6G4oWS+11J6n4 dc7g== X-Gm-Message-State: AOAM532oP+0cqxZdVNU7IQZyWtP5qj/5jB3irkTNXASxncefOukCMdOo mC6nbt6RttS30Mh9QWJwiojsobxJU4gO56lf+5nDLC8qrwroo6qK+DFCtx0JWZdS/yKNpQXkNJu k/LTpIhD3HlkB8s7oIblJ0BYLtx8yf335YZliRYHanF2W0gz7LR30aRtIeyhSdXLtaQ== X-Received: by 2002:a37:c92:: with SMTP id 140mr593901qkm.177.1614213301904; Wed, 24 Feb 2021 16:35:01 -0800 (PST) X-Google-Smtp-Source: ABdhPJzqne4CNavegxhR4ap+5Pl5prjtrMidSFxW7TMrN3YfCmr9olUiCjjQLzSjgRHXAuMcaPderw== X-Received: by 2002:a37:c92:: with SMTP id 140mr593878qkm.177.1614213301526; Wed, 24 Feb 2021 16:35:01 -0800 (PST) Received: from [192.168.1.148] (209-6-216-142.s141.c3-0.smr-cbr1.sbo-smr.ma.cable.rcncustomer.com. [209.6.216.142]) by smtp.gmail.com with ESMTPSA id k26sm2727762qkj.131.2021.02.24.16.35.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 24 Feb 2021 16:35:00 -0800 (PST) Subject: Re: PING [PATCH] avoid -Warray-bounds checks for vtable assignments (PR 98266) To: Martin Sebor , Jeff Law , gcc-patches References: <7fa3c1e0-8a09-3436-8669-6d1a577ca247@gmail.com> <55235bab-1a5a-4e3b-3a57-55d9a84ae00c@redhat.com> <042e2b8f-3031-519d-4a2f-1cb33e2004a7@gmail.com> <354bf06f-ad62-7966-a5d5-e4931ed1d5ed@redhat.com> <355d1e90-bba2-ad97-3544-c8fda473cfa0@gmail.com> <571fbfd1-d0f1-c1ca-1eef-4e804d7f8088@redhat.com> <8067ec57-5bb5-87a3-6a72-7ab43004248c@gmail.com> <3ced50a4-06b9-f83b-da73-9c4842944714@redhat.com> From: Jason Merrill Message-ID: Date: Wed, 24 Feb 2021 19:35:00 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Feb 2021 00:35:07 -0000 On 2/23/21 6:07 PM, Martin Sebor wrote: > On 2/23/21 2:52 PM, Jason Merrill wrote: >> On 2/23/21 11:02 AM, Martin Sebor wrote: >>> [CC Jason for any further comments/clarification] >>> >>> On 2/9/21 10:49 AM, Martin Sebor wrote: >>>> On 2/8/21 4:11 PM, Jeff Law wrote: >>>>> >>>>> >>>>> On 2/8/21 3:44 PM, Martin Sebor wrote: >>>>>> On 2/8/21 3:26 PM, Jeff Law wrote: >>>>>>> >>>>>>> >>>>>>> On 2/8/21 2:56 PM, Martin Sebor wrote: >>>>>>>> On 2/8/21 12:59 PM, Jeff Law wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 1/19/21 5:56 PM, Martin Sebor via Gcc-patches wrote: >>>>>>>>>> Similar to the problem reported for -Wstringop-overflow in >>>>>>>>>> pr98266 >>>>>>>>>> and already fixed, -Warray-bounds is also susceptible to false >>>>>>>>>> positives in assignments and copies involving virtual >>>>>>>>>> inheritance. >>>>>>>>>> Because the two warnings don't share code yet (hopefully in >>>>>>>>>> GCC 12) >>>>>>>>>> the attached patch adds its own workaround for this problem to >>>>>>>>>> gimple-array-bounds.cc, this one slightly more crude because of >>>>>>>>>> the limited insight the array bounds checking has into the >>>>>>>>>> checked >>>>>>>>>> expressions. >>>>>>>>>> >>>>>>>>>> Tested on x86_64-linux. >>>>>>>>>> >>>>>>>>>> Martin >>>>>>>>>> >>>>>>>>>> gcc-98266.diff >>>>>>>>>> >>>>>>>>>> PR middle-end/98266 - bogus array subscript is partly outside >>>>>>>>>> array >>>>>>>>>> bounds on virtual inheritance >>>>>>>>>> >>>>>>>>>> gcc/ChangeLog: >>>>>>>>>> >>>>>>>>>>       PR middle-end/98266 >>>>>>>>>>       * gimple-array-bounds.cc >>>>>>>>>> (array_bounds_checker::check_array_bounds): >>>>>>>>>>       Avoid checking references involving artificial members. >>>>>>>>>> >>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>> >>>>>>>>>>       PR middle-end/98266 >>>>>>>>>>       * g++.dg/warn/Warray-bounds-15.C: New test. >>>>>>>>> It seems to me that we've got the full statement at some point >>>>>>>>> and >>>>>>>>> thus >>>>>>>>> the full expression so at some point couldn't we detect when >>>>>>>>> TYPE_SIZE_UNIT!= DECL_SIZE_UNIT?  Or should we be using >>>>>>>>> TYPE_SIZE_UNIT >>>>>>>>> rather than DECL_SIZE_UNIT in gimple-array-bounds.cc >>>>>>>>> >>>>>>>>> Am I missing something? >>>>>>>> >>>>>>>> The expression we're looking at when the false positive is issued >>>>>>>> is the MEM_REF in the LHS of: >>>>>>>> >>>>>>>> MEM[(struct D *)&D.2652 + 24B]._vptr.D = &MEM [(void >>>>>>>> *)&_ZTC1E24_1D + 24B]; >>>>>>>> >>>>>>>> TREE_TYPE(LHS) is D, DECL_SIZE_UNIT (D.2652) is 24, and >>>>>>>> TYPE_SIZE_UNIT(D) is also 24, so there's no discrepancy between >>>>>>>> DECL_SIZE and TYPE_SIZE. >>>>>>> So that seems like it's a different issue then, unrelated to 97595. >>>>>>> Right? >>>>>> >>>>>> I think the underlying problem is the same.  We're getting a size >>>>>> that doesn't correspond to what's actually being accessed, and it >>>>>> happens because of the virtual inheritance.  In pr97595 Jason >>>>>> suggested to use the decl/type size inequality to identify this >>>>>> case but I think we could have just as well used DECL_ARTIFICIAL >>>>>> instead.  At least the test cases from pr97595 both pass with >>>>>> this change. >>>>> But in the 98266 case the type and decl sizes are the same.  So to be >>>>> true that would mean that the underlying type we're using to access >>>>> memory differs from its actual type.  Is that the case in the IL?  And >>>>> does this have wider implications for diagnostics or optimizations >>>>> that >>>>> rely on accurate type sizing? >>>>> >>>>> I'm just trying to make sure I understand, not accepting or rejecting >>>>> the patch yet. >>>> >>>> The part of the IL with the MEM_REF is this: >>>> >>>> void g () >>>> { >>>>    void * D.2789; >>>>    struct E D.2652; >>>> >>>>    [local count: 1073741824]: >>>>    E::E (&D.2652, ""); >>>>    f (&D.2652); >>>> >>>>    [local count: 1073741824]: >>>>    MEM[(struct D *)&D.2652 + 24B]._vptr.D = &MEM [(void >>>> *)&_ZTC1E24_1D + 24B]; >>>>    ... >>>> >>>> The access here is to the _vptr.D pointer member of D.2652 which is >>>> just past the end of the parent object (as reflected by its SIZE): >>>> it sets sets up the virtual table pointer. >>>> >>>> The access in pr97595 is to the member subobject, which, as Jason >>>> explained (and I accordingly documented under DECL_SIZE in tree.h), >>>> is also laid out separately from the parent object. >>>> >>>> These cases aren't exactly the same (which is also why the test >>>> I added for -Warray-bounds in pr97595 didn't expose this bug) but >>>> they are closely related.  The one here can be distinguished by >>>> DECL_ARTIFICAL.  The other by the DECL_SIZE != TYPE_SIZE member >>>> inequality. >>>> >>>> Might this impact other warnings?  I'd say so if they don't take >>>> these things into account.  I just learned about this in pr97595 >>>> which was a -Wstringop-overflow false positive but I also saw >>>> a similar instance of -Warray-bounds with my patch to improve >>>> caching and enhance array bounds checking.  I dealt with that >>>> instance of the warning in that patch but proactively added >>>> a test case to the fix for pr97595.  But the test case is focused >>>> on the subobject access and not on one to the virtual table so >>>> (as I said above) it didn't expose this bug. >>>> >>>> Might this also impact optimizations?  I can imagine someone >>>> unaware of this "gotcha" making the same "naive" assumption >>>> I did, but I'd also expect such an invalid assumption to be >>>> found either in code review or quickly cause problems in >>>> testing. >>> >>> Jeff, does this answer your question? >> >> I don't see how the issue here depends on the artificiality of the >> vptr; I'd expect to see the same problem with a data member.  The >> problem is that a D base subobject is smaller than a complete D >> object, and in this case the base subobject is allocated such that if >> it were a full D object, it would overlap the end of E.  And we're >> checking the MEM_REF as though accessing a full D object, so we get a >> warning. > > Thanks for chiming in! > > There is no warning for accesses to data members defined in > the virtual bases because their offset isn't constant.  For example, > given the following modified version of the test case from the patch: > > struct A > { >   virtual ~A (); >   int a; > }; > > struct B: virtual A { int b; }; > struct C: virtual A { int c; }; > struct D: virtual B, virtual C > { >   int d; > }; > > D d; > > void g () > { >   D *p = &d; >   p->a = p->b = p->c = p->d = 1; > } > > we end up with: > > void g () > { >   ... >   [local count: 1073741824]: >   d.d = 1;                           <<< okay >   _1 = d._vptr.D;                    <<< DECL_ARTIFICIAL >   _2 = MEM[(long int *)_1 + -40B];   <<< not checked >   _3 = (sizetype) _2;                <<< not constant... >   _4 = &d + _3; >   MEM[(struct C *)_4].c = 1;         <<< ...no warning Interesting. I do get a warning if I embed the access in the destructor: struct A { virtual ~A (); int a; }; struct B: virtual A { }; struct C: virtual A { int c; ~C() { c = 0; } // warns twice, for _vptr and c accesses }; struct D: virtual B, virtual C { D(); }; void g() { D d; } >   _5 = MEM[(long int *)_1 + -24B]; >   _6 = (sizetype) _5; >   _7 = &d + _6; >   MEM[(struct B *)_7].b = 1; >   _8 = MEM[(long int *)_1 + -32B]; >   _9 = (sizetype) _8; >   _10 = &d + _9; >   MEM[(struct A *)_10].a = 1; >   return; > > } > >> The general issue about the confusion between complete and as-base >> types is PR 22488; I have various thoughts about a proper fix, but >> there have always been higher-priority things to work on. > > Thanks for the reference! > >> One possible approach to this PR is that we don't need to check an >> intermediate _REF; in >> >> MEM[(struct D *)&D.2651 + 24B]._vptr.D >> >> we aren't loading a whole D, we're only looking at the _vptr, which is >> within the bounds of E. > > This is also what the patch does: it keeps us from reaching > the MEM_REF because _vptr.D is artificial. My point applies equally to the access to c in my testcase above: we shouldn't check lvalue refs as though we're loading or storing the whole aggregate object. The full access is (*((D*)&e))._vptr.D. The MEM_REF represents the *, which we shouldn't check by itself, because it doesn't imply a memory access of its own. We should only check the COMPONENT_REF around it that is an actual memory access. > -Warray-bounds still of course checks other MEM_REFs in order to > detect bugs like: > > struct A { int i, j; }; > > A* f () > { >   A *p = (A*)new char[4]; >   p->i = 1; >   p->j = 2; >   return p; > } Of course.