From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 150BA3858D39 for ; Mon, 1 Aug 2022 10:38:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 150BA3858D39 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 46952139F; Mon, 1 Aug 2022 03:38:34 -0700 (PDT) Received: from [10.2.79.48] (unknown [10.2.79.48]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2D4B03F67D; Mon, 1 Aug 2022 03:38:33 -0700 (PDT) Message-ID: <5df3ffd0-9a15-f4ad-325b-1507107a3504@foss.arm.com> Date: Mon, 1 Aug 2022 11:38:31 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH v2] cselib: add function to check if SET is redundant [PR106187] Content-Language: en-GB To: Jeff Law , gcc-patches@gcc.gnu.org References: <9e9b7ec5-4c10-115d-8b6d-a7a07bfd72be@gmail.com> From: Richard Earnshaw In-Reply-To: <9e9b7ec5-4c10-115d-8b6d-a7a07bfd72be@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3489.1 required=5.0 tests=BAYES_00, BODY_8BITS, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, NICE_REPLY_A, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Aug 2022 10:38:36 -0000 On 30/07/2022 20:57, Jeff Law via Gcc-patches wrote: > > > On 7/29/2022 7:52 AM, Richard Earnshaw via Gcc-patches wrote: >> A SET operation that writes memory may have the same value as an >> earlier store but if the alias sets of the new and earlier store do >> not conflict then the set is not truly redundant.  This can happen, >> for example, if objects of different types share a stack slot. >> >> To fix this we define a new function in cselib that first checks for >> equality and if that is successful then finds the earlier store in the >> value history and checks the alias sets. >> >> The routine is used in two places elsewhere in the compiler. Firstly >> in cfgcleanup and secondly in postreload. >> >> gcc/ChangeLog: >>     * alias.h (mems_same_for_tbaa_p): Declare. >>     * alias.cc (mems_same_for_tbaa_p): New function. >>     * dse.cc (record_store): Use it instead of open-coding >>     alias check. >>     * cselib.h (cselib_redundant_set_p): Declare. >>     * cselib.cc: Include alias.h >>     (cselib_redundant_set_p): New function. >>     * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead >>     of rtx_equal_for_cselib_p. >>     * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p. >>     (reload_cse_noop_set_p): Delete. > Seems quite reasonable.   The only question I would have would be > whether or not you considered including the aliasing info into the > hashing used by cselib.  You'd probably still need the bulk of this > patch as well since we could presumably still get a hash conflict with > two stores of the same value to the same location, but with different > alias sets (it's just much less likely), so perhaps it doesn't really > buy us anything. I thought about this, but if the alias set were included in the hash, then surely you'd get every alias set in a different value. Then you'd miss the cases where the alias sets do conflict even though they are not the same. Anyway, the values /are/ the same so in some circumstances you might want to know that. > > Ideally this would include a testcase.  You might be able to turn that > non-executawble reduced case into something useful by scanning the > post-reload dumps. I considered this as well, but the testcase I have is far too fragile, I think. The existing test only fails on Arm, only fails on 11.2 (not 11.3 or gcc-12 onwards), relies on two objects with the same value being in distinct alias sets but still assigned to the same stack slot and for some copy dance to end up trying to write back the original value to the same slot but with a non-conflicting set. And finally, the scheduler has to then try to move a load past the non-aliasing store. To get anywhere close to this I think we'd need something akin to the gimple reader but for RTL so that we could set up all the conditions for the failure without the risk of an earlier transform blowing the test away. I even considered whether we could start with a gimple dump and bypassing all the tree/gimple transformations, but even that would be still at the mercy of the stack-slot allocation algorithm. > > Jeff R.