From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by sourceware.org (Postfix) with ESMTPS id 6085238582AC for ; Tue, 2 Aug 2022 23:36:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6085238582AC Received: by mail-pl1-x633.google.com with SMTP id o3so14849178ple.5 for ; Tue, 02 Aug 2022 16:36:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=rhdjH9uljos9zl3Nz9W+vJrFKZVncbjQqvascgKKb1I=; b=vbb1o8X9DjJjreYokoNSXqeQPi6QJdxBI4FFe41xvN/Ccu5n/uboqG60MYNs+AlAqL bIc6GQ6d+fRYMS65UZp1NrK4dP1EG1msWlFd0KCsPTDpN+jhzYTQN7+7hDG0l7oErhT0 aKTNVANq9eKAq05EbZnCe+U1y886VLB8wee/fslGhSmN1rMRh3vYgIJhOov0FenZzEwz OTsHEF7lKq0J7jnvgJw23xUdb4VN7hEh95TldlGvTkdpuWR8QK1QsQlCMhRVz4qK+xZY 3yaRVjPoyJeKVcFlc1N5bBIR8iBz76UZRR4lIKBcVLb27FVH0QQaGsL4HeTwmD9lxOvQ Dpwg== X-Gm-Message-State: ACgBeo00Nf7III8q1M6yaxF9moLltAt3zWu2x+FlCNEoXSeDlgNQaDpD NT9WefVNPEYukaJlz9JOB74bF3wRiDg= X-Google-Smtp-Source: AA6agR7TUHzz4Kh95DvMKMhcicK4vh/QWaT1/D+VSb3/Z/X4W6TfZGS+qNuagMnLHobKBUqG1/w84w== X-Received: by 2002:a17:902:e946:b0:16d:d16:e010 with SMTP id b6-20020a170902e94600b0016d0d16e010mr23487937pll.86.1659483391246; Tue, 02 Aug 2022 16:36:31 -0700 (PDT) Received: from [172.31.0.204] (c-73-98-188-51.hsd1.ut.comcast.net. [73.98.188.51]) by smtp.gmail.com with ESMTPSA id n13-20020a170903110d00b0016c33dc879esm260542plh.113.2022.08.02.16.36.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 02 Aug 2022 16:36:30 -0700 (PDT) Message-ID: <7a7a8864-49d9-b0f0-604b-ec5d98466200@gmail.com> Date: Tue, 2 Aug 2022 17:36:29 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH v2] cselib: add function to check if SET is redundant [PR106187] Content-Language: en-US To: Richard Earnshaw , gcc-patches@gcc.gnu.org References: <9e9b7ec5-4c10-115d-8b6d-a7a07bfd72be@gmail.com> <5df3ffd0-9a15-f4ad-325b-1507107a3504@foss.arm.com> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.7 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Aug 2022 23:36:34 -0000 On 8/2/2022 10:06 AM, Richard Earnshaw wrote: > > > On 01/08/2022 11:38, Richard Earnshaw via Gcc-patches wrote: >> >> >> On 30/07/2022 20:57, Jeff Law via Gcc-patches wrote: >>> >>> >>> On 7/29/2022 7:52 AM, Richard Earnshaw via Gcc-patches wrote: >>>> A SET operation that writes memory may have the same value as an >>>> earlier store but if the alias sets of the new and earlier store do >>>> not conflict then the set is not truly redundant.  This can happen, >>>> for example, if objects of different types share a stack slot. >>>> >>>> To fix this we define a new function in cselib that first checks for >>>> equality and if that is successful then finds the earlier store in the >>>> value history and checks the alias sets. >>>> >>>> The routine is used in two places elsewhere in the compiler. Firstly >>>> in cfgcleanup and secondly in postreload. >>>> >>>> gcc/ChangeLog: >>>>     * alias.h (mems_same_for_tbaa_p): Declare. >>>>     * alias.cc (mems_same_for_tbaa_p): New function. >>>>     * dse.cc (record_store): Use it instead of open-coding >>>>     alias check. >>>>     * cselib.h (cselib_redundant_set_p): Declare. >>>>     * cselib.cc: Include alias.h >>>>     (cselib_redundant_set_p): New function. >>>>     * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead >>>>     of rtx_equal_for_cselib_p. >>>>     * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p. >>>>     (reload_cse_noop_set_p): Delete. >>> Seems quite reasonable.   The only question I would have would be >>> whether or not you considered including the aliasing info into the >>> hashing used by cselib.  You'd probably still need the bulk of this >>> patch as well since we could presumably still get a hash conflict >>> with two stores of the same value to the same location, but with >>> different alias sets (it's just much less likely), so perhaps it >>> doesn't really buy us anything. >> >> I thought about this, but if the alias set were included in the hash, >> then surely you'd get every alias set in a different value.  Then >> you'd miss the cases where the alias sets do conflict even though >> they are not the same.  Anyway, the values /are/ the same so in some >> circumstances you might want to know that. >> >>> >>> Ideally this would include a testcase.  You might be able to turn >>> that non-executawble reduced case into something useful by scanning >>> the post-reload dumps. >> >> I considered this as well, but the testcase I have is far too >> fragile, I think.  The existing test only fails on Arm, only fails on >> 11.2 (not 11.3 or gcc-12 onwards), relies on two objects with the >> same value being in distinct alias sets but still assigned to the >> same stack slot and for some copy dance to end up trying to write >> back the original value to the same slot but with a non-conflicting >> set.  And finally, the scheduler has to then try to move a load past >> the non-aliasing store. > > >> >> To get anywhere close to this I think we'd need something akin to the >> gimple reader but for RTL so that we could set up all the conditions >> for the failure without the risk of an earlier transform blowing the >> test away. > > I wasn't aware of the rtl reader already in the compiler.  But it > doesn't really get me any closer as it is lacking in so many regards: > > - It can't handle (const_double:SF ...) - it tries to handle the > argument as an int.  This is a consequence, I think, of the reader > being based on that for reading machine descriptions where FP > const_double is simply never encountered. > > - It doesn't seem to handle anything much more than very basic types, > and in particular appears to have no way of ensuring that alias sets > match up with the type system. > >> >> I even considered whether we could start with a gimple dump and >> bypassing all the tree/gimple transformations, but even that would be >> still at the mercy of the stack-slot allocation algorithm. > > I spent a while trying to get some gimple out of the dumpers in a form > that was usable, but that's pretty much a non-starter.  To make it > work we'd need to add support for gimple clobbers on objects - without > that there's no way to get the stack-slot sharing code to work.  > Furthermore, even feeding fully-optimized gimple directly into expand > is such a long way from the postreload pass, that I can't believe the > testcase would remain stable for long. > > And the other major issue is that the original testcase is heavily > templated C++ and neither of the parsers gimple or rtl is supported in > cc1plus: converting the boilerplate to be C-friendly is probably going > to be hard. > > I can't afford to spend much more time on this, especially given the > low-quality test we're going to get out of the end of the process. Understood.  Let's just go with the patch as-is.  That's normal for cases where we can't produce a reasonable test. jeff