From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id E549D385C40C for ; Mon, 6 Sep 2021 14:25:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E549D385C40C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 186EOtw3000960; Mon, 6 Sep 2021 09:24:55 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 186EOs4d000959; Mon, 6 Sep 2021 09:24:54 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Mon, 6 Sep 2021 09:24:54 -0500 From: Segher Boessenkool To: Roger Sayle Cc: "'GCC Patches'" Subject: Re: [PATCH] Simplify paradoxical subreg extensions of TRUNCATE Message-ID: <20210906142454.GU1583@gate.crashing.org> References: <001401d7a2a5$5bf07db0$13d17910$@nextmovesoftware.com> <20210906101414.GT1583@gate.crashing.org> <001401d7a312$d7ca09c0$875e1d40$@nextmovesoftware.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <001401d7a312$d7ca09c0$875e1d40$@nextmovesoftware.com> User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, TXREP, T_SPF_HELO_PERMERROR, T_SPF_PERMERROR autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Sep 2021 14:25:57 -0000 Hi! On Mon, Sep 06, 2021 at 12:32:13PM +0100, Roger Sayle wrote: > I think the current documentation is sufficient. During compilation, GCC's > combine pass will often substitute a register with an expression defining > it's value, and then attempt to simplify it. As you point out, this often > produces > (temporary intermediate) expressions with poorly defined semantics. Not "poorly defined". The documentation says (in rtl.texi) @findex subreg @item (subreg:@var{m1} @var{reg:m2} @var{bytenum}) @code{subreg} expressions are used to refer to a register in a machine mode other than its natural one, or to refer to one register of a multi-part @code{reg} that actually refers to several registers. It goes on to say there also are subregs of mem currently; it exhaustively lists all things you can take a subreg of, what the different semantics of those are, how the semantics are further "modified" (read: completely different) if some RTL flags are set, etc. The instruction combiner very often creates invalid RTL (only structurally valid, like, no missing operands). Invalid RTL is supposed to never match (because backends will not have patterns that match these). combine even creates invalid RTL on purpose (a clobber of const0_rtx) to ensure its result is deemed invalid, when it wants to abort a combination attempt for some reason. > The > purpose of my patch is to avoid (constructing) SUBREGs that have TRUNCATE > as an argument (that some folks consider undefined) and instead simplify > them to either a single TRUNCATE or a SUBREG of a REG, both of which are > well defined. I'm making/helping the implementation match the > documentation. But this should never make it to simplify-rtx at all. You can only validly do such transformations in combine, because you need to know what insns you started with etc. simplify-rtx is a part of combine, sure, but it is used from other contexts as well nowadays. If you can safely do this from combine, you can do it in combine. > Trying 10 -> 15: > 10: r29:QI=trunc(r32:SI) > REG_DEAD r32:SI > 15: r38:HI=r29:QI#0 > REG_DEAD r29:QI > Failed to match this instruction: > (set (reg:HI 38) > (subreg:HI (truncate:QI (reg:SI 32)) 0)) > > with this patch: > > Trying 10 -> 15: > 10: r29:QI=trunc(r32:SI) > REG_DEAD r32:SI > 15: r38:HI=r29:QI#0 > REG_DEAD r29:QI > Successfully matched this instruction: > (set (reg:HI 38) > (truncate:HI (reg:SI 32))) > allowing combination of insns 10 and 15 > original costs 4 + 12 = 16 > replacement cost 4 > deferring deletion of insn with uid = 10. (It appears costs are a bit off on your target? Or is HImode especially expensive for some reason?) There is no subreg of a non-reg in there, so this is fine. > A few lines earlier in this function, it simplifies SUBREGs of CONST_INT, > which > would also have poorly defined semantics. Perhaps your interpretation of > the > RTL documentation doesn't strictly apply to this part of the RTL optimizers? Perhaps this is a bug. It is if it actually allows subregs of ints as input. Segher