From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id E595D3858424 for ; Tue, 2 Aug 2022 20:08:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E595D3858424 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 272K7Fi2005942; Tue, 2 Aug 2022 15:07:15 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 272K7Eg1005941; Tue, 2 Aug 2022 15:07:14 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Tue, 2 Aug 2022 15:07:14 -0500 From: Segher Boessenkool To: Roger Sayle Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Some additional zero-extension related optimizations in simplify-rtx. Message-ID: <20220802200714.GJ25951@gate.crashing.org> References: <009501d8a1be$b6199e20$224cda60$@nextmovesoftware.com> <20220727202336.GE25951@gate.crashing.org> <041201d8a318$8850d110$98f27330$@nextmovesoftware.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <041201d8a318$8850d110$98f27330$@nextmovesoftware.com> User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Aug 2022 20:08:17 -0000 Hi! On Fri, Jul 29, 2022 at 07:57:51AM +0100, Roger Sayle wrote: > > On Wed, Jul 27, 2022 at 02:42:25PM +0100, Roger Sayle wrote: > > > This patch implements some additional zero-extension and > > > sign-extension related optimizations in simplify-rtx.cc. The original > > > motivation comes from PR rtl-optimization/71775, where in comment #2 > > Andrew Pinski sees: > > > > > > Failed to match this instruction: > > > (set (reg:DI 88 [ _1 ]) > > > (sign_extend:DI (subreg:SI (ctz:DI (reg/v:DI 86 [ x ])) 0))) > > > > > > On many platforms the result of DImode CTZ is constrained to be a > > > small unsigned integer (between 0 and 64), hence the truncation to > > > 32-bits (using a SUBREG) and the following sign extension back to > > > 64-bits are effectively a no-op, so the above should ideally (often) > > > be simplified to "(set (reg:DI 88) (ctz:DI (reg/v:DI 86 [ x ]))". > > > > And you can also do that if ctz is undefined for a zero argument! > > Forgive my perhaps poor use of terminology. The case of ctz 0 on > x64_64 isn't "undefined behaviour" (UB) in the C/C++ sense that > would allow us to do anything, but implementation defined (which > Intel calls "undefined" in their documentation). This is about CTZ in RTL, in GCC. CTZ_DEFINED_VALUE_AT_ZERO is 0 here, which means a zero argument gives an undefined result. > Hence, we don't > know which DI value is placed in the result register. In this case, > truncating to SI mode, then sign extending the result is not a no-op, > as the top bits will/must now all be the same [though admittedly to an > unknown undefined signbit]. And any value is valid. > Hence the above optimization would > be invalid, as it doesn't guarantee the result would be sign-extended. It does not have to be! Truncating an undefined DImode value to SIMode gives an undefined SImode value. On most architectures (including x86 afaik) you do not need to do any machine insn for that (the top 32 bits in the register are just ignored for a SImode value). > > Also, this is not correct for C[LT]Z_DEFINED_VALUE_AT_ZERO non-zero if the > > value it returns in its second arg does not survive sign extending > unmodified (if it > > is 0xffffffff for an extend from SI to DI for example). > > Fortunately, C[LT]Z_DEFINED_VALUE_AT_ZERO being defined to return a negative > result, such as -1 is already handled (accounted for) in nonzero_bits. The > relevant > code in rtlanal.cc's nonzero_bits1 is: A negative result, yes. But that was not my example. Segher