From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id 55A693858D1E for ; Sat, 8 Apr 2023 14:02:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 55A693858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 338E1Jva016009; Sat, 8 Apr 2023 09:01:19 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 338E1I6l016008; Sat, 8 Apr 2023 09:01:18 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Sat, 8 Apr 2023 09:01:18 -0500 From: Segher Boessenkool To: Michael Meissner , gcc-patches@gcc.gnu.org, "Kewen.Lin" , David Edelsohn , Peter Bergner , Will Schmidt , chip.kerchner@ibm.com Subject: Re: [PATCH, V3] PR target/70243 - Do not generate vmaddfp or vnmsubdp Message-ID: <20230408140118.GA19790@gate.crashing.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,KAM_MANYTO,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi! On Sat, Apr 08, 2023 at 09:34:51AM -0400, Michael Meissner wrote: > The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors > than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating > these instructions seems to break Eigen on big endian systems. What actually breaks Eigen is not the rounding behaviour (it runs with RN=0b00, like most things do, so round-to-nearest-ties-to-even, the only rounding mode supported by the VMX float insns, no big shocking surprise there). What break Eigen and many other unsuspecting programs unknowingly using VMX is that on Linux programs are started with VSCR[NJ]=1, "Non-Java mode", which means all numbers with unbiased exponent 0 get the mantissa forced to 0 as well, both on input and output (all denormals are flushed to zero of the same sign). This is counter to the various ABIs. I'll submit a patch to Linux soon. But since many people run older kernels, at least for a while more, we need to fix this in GCC. Like your patch does. > PR target/70243 > * config/rs6000/rs6000.md (vsx_fmav4sf4): Do not generate vmaddfp. > (vsx_nfmsv4sf4): Do not generate vnmsubfp. > -;; Fused vector multiply/add instructions. Support the classical Altivec > -;; versions of fma, which allows the target to be a separate register from the > -;; 3 inputs. Under VSX, the target must be either the addend or the first > -;; multiply. > - > +;; Fused vector multiply/add instructions. Do not generate the Altivec versions > +;; of fma (vmaddfp and vnmsubfp). These instructions allows the target to be a > +;; separate register from the 3 inputs, but they have different rounding > +;; behaviors than the VSX instructions. Please mention the VSCR[NJ] thing here as well? Just something very short, just mentioning "NJ" or "Non-Java" is enough. With that: okay for trunk, thank you! Also okay for all backports. Segher