public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Michael Meissner <meissner@linux.ibm.com>,
	gcc-patches@gcc.gnu.org, "Kewen.Lin" <linkw@linux.ibm.com>,
	David Edelsohn <dje.gcc@gmail.com>,
	Peter Bergner <bergner@linux.ibm.com>,
	Will Schmidt <will_schmidt@vnet.ibm.com>,
	chip.kerchner@ibm.com
Subject: Re: PR target/70243: Do not generate fmaddfp and fnmsubfp
Date: Thu, 6 Apr 2023 15:37:59 -0500	[thread overview]
Message-ID: <20230406203759.GK25951@gate.crashing.org> (raw)
In-Reply-To: <ZC7hS75ohXMo7Qcw@toto.the-meissners.org>

Hi!

On Thu, Apr 06, 2023 at 11:12:11AM -0400, Michael Meissner wrote:
> The Altivec instructions fmaddfp and fnmsubfp have different rounding behaviors

Those are not existing instructions.  You mean "vmaddfp" etc.

> than the VSX xvmaddsp and xvnmsubsp instructions.  In particular, generating
> these instructions seems to break Eigen.

Those instructions use round-to-nearest-tiea-to-even, like all other
VMX FP insns.  A proper patch has to deal with all VMX FP insns.  But,
almost all programs expect that rounding mode anyway, so this is not a
problem in practice.  What happened on Eigen is that the Linux kernel
starts every new process with VSCR[NJ]=1, breaking pretty much
everything that wants floating point for non-toy purposes.  (There
currently is a bug on LE that sets the wrong bit, hiding the problem in
that configuration, but it is intended there as well).

> GCC has generated the Altivec fmaddfp and fnmsubfp instructions on VSX systems
> as an alternative to the xsmadd{a,m}sp and xsnmsub{a,m}sp instructions.  The
> advantage  of the Altivec instructions is that they are 4 operand instructions
> (i.e. the target register does not have to overlap with one of the input
> registers).  The advantage is it can eliminate an extra move instruction.  The
> disadvantage is it does round the same was as the VSX instructions.

And it gets the VSCR[NJ] setting applied.  Yup.

> This patch eliminates the generation of the Altivec fmaddfp and fnmsubfp
> instructions as alternatives in the VSX instruction insn support, and in the
> Altivec insns it adds a test to prevent the insn from being used if VSX is
> available.  I also added a test to the regression test suite.

Please leave the latter out, it does not belong in this patch.  If you
want a patch to do that deal with *all* VMX FP insns?  There also are
add, sub, mul, etc.  Well I think those (as well as madd and nmsub) are
the only ones that use the NJ bit or the RN bits, but please check.

> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -750,12 +750,15 @@ (define_insn "altivec_vsel<mode>4"
>  
>  ;; Fused multiply add.
>  
> +;; If we are using VSX instructions, do not generate the vmaddfp instruction
> +;; since is has different rounding behavior than the xvmaddsp instruction.
> +

No blank lines please.

>  (define_insn "*altivec_fmav4sf4"
>    [(set (match_operand:V4SF 0 "register_operand" "=v")
>  	(fma:V4SF (match_operand:V4SF 1 "register_operand" "v")
>  		  (match_operand:V4SF 2 "register_operand" "v")
>  		  (match_operand:V4SF 3 "register_operand" "v")))]
> -  "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
> +  "VECTOR_UNIT_ALTIVEC_P (V4SFmode) && !TARGET_VSX"

This is very error-prone.  Maybe add a test to the VECTOR_UNIT_ALTIVEC
macro instead?

> -;; Fused vector multiply/add instructions. Support the classical Altivec
> -;; versions of fma, which allows the target to be a separate register from the
> -;; 3 inputs.  Under VSX, the target must be either the addend or the first
> -;; multiply.
> +;; Fused vector multiply/add instructions. Do not use the classical Altivec

(Two spaces after dot, and AltiVec is spelled with a capital V.  I don't
like it either, VMX is a much nicer and more regular name).

> +;; versions of fma.  Those instructions allows the target to be a separate
> +;; register from the 3 inputs, but they have different rounding behaviors.
>  
>  (define_insn "*vsx_fmav4sf4"
> -  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa,wa,v")
> +  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa,wa")
>  	(fma:V4SF
> -	  (match_operand:V4SF 1 "vsx_register_operand" "%wa,wa,v")
> -	  (match_operand:V4SF 2 "vsx_register_operand" "wa,0,v")
> -	  (match_operand:V4SF 3 "vsx_register_operand" "0,wa,v")))]
> +	  (match_operand:V4SF 1 "vsx_register_operand" "%wa,wa")
> +	  (match_operand:V4SF 2 "vsx_register_operand" "wa,0")
> +	  (match_operand:V4SF 3 "vsx_register_operand" "0,wa")))]
>    "VECTOR_UNIT_VSX_P (V4SFmode)"
>    "@
>     xvmaddasp %x0,%x1,%x2
> -   xvmaddmsp %x0,%x1,%x3
> -   vmaddfp %0,%1,%2,%3"
> +   xvmaddmsp %x0,%x1,%x3"
>    [(set_attr "type" "vecfloat")])

So this part looks okay, and it alone is safe for GCC 13 as well.

>  (define_insn "*vsx_nfmsv4sf4"
> -  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa,wa,v")
> +  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa,wa")
>  	(neg:V4SF
>  	 (fma:V4SF
> -	   (match_operand:V4SF 1 "vsx_register_operand" "%wa,wa,v")
> -	   (match_operand:V4SF 2 "vsx_register_operand" "wa,0,v")
> +	   (match_operand:V4SF 1 "vsx_register_operand" "%wa,wa")
> +	   (match_operand:V4SF 2 "vsx_register_operand" "wa,0")
>  	   (neg:V4SF
> -	     (match_operand:V4SF 3 "vsx_register_operand" "0,wa,v")))))]
> +	     (match_operand:V4SF 3 "vsx_register_operand" "0,wa")))))]
>    "VECTOR_UNIT_VSX_P (V4SFmode)"
>    "@
>     xvnmsubasp %x0,%x1,%x2
> -   xvnmsubmsp %x0,%x1,%x3
> -   vnmsubfp %0,%1,%2,%3"
> +   xvnmsubmsp %x0,%x1,%x3"
>    [(set_attr "type" "vecfloat")])

Well, together with this of course :-)

Could you please do that?


Segher

  reply	other threads:[~2023-04-06 20:39 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-06 15:12 Michael Meissner
2023-04-06 20:37 ` Segher Boessenkool [this message]
2023-04-07  6:32   ` Michael Meissner
2023-04-07 11:44     ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230406203759.GK25951@gate.crashing.org \
    --to=segher@kernel.crashing.org \
    --cc=bergner@linux.ibm.com \
    --cc=chip.kerchner@ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linkw@linux.ibm.com \
    --cc=meissner@linux.ibm.com \
    --cc=will_schmidt@vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).