[gcc(refs/users/meissner/heads/work048)] Add IEEE 128-bit fp conditional move on PowerPC.

public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed

* [gcc(refs/users/meissner/heads/work048)] Add IEEE 128-bit fp conditional move on PowerPC.
@ 2021-04-14 19:54 Michael Meissner
  0 siblings, 0 replies; 2+ messages in thread
From: Michael Meissner @ 2021-04-14 19:54 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:2bb8f1e5428c7b38e6f69ce84d7a895621605864

commit 2bb8f1e5428c7b38e6f69ce84d7a895621605864
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Wed Apr 14 15:54:00 2021 -0400

    Add IEEE 128-bit fp conditional move on PowerPC.
    
    This patch adds the support for power10 IEEE 128-bit floating point conditional
    move and for automatically generating min/max.  Unlike the previous patch, I
    decided to keep two separate patterns for fpmask before splitting (one pattern
    for normal compares, and the other pattern for inverted compares).  I can go
    back to a single pattern with a new predicate that allows either comparison.
    
    Compared to the original code, these patterns do simplify the fpmask insns to
    having one alternative instead of two.  In the original code, the first
    alternative tried to use the result as a temporary register.  But that doesn't
    work if you are doing a conditional move with SF/DF types, but the comparison
    is KF/TF.  That is because the SF/DF types can use the traditional FPR
    registers, but IEEE 128-bit floating point can only do arithmetic in the
    traditional Altivec registers.
    
    This code also has to insert a XXPERMDI if you are moving KF/TF values, but
    the comparison is done with SF/DF values.  In this case, the set and compare
    mask for SF/DF clears the bottom 64-bits of the register, and the XXPERMDI is
    needed to fill it.
    
    gcc/
    2021-04-14 Michael Meissner  <meissner@linux.ibm.com>
    
            * config/rs6000/rs6000.c (have_compare_and_set_mask): Add IEEE
            128-bit floating point types.
            * config/rs6000/rs6000.md (FPMASK): New iterator.
            (FPMASK2): New iterator.
            (Fv mode attribute): Add KFmode and TFmode.
            (mov<FPMASK:mode><FPMASK2:mode>cc_fpmask): Replace
            mov<SFDF:mode><SFDF2:mode>cc_p9.  Add IEEE 128-bit fp support.
            (mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask): Replace
            mov<SFDF:mode><SFDF2:mode>cc_invert_p9.  Add IEEE 128-bit fp
            support.
            (fpmask<mode>): Add IEEE 128-bit fp support.  Enable generator to
            build te RTL.
            (xxsel<mode>): Add IEEE 128-bit fp support.  Enable generator to
            build te RTL.
    
    gcc/testsuite/
    2021-04-14  Michael Meissner  <meissner@linux.ibm.com>
    
            * gcc.target/powerpc/float128-cmove.c: New test.
            * gcc.target/powerpc/float128-minmax-3.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000.c                         |   8 +-
 gcc/config/rs6000/rs6000.md                        | 190 ++++++++++++++-------
 gcc/testsuite/gcc.target/powerpc/float128-cmove.c  |  93 ++++++++++
 .../gcc.target/powerpc/float128-minmax-3.c         |  15 ++
 4 files changed, 240 insertions(+), 66 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8d00f99e9fd..fecdcf6e01e 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15706,8 +15706,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   return 1;
 }
 
-/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
-   minimum with "C" semantics.
+/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
+   maximum or minimum with "C" semantics.
 
    Unless you use -ffast-math, you can't use these instructions to replace
    conditions that implicitly reverse the condition because the comparison
@@ -15843,6 +15843,10 @@ have_compare_and_set_mask (machine_mode mode)
     case E_DFmode:
       return TARGET_P9_MINMAX;
 
+    case E_KFmode:
+    case E_TFmode:
+      return (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode));
+
     default:
       break;
     }
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 17b2fdc1cdd..33f3e7c2159 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -575,6 +575,23 @@
 ; And again, for when we need two FP modes in a pattern.
 (define_mode_iterator SFDF2 [SF DF])
 
+; Floating scalars that supports the set compare mask instruction.
+(define_mode_iterator FPMASK [SF
+			      DF
+			      (KF "(TARGET_POWER10 && TARGET_FLOAT128_HW
+			            && FLOAT128_IEEE_P (KFmode))")
+			      (TF "(TARGET_POWER10 && TARGET_FLOAT128_HW
+			            && FLOAT128_IEEE_P (TFmode))")])
+
+; And again, for patterns that need two (potentially) different floating point
+; scalars that support the set compare mask instruction.
+(define_mode_iterator FPMASK2 [SF
+			       DF
+			       (KF "(TARGET_POWER10 && TARGET_FLOAT128_HW
+			             && FLOAT128_IEEE_P (KFmode))")
+			       (TF "(TARGET_POWER10 && TARGET_FLOAT128_HW
+			             && FLOAT128_IEEE_P (TFmode))")])
+
 ; A generic s/d attribute, for sp/dp for example.
 (define_mode_attr sd [(SF   "s") (DF   "d")
 		      (V4SF "s") (V2DF "d")])
@@ -608,8 +625,13 @@
 ; SF/DF constraint for arithmetic on VSX registers using instructions added in
 ; ISA 2.06 (power7).  This includes instructions that normally target DF mode,
 ; but are used on SFmode, since internally SFmode values are kept in the DFmode
-; format.
-(define_mode_attr Fv		[(SF "wa") (DF "wa") (DI "wa")])
+; format.  Also include IEEE 128-bit instructions which are restricted to the
+; Altivec registers.
+(define_mode_attr Fv		[(SF "wa")
+				 (DF "wa")
+				 (DI "wa")
+				 (KF "v")
+				 (TF "v")])
 
 ; Which isa is needed for those float instructions?
 (define_mode_attr Fisa		[(SF "p8v")  (DF "*") (DI "*")])
@@ -5316,10 +5338,10 @@
 
 ;; Floating point conditional move
 (define_expand "mov<mode>cc"
-   [(set (match_operand:SFDF 0 "gpc_reg_operand")
-	 (if_then_else:SFDF (match_operand 1 "comparison_operator")
-			    (match_operand:SFDF 2 "gpc_reg_operand")
-			    (match_operand:SFDF 3 "gpc_reg_operand")))]
+   [(set (match_operand:FPMASK 0 "gpc_reg_operand")
+	 (if_then_else:FPMASK (match_operand 1 "comparison_operator")
+			      (match_operand:FPMASK 2 "gpc_reg_operand")
+			      (match_operand:FPMASK 3 "gpc_reg_operand")))]
   "TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT"
 {
   if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
@@ -5339,92 +5361,132 @@
   "fsel %0,%1,%2,%3"
   [(set_attr "type" "fp")])
 
-(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_p9"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
-	(if_then_else:SFDF
+(define_insn_and_split "*mov<FPMASK:mode><FPMASK2:mode>cc_fpmask"
+  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=<FPMASK:Fv>")
+	(if_then_else:FPMASK
 	 (match_operator:CCFP 1 "fpmask_comparison_operator"
-		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
-		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
-	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
-	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
-   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
+	    [(match_operand:FPMASK2 2 "vsx_register_operand" "<FPMASK2:Fv>")
+	     (match_operand:FPMASK2 3 "vsx_register_operand" "<FPMASK2:Fv>")])
+	 (match_operand:FPMASK 4 "vsx_register_operand" "<FPMASK:Fv>")
+	 (match_operand:FPMASK 5 "vsx_register_operand" "<FPMASK:Fv>")))
+   (clobber (match_scratch:V2DI 6 "=&<FPMASK2:Fv>"))]
   "TARGET_P9_MINMAX"
   "#"
   "&& 1"
-  [(set (match_dup 6)
-	(if_then_else:V2DI (match_dup 1)
-			   (match_dup 7)
-			   (match_dup 8)))
-   (set (match_dup 0)
-	(if_then_else:SFDF (ne (match_dup 6)
-			       (match_dup 8))
-			   (match_dup 4)
-			   (match_dup 5)))]
+  [(pc)]
 {
-  if (GET_CODE (operands[6]) == SCRATCH)
-    operands[6] = gen_reg_rtx (V2DImode);
+  rtx dest = operands[0];
+  rtx cmp = operands[1];
+  rtx cmp_op1 = operands[2];
+  rtx cmp_op2 = operands[3];
+  rtx move_t = operands[4];
+  rtx move_f = operands[5];
+  rtx mask_reg = operands[6];
+  rtx mask_m1 = CONSTM1_RTX (V2DImode);
+  rtx mask_0 = CONST0_RTX (V2DImode);
+  machine_mode move_mode = <FPMASK:MODE>mode;
+  machine_mode compare_mode = <FPMASK2:MODE>mode;
+
+  if (GET_CODE (mask_reg) == SCRATCH)
+    mask_reg = gen_reg_rtx (V2DImode);
 
-  operands[7] = CONSTM1_RTX (V2DImode);
-  operands[8] = CONST0_RTX (V2DImode);
+  /* Emit the compare and set mask instruction.  */
+  emit_insn (gen_fpmask<FPMASK2:mode> (mask_reg, cmp, cmp_op1, cmp_op2,
+				       mask_m1, mask_0));
+
+  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
+     mask.  Because we are using the splat builtin to extend the V2DImode, we
+     need to use element 1 on little endian systems.  */
+  if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode))
+    {
+      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
+      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
+    }
+
+  /* Now emit the XXSEL insn.  */
+  emit_insn (gen_xxsel<FPMASK:mode> (dest, mask_reg, mask_0, move_t, move_f));
+  DONE;
 }
- [(set_attr "length" "8")
+ ;; length is 12 in case we need to add XXPERMDI
+ [(set_attr "length" "12")
   (set_attr "type" "vecperm")])
 
 ;; Handle inverting the fpmask comparisons.
-(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_invert_p9"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
-	(if_then_else:SFDF
+(define_insn_and_split "*mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask"
+  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=<FPMASK:Fv>")
+	(if_then_else:FPMASK
 	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
-		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
-		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
-	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
-	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
-   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
+	    [(match_operand:FPMASK2 2 "vsx_register_operand" "<FPMASK2:Fv>")
+	     (match_operand:FPMASK2 3 "vsx_register_operand" "<FPMASK2:Fv>")])
+	 (match_operand:FPMASK 4 "vsx_register_operand" "<FPMASK:Fv>")
+	 (match_operand:FPMASK 5 "vsx_register_operand" "<FPMASK:Fv>")))
+   (clobber (match_scratch:V2DI 6 "=&<FPMASK2:Fv>"))]
   "TARGET_P9_MINMAX"
   "#"
   "&& 1"
-  [(set (match_dup 6)
-	(if_then_else:V2DI (match_dup 9)
-			   (match_dup 7)
-			   (match_dup 8)))
-   (set (match_dup 0)
-	(if_then_else:SFDF (ne (match_dup 6)
-			       (match_dup 8))
-			   (match_dup 5)
-			   (match_dup 4)))]
+  [(pc)]
 {
-  rtx op1 = operands[1];
-  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
+  rtx dest = operands[0];
+  rtx old_cmp = operands[1];
+  rtx cmp_op1 = operands[2];
+  rtx cmp_op2 = operands[3];
+  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (old_cmp));
+  rtx cmp_rev = gen_rtx_fmt_ee (cond, CCFPmode, cmp_op1, cmp_op2);
+  rtx move_f = operands[4];
+  rtx move_t = operands[5];
+  rtx mask_reg = operands[6];
+  rtx mask_m1 = CONSTM1_RTX (V2DImode);
+  rtx mask_0 = CONST0_RTX (V2DImode);
+  machine_mode move_mode = <FPMASK:MODE>mode;
+  machine_mode compare_mode = <FPMASK2:MODE>mode;
 
-  if (GET_CODE (operands[6]) == SCRATCH)
-    operands[6] = gen_reg_rtx (V2DImode);
+  if (GET_CODE (mask_reg) == SCRATCH)
+    mask_reg = gen_reg_rtx (V2DImode);
 
-  operands[7] = CONSTM1_RTX (V2DImode);
-  operands[8] = CONST0_RTX (V2DImode);
+  /* Emit the compare and set mask instruction.  */
+  emit_insn (gen_fpmask<FPMASK2:mode> (mask_reg, cmp_rev, cmp_op1, cmp_op2,
+				       mask_m1, mask_0));
 
-  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
+  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
+     mask.  Because we are using the splat builtin to extend the V2DImode, we
+     need to use element 1 on little endian systems.  */
+  if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode))
+    {
+      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
+      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
+    }
+
+  /* Now emit the XXSEL insn.  */
+  emit_insn (gen_xxsel<FPMASK:mode> (dest, mask_reg, mask_0, move_t, move_f));
+  DONE;
 }
- [(set_attr "length" "8")
+ ;; length is 12 in case we need to add XXPERMDI
+ [(set_attr "length" "12")
   (set_attr "type" "vecperm")])
 
-(define_insn "*fpmask<mode>"
-  [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa")
+(define_insn "fpmask<mode>"
+  [(set (match_operand:V2DI 0 "vsx_register_operand" "=<Fv>")
 	(if_then_else:V2DI
 	 (match_operator:CCFP 1 "fpmask_comparison_operator"
-		[(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")
-		 (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")])
+		[(match_operand:FPMASK 2 "vsx_register_operand" "<Fv>")
+		 (match_operand:FPMASK 3 "vsx_register_operand" "<Fv>")])
 	 (match_operand:V2DI 4 "all_ones_constant" "")
 	 (match_operand:V2DI 5 "zero_constant" "")))]
   "TARGET_P9_MINMAX"
-  "xscmp%V1dp %x0,%x2,%x3"
+{
+  return (FLOAT128_IEEE_P (<MODE>mode)
+	  ? "xscmp%V1qp %0,%2,%3"
+	  : "xscmp%V1dp %x0,%x2,%x3");
+}
   [(set_attr "type" "fpcompare")])
 
-(define_insn "*xxsel<mode>"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
-	(if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
-			       (match_operand:V2DI 2 "zero_constant" ""))
-			   (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")
-			   (match_operand:SFDF 4 "vsx_register_operand" "<Fv>")))]
+(define_insn "xxsel<mode>"
+  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=wa")
+	(if_then_else:FPMASK
+	 (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
+	     (match_operand:V2DI 2 "zero_constant" ""))
+	 (match_operand:FPMASK 3 "vsx_register_operand" "wa")
+	 (match_operand:FPMASK 4 "vsx_register_operand" "wa")))]
   "TARGET_P9_MINMAX"
   "xxsel %x0,%x4,%x3,%x1"
   [(set_attr "type" "vecmove")])
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
new file mode 100644
index 00000000000..639d5a77087
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
@@ -0,0 +1,93 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+/* { dg-final { scan-assembler     {\mxscmpeq[dq]p\M} } } */
+/* { dg-final { scan-assembler     {\mxxpermdi\M}     } } */
+/* { dg-final { scan-assembler     {\mxxsel\M}        } } */
+/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
+/* { dg-final { scan-assembler-not {\mfcmp[uo]\M}     } } */
+/* { dg-final { scan-assembler-not {\mfsel\M}         } } */
+
+/* This series of tests tests whether you can do a conditional move where the
+   test is one floating point type, and the result is another floating point
+   type.
+
+   If the comparison type is SF/DFmode, and the move type is IEEE 128-bit
+   floating point, we have to duplicate the mask in the lower 64-bits with
+   XXPERMDI because XSCMPEQDP clears the bottom 64-bits of the mask register.
+
+   Going the other way (IEEE 128-bit comparsion, 64-bit move) is fine as the
+   mask word will be 128-bits.  */
+
+float
+eq_f_d (float a, float b, double x, double y)
+{
+  return (x == y) ? a : b;
+}
+
+double
+eq_d_f (double a, double b, float x, float y)
+{
+  return (x == y) ? a : b;
+}
+
+float
+eq_f_f128 (float a, float b, __float128 x, __float128 y)
+{
+  return (x == y) ? a : b;
+}
+
+double
+eq_d_f128 (double a, double b, __float128 x, __float128 y)
+{
+  return (x == y) ? a : b;
+}
+
+__float128
+eq_f128_f (__float128 a, __float128 b, float x, float y)
+{
+  return (x == y) ? a : b;
+}
+
+__float128
+eq_f128_d (__float128 a, __float128 b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
+
+float
+ne_f_d (float a, float b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
+
+double
+ne_d_f (double a, double b, float x, float y)
+{
+  return (x != y) ? a : b;
+}
+
+float
+ne_f_f128 (float a, float b, __float128 x, __float128 y)
+{
+  return (x != y) ? a : b;
+}
+
+double
+ne_d_f128 (double a, double b, __float128 x, __float128 y)
+{
+  return (x != y) ? a : b;
+}
+
+__float128
+ne_f128_f (__float128 a, __float128 b, float x, float y)
+{
+  return (x != y) ? a : b;
+}
+
+__float128
+ne_f128_d (__float128 a, __float128 b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
new file mode 100644
index 00000000000..6f7627c0f2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; }
+TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [gcc(refs/users/meissner/heads/work048)] Add IEEE 128-bit fp conditional move on PowerPC.
@ 2021-04-15  0:34 Michael Meissner
  0 siblings, 0 replies; 2+ messages in thread
From: Michael Meissner @ 2021-04-15  0:34 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:5795c035821808a3ff2e2018e18ebde4444e3fe8

commit 5795c035821808a3ff2e2018e18ebde4444e3fe8
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Wed Apr 14 20:33:57 2021 -0400

    Add IEEE 128-bit fp conditional move on PowerPC.
    
    This patch adds the support for power10 IEEE 128-bit floating point conditional
    move and for automatically generating min/max.
    
    In this patch, I simplified things.  Instead of allowing any four of the modes
    to be used for the conditional move comparison and the move itself could use
    different modes, I restricted the conditional move to just the same mode.
    I.e. you can do:
    
            _Float128 a, b, c, d, e, r;
    
            r = (a == b) ? c : d;
    
    But you can't do:
    
            _Float128 a, b;
            double c, d, r;
    
            r = (a == b) ? c : d;
    
    This eliminates a lot of the complexity of the code, because you don't have to
    worry about the sizes being different, and the IEEE 128-bit types being
    restricted to Altivec registers, while the SF/DF modes can use any VSX
    register.
    
    gcc/
    2021-04-14 Michael Meissner  <meissner@linux.ibm.com>
    
            * config/rs6000/rs6000.c (have_compare_and_set_mask): Add IEEE
            128-bit floating point types.
            * config/rs6000/rs6000.md (mov<mode>cc, IEEE128 iterator): New insn.
            (mov<mode>cc_p10, IEEE128 iterator): New insn.
            (mov<mode>cc_invert_p10, IEEE128 iterator): New insn.
            (fpmask<mode>, IEEE128 iterator): New insn.
            (xxsel<mode>, IEEE128 iterator): New insn.
    
    gcc/testsuite/
    2021-04-14  Michael Meissner  <meissner@linux.ibm.com>
    
            * gcc.target/powerpc/float128-cmove.c: New test.
            * gcc.target/powerpc/float128-minmax-3.c: New test.

Diff:
---
 gcc/config/rs6000/rs6000.c                         |   8 +-
 gcc/config/rs6000/rs6000.md                        | 106 +++++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/float128-cmove.c  |  58 +++++++++++
 .../gcc.target/powerpc/float128-minmax-3.c         |  15 +++
 4 files changed, 185 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8d00f99e9fd..f979d7320ad 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15706,8 +15706,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   return 1;
 }
 
-/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
-   minimum with "C" semantics.
+/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
+   maximum or minimum with "C" semantics.
 
    Unless you use -ffast-math, you can't use these instructions to replace
    conditions that implicitly reverse the condition because the comparison
@@ -15843,6 +15843,10 @@ have_compare_and_set_mask (machine_mode mode)
     case E_DFmode:
       return TARGET_P9_MINMAX;
 
+    case E_KFmode:
+    case E_TFmode:
+      return TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode);
+
     default:
       break;
     }
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 17b2fdc1cdd..c74fc8ce6fc 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5429,6 +5429,112 @@
   "xxsel %x0,%x4,%x3,%x1"
   [(set_attr "type" "vecmove")])
 
+;; Support for ISA 3.1 IEEE 128-bit conditional move.  The mode used in the
+;; comparison must be the same as used in the conditional move.
+(define_expand "mov<mode>cc"
+   [(set (match_operand:IEEE128 0 "gpc_reg_operand")
+	 (if_then_else:IEEE128 (match_operand 1 "comparison_operator")
+			       (match_operand:IEEE128 2 "gpc_reg_operand")
+			       (match_operand:IEEE128 3 "gpc_reg_operand")))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+{
+  if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
+    DONE;
+  else
+    FAIL;
+})
+
+(define_insn_and_split "*mov<mode>cc_p10"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=&v,v")
+	(if_then_else:IEEE128
+	 (match_operator:CCFP 1 "fpmask_comparison_operator"
+		[(match_operand:IEEE128 2 "altivec_register_operand" "v,v")
+		 (match_operand:IEEE128 3 "altivec_register_operand" "v,v")])
+	 (match_operand:IEEE128 4 "altivec_register_operand" "v,v")
+	 (match_operand:IEEE128 5 "altivec_register_operand" "v,v")))
+   (clobber (match_scratch:V2DI 6 "=0,&v"))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "#"
+  "&& 1"
+  [(set (match_dup 6)
+	(if_then_else:V2DI (match_dup 1)
+			   (match_dup 7)
+			   (match_dup 8)))
+   (set (match_dup 0)
+	(if_then_else:IEEE128 (ne (match_dup 6)
+				  (match_dup 8))
+			      (match_dup 4)
+			      (match_dup 5)))]
+{
+  if (GET_CODE (operands[6]) == SCRATCH)
+    operands[6] = gen_reg_rtx (V2DImode);
+
+  operands[7] = CONSTM1_RTX (V2DImode);
+  operands[8] = CONST0_RTX (V2DImode);
+}
+ [(set_attr "length" "8")
+  (set_attr "type" "vecperm")])
+
+;; Handle inverting the fpmask comparisons.
+(define_insn_and_split "*mov<mode>cc_invert_p10"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=&v,v")
+	(if_then_else:IEEE128
+	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
+		[(match_operand:IEEE128 2 "altivec_register_operand" "v,v")
+		 (match_operand:IEEE128 3 "altivec_register_operand" "v,v")])
+	 (match_operand:IEEE128 4 "altivec_register_operand" "v,v")
+	 (match_operand:IEEE128 5 "altivec_register_operand" "v,v")))
+   (clobber (match_scratch:V2DI 6 "=0,&v"))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "#"
+  "&& 1"
+  [(set (match_dup 6)
+	(if_then_else:V2DI (match_dup 9)
+			   (match_dup 7)
+			   (match_dup 8)))
+   (set (match_dup 0)
+	(if_then_else:IEEE128 (ne (match_dup 6)
+				  (match_dup 8))
+			      (match_dup 5)
+			      (match_dup 4)))]
+{
+  rtx op1 = operands[1];
+  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
+
+  if (GET_CODE (operands[6]) == SCRATCH)
+    operands[6] = gen_reg_rtx (V2DImode);
+
+  operands[7] = CONSTM1_RTX (V2DImode);
+  operands[8] = CONST0_RTX (V2DImode);
+
+  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
+}
+ [(set_attr "length" "8")
+  (set_attr "type" "vecperm")])
+
+(define_insn "*fpmask<mode>"
+  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
+	(if_then_else:V2DI
+	 (match_operator:CCFP 1 "fpmask_comparison_operator"
+		[(match_operand:IEEE128 2 "altivec_register_operand" "v")
+		 (match_operand:IEEE128 3 "altivec_register_operand" "v")])
+	 (match_operand:V2DI 4 "all_ones_constant" "")
+	 (match_operand:V2DI 5 "zero_constant" "")))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xscmp%V1qp %0,%2,%3"
+  [(set_attr "type" "fpcompare")])
+
+(define_insn "*xxsel<mode>"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(if_then_else:IEEE128
+	 (ne (match_operand:V2DI 1 "altivec_register_operand" "v")
+	     (match_operand:V2DI 2 "zero_constant" ""))
+	 (match_operand:IEEE128 3 "altivec_register_operand" "v")
+	 (match_operand:IEEE128 4 "altivec_register_operand" "v")))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xxsel %x0,%x4,%x3,%x1"
+  [(set_attr "type" "vecmove")])
+
 \f
 ;; Conversions to and from floating-point.
 
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
new file mode 100644
index 00000000000..2fae8dc23bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
@@ -0,0 +1,58 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#ifndef TYPE
+#ifdef __LONG_DOUBLE_IEEE128__
+#define TYPE long double
+
+#else
+#define TYPE _Float128
+#endif
+#endif
+
+/* Verify that the ISA 3.1 (power10) IEEE 128-bit conditional move instructions
+   are generated.  */
+
+TYPE
+eq (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a == b) ? c : d;
+}
+
+TYPE
+ne (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a != b) ? c : d;
+}
+
+TYPE
+lt (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a < b) ? c : d;
+}
+
+TYPE
+le (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a <= b) ? c : d;
+}
+
+TYPE
+gt (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a > b) ? c : d;
+}
+
+TYPE
+ge (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a >= b) ? c : d;
+}
+
+/* { dg-final { scan-assembler-times {\mxscmpeqqp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxscmpgeqp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxscmpgtqp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxxsel\M}     6 } } */
+/* { dg-final { scan-assembler-not   {\mxscmpuqp\M}    } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
new file mode 100644
index 00000000000..6f7627c0f2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; }
+TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-04-15  0:34 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-14 19:54 [gcc(refs/users/meissner/heads/work048)] Add IEEE 128-bit fp conditional move on PowerPC Michael Meissner
2021-04-15  0:34 Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).