[PATCH 0/2] Add IEEE 128-bit min/max support on PowerPC

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 0/2] Add IEEE 128-bit min/max support on PowerPC
@ 2021-04-09 14:38 Michael Meissner
  2021-04-09 14:42 ` [PATCH 1/2] " Michael Meissner
  2021-04-09 14:43 ` [PATCH 2/2] " Michael Meissner
  0 siblings, 2 replies; 22+ messages in thread
From: Michael Meissner @ 2021-04-09 14:38 UTC (permalink / raw)
  To: gcc-patches, Michael Meissner, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

These patches have been posted quite a few times before.  My memory is I
addressed the concerns posted with the last set of changes in November.

These two patches add support for the ISA 3.1 (power10) instructions xsmaxcqp,
xsmincqp, xscmpeqqp, xscmpgeqp, and xscmpgtqp.

I have tested these patches quite extensively on both little endian and big
endian over the months.

The first patch adds the basic min/max support and generates the xsmaxcqp and
xsmincqp instructions if -Ofast is used.

The second patch adds support for conditional move.  This also enables
generating the min/max instructions in some cases without using -Ofast.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-09 14:38 [PATCH 0/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
@ 2021-04-09 14:42 ` Michael Meissner
  2021-04-09 16:54   ` will schmidt
  2021-04-13 22:19   ` Segher Boessenkool
  2021-04-09 14:43 ` [PATCH 2/2] " Michael Meissner
  1 sibling, 2 replies; 22+ messages in thread
From: Michael Meissner @ 2021-04-09 14:42 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

Add IEEE 128-bit min/max support on PowerPC.

This patch has been posted various times in the past.  My memory is the last
time I changed the patch, I addressed the concerns posted at that time.  Since
then the patch seems to have gone into a limbo state.

This patch adds the support for the IEEE 128-bit floating point C minimum and
maximum instructions.  The next patch will add the support for using the
compare and set mask instruction to implement conditional moves.

Rather than trying to overload the current SF/DF min/max support, it was
simpler to just provide the new instructions as a separate insn.

I have tested this patch in various little endian and big endian PowerPC builds
since I've posted.  It has no regressions, and it adds the instructions  if
-mcpu=power10 is used.

gcc/
2021-04-09  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
	3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp instructions.
	* config/rs6000/rs60000.h (FLOAT128_MIN_MAX_FPMASK_P): New macro.
	* config/rs6000/rs6000.md (s<minmax><mode>3): Add support for the
	ISA 3.1 IEEE 128-bit minimum and maximum instructions.

gcc/testsuite/
2021-04-09  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/float128-minmax-2.c: New test.
---
 gcc/config/rs6000/rs6000.c                        |  3 ++-
 gcc/config/rs6000/rs6000.h                        |  5 +++++
 gcc/config/rs6000/rs6000.md                       | 11 +++++++++++
 .../gcc.target/powerpc/float128-minmax-2.c        | 15 +++++++++++++++
 4 files changed, 33 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 35f5c332c41..e87686c1c4d 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16111,7 +16111,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
   /* VSX/altivec have direct min/max insns.  */
   if ((code == SMAX || code == SMIN)
       && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
-	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
+	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
+	  || FLOAT128_MIN_MAX_FPMASK_P (mode)))
     {
       emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
       return;
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 233a92baf3c..e3fb0798622 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -345,6 +345,11 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
    || ((MODE) == TDmode)						\
    || (!TARGET_FLOAT128_TYPE && FLOAT128_IEEE_P (MODE)))
 
+/* Macro whether the float128 minimum, maximum, and set compare mask
+   instructions are enabled.  */
+#define FLOAT128_MIN_MAX_FPMASK_P(MODE)					\
+  (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (MODE))
+
 /* Return true for floating point that does not use a vector register.  */
 #define SCALAR_FLOAT_MODE_NOT_VECTOR_P(MODE)				\
   (SCALAR_FLOAT_MODE_P (MODE) && !FLOAT128_VECTOR_P (MODE))
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c8cdc42533c..17b2fdc1cdd 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5194,6 +5194,17 @@ (define_insn "*s<minmax><mode>3_vsx"
 }
   [(set_attr "type" "fp")])
 
+;; Min/max for ISA 3.1 IEEE 128-bit floating point
+(define_insn "s<minmax><mode>3"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(fp_minmax:IEEE128
+	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
+	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10"
+  "xs<minmax>cqp %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
 ;; The conditional move instructions allow us to perform max and min operations
 ;; even when we don't have the appropriate max/min instruction using the FSEL
 ;; instruction.
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
new file mode 100644
index 00000000000..c71ba08c9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
+TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-09 14:38 [PATCH 0/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
  2021-04-09 14:42 ` [PATCH 1/2] " Michael Meissner
@ 2021-04-09 14:43 ` Michael Meissner
  2021-04-09 16:54   ` will schmidt
  2021-04-14 19:38   ` Segher Boessenkool
  1 sibling, 2 replies; 22+ messages in thread
From: Michael Meissner @ 2021-04-09 14:43 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

Add IEEE 128-bit fp conditional move on PowerPC.

This patch has been posted various times in the past.  My memory is the last
time I changed the patch, I addressed the concerns posted at that time.  Since
then the patch seems to have gone into a limbo state.

This patch adds the support for power10 IEEE 128-bit floating point conditional
move and for automatically generating min/max.  Unlike the previous patch, I
decided to keep two separate patterns for fpmask before splitting (one pattern
for normal compares, and the other pattern for inverted compares).  I can go
back to a single pattern with a new predicate that allows either comparison.

Compared to the original code, these patterns do simplify the fpmask insns to
having one alternative instead of two.  In the original code, the first
alternative tried to use the result as a temporary register.  But that doesn't
work if you are doing a conditional move with SF/DF types, but the comparison
is KF/TF.  That is because the SF/DF types can use the traditional FPR
registers, but IEEE 128-bit floating point can only do arithmetic in the
traditional Altivec registers.

This code also has to insert a XXPERMDI if you are moving KF/TF values, but
the comparison is done with SF/DF values.  In this case, the set and compare
mask for SF/DF clears the bottom 64-bits of the register, and the XXPERMDI is
needed to fill it.

I have tested this patch in various little endian and big endian PowerPC builds
since I've posted.  It has no regressions, and it adds the instructions  if
-mcpu=power10 is used.

gcc/
2021-04-09 Michael Meissner  <meissner@linux.ibm.com>

        * config/rs6000/rs6000.c (have_compare_and_set_mask): Add IEEE
        128-bit floating point types.
        * config/rs6000/rs6000.md (FPMASK): New iterator.
        (FPMASK2): New iterator.
        (Fv mode attribute): Add KFmode and TFmode.
        (mov<FPMASK:mode><FPMASK2:mode>cc_fpmask): Replace
        mov<SFDF:mode><SFDF2:mode>cc_p9.  Add IEEE 128-bit fp support.
        (mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask): Replace
        mov<SFDF:mode><SFDF2:mode>cc_invert_p9.  Add IEEE 128-bit fp
        support.
        (fpmask<mode>): Add IEEE 128-bit fp support.  Enable generator to
        build te RTL.
        (xxsel<mode>): Add IEEE 128-bit fp support.  Enable generator to
        build te RTL.

gcc/testsuite/
2021-04-09  Michael Meissner  <meissner@linux.ibm.com>

        * gcc.target/powerpc/float128-cmove.c: New test.
        * gcc.target/powerpc/float128-minmax-3.c: New test.
---
 gcc/config/rs6000/rs6000.c                    |   8 +-
 gcc/config/rs6000/rs6000.md                   | 186 ++++++++++++------
 .../gcc.target/powerpc/float128-cmove.c       |  93 +++++++++
 .../gcc.target/powerpc/float128-minmax-3.c    |  15 ++
 4 files changed, 236 insertions(+), 66 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index e87686c1c4d..ad0d83f6d3f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15706,8 +15706,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   return 1;
 }
 
-/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
-   minimum with "C" semantics.
+/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
+   maximum or minimum with "C" semantics.
 
    Unless you use -ffast-math, you can't use these instructions to replace
    conditions that implicitly reverse the condition because the comparison
@@ -15843,6 +15843,10 @@ have_compare_and_set_mask (machine_mode mode)
     case E_DFmode:
       return TARGET_P9_MINMAX;
 
+    case E_KFmode:
+    case E_TFmode:
+      return FLOAT128_MIN_MAX_FPMASK_P (mode);
+
     default:
       break;
     }
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 17b2fdc1cdd..ca4a4d01f05 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -575,6 +575,19 @@ (define_mode_iterator SFDF [SF DF])
 ; And again, for when we need two FP modes in a pattern.
 (define_mode_iterator SFDF2 [SF DF])
 
+; Floating scalars that supports the set compare mask instruction.
+(define_mode_iterator FPMASK [SF
+			      DF
+			      (KF "FLOAT128_MIN_MAX_FPMASK_P (KFmode)")
+			      (TF "FLOAT128_MIN_MAX_FPMASK_P (TFmode)")])
+
+; And again, for patterns that need two (potentially) different floating point
+; scalars that support the set compare mask instruction.
+(define_mode_iterator FPMASK2 [SF
+			       DF
+			       (KF "FLOAT128_MIN_MAX_FPMASK_P (KFmode)")
+			       (TF "FLOAT128_MIN_MAX_FPMASK_P (TFmode)")])
+
 ; A generic s/d attribute, for sp/dp for example.
 (define_mode_attr sd [(SF   "s") (DF   "d")
 		      (V4SF "s") (V2DF "d")])
@@ -608,8 +621,13 @@ (define_mode_attr Ff		[(SF "f") (DF "d") (DI "d")])
 ; SF/DF constraint for arithmetic on VSX registers using instructions added in
 ; ISA 2.06 (power7).  This includes instructions that normally target DF mode,
 ; but are used on SFmode, since internally SFmode values are kept in the DFmode
-; format.
-(define_mode_attr Fv		[(SF "wa") (DF "wa") (DI "wa")])
+; format.  Also include IEEE 128-bit instructions which are restricted to the
+; Altivec registers.
+(define_mode_attr Fv		[(SF "wa")
+				 (DF "wa")
+				 (DI "wa")
+				 (KF "v")
+				 (TF "v")])
 
 ; Which isa is needed for those float instructions?
 (define_mode_attr Fisa		[(SF "p8v")  (DF "*") (DI "*")])
@@ -5316,10 +5334,10 @@ (define_insn "*setnbcr_<un>signed_<GPR:mode>"
 
 ;; Floating point conditional move
 (define_expand "mov<mode>cc"
-   [(set (match_operand:SFDF 0 "gpc_reg_operand")
-	 (if_then_else:SFDF (match_operand 1 "comparison_operator")
-			    (match_operand:SFDF 2 "gpc_reg_operand")
-			    (match_operand:SFDF 3 "gpc_reg_operand")))]
+   [(set (match_operand:FPMASK 0 "gpc_reg_operand")
+	 (if_then_else:FPMASK (match_operand 1 "comparison_operator")
+			      (match_operand:FPMASK 2 "gpc_reg_operand")
+			      (match_operand:FPMASK 3 "gpc_reg_operand")))]
   "TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT"
 {
   if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
@@ -5339,92 +5357,132 @@ (define_insn "*fsel<SFDF:mode><SFDF2:mode>4"
   "fsel %0,%1,%2,%3"
   [(set_attr "type" "fp")])
 
-(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_p9"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
-	(if_then_else:SFDF
+(define_insn_and_split "*mov<FPMASK:mode><FPMASK2:mode>cc_fpmask"
+  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=<FPMASK:Fv>")
+	(if_then_else:FPMASK
 	 (match_operator:CCFP 1 "fpmask_comparison_operator"
-		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
-		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
-	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
-	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
-   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
+	    [(match_operand:FPMASK2 2 "vsx_register_operand" "<FPMASK2:Fv>")
+	     (match_operand:FPMASK2 3 "vsx_register_operand" "<FPMASK2:Fv>")])
+	 (match_operand:FPMASK 4 "vsx_register_operand" "<FPMASK:Fv>")
+	 (match_operand:FPMASK 5 "vsx_register_operand" "<FPMASK:Fv>")))
+   (clobber (match_scratch:V2DI 6 "=&<FPMASK2:Fv>"))]
   "TARGET_P9_MINMAX"
   "#"
   "&& 1"
-  [(set (match_dup 6)
-	(if_then_else:V2DI (match_dup 1)
-			   (match_dup 7)
-			   (match_dup 8)))
-   (set (match_dup 0)
-	(if_then_else:SFDF (ne (match_dup 6)
-			       (match_dup 8))
-			   (match_dup 4)
-			   (match_dup 5)))]
+  [(pc)]
 {
-  if (GET_CODE (operands[6]) == SCRATCH)
-    operands[6] = gen_reg_rtx (V2DImode);
+  rtx dest = operands[0];
+  rtx cmp = operands[1];
+  rtx cmp_op1 = operands[2];
+  rtx cmp_op2 = operands[3];
+  rtx move_t = operands[4];
+  rtx move_f = operands[5];
+  rtx mask_reg = operands[6];
+  rtx mask_m1 = CONSTM1_RTX (V2DImode);
+  rtx mask_0 = CONST0_RTX (V2DImode);
+  machine_mode move_mode = <FPMASK:MODE>mode;
+  machine_mode compare_mode = <FPMASK2:MODE>mode;
+
+  if (GET_CODE (mask_reg) == SCRATCH)
+    mask_reg = gen_reg_rtx (V2DImode);
 
-  operands[7] = CONSTM1_RTX (V2DImode);
-  operands[8] = CONST0_RTX (V2DImode);
+  /* Emit the compare and set mask instruction.  */
+  emit_insn (gen_fpmask<FPMASK2:mode> (mask_reg, cmp, cmp_op1, cmp_op2,
+				       mask_m1, mask_0));
+
+  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
+     mask.  Because we are using the splat builtin to extend the V2DImode, we
+     need to use element 1 on little endian systems.  */
+  if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode))
+    {
+      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
+      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
+    }
+
+  /* Now emit the XXSEL insn.  */
+  emit_insn (gen_xxsel<FPMASK:mode> (dest, mask_reg, mask_0, move_t, move_f));
+  DONE;
 }
- [(set_attr "length" "8")
+ ;; length is 12 in case we need to add XXPERMDI
+ [(set_attr "length" "12")
   (set_attr "type" "vecperm")])
 
 ;; Handle inverting the fpmask comparisons.
-(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_invert_p9"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
-	(if_then_else:SFDF
+(define_insn_and_split "*mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask"
+  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=<FPMASK:Fv>")
+	(if_then_else:FPMASK
 	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
-		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
-		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
-	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
-	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
-   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
+	    [(match_operand:FPMASK2 2 "vsx_register_operand" "<FPMASK2:Fv>")
+	     (match_operand:FPMASK2 3 "vsx_register_operand" "<FPMASK2:Fv>")])
+	 (match_operand:FPMASK 4 "vsx_register_operand" "<FPMASK:Fv>")
+	 (match_operand:FPMASK 5 "vsx_register_operand" "<FPMASK:Fv>")))
+   (clobber (match_scratch:V2DI 6 "=&<FPMASK2:Fv>"))]
   "TARGET_P9_MINMAX"
   "#"
   "&& 1"
-  [(set (match_dup 6)
-	(if_then_else:V2DI (match_dup 9)
-			   (match_dup 7)
-			   (match_dup 8)))
-   (set (match_dup 0)
-	(if_then_else:SFDF (ne (match_dup 6)
-			       (match_dup 8))
-			   (match_dup 5)
-			   (match_dup 4)))]
+  [(pc)]
 {
-  rtx op1 = operands[1];
-  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
+  rtx dest = operands[0];
+  rtx old_cmp = operands[1];
+  rtx cmp_op1 = operands[2];
+  rtx cmp_op2 = operands[3];
+  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (old_cmp));
+  rtx cmp_rev = gen_rtx_fmt_ee (cond, CCFPmode, cmp_op1, cmp_op2);
+  rtx move_f = operands[4];
+  rtx move_t = operands[5];
+  rtx mask_reg = operands[6];
+  rtx mask_m1 = CONSTM1_RTX (V2DImode);
+  rtx mask_0 = CONST0_RTX (V2DImode);
+  machine_mode move_mode = <FPMASK:MODE>mode;
+  machine_mode compare_mode = <FPMASK2:MODE>mode;
 
-  if (GET_CODE (operands[6]) == SCRATCH)
-    operands[6] = gen_reg_rtx (V2DImode);
+  if (GET_CODE (mask_reg) == SCRATCH)
+    mask_reg = gen_reg_rtx (V2DImode);
 
-  operands[7] = CONSTM1_RTX (V2DImode);
-  operands[8] = CONST0_RTX (V2DImode);
+  /* Emit the compare and set mask instruction.  */
+  emit_insn (gen_fpmask<FPMASK2:mode> (mask_reg, cmp_rev, cmp_op1, cmp_op2,
+				       mask_m1, mask_0));
 
-  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
+  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
+     mask.  Because we are using the splat builtin to extend the V2DImode, we
+     need to use element 1 on little endian systems.  */
+  if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode))
+    {
+      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
+      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
+    }
+
+  /* Now emit the XXSEL insn.  */
+  emit_insn (gen_xxsel<FPMASK:mode> (dest, mask_reg, mask_0, move_t, move_f));
+  DONE;
 }
- [(set_attr "length" "8")
+ ;; length is 12 in case we need to add XXPERMDI
+ [(set_attr "length" "12")
   (set_attr "type" "vecperm")])
 
-(define_insn "*fpmask<mode>"
-  [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa")
+(define_insn "fpmask<mode>"
+  [(set (match_operand:V2DI 0 "vsx_register_operand" "=<Fv>")
 	(if_then_else:V2DI
 	 (match_operator:CCFP 1 "fpmask_comparison_operator"
-		[(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")
-		 (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")])
+		[(match_operand:FPMASK 2 "vsx_register_operand" "<Fv>")
+		 (match_operand:FPMASK 3 "vsx_register_operand" "<Fv>")])
 	 (match_operand:V2DI 4 "all_ones_constant" "")
 	 (match_operand:V2DI 5 "zero_constant" "")))]
   "TARGET_P9_MINMAX"
-  "xscmp%V1dp %x0,%x2,%x3"
+{
+  return (FLOAT128_IEEE_P (<MODE>mode)
+	  ? "xscmp%V1qp %0,%2,%3"
+	  : "xscmp%V1dp %x0,%x2,%x3");
+}
   [(set_attr "type" "fpcompare")])
 
-(define_insn "*xxsel<mode>"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
-	(if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
-			       (match_operand:V2DI 2 "zero_constant" ""))
-			   (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")
-			   (match_operand:SFDF 4 "vsx_register_operand" "<Fv>")))]
+(define_insn "xxsel<mode>"
+  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=wa")
+	(if_then_else:FPMASK
+	 (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
+	     (match_operand:V2DI 2 "zero_constant" ""))
+	 (match_operand:FPMASK 3 "vsx_register_operand" "wa")
+	 (match_operand:FPMASK 4 "vsx_register_operand" "wa")))]
   "TARGET_P9_MINMAX"
   "xxsel %x0,%x4,%x3,%x1"
   [(set_attr "type" "vecmove")])
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
new file mode 100644
index 00000000000..639d5a77087
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
@@ -0,0 +1,93 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+/* { dg-final { scan-assembler     {\mxscmpeq[dq]p\M} } } */
+/* { dg-final { scan-assembler     {\mxxpermdi\M}     } } */
+/* { dg-final { scan-assembler     {\mxxsel\M}        } } */
+/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
+/* { dg-final { scan-assembler-not {\mfcmp[uo]\M}     } } */
+/* { dg-final { scan-assembler-not {\mfsel\M}         } } */
+
+/* This series of tests tests whether you can do a conditional move where the
+   test is one floating point type, and the result is another floating point
+   type.
+
+   If the comparison type is SF/DFmode, and the move type is IEEE 128-bit
+   floating point, we have to duplicate the mask in the lower 64-bits with
+   XXPERMDI because XSCMPEQDP clears the bottom 64-bits of the mask register.
+
+   Going the other way (IEEE 128-bit comparsion, 64-bit move) is fine as the
+   mask word will be 128-bits.  */
+
+float
+eq_f_d (float a, float b, double x, double y)
+{
+  return (x == y) ? a : b;
+}
+
+double
+eq_d_f (double a, double b, float x, float y)
+{
+  return (x == y) ? a : b;
+}
+
+float
+eq_f_f128 (float a, float b, __float128 x, __float128 y)
+{
+  return (x == y) ? a : b;
+}
+
+double
+eq_d_f128 (double a, double b, __float128 x, __float128 y)
+{
+  return (x == y) ? a : b;
+}
+
+__float128
+eq_f128_f (__float128 a, __float128 b, float x, float y)
+{
+  return (x == y) ? a : b;
+}
+
+__float128
+eq_f128_d (__float128 a, __float128 b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
+
+float
+ne_f_d (float a, float b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
+
+double
+ne_d_f (double a, double b, float x, float y)
+{
+  return (x != y) ? a : b;
+}
+
+float
+ne_f_f128 (float a, float b, __float128 x, __float128 y)
+{
+  return (x != y) ? a : b;
+}
+
+double
+ne_d_f128 (double a, double b, __float128 x, __float128 y)
+{
+  return (x != y) ? a : b;
+}
+
+__float128
+ne_f128_f (__float128 a, __float128 b, float x, float y)
+{
+  return (x != y) ? a : b;
+}
+
+__float128
+ne_f128_d (__float128 a, __float128 b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
new file mode 100644
index 00000000000..6f7627c0f2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; }
+TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-09 14:42 ` [PATCH 1/2] " Michael Meissner
@ 2021-04-09 16:54   ` will schmidt
  2021-04-13 17:57     ` Segher Boessenkool
  2021-04-13 22:19   ` Segher Boessenkool
  1 sibling, 1 reply; 22+ messages in thread
From: will schmidt @ 2021-04-09 16:54 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Fri, 2021-04-09 at 10:42 -0400, Michael Meissner wrote:
> Add IEEE 128-bit min/max support on PowerPC.
> 
> This patch has been posted various times in the past.  My memory is the last
> time I changed the patch, I addressed the concerns posted at that time.  Since
> then the patch seems to have gone into a limbo state.

Hi,

I'll throw some comments at this below, and see if it will trigger more
follow-up. 

> 
> This patch adds the support for the IEEE 128-bit floating point C minimum and
> maximum instructions.  The next patch will add the support for using the
> compare and set mask instruction to implement conditional moves.
> 
> Rather than trying to overload the current SF/DF min/max support, it was
> simpler to just provide the new instructions as a separate insn.
> 
> I have tested this patch in various little endian and big endian PowerPC builds
> since I've posted.  It has no regressions, and it adds the instructions  if
> -mcpu=power10 is used.
> 
> gcc/
> 2021-04-09  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> 	3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp instructions.

I don't see any direct reference to xsmaxcqp or xsmincqp with respect
to this change below. 

It looks like this change adds the FLOAT128_MIN_MAX_FPMASK_P (mode)
check
as criteria for emitting some form of a SET instruction. 
       emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));

Ok, I see it now,  the instructions are mildly obfuscated by
"xs<minmax>cqp" as part of the rs6000.md change below.




> 	* config/rs6000/rs60000.h (FLOAT128_MIN_MAX_FPMASK_P): New macro.

which is
#define FLOAT128_MIN_MAX_FPMASK_P(MODE)	\
  (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (MODE))

Are there any non MIN_MAX scenarios that will require the combination
of POWER10,FLOAT128_HW,FLOAT128_IEEE(mode)?  I'd wonder if there is a name
not specific to *_MIN_MAX_* that would be a better naming choice.
But, naming is hard. :-)


> 	* config/rs6000/rs6000.md (s<minmax><mode>3): Add support for the
> 	ISA 3.1 IEEE 128-bit minimum and maximum instructions.

I'd move the "xsmaxcqp,xsmincqp" instruction references from the rs6000.c changelog blurb to this changelog blurb.

I've looked over the rest, no further relevant comments below.
thanks
-Will

> 
> gcc/testsuite/
> 2021-04-09  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* gcc.target/powerpc/float128-minmax-2.c: New test.
> ---
>  gcc/config/rs6000/rs6000.c                        |  3 ++-
>  gcc/config/rs6000/rs6000.h                        |  5 +++++
>  gcc/config/rs6000/rs6000.md                       | 11 +++++++++++
>  .../gcc.target/powerpc/float128-minmax-2.c        | 15 +++++++++++++++
>  4 files changed, 33 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 35f5c332c41..e87686c1c4d 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -16111,7 +16111,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
>    /* VSX/altivec have direct min/max insns.  */
>    if ((code == SMAX || code == SMIN)
>        && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
> -	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
> +	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
> +	  || FLOAT128_MIN_MAX_FPMASK_P (mode)))
>      {
>        emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
>        return;

ok


> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index 233a92baf3c..e3fb0798622 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -345,6 +345,11 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
>     || ((MODE) == TDmode)						\
>     || (!TARGET_FLOAT128_TYPE && FLOAT128_IEEE_P (MODE)))
> 
> +/* Macro whether the float128 minimum, maximum, and set compare mask
> +   instructions are enabled.  */
> +#define FLOAT128_MIN_MAX_FPMASK_P(MODE)					\
> +  (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (MODE))
> +
>  /* Return true for floating point that does not use a vector register.  */
>  #define SCALAR_FLOAT_MODE_NOT_VECTOR_P(MODE)				\
>    (SCALAR_FLOAT_MODE_P (MODE) && !FLOAT128_VECTOR_P (MODE))


ok

> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index c8cdc42533c..17b2fdc1cdd 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -5194,6 +5194,17 @@ (define_insn "*s<minmax><mode>3_vsx"
>  }
>    [(set_attr "type" "fp")])
> 
> +;; Min/max for ISA 3.1 IEEE 128-bit floating point
> +(define_insn "s<minmax><mode>3"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> +	(fp_minmax:IEEE128
> +	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
> +	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10"
> +  "xs<minmax>cqp %0,%1,%2"
> +  [(set_attr "type" "vecfloat")
> +   (set_attr "size" "128")])
> +
>  ;; The conditional move instructions allow us to perform max and min operations
>  ;; even when we don't have the appropriate max/min instruction using the FSEL
>  ;; instruction.

ok

> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> new file mode 100644
> index 00000000000..c71ba08c9f8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
> +
> +#ifndef TYPE
> +#define TYPE _Float128
> +#endif
> +
> +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
> +   call.  */
> +TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
> +TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
> +
> +/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
> +/* { dg-final { scan-assembler {\mxsmincqp\M} } } */

ok

> -- 
> 2.22.0
> 
> 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-09 14:43 ` [PATCH 2/2] " Michael Meissner
@ 2021-04-09 16:54   ` will schmidt
  2021-04-09 19:20     ` Bernhard Reutner-Fischer
  2021-04-14 19:38   ` Segher Boessenkool
  1 sibling, 1 reply; 22+ messages in thread
From: will schmidt @ 2021-04-09 16:54 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Fri, 2021-04-09 at 10:43 -0400, Michael Meissner wrote:
> Add IEEE 128-bit fp conditional move on PowerPC.
> 
> This patch has been posted various times in the past.  My memory is the last
> time I changed the patch, I addressed the concerns posted at that time.  Since
> then the patch seems to have gone into a limbo state.

This is covered in the patch series title page, Don't distract from the
patch itself here.

> 
> This patch adds the support for power10 IEEE 128-bit floating point conditional
> move and for automatically generating min/max.  Unlike the previous patch, I
> decided to keep two separate patterns for fpmask before splitting (one pattern
> for normal compares, and the other pattern for inverted compares).  I can go
> back to a single pattern with a new predicate that allows either comparison.

ok.

> 
> Compared to the original code, these patterns do simplify the fpmask insns to
> having one alternative instead of two.  In the original code, the first
> alternative tried to use the result as a temporary register.  But that doesn't
> work if you are doing a conditional move with SF/DF types, but the comparison
> is KF/TF.  That is because the SF/DF types can use the traditional FPR
> registers, but IEEE 128-bit floating point can only do arithmetic in the
> traditional Altivec registers.

ok.

> 
> This code also has to insert a XXPERMDI if you are moving KF/TF values, but
> the comparison is done with SF/DF values.  In this case, the set and compare
> mask for SF/DF clears the bottom 64-bits of the register, and the XXPERMDI is
> needed to fill it.

ok.

> 
> I have tested this patch in various little endian and big endian PowerPC builds
> since I've posted.  It has no regressions, and it adds the instructions  if
> -mcpu=power10 is used.
> 
> gcc/
> 2021-04-09 Michael Meissner  <meissner@linux.ibm.com>
> 
>         * config/rs6000/rs6000.c (have_compare_and_set_mask): Add IEEE
>         128-bit floating point types.

ok

>         * config/rs6000/rs6000.md (FPMASK): New iterator.
>         (FPMASK2): New iterator.

comment on this below.


>         (Fv mode attribute): Add KFmode and TFmode.

ok


Missing an entry?  I'm not certain I've followed changelog versus code
accurately here.  May need an additional entry, stl
	(mov<mode>cc): Replace SFDF with FPMASK



>         (mov<FPMASK:mode><FPMASK2:mode>cc_fpmask): Replace
>         mov<SFDF:mode><SFDF2:mode>cc_p9.  Add IEEE 128-bit fp support.
>         (mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask): Replace
>         mov<SFDF:mode><SFDF2:mode>cc_invert_p9.  Add IEEE 128-bit fp
>         support.
>         (fpmask<mode>): Add IEEE 128-bit fp support.  Enable generator to
>         build te RTL.
>         (xxsel<mode>): Add IEEE 128-bit fp support.  Enable generator to
>         build te RTL.
ok


> 
> gcc/testsuite/
> 2021-04-09  Michael Meissner  <meissner@linux.ibm.com>
> 
>         * gcc.target/powerpc/float128-cmove.c: New test.
>         * gcc.target/powerpc/float128-minmax-3.c: New test.
> ---
>  gcc/config/rs6000/rs6000.c                    |   8 +-
>  gcc/config/rs6000/rs6000.md                   | 186 ++++++++++++------
>  .../gcc.target/powerpc/float128-cmove.c       |  93 +++++++++
>  .../gcc.target/powerpc/float128-minmax-3.c    |  15 ++
>  4 files changed, 236 insertions(+), 66 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index e87686c1c4d..ad0d83f6d3f 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15706,8 +15706,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>    return 1;
>  }
> 
> -/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
> -   minimum with "C" semantics.
> +/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
> +   maximum or minimum with "C" semantics.
> 
>     Unless you use -ffast-math, you can't use these instructions to replace
>     conditions that implicitly reverse the condition because the comparison
> @@ -15843,6 +15843,10 @@ have_compare_and_set_mask (machine_mode mode)
>      case E_DFmode:
>        return TARGET_P9_MINMAX;
> 
> +    case E_KFmode:
> +    case E_TFmode:
> +      return FLOAT128_MIN_MAX_FPMASK_P (mode);
> +
>      default:
>        break;
>      }
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 17b2fdc1cdd..ca4a4d01f05 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -575,6 +575,19 @@ (define_mode_iterator SFDF [SF DF])
>  ; And again, for when we need two FP modes in a pattern.
>  (define_mode_iterator SFDF2 [SF DF])
> 
> +; Floating scalars that supports the set compare mask instruction.
> +(define_mode_iterator FPMASK [SF
> +			      DF
> +			      (KF "FLOAT128_MIN_MAX_FPMASK_P (KFmode)")
> +			      (TF "FLOAT128_MIN_MAX_FPMASK_P (TFmode)")])
> +
> +; And again, for patterns that need two (potentially) different floating point
> +; scalars that support the set compare mask instruction.

All the words are here, i think.  But I'd be explicit that the defines
are identical by design.  


I've read over the rest, nothing else jumps out at me.

Thanks
-WIll



> +(define_mode_iterator FPMASK2 [SF
> +			       DF
> +			       (KF "FLOAT128_MIN_MAX_FPMASK_P (KFmode)")
> +			       (TF "FLOAT128_MIN_MAX_FPMASK_P (TFmode)")])
> +
>  ; A generic s/d attribute, for sp/dp for example.
>  (define_mode_attr sd [(SF   "s") (DF   "d")
>  		      (V4SF "s") (V2DF "d")])
> @@ -608,8 +621,13 @@ (define_mode_attr Ff		[(SF "f") (DF "d") (DI "d")])
>  ; SF/DF constraint for arithmetic on VSX registers using instructions added in
>  ; ISA 2.06 (power7).  This includes instructions that normally target DF mode,
>  ; but are used on SFmode, since internally SFmode values are kept in the DFmode
> -; format.
> -(define_mode_attr Fv		[(SF "wa") (DF "wa") (DI "wa")])
> +; format.  Also include IEEE 128-bit instructions which are restricted to the
> +; Altivec registers.
> +(define_mode_attr Fv		[(SF "wa")
> +				 (DF "wa")
> +				 (DI "wa")
> +				 (KF "v")
> +				 (TF "v")])
> 
>  ; Which isa is needed for those float instructions?
>  (define_mode_attr Fisa		[(SF "p8v")  (DF "*") (DI "*")])
> @@ -5316,10 +5334,10 @@ (define_insn "*setnbcr_<un>signed_<GPR:mode>"
> 
>  ;; Floating point conditional move
>  (define_expand "mov<mode>cc"
> -   [(set (match_operand:SFDF 0 "gpc_reg_operand")
> -	 (if_then_else:SFDF (match_operand 1 "comparison_operator")
> -			    (match_operand:SFDF 2 "gpc_reg_operand")
> -			    (match_operand:SFDF 3 "gpc_reg_operand")))]
> +   [(set (match_operand:FPMASK 0 "gpc_reg_operand")
> +	 (if_then_else:FPMASK (match_operand 1 "comparison_operator")
> +			      (match_operand:FPMASK 2 "gpc_reg_operand")
> +			      (match_operand:FPMASK 3 "gpc_reg_operand")))]
>    "TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT"
>  {
>    if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
> @@ -5339,92 +5357,132 @@ (define_insn "*fsel<SFDF:mode><SFDF2:mode>4"
>    "fsel %0,%1,%2,%3"
>    [(set_attr "type" "fp")])
> 
> -(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_p9"
> -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
> -	(if_then_else:SFDF
> +(define_insn_and_split "*mov<FPMASK:mode><FPMASK2:mode>cc_fpmask"
> +  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=<FPMASK:Fv>")
> +	(if_then_else:FPMASK
>  	 (match_operator:CCFP 1 "fpmask_comparison_operator"
> -		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
> -		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
> -	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
> -	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
> -   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
> +	    [(match_operand:FPMASK2 2 "vsx_register_operand" "<FPMASK2:Fv>")
> +	     (match_operand:FPMASK2 3 "vsx_register_operand" "<FPMASK2:Fv>")])
> +	 (match_operand:FPMASK 4 "vsx_register_operand" "<FPMASK:Fv>")
> +	 (match_operand:FPMASK 5 "vsx_register_operand" "<FPMASK:Fv>")))
> +   (clobber (match_scratch:V2DI 6 "=&<FPMASK2:Fv>"))]
>    "TARGET_P9_MINMAX"
>    "#"
>    "&& 1"
> -  [(set (match_dup 6)
> -	(if_then_else:V2DI (match_dup 1)
> -			   (match_dup 7)
> -			   (match_dup 8)))
> -   (set (match_dup 0)
> -	(if_then_else:SFDF (ne (match_dup 6)
> -			       (match_dup 8))
> -			   (match_dup 4)
> -			   (match_dup 5)))]
> +  [(pc)]
>  {
> -  if (GET_CODE (operands[6]) == SCRATCH)
> -    operands[6] = gen_reg_rtx (V2DImode);
> +  rtx dest = operands[0];
> +  rtx cmp = operands[1];
> +  rtx cmp_op1 = operands[2];
> +  rtx cmp_op2 = operands[3];
> +  rtx move_t = operands[4];
> +  rtx move_f = operands[5];
> +  rtx mask_reg = operands[6];
> +  rtx mask_m1 = CONSTM1_RTX (V2DImode);
> +  rtx mask_0 = CONST0_RTX (V2DImode);
> +  machine_mode move_mode = <FPMASK:MODE>mode;
> +  machine_mode compare_mode = <FPMASK2:MODE>mode;
> +
> +  if (GET_CODE (mask_reg) == SCRATCH)
> +    mask_reg = gen_reg_rtx (V2DImode);
> 
> -  operands[7] = CONSTM1_RTX (V2DImode);
> -  operands[8] = CONST0_RTX (V2DImode);
> +  /* Emit the compare and set mask instruction.  */
> +  emit_insn (gen_fpmask<FPMASK2:mode> (mask_reg, cmp, cmp_op1, cmp_op2,
> +				       mask_m1, mask_0));
> +
> +  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
> +     mask.  Because we are using the splat builtin to extend the V2DImode, we
> +     need to use element 1 on little endian systems.  */
> +  if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode))
> +    {
> +      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
> +      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
> +    }
> +
> +  /* Now emit the XXSEL insn.  */
> +  emit_insn (gen_xxsel<FPMASK:mode> (dest, mask_reg, mask_0, move_t, move_f));
> +  DONE;
>  }
> - [(set_attr "length" "8")
> + ;; length is 12 in case we need to add XXPERMDI
> + [(set_attr "length" "12")
>    (set_attr "type" "vecperm")])
> 
>  ;; Handle inverting the fpmask comparisons.
> -(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_invert_p9"
> -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
> -	(if_then_else:SFDF
> +(define_insn_and_split "*mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask"
> +  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=<FPMASK:Fv>")
> +	(if_then_else:FPMASK
>  	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
> -		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
> -		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
> -	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
> -	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
> -   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
> +	    [(match_operand:FPMASK2 2 "vsx_register_operand" "<FPMASK2:Fv>")
> +	     (match_operand:FPMASK2 3 "vsx_register_operand" "<FPMASK2:Fv>")])
> +	 (match_operand:FPMASK 4 "vsx_register_operand" "<FPMASK:Fv>")
> +	 (match_operand:FPMASK 5 "vsx_register_operand" "<FPMASK:Fv>")))
> +   (clobber (match_scratch:V2DI 6 "=&<FPMASK2:Fv>"))]
>    "TARGET_P9_MINMAX"
>    "#"
>    "&& 1"
> -  [(set (match_dup 6)
> -	(if_then_else:V2DI (match_dup 9)
> -			   (match_dup 7)
> -			   (match_dup 8)))
> -   (set (match_dup 0)
> -	(if_then_else:SFDF (ne (match_dup 6)
> -			       (match_dup 8))
> -			   (match_dup 5)
> -			   (match_dup 4)))]
> +  [(pc)]
>  {
> -  rtx op1 = operands[1];
> -  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
> +  rtx dest = operands[0];
> +  rtx old_cmp = operands[1];
> +  rtx cmp_op1 = operands[2];
> +  rtx cmp_op2 = operands[3];
> +  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (old_cmp));
> +  rtx cmp_rev = gen_rtx_fmt_ee (cond, CCFPmode, cmp_op1, cmp_op2);
> +  rtx move_f = operands[4];
> +  rtx move_t = operands[5];
> +  rtx mask_reg = operands[6];
> +  rtx mask_m1 = CONSTM1_RTX (V2DImode);
> +  rtx mask_0 = CONST0_RTX (V2DImode);
> +  machine_mode move_mode = <FPMASK:MODE>mode;
> +  machine_mode compare_mode = <FPMASK2:MODE>mode;
> 
> -  if (GET_CODE (operands[6]) == SCRATCH)
> -    operands[6] = gen_reg_rtx (V2DImode);
> +  if (GET_CODE (mask_reg) == SCRATCH)
> +    mask_reg = gen_reg_rtx (V2DImode);
> 
> -  operands[7] = CONSTM1_RTX (V2DImode);
> -  operands[8] = CONST0_RTX (V2DImode);
> +  /* Emit the compare and set mask instruction.  */
> +  emit_insn (gen_fpmask<FPMASK2:mode> (mask_reg, cmp_rev, cmp_op1, cmp_op2,
> +				       mask_m1, mask_0));
> 
> -  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
> +  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
> +     mask.  Because we are using the splat builtin to extend the V2DImode, we
> +     need to use element 1 on little endian systems.  */
> +  if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode))
> +    {
> +      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
> +      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
> +    }
> +
> +  /* Now emit the XXSEL insn.  */
> +  emit_insn (gen_xxsel<FPMASK:mode> (dest, mask_reg, mask_0, move_t, move_f));
> +  DONE;
>  }
> - [(set_attr "length" "8")
> + ;; length is 12 in case we need to add XXPERMDI
> + [(set_attr "length" "12")
>    (set_attr "type" "vecperm")])
> 
> -(define_insn "*fpmask<mode>"
> -  [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa")
> +(define_insn "fpmask<mode>"
> +  [(set (match_operand:V2DI 0 "vsx_register_operand" "=<Fv>")
>  	(if_then_else:V2DI
>  	 (match_operator:CCFP 1 "fpmask_comparison_operator"
> -		[(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")
> -		 (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")])
> +		[(match_operand:FPMASK 2 "vsx_register_operand" "<Fv>")
> +		 (match_operand:FPMASK 3 "vsx_register_operand" "<Fv>")])
>  	 (match_operand:V2DI 4 "all_ones_constant" "")
>  	 (match_operand:V2DI 5 "zero_constant" "")))]
>    "TARGET_P9_MINMAX"
> -  "xscmp%V1dp %x0,%x2,%x3"
> +{
> +  return (FLOAT128_IEEE_P (<MODE>mode)
> +	  ? "xscmp%V1qp %0,%2,%3"
> +	  : "xscmp%V1dp %x0,%x2,%x3");
> +}
>    [(set_attr "type" "fpcompare")])
> 
> -(define_insn "*xxsel<mode>"
> -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
> -	(if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
> -			       (match_operand:V2DI 2 "zero_constant" ""))
> -			   (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")
> -			   (match_operand:SFDF 4 "vsx_register_operand" "<Fv>")))]
> +(define_insn "xxsel<mode>"
> +  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=wa")
> +	(if_then_else:FPMASK
> +	 (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
> +	     (match_operand:V2DI 2 "zero_constant" ""))
> +	 (match_operand:FPMASK 3 "vsx_register_operand" "wa")
> +	 (match_operand:FPMASK 4 "vsx_register_operand" "wa")))]
>    "TARGET_P9_MINMAX"
>    "xxsel %x0,%x4,%x3,%x1"
>    [(set_attr "type" "vecmove")])
> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> new file mode 100644
> index 00000000000..639d5a77087
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> @@ -0,0 +1,93 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +/* { dg-final { scan-assembler     {\mxscmpeq[dq]p\M} } } */
> +/* { dg-final { scan-assembler     {\mxxpermdi\M}     } } */
> +/* { dg-final { scan-assembler     {\mxxsel\M}        } } */
> +/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
> +/* { dg-final { scan-assembler-not {\mfcmp[uo]\M}     } } */
> +/* { dg-final { scan-assembler-not {\mfsel\M}         } } */
> +
> +/* This series of tests tests whether you can do a conditional move where the
> +   test is one floating point type, and the result is another floating point
> +   type.
> +
> +   If the comparison type is SF/DFmode, and the move type is IEEE 128-bit
> +   floating point, we have to duplicate the mask in the lower 64-bits with
> +   XXPERMDI because XSCMPEQDP clears the bottom 64-bits of the mask register.
> +
> +   Going the other way (IEEE 128-bit comparsion, 64-bit move) is fine as the
> +   mask word will be 128-bits.  */
> +
> +float
> +eq_f_d (float a, float b, double x, double y)
> +{
> +  return (x == y) ? a : b;
> +}
> +
> +double
> +eq_d_f (double a, double b, float x, float y)
> +{
> +  return (x == y) ? a : b;
> +}
> +
> +float
> +eq_f_f128 (float a, float b, __float128 x, __float128 y)
> +{
> +  return (x == y) ? a : b;
> +}
> +
> +double
> +eq_d_f128 (double a, double b, __float128 x, __float128 y)
> +{
> +  return (x == y) ? a : b;
> +}
> +
> +__float128
> +eq_f128_f (__float128 a, __float128 b, float x, float y)
> +{
> +  return (x == y) ? a : b;
> +}
> +
> +__float128
> +eq_f128_d (__float128 a, __float128 b, double x, double y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +float
> +ne_f_d (float a, float b, double x, double y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +double
> +ne_d_f (double a, double b, float x, float y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +float
> +ne_f_f128 (float a, float b, __float128 x, __float128 y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +double
> +ne_d_f128 (double a, double b, __float128 x, __float128 y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +__float128
> +ne_f128_f (__float128 a, __float128 b, float x, float y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +__float128
> +ne_f128_d (__float128 a, __float128 b, double x, double y)
> +{
> +  return (x != y) ? a : b;
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> new file mode 100644
> index 00000000000..6f7627c0f2a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +
> +#ifndef TYPE
> +#define TYPE _Float128
> +#endif
> +
> +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
> +   call.  */
> +TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; }
> +TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; }
> +
> +/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
> +/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
> -- 
> 2.22.0
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-09 16:54   ` will schmidt
@ 2021-04-09 19:20     ` Bernhard Reutner-Fischer
  2021-04-14 19:01       ` Segher Boessenkool
  0 siblings, 1 reply; 22+ messages in thread
From: Bernhard Reutner-Fischer @ 2021-04-09 19:20 UTC (permalink / raw)
  To: Michael Meissner
  Cc: rep.dot.nop, will schmidt via Gcc-patches, will schmidt,
	Segher Boessenkool, David Edelsohn, Bill Schmidt, Peter Bergner

On Fri, 09 Apr 2021 11:54:59 -0500
will schmidt via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

> On Fri, 2021-04-09 at 10:43 -0400, Michael Meissner wrote:

> > gcc/
> > 2021-04-09 Michael Meissner  <meissner@linux.ibm.com>

> >         (mov<FPMASK:mode><FPMASK2:mode>cc_fpmask): Replace
> >         mov<SFDF:mode><SFDF2:mode>cc_p9.  Add IEEE 128-bit fp support.
> >         (mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask): Replace
> >         mov<SFDF:mode><SFDF2:mode>cc_invert_p9.  Add IEEE 128-bit fp
> >         support.
> >         (fpmask<mode>): Add IEEE 128-bit fp support.  Enable generator to
> >         build te RTL.
> >         (xxsel<mode>): Add IEEE 128-bit fp support.  Enable generator to
> >         build te RTL.  
> ok

te? the?

> > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> > index 17b2fdc1cdd..ca4a4d01f05 100644
> > --- a/gcc/config/rs6000/rs6000.md
> > +++ b/gcc/config/rs6000/rs6000.md


> >  ;; Handle inverting the fpmask comparisons.
> > -(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_invert_p9"
> > -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
> > -	(if_then_else:SFDF
> > +(define_insn_and_split "*mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask"
> > +  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=<FPMASK:Fv>")
> > +	(if_then_else:FPMASK
> >  	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
> > -		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
> > -		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
> > -	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
> > -	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
> > -   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
> > +	    [(match_operand:FPMASK2 2 "vsx_register_operand" "<FPMASK2:Fv>")
> > +	     (match_operand:FPMASK2 3 "vsx_register_operand" "<FPMASK2:Fv>")])
> > +	 (match_operand:FPMASK 4 "vsx_register_operand" "<FPMASK:Fv>")
> > +	 (match_operand:FPMASK 5 "vsx_register_operand" "<FPMASK:Fv>")))
> > +   (clobber (match_scratch:V2DI 6 "=&<FPMASK2:Fv>"))]
> >    "TARGET_P9_MINMAX"
> >    "#"
> >    "&& 1"
> > -  [(set (match_dup 6)
> > -	(if_then_else:V2DI (match_dup 9)
> > -			   (match_dup 7)
> > -			   (match_dup 8)))
> > -   (set (match_dup 0)
> > -	(if_then_else:SFDF (ne (match_dup 6)
> > -			       (match_dup 8))
> > -			   (match_dup 5)
> > -			   (match_dup 4)))]
> > +  [(pc)]
> >  {
> > -  rtx op1 = operands[1];
> > -  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
> > +  rtx dest = operands[0];
> > +  rtx old_cmp = operands[1];
> > +  rtx cmp_op1 = operands[2];
> > +  rtx cmp_op2 = operands[3];
> > +  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (old_cmp));

I think you can drop the enum keyword.
Didn't read the pattern in detail.

> > +  rtx cmp_rev = gen_rtx_fmt_ee (cond, CCFPmode, cmp_op1, cmp_op2);
> > +  rtx move_f = operands[4];
> > +  rtx move_t = operands[5];
> > +  rtx mask_reg = operands[6];
> > +  rtx mask_m1 = CONSTM1_RTX (V2DImode);
> > +  rtx mask_0 = CONST0_RTX (V2DImode);
> > +  machine_mode move_mode = <FPMASK:MODE>mode;
> > +  machine_mode compare_mode = <FPMASK2:MODE>mode;
> > 
> > -  if (GET_CODE (operands[6]) == SCRATCH)
> > -    operands[6] = gen_reg_rtx (V2DImode);
> > +  if (GET_CODE (mask_reg) == SCRATCH)
> > +    mask_reg = gen_reg_rtx (V2DImode);
> > 
> > -  operands[7] = CONSTM1_RTX (V2DImode);
> > -  operands[8] = CONST0_RTX (V2DImode);
> > +  /* Emit the compare and set mask instruction.  */
> > +  emit_insn (gen_fpmask<FPMASK2:mode> (mask_reg, cmp_rev, cmp_op1, cmp_op2,
> > +				       mask_m1, mask_0));
> > 
> > -  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
> > +  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
> > +     mask.  Because we are using the splat builtin to extend the V2DImode, we
> > +     need to use element 1 on little endian systems.  */
> > +  if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode))
> > +    {
> > +      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
> > +      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
> > +    }
> > +
> > +  /* Now emit the XXSEL insn.  */
> > +  emit_insn (gen_xxsel<FPMASK:mode> (dest, mask_reg, mask_0, move_t, move_f));
> > +  DONE;
> >  }
> > - [(set_attr "length" "8")
> > + ;; length is 12 in case we need to add XXPERMDI
> > + [(set_attr "length" "12")
> >    (set_attr "type" "vecperm")])
> > 
> > -(define_insn "*fpmask<mode>"
> > -  [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa")
> > +(define_insn "fpmask<mode>"
> > +  [(set (match_operand:V2DI 0 "vsx_register_operand" "=<Fv>")
> >  	(if_then_else:V2DI
> >  	 (match_operator:CCFP 1 "fpmask_comparison_operator"
> > -		[(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")
> > -		 (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")])
> > +		[(match_operand:FPMASK 2 "vsx_register_operand" "<Fv>")
> > +		 (match_operand:FPMASK 3 "vsx_register_operand" "<Fv>")])
> >  	 (match_operand:V2DI 4 "all_ones_constant" "")
> >  	 (match_operand:V2DI 5 "zero_constant" "")))]
> >    "TARGET_P9_MINMAX"
> > -  "xscmp%V1dp %x0,%x2,%x3"
> > +{
> > +  return (FLOAT128_IEEE_P (<MODE>mode)
> > +	  ? "xscmp%V1qp %0,%2,%3"
> > +	  : "xscmp%V1dp %x0,%x2,%x3");
> > +}
> >    [(set_attr "type" "fpcompare")])
> > 
> > -(define_insn "*xxsel<mode>"
> > -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
> > -	(if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
> > -			       (match_operand:V2DI 2 "zero_constant" ""))
> > -			   (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")
> > -			   (match_operand:SFDF 4 "vsx_register_operand" "<Fv>")))]
> > +(define_insn "xxsel<mode>"
> > +  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=wa")
> > +	(if_then_else:FPMASK
> > +	 (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
> > +	     (match_operand:V2DI 2 "zero_constant" ""))
> > +	 (match_operand:FPMASK 3 "vsx_register_operand" "wa")
> > +	 (match_operand:FPMASK 4 "vsx_register_operand" "wa")))]
> >    "TARGET_P9_MINMAX"
> >    "xxsel %x0,%x4,%x3,%x1"
> >    [(set_attr "type" "vecmove")])
> > diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> > new file mode 100644
> > index 00000000000..639d5a77087
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> > @@ -0,0 +1,93 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target ppc_float128_hw } */
> > +/* { dg-require-effective-target power10_ok } */
> > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> > +/* { dg-final { scan-assembler     {\mxscmpeq[dq]p\M} } } */
> > +/* { dg-final { scan-assembler     {\mxxpermdi\M}     } } */
> > +/* { dg-final { scan-assembler     {\mxxsel\M}        } } */
> > +/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
> > +/* { dg-final { scan-assembler-not {\mfcmp[uo]\M}     } } */
> > +/* { dg-final { scan-assembler-not {\mfsel\M}         } } */

I'd have expected scan-assembler-times fwiw.

> > +
> > +/* This series of tests tests whether you can do a conditional move where the
> > +   test is one floating point type, and the result is another floating point
> > +   type.
> > +
> > +   If the comparison type is SF/DFmode, and the move type is IEEE 128-bit
> > +   floating point, we have to duplicate the mask in the lower 64-bits with
> > +   XXPERMDI because XSCMPEQDP clears the bottom 64-bits of the mask register.
> > +
> > +   Going the other way (IEEE 128-bit comparsion, 64-bit move) is fine as the
> > +   mask word will be 128-bits.  */
> > +
> > +float
> > +eq_f_d (float a, float b, double x, double y)
> > +{
> > +  return (x == y) ? a : b;
> > +}
> > +
> > +double
> > +eq_d_f (double a, double b, float x, float y)
> > +{
> > +  return (x == y) ? a : b;
> > +}
> > +
> > +float
> > +eq_f_f128 (float a, float b, __float128 x, __float128 y)
> > +{
> > +  return (x == y) ? a : b;
> > +}
> > +
> > +double
> > +eq_d_f128 (double a, double b, __float128 x, __float128 y)
> > +{
> > +  return (x == y) ? a : b;
> > +}
> > +
> > +__float128
> > +eq_f128_f (__float128 a, __float128 b, float x, float y)
> > +{
> > +  return (x == y) ? a : b;
> > +}
> > +
> > +__float128
> > +eq_f128_d (__float128 a, __float128 b, double x, double y)
> > +{
> > +  return (x != y) ? a : b;
> > +}

I would think the above should be == since it's named eq_ and
the body would be redundant to ne_f128_d below as is.

thanks,

> > +
> > +float
> > +ne_f_d (float a, float b, double x, double y)
> > +{
> > +  return (x != y) ? a : b;
> > +}
> > +
> > +double
> > +ne_d_f (double a, double b, float x, float y)
> > +{
> > +  return (x != y) ? a : b;
> > +}
> > +
> > +float
> > +ne_f_f128 (float a, float b, __float128 x, __float128 y)
> > +{
> > +  return (x != y) ? a : b;
> > +}
> > +
> > +double
> > +ne_d_f128 (double a, double b, __float128 x, __float128 y)
> > +{
> > +  return (x != y) ? a : b;
> > +}
> > +
> > +__float128
> > +ne_f128_f (__float128 a, __float128 b, float x, float y)
> > +{
> > +  return (x != y) ? a : b;
> > +}
> > +
> > +__float128
> > +ne_f128_d (__float128 a, __float128 b, double x, double y)
> > +{
> > +  return (x != y) ? a : b;
> > +}

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-09 16:54   ` will schmidt
@ 2021-04-13 17:57     ` Segher Boessenkool
  0 siblings, 0 replies; 22+ messages in thread
From: Segher Boessenkool @ 2021-04-13 17:57 UTC (permalink / raw)
  To: will schmidt
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner

Hi!

On Fri, Apr 09, 2021 at 11:54:57AM -0500, will schmidt wrote:
> On Fri, 2021-04-09 at 10:42 -0400, Michael Meissner wrote:
> > 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> > 	3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp instructions.
> 
> I don't see any direct reference to xsmaxcqp or xsmincqp with respect
> to this change below. 
> 
> It looks like this change adds the FLOAT128_MIN_MAX_FPMASK_P (mode)
> check
> as criteria for emitting some form of a SET instruction. 
>        emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
> 
> Ok, I see it now,  the instructions are mildly obfuscated by
> "xs<minmax>cqp" as part of the rs6000.md change below.

Yeah, that is the downside of using iterators and the like in the
machine description.  But the upsides of that far outweigh these
downsides :-)

> > 	* config/rs6000/rs60000.h (FLOAT128_MIN_MAX_FPMASK_P): New macro.
> 
> which is
> #define FLOAT128_MIN_MAX_FPMASK_P(MODE)	\
>   (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (MODE))
> 
> Are there any non MIN_MAX scenarios that will require the combination
> of POWER10,FLOAT128_HW,FLOAT128_IEEE(mode)?  I'd wonder if there is a name
> not specific to *_MIN_MAX_* that would be a better naming choice.
> But, naming is hard. :-)

Yes, and for every new macro, the reader will have to understand what it
does, what it is for.  If you cannot come up with a good name for it
(one for which it is immediately obvious to the reader what it *means*),
often a good tradeoff is to not make a macro at all, just write out the
few words where you need them.

> > 	* config/rs6000/rs6000.md (s<minmax><mode>3): Add support for the
> > 	ISA 3.1 IEEE 128-bit minimum and maximum instructions.
> 
> I'd move the "xsmaxcqp,xsmincqp" instruction references from the rs6000.c changelog blurb to this changelog blurb.

It should say "New." or "New define_insns." or "New instructions." or
similar.  The changelog says *what*, not *why*.  And it is important
that you can find stuff in it using grep or similar.

Here it should say "s<minmax><mode>3 for IEEE128".  We actually have
some patterns that just say "<code><mode>3", not too useful in a
changelog if you do not say what code and mode are!  (In this case, it
does not help to say what "minmax" is from, it stand for just "min" and
"max" after all :-) )

Segher

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-09 14:42 ` [PATCH 1/2] " Michael Meissner
  2021-04-09 16:54   ` will schmidt
@ 2021-04-13 22:19   ` Segher Boessenkool
  2021-04-14 19:09     ` Michael Meissner
  1 sibling, 1 reply; 22+ messages in thread
From: Segher Boessenkool @ 2021-04-13 22:19 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

Hi!

On Fri, Apr 09, 2021 at 10:42:50AM -0400, Michael Meissner wrote:
> Since then the patch seems to have gone into a limbo state.

Patches I cannot immediately handle take time, and if they aren't
pinged, they can fall off the map.  So a) ping your patches, once a week
for example; and b) write patches that are simpler to review (do not
cost many hours each).

> gcc/
> 2021-04-09  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> 	3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp instructions.
> 	* config/rs6000/rs60000.h (FLOAT128_MIN_MAX_FPMASK_P): New macro.

As said in the other mail, don't do the macro; just write its expansion
in the single place it is used.

> 	* config/rs6000/rs6000.md (s<minmax><mode>3): Add support for the
> 	ISA 3.1 IEEE 128-bit minimum and maximum instructions.

And rephrase this, just "New" something.

So please resend, taking into account all comments.


Segher

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-09 19:20     ` Bernhard Reutner-Fischer
@ 2021-04-14 19:01       ` Segher Boessenkool
  2021-04-14 20:19         ` Bernhard Reutner-Fischer
  0 siblings, 1 reply; 22+ messages in thread
From: Segher Boessenkool @ 2021-04-14 19:01 UTC (permalink / raw)
  To: Bernhard Reutner-Fischer
  Cc: Michael Meissner, will schmidt via Gcc-patches, will schmidt,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Fri, Apr 09, 2021 at 09:20:32PM +0200, Bernhard Reutner-Fischer wrote:
> On Fri, 09 Apr 2021 11:54:59 -0500
> will schmidt via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> > > +  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (old_cmp));
> 
> I think you can drop the enum keyword.

You can in C++, but not in C.  It is fine to have it in C++ as well.
I think it is nicer in this case to lose the keyword, but it is hardly
harmful :-)

> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> > > @@ -0,0 +1,93 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-require-effective-target ppc_float128_hw } */
> > > +/* { dg-require-effective-target power10_ok } */
> > > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> > > +/* { dg-final { scan-assembler     {\mxscmpeq[dq]p\M} } } */
> > > +/* { dg-final { scan-assembler     {\mxxpermdi\M}     } } */
> > > +/* { dg-final { scan-assembler     {\mxxsel\M}        } } */
> > > +/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
> > > +/* { dg-final { scan-assembler-not {\mfcmp[uo]\M}     } } */
> > > +/* { dg-final { scan-assembler-not {\mfsel\M}         } } */
> 
> I'd have expected scan-assembler-times fwiw.

For what?  scan-assembler-not *is* scan-assembler-times, in effect (but
simpler of course, and it does work with capturing parens).

Having too strict checks for generated code means no end to having to
update many testcases when we have very small changes in the compiler.
It's a balancing act.  But maybe some -times would be good here, dunno.

> > > +__float128
> > > +eq_f128_d (__float128 a, __float128 b, double x, double y)
> > > +{
> > > +  return (x != y) ? a : b;
> > > +}
> 
> I would think the above should be == since it's named eq_ and
> the body would be redundant to ne_f128_d below as is.

Good spot :-)


Segher

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-13 22:19   ` Segher Boessenkool
@ 2021-04-14 19:09     ` Michael Meissner
  2021-04-14 19:15       ` Segher Boessenkool
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Meissner @ 2021-04-14 19:09 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Tue, Apr 13, 2021 at 05:19:12PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Apr 09, 2021 at 10:42:50AM -0400, Michael Meissner wrote:
> > Since then the patch seems to have gone into a limbo state.
> 
> Patches I cannot immediately handle take time, and if they aren't
> pinged, they can fall off the map.  So a) ping your patches, once a week
> for example; and b) write patches that are simpler to review (do not
> cost many hours each).
> 
> > gcc/
> > 2021-04-09  Michael Meissner  <meissner@linux.ibm.com>
> > 
> > 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> > 	3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp instructions.
> > 	* config/rs6000/rs60000.h (FLOAT128_MIN_MAX_FPMASK_P): New macro.
> 
> As said in the other mail, don't do the macro; just write its expansion
> in the single place it is used.

Note, in the first patch it is only used 1 time, but in the second patch it is
used 5 times (4 times in mode iterators in rs6000.md, 1 other use in rs6000.c).
But I will eliminate it, and replicate it in each of the 6 places it is used.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-14 19:09     ` Michael Meissner
@ 2021-04-14 19:15       ` Segher Boessenkool
  2021-04-14 20:51         ` Michael Meissner
  0 siblings, 1 reply; 22+ messages in thread
From: Segher Boessenkool @ 2021-04-14 19:15 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Wed, Apr 14, 2021 at 03:09:13PM -0400, Michael Meissner wrote:
> On Tue, Apr 13, 2021 at 05:19:12PM -0500, Segher Boessenkool wrote:
> > > 	* config/rs6000/rs60000.h (FLOAT128_MIN_MAX_FPMASK_P): New macro.
> > 
> > As said in the other mail, don't do the macro; just write its expansion
> > in the single place it is used.
> 
> Note, in the first patch it is only used 1 time, but in the second patch it is
> used 5 times (4 times in mode iterators in rs6000.md, 1 other use in rs6000.c).
> But I will eliminate it, and replicate it in each of the 6 places it is used.

The alternative is to come up with a much better name :-/


Segher

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-09 14:43 ` [PATCH 2/2] " Michael Meissner
  2021-04-09 16:54   ` will schmidt
@ 2021-04-14 19:38   ` Segher Boessenkool
  2021-04-14 20:48     ` Michael Meissner
  1 sibling, 1 reply; 22+ messages in thread
From: Segher Boessenkool @ 2021-04-14 19:38 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Fri, Apr 09, 2021 at 10:43:58AM -0400, Michael Meissner wrote:
>         (Fv mode attribute): Add KFmode and TFmode.
>         (mov<FPMASK:mode><FPMASK2:mode>cc_fpmask): Replace
>         mov<SFDF:mode><SFDF2:mode>cc_p9.  Add IEEE 128-bit fp support.
>         (mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask): Replace
>         mov<SFDF:mode><SFDF2:mode>cc_invert_p9.  Add IEEE 128-bit fp
>         support.
>         (fpmask<mode>): Add IEEE 128-bit fp support.  Enable generator to
>         build te RTL.
>         (xxsel<mode>): Add IEEE 128-bit fp support.  Enable generator to
>         build te RTL.

> @@ -608,8 +621,13 @@ (define_mode_attr Ff		[(SF "f") (DF "d") (DI "d")])
>  ; SF/DF constraint for arithmetic on VSX registers using instructions added in
>  ; ISA 2.06 (power7).  This includes instructions that normally target DF mode,
>  ; but are used on SFmode, since internally SFmode values are kept in the DFmode
> -; format.
> -(define_mode_attr Fv		[(SF "wa") (DF "wa") (DI "wa")])
> +; format.  Also include IEEE 128-bit instructions which are restricted to the
> +; Altivec registers.
> +(define_mode_attr Fv		[(SF "wa")
> +				 (DF "wa")
> +				 (DI "wa")
> +				 (KF "v")
> +				 (TF "v")])

Eww.  Please just split the patterns.  Fv should just go away, it is
always "wa" currently.  Removing that cascades to more cleanups, which
is why I haven't done it yet, it takes time.

Almost all places that use Fv have no use at all for KF.

>  (define_expand "mov<mode>cc"
> -   [(set (match_operand:SFDF 0 "gpc_reg_operand")
> -	 (if_then_else:SFDF (match_operand 1 "comparison_operator")
> -			    (match_operand:SFDF 2 "gpc_reg_operand")
> -			    (match_operand:SFDF 3 "gpc_reg_operand")))]
> +   [(set (match_operand:FPMASK 0 "gpc_reg_operand")
> +	 (if_then_else:FPMASK (match_operand 1 "comparison_operator")
> +			      (match_operand:FPMASK 2 "gpc_reg_operand")
> +			      (match_operand:FPMASK 3 "gpc_reg_operand")))]

So you really want SFDFQF or such?  That is much more generic than
"FPMASK", which doesn't explain what it means at all, either.

But, you can keep the patterns separate as well.

> - [(set_attr "length" "8")
> + ;; length is 12 in case we need to add XXPERMDI
> + [(set_attr "length" "12")

Which is only for QP.  So really, just keep the patterns split.

> +  return (FLOAT128_IEEE_P (<MODE>mode)
> +	  ? "xscmp%V1qp %0,%2,%3"
> +	  : "xscmp%V1dp %x0,%x2,%x3");

Different output as well.

> -(define_insn "*xxsel<mode>"
> -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
> -	(if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
> -			       (match_operand:V2DI 2 "zero_constant" ""))
> -			   (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")
> -			   (match_operand:SFDF 4 "vsx_register_operand" "<Fv>")))]
> +(define_insn "xxsel<mode>"
> +  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=wa")
> +	(if_then_else:FPMASK
> +	 (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
> +	     (match_operand:V2DI 2 "zero_constant" ""))
> +	 (match_operand:FPMASK 3 "vsx_register_operand" "wa")
> +	 (match_operand:FPMASK 4 "vsx_register_operand" "wa")))]
>    "TARGET_P9_MINMAX"
>    "xxsel %x0,%x4,%x3,%x1"
>    [(set_attr "type" "vecmove")])

Please keep that a "*"; it should be generated via "mov<mode>cc".

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> @@ -0,0 +1,93 @@

> +/* { dg-final { scan-assembler     {\mxxsel\M}        } } */

> +/* { dg-final { scan-assembler-not {\mfsel\M}         } } */

It is somewhat problematic to require xxsel and disallow fsel (for one
thing, the compiler could always generated xxsel instead of any fsel).
But it will probably keep working fine, the routines here are very
short.

> +__float128
> +eq_f128_d (__float128 a, __float128 b, double x, double y)
> +{
> +  return (x != y) ? a : b;
> +}

So "==" here.


Segher

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-14 19:01       ` Segher Boessenkool
@ 2021-04-14 20:19         ` Bernhard Reutner-Fischer
  0 siblings, 0 replies; 22+ messages in thread
From: Bernhard Reutner-Fischer @ 2021-04-14 20:19 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, will schmidt via Gcc-patches, will schmidt,
	David Edelsohn, Bill Schmidt, Peter Bergner

On 14 April 2021 21:01:15 CEST, Segher Boessenkool <segher@kernel.crashing.org> wrote:

>> > > --- /dev/null
>> > > +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
>> > > @@ -0,0 +1,93 @@
>> > > +/* { dg-do compile } */
>> > > +/* { dg-require-effective-target ppc_float128_hw } */
>> > > +/* { dg-require-effective-target power10_ok } */
>> > > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
>> > > +/* { dg-final { scan-assembler     {\mxscmpeq[dq]p\M} } } */
>> > > +/* { dg-final { scan-assembler     {\mxxpermdi\M}     } } */
>> > > +/* { dg-final { scan-assembler     {\mxxsel\M}        } } */
>> > > +/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
>> > > +/* { dg-final { scan-assembler-not {\mfcmp[uo]\M}     } } */
>> > > +/* { dg-final { scan-assembler-not {\mfsel\M}         } } */
>> 
>> I'd have expected scan-assembler-times fwiw.
>
>For what?  scan-assembler-not *is* scan-assembler-times, in effect (but
>simpler of course, and it does work with capturing parens).

I meant -times for the occurrences of scan-assembler, not the -not, in case that wasn't clear.

>Having too strict checks for generated code means no end to having to
>update many testcases when we have very small changes in the compiler.
>It's a balancing act.  But maybe some -times would be good here, dunno.
>
>> > > +__float128
>> > > +eq_f128_d (__float128 a, __float128 b, double x, double y)
>> > > +{
>> > > +  return (x != y) ? a : b;
>> > > +}
>> 
>> I would think the above should be == since it's named eq_ and
>> the body would be redundant to ne_f128_d below as is.
>
>Good spot :-)

Well -times would maybe have caught exactly this I suppose.

I know the exact count can be cumbersome to maintain, but in this very specific case which checks exactly the desired instruction it may be appropriate.

Just saying, prompted by the typo..
thanks,

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-14 19:38   ` Segher Boessenkool
@ 2021-04-14 20:48     ` Michael Meissner
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Meissner @ 2021-04-14 20:48 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Wed, Apr 14, 2021 at 02:38:47PM -0500, Segher Boessenkool wrote:
> On Fri, Apr 09, 2021 at 10:43:58AM -0400, Michael Meissner wrote:
> >         (Fv mode attribute): Add KFmode and TFmode.
> >         (mov<FPMASK:mode><FPMASK2:mode>cc_fpmask): Replace
> >         mov<SFDF:mode><SFDF2:mode>cc_p9.  Add IEEE 128-bit fp support.
> >         (mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask): Replace
> >         mov<SFDF:mode><SFDF2:mode>cc_invert_p9.  Add IEEE 128-bit fp
> >         support.
> >         (fpmask<mode>): Add IEEE 128-bit fp support.  Enable generator to
> >         build te RTL.
> >         (xxsel<mode>): Add IEEE 128-bit fp support.  Enable generator to
> >         build te RTL.
> 
> > @@ -608,8 +621,13 @@ (define_mode_attr Ff		[(SF "f") (DF "d") (DI "d")])
> >  ; SF/DF constraint for arithmetic on VSX registers using instructions added in
> >  ; ISA 2.06 (power7).  This includes instructions that normally target DF mode,
> >  ; but are used on SFmode, since internally SFmode values are kept in the DFmode
> > -; format.
> > -(define_mode_attr Fv		[(SF "wa") (DF "wa") (DI "wa")])
> > +; format.  Also include IEEE 128-bit instructions which are restricted to the
> > +; Altivec registers.
> > +(define_mode_attr Fv		[(SF "wa")
> > +				 (DF "wa")
> > +				 (DI "wa")
> > +				 (KF "v")
> > +				 (TF "v")])
> 
> Eww.  Please just split the patterns.  Fv should just go away, it is
> always "wa" currently.  Removing that cascades to more cleanups, which
> is why I haven't done it yet, it takes time.

The problem is you have a combinatorial explosion.  Right now, there are two
patterns, one for the normal move, and one for the inverted move.  Without
doing a cascaded combination, you would need some 32 patterns to cover all of
the possibilities.

Or you give up on having a conditional move that compares one type and moves a
second:

	_Float128 a, b;
	double c, d, r;

	r = (a == b) ? c : d;

As I recall when I put the original logic in, there were a few places that did
this mixed comparison between SF/DF modes was used in real code.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC
  2021-04-14 19:15       ` Segher Boessenkool
@ 2021-04-14 20:51         ` Michael Meissner
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Meissner @ 2021-04-14 20:51 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Wed, Apr 14, 2021 at 02:15:47PM -0500, Segher Boessenkool wrote:
> On Wed, Apr 14, 2021 at 03:09:13PM -0400, Michael Meissner wrote:
> > On Tue, Apr 13, 2021 at 05:19:12PM -0500, Segher Boessenkool wrote:
> > > > 	* config/rs6000/rs60000.h (FLOAT128_MIN_MAX_FPMASK_P): New macro.
> > > 
> > > As said in the other mail, don't do the macro; just write its expansion
> > > in the single place it is used.
> > 
> > Note, in the first patch it is only used 1 time, but in the second patch it is
> > used 5 times (4 times in mode iterators in rs6000.md, 1 other use in rs6000.c).
> > But I will eliminate it, and replicate it in each of the 6 places it is used.
> 
> The alternative is to come up with a much better name :-/

I dunno, given the what the macro is used for (i.e. whether we have the IEEE
128-bit minimum, maximum, and floating point compare mask)
FLOAT128_MIN_MAX_FPMASK_P meets the definition.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-06-07 20:25   ` Segher Boessenkool
@ 2021-06-08 23:52     ` Michael Meissner
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Meissner @ 2021-06-08 23:52 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Mon, Jun 07, 2021 at 03:25:06PM -0500, Segher Boessenkool wrote:
> On Tue, May 18, 2021 at 04:26:06PM -0400, Michael Meissner wrote:
> > This patch adds the support for the IEEE 128-bit floating point C minimum and
> > maximum instructions.
> 
> > gcc/
> > 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> > 
> > 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> > 	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
> > 	instructions.
> 
> 3.1 fits on the previous line (it is better to not split numbers to a
> new line).  What is up with the weird multiple spaces?  We don't align
> the right border in changelogs :-)
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> > @@ -0,0 +1,15 @@
> > +/* { dg-require-effective-target ppc_float128_hw } */
> > +/* { dg-require-effective-target power10_ok } */
> 
> Is this needed?  And, why is ppc_float128_hw needed?  That combination
> does not seem to make sense.

Basically it is there to make sure that we are actually generating IEEE 128-bit
instructions.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-18 20:26 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
  2021-05-20 19:25   ` will schmidt
@ 2021-06-07 20:25   ` Segher Boessenkool
  2021-06-08 23:52     ` Michael Meissner
  1 sibling, 1 reply; 22+ messages in thread
From: Segher Boessenkool @ 2021-06-07 20:25 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Tue, May 18, 2021 at 04:26:06PM -0400, Michael Meissner wrote:
> This patch adds the support for the IEEE 128-bit floating point C minimum and
> maximum instructions.

> gcc/
> 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> 	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
> 	instructions.

3.1 fits on the previous line (it is better to not split numbers to a
new line).  What is up with the weird multiple spaces?  We don't align
the right border in changelogs :-)

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */

Is this needed?  And, why is ppc_float128_hw needed?  That combination
does not seem to make sense.

> --- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> @@ -3,6 +3,13 @@
>  /* { dg-require-effective-target float128 } */
>  /* { dg-options "-mpower9-vector -O2 -ffast-math" } */
>  
> +/* If the compiler was configured to automatically generate power10 support with
> +   --with-cpu=power10, turn it off.  Otherwise, it will generate XXMAXCQP and
> +   XXMINCQP instructions.  */
> +#ifdef _ARCH_PWR10
> +#pragma GCC target ("cpu=power9")
> +#endif

Yeah, don't.  Add a dg-skip-if if that is what you want.  That
-mpower9-vector shouldn't be there either.


Segher

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-21  1:38     ` Michael Meissner
@ 2021-06-07 20:08       ` Segher Boessenkool
  0 siblings, 0 replies; 22+ messages in thread
From: Segher Boessenkool @ 2021-06-07 20:08 UTC (permalink / raw)
  To: Michael Meissner, will schmidt, gcc-patches, David Edelsohn,
	Bill Schmidt, Peter Bergner

Hi!

On Thu, May 20, 2021 at 09:38:49PM -0400, Michael Meissner wrote:
> Basically for code generation tests, I see the following cases:
> 
> 1) Test code targetting precisley power8 (or power9, power10), etc.  Hopefully
> these are rare.

-mdejagnu-cpu= works perfectly for this.  You may need a *_ok or a *_hw
as well (and/or other selectors).

> 2) Test code targetting at least power8.  But as these tests show, that a lot
> of the code won't generate the appropriate instructions on power10.  This is
> what we have now.  It relies on undocumented switches like -mpower9-vector to
> add the necessary support.

You should simply not run this test on too new systems.  You can use
dg-skip-if or similar.

> 3) Test code targetting at least power8 but go to power9 at the maximum.

But why?  We will keep testing all interesting CPU / OS combos as long
as they are interesting.


Segher

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-20 19:25   ` will schmidt
@ 2021-05-21  1:38     ` Michael Meissner
  2021-06-07 20:08       ` Segher Boessenkool
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Meissner @ 2021-05-21  1:38 UTC (permalink / raw)
  To: will schmidt
  Cc: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Thu, May 20, 2021 at 02:25:58PM -0500, will schmidt wrote:
> I'd throw the ternary term in there, easier to search for later. 
> s/?: operations/ternary (?:) operations /

Thanks.

> So, presumably the float128-minmax-2.c test adds/replaces the power10
> code gen tests that were removed or disabled from float128-minmax.c. 

Yes.

> Probably fine..  It's good to exercise the pragma target stuff, thoguh
> I wonder if it would be better to just specify -mcpu=power9 in the
> options since we are already specifying (redundant?) -mpower9-vector. 
> 
> I see similar changes in a later patch, probably OK there since those
> tests do not appear to be specifying -mcpu=foo options that are already
> pointed at power9 features...

I think we really want a better solution than #pragma, since some systems (AIX
if memory serves) might not support #pragma to change code generation models,
because they don't have the assembler/linker support for it.

Basically for code generation tests, I see the following cases:

1) Test code targetting precisley power8 (or power9, power10), etc.  Hopefully
these are rare.

2) Test code targetting at least power8.  But as these tests show, that a lot
of the code won't generate the appropriate instructions on power10.  This is
what we have now.  It relies on undocumented switches like -mpower9-vector to
add the necessary support.

3) Test code targetting at least power8 but go to power9 at the maximum.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-18 20:26 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
@ 2021-05-20 19:25   ` will schmidt
  2021-05-21  1:38     ` Michael Meissner
  2021-06-07 20:25   ` Segher Boessenkool
  1 sibling, 1 reply; 22+ messages in thread
From: will schmidt @ 2021-05-20 19:25 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Tue, 2021-05-18 at 16:26 -0400, Michael Meissner wrote:
> [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
> 

Hi,


> This patch adds the support for the IEEE 128-bit floating point C minimum and
> maximum instructions.  The next patch will add the support for using the
> compare and set mask instruction to implement conditional moves.
> 
> This patch does not try to re-use the code used for SF/DF min/max
> support.  It defines a separate insn for the IEEE 128-bit support.  It
> uses the code iterator <minmax> to simplify adding both operations.
> 
> GCC will not convert ?: operations into using min/max instructions provided in

I'd throw the ternary term in there, easier to search for later. 
s/?: operations/ternary (?:) operations /

> this patch unless the user uses -Ofast or similar switches due to issues with
> NaNs.  The next patch that adds conditional move instructions will enable the
> ?: conversion in many cases.
> 
> I have done bootstrap builds with this patch on the following 3 systems:
>     1)	power9 running LE Linux using --with-cpu=power9
>     2)	power8 running BE Linux using --with-cpu=power8, testing both
> 	32/64-bit.
>     3)	power10 prototype running LE Linux using --with-cpu=power10.
> 
> There were no regressions to the tests, and the new test added passed.  Can I
> check these patches into trunk branch for GCC 12?
> 
> I would like to check these patches into GCC 11 after a cooling off period, but
> I can also not do the backport if desired.
> 
> gcc/
> 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> 	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
> 	instructions.
> 	* config/rs6000/rs6000.md (s<minmax><mode>3, IEEE128 iterator):
> 	New insns.

ok

> 
> gcc/testsuite/
> 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* gcc.target/powerpc/float128-minmax-2.c: New test.
> 	* gcc.target/powerpc/float128-minmax.c: Turn off power10 code
> 	generation.

So, presumably the float128-minmax-2.c test adds/replaces the power10
code gen tests that were removed or disabled from float128-minmax.c. 



> ---
>  gcc/config/rs6000/rs6000.c                        |  3 ++-
>  gcc/config/rs6000/rs6000.md                       | 11 +++++++++++
>  .../gcc.target/powerpc/float128-minmax-2.c        | 15 +++++++++++++++
>  .../gcc.target/powerpc/float128-minmax.c          |  7 +++++++
>  4 files changed, 35 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 0d0595dddd6..fdaf12aeda0 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -16111,7 +16111,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
>    /* VSX/altivec have direct min/max insns.  */
>    if ((code == SMAX || code == SMIN)
>        && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
> -	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
> +	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
> +	  || (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))))
>      {
>        emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
>        return;
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 0bfeb24d9e8..3a1bc1f8547 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -5196,6 +5196,17 @@ (define_insn "*s<minmax><mode>3_vsx"
>  }
>    [(set_attr "type" "fp")])
> 
> +;; Min/max for ISA 3.1 IEEE 128-bit floating point
> +(define_insn "s<minmax><mode>3"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> +	(fp_minmax:IEEE128
> +	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
> +	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10"
> +  "xs<minmax>cqp %0,%1,%2"
> +  [(set_attr "type" "vecfloat")
> +   (set_attr "size" "128")])
> +
>  ;; The conditional move instructions allow us to perform max and min operations
>  ;; even when we don't have the appropriate max/min instruction using the FSEL
>  ;; instruction.

ok


> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> new file mode 100644
> index 00000000000..c71ba08c9f8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
> +
> +#ifndef TYPE
> +#define TYPE _Float128
> +#endif
> +
> +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
> +   call.  */
> +TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
> +TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
> +
> +/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
> +/* { dg-final { scan-assembler {\mxsmincqp\M} } } */

ok

> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> index fe397518f2f..c3af759c0b9 100644
> --- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> @@ -3,6 +3,13 @@
>  /* { dg-require-effective-target float128 } */
>  /* { dg-options "-mpower9-vector -O2 -ffast-math" } */
> 
> +/* If the compiler was configured to automatically generate power10 support with
> +   --with-cpu=power10, turn it off.  Otherwise, it will generate XXMAXCQP and
> +   XXMINCQP instructions.  */
> +#ifdef _ARCH_PWR10
> +#pragma GCC target ("cpu=power9")
> +#endif

Probably fine..  It's good to exercise the pragma target stuff, thoguh
I wonder if it would be better to just specify -mcpu=power9 in the
options since we are already specifying (redundant?) -mpower9-vector. 

I see similar changes in a later patch, probably OK there since those
tests do not appear to be specifying -mcpu=foo options that are already
pointed at power9 features...


> +
>  #ifndef TYPE
>  #define TYPE _Float128
>  #endif
> -- 
> 2.31.1

lgtm, 
thanks
-Will

> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-18 20:22 [PATCH 0/2] Add power10 IEEE 128-bit min/max/conditional move support Michael Meissner
@ 2021-05-18 20:26 ` Michael Meissner
  2021-05-20 19:25   ` will schmidt
  2021-06-07 20:25   ` Segher Boessenkool
  0 siblings, 2 replies; 22+ messages in thread
From: Michael Meissner @ 2021-05-18 20:26 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

[PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.

This patch adds the support for the IEEE 128-bit floating point C minimum and
maximum instructions.  The next patch will add the support for using the
compare and set mask instruction to implement conditional moves.

This patch does not try to re-use the code used for SF/DF min/max
support.  It defines a separate insn for the IEEE 128-bit support.  It
uses the code iterator <minmax> to simplify adding both operations.

GCC will not convert ?: operations into using min/max instructions provided in
this patch unless the user uses -Ofast or similar switches due to issues with
NaNs.  The next patch that adds conditional move instructions will enable the
?: conversion in many cases.

I have done bootstrap builds with this patch on the following 3 systems:
    1)	power9 running LE Linux using --with-cpu=power9
    2)	power8 running BE Linux using --with-cpu=power8, testing both
	32/64-bit.
    3)	power10 prototype running LE Linux using --with-cpu=power10.

There were no regressions to the tests, and the new test added passed.  Can I
check these patches into trunk branch for GCC 12?

I would like to check these patches into GCC 11 after a cooling off period, but
I can also not do the backport if desired.

gcc/
2021-05-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
	instructions.
	* config/rs6000/rs6000.md (s<minmax><mode>3, IEEE128 iterator):
	New insns.

gcc/testsuite/
2021-05-18  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/float128-minmax-2.c: New test.
	* gcc.target/powerpc/float128-minmax.c: Turn off power10 code
	generation.
---
 gcc/config/rs6000/rs6000.c                        |  3 ++-
 gcc/config/rs6000/rs6000.md                       | 11 +++++++++++
 .../gcc.target/powerpc/float128-minmax-2.c        | 15 +++++++++++++++
 .../gcc.target/powerpc/float128-minmax.c          |  7 +++++++
 4 files changed, 35 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 0d0595dddd6..fdaf12aeda0 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16111,7 +16111,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
   /* VSX/altivec have direct min/max insns.  */
   if ((code == SMAX || code == SMIN)
       && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
-	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
+	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
+	  || (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))))
     {
       emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
       return;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 0bfeb24d9e8..3a1bc1f8547 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5196,6 +5196,17 @@ (define_insn "*s<minmax><mode>3_vsx"
 }
   [(set_attr "type" "fp")])
 
+;; Min/max for ISA 3.1 IEEE 128-bit floating point
+(define_insn "s<minmax><mode>3"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(fp_minmax:IEEE128
+	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
+	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10"
+  "xs<minmax>cqp %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
 ;; The conditional move instructions allow us to perform max and min operations
 ;; even when we don't have the appropriate max/min instruction using the FSEL
 ;; instruction.
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
new file mode 100644
index 00000000000..c71ba08c9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
+TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
index fe397518f2f..c3af759c0b9 100644
--- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
@@ -3,6 +3,13 @@
 /* { dg-require-effective-target float128 } */
 /* { dg-options "-mpower9-vector -O2 -ffast-math" } */
 
+/* If the compiler was configured to automatically generate power10 support with
+   --with-cpu=power10, turn it off.  Otherwise, it will generate XXMAXCQP and
+   XXMINCQP instructions.  */
+#ifdef _ARCH_PWR10
+#pragma GCC target ("cpu=power9")
+#endif
+
 #ifndef TYPE
 #define TYPE _Float128
 #endif
-- 
2.31.1


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-04-15 15:57 [PATCH 0/2] Add IEEE 128-bit min/max/conditional move Michael Meissner
@ 2021-04-15 16:00 ` Michael Meissner
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Meissner @ 2021-04-15 16:00 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

[PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.

This patch adds the support for the IEEE 128-bit floating point C minimum and
maximum instructions.  The next patch will add the support for using the
compare and set mask instruction to implement conditional moves.

I removed the FLOAT128_MIN_MAX_FPMASK_P macro that was in the last iteration of
the patch.

I have built bootstrap compilers on both a little endian power9 Linux system
and a big endian power8 Linux system.  There were no regressions in either
build in the test suites.  Can I check these changes into the trunk for gcc 11?

gcc/
2021-04-14  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
	3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp instructions.
	* config/rs6000/rs6000.md (s<minmax><mode>3, IEEE128 iterator):
	New insns.

gcc/testsuite/
2021-04-14  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/float128-minmax-2.c: New test.
---
 gcc/config/rs6000/rs6000.c                        |  3 ++-
 gcc/config/rs6000/rs6000.md                       | 11 +++++++++++
 .../gcc.target/powerpc/float128-minmax-2.c        | 15 +++++++++++++++
 3 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 48b8efd732b..8d00f99e9fd 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16111,7 +16111,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
   /* VSX/altivec have direct min/max insns.  */
   if ((code == SMAX || code == SMIN)
       && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
-	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
+	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
+	  || (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))))
     {
       emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
       return;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c8cdc42533c..17b2fdc1cdd 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5194,6 +5194,17 @@ (define_insn "*s<minmax><mode>3_vsx"
 }
   [(set_attr "type" "fp")])
 
+;; Min/max for ISA 3.1 IEEE 128-bit floating point
+(define_insn "s<minmax><mode>3"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(fp_minmax:IEEE128
+	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
+	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10"
+  "xs<minmax>cqp %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
 ;; The conditional move instructions allow us to perform max and min operations
 ;; even when we don't have the appropriate max/min instruction using the FSEL
 ;; instruction.
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
new file mode 100644
index 00000000000..c71ba08c9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
+TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-06-08 23:52 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-09 14:38 [PATCH 0/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
2021-04-09 14:42 ` [PATCH 1/2] " Michael Meissner
2021-04-09 16:54   ` will schmidt
2021-04-13 17:57     ` Segher Boessenkool
2021-04-13 22:19   ` Segher Boessenkool
2021-04-14 19:09     ` Michael Meissner
2021-04-14 19:15       ` Segher Boessenkool
2021-04-14 20:51         ` Michael Meissner
2021-04-09 14:43 ` [PATCH 2/2] " Michael Meissner
2021-04-09 16:54   ` will schmidt
2021-04-09 19:20     ` Bernhard Reutner-Fischer
2021-04-14 19:01       ` Segher Boessenkool
2021-04-14 20:19         ` Bernhard Reutner-Fischer
2021-04-14 19:38   ` Segher Boessenkool
2021-04-14 20:48     ` Michael Meissner
2021-04-15 15:57 [PATCH 0/2] Add IEEE 128-bit min/max/conditional move Michael Meissner
2021-04-15 16:00 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
2021-05-18 20:22 [PATCH 0/2] Add power10 IEEE 128-bit min/max/conditional move support Michael Meissner
2021-05-18 20:26 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
2021-05-20 19:25   ` will schmidt
2021-05-21  1:38     ` Michael Meissner
2021-06-07 20:08       ` Segher Boessenkool
2021-06-07 20:25   ` Segher Boessenkool
2021-06-08 23:52     ` Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).