[PATCH 0/2] Add power10 IEEE 128-bit min/max/conditional move support

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 0/2] Add power10 IEEE 128-bit min/max/conditional move support
@ 2021-05-18 20:22 Michael Meissner
  2021-05-18 20:26 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
  2021-05-18 20:28 ` [PATCH 2/2] Add IEEE 128-bit fp conditional move " Michael Meissner
  0 siblings, 2 replies; 18+ messages in thread
From: Michael Meissner @ 2021-05-18 20:22 UTC (permalink / raw)
  To: gcc-patches, Michael Meissner, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

The following two patches are new versions of the patches I've submitted in the
past to add support for the power10 IEEE 128-bit XSMAXCQP, XSMINCQP, XSCMPEQQ,
XSCMPGTQ, and XSCMPGEQ instructions.

This time I'm not trying to share code with the DFmode/SFmode min, max, or
conditional move support.  One reason is going between VSX registers (SF/DF)
and Altivec registers (KF/TF) adds to the complexity.

The first patch adds support for the XSMINCQP and XSMAXCQP instructions.  GCC
will generate these instructions if a built-in function is used or if the use
has turned on fast math.

The second patch adds support for the XSCMPEQ, XSCMPGEQ, and XSCMPGTQ
instructions and conditional moves.  With conditional move support, the
compiler will generate XSMINCQP and XSMAXCQP more often without fast math.

Too keep the complexity down, the conditional move support only supports both
the type for the items being compare and the items being move being the same.
Note, there is code to allow SF/DF inter-mixing for contiional move, and that
is not changed with this patch.

I.e. for conditional move, this will use the XSCMPEQP and XXSEL instructions:

	__float128 cmp1, cmp2;
	__float128 result, move1, move2;
		/* ... */
	result = (cmp1 == cmp2) ? move1 : move2;

I have done bootstrap builds with this patch on the following 3 systems:
    1)	power9 running LE Linux using --with-cpu=power9
    2)	power8 running BE Linux using --with-cpu=power8, testing both
	32/64-bit.
    3)	power10 prototype running LE Linux using --with-cpu=power10.

There were no regressions to the tests, and the new test added passed.  Can I
check these patches into trunk branch for GCC 12?

I would like to check these patches into GCC 11 after a cooling off period, but
I can also not do the backport if desired.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-18 20:22 [PATCH 0/2] Add power10 IEEE 128-bit min/max/conditional move support Michael Meissner
@ 2021-05-18 20:26 ` Michael Meissner
  2021-05-20 19:25   ` will schmidt
                     ` (3 more replies)
  2021-05-18 20:28 ` [PATCH 2/2] Add IEEE 128-bit fp conditional move " Michael Meissner
  1 sibling, 4 replies; 18+ messages in thread
From: Michael Meissner @ 2021-05-18 20:26 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

[PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.

This patch adds the support for the IEEE 128-bit floating point C minimum and
maximum instructions.  The next patch will add the support for using the
compare and set mask instruction to implement conditional moves.

This patch does not try to re-use the code used for SF/DF min/max
support.  It defines a separate insn for the IEEE 128-bit support.  It
uses the code iterator <minmax> to simplify adding both operations.

GCC will not convert ?: operations into using min/max instructions provided in
this patch unless the user uses -Ofast or similar switches due to issues with
NaNs.  The next patch that adds conditional move instructions will enable the
?: conversion in many cases.

I have done bootstrap builds with this patch on the following 3 systems:
    1)	power9 running LE Linux using --with-cpu=power9
    2)	power8 running BE Linux using --with-cpu=power8, testing both
	32/64-bit.
    3)	power10 prototype running LE Linux using --with-cpu=power10.

There were no regressions to the tests, and the new test added passed.  Can I
check these patches into trunk branch for GCC 12?

I would like to check these patches into GCC 11 after a cooling off period, but
I can also not do the backport if desired.

gcc/
2021-05-18  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
	instructions.
	* config/rs6000/rs6000.md (s<minmax><mode>3, IEEE128 iterator):
	New insns.

gcc/testsuite/
2021-05-18  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/float128-minmax-2.c: New test.
	* gcc.target/powerpc/float128-minmax.c: Turn off power10 code
	generation.
---
 gcc/config/rs6000/rs6000.c                        |  3 ++-
 gcc/config/rs6000/rs6000.md                       | 11 +++++++++++
 .../gcc.target/powerpc/float128-minmax-2.c        | 15 +++++++++++++++
 .../gcc.target/powerpc/float128-minmax.c          |  7 +++++++
 4 files changed, 35 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 0d0595dddd6..fdaf12aeda0 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16111,7 +16111,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
   /* VSX/altivec have direct min/max insns.  */
   if ((code == SMAX || code == SMIN)
       && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
-	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
+	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
+	  || (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))))
     {
       emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
       return;
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 0bfeb24d9e8..3a1bc1f8547 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5196,6 +5196,17 @@ (define_insn "*s<minmax><mode>3_vsx"
 }
   [(set_attr "type" "fp")])
 
+;; Min/max for ISA 3.1 IEEE 128-bit floating point
+(define_insn "s<minmax><mode>3"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(fp_minmax:IEEE128
+	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
+	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10"
+  "xs<minmax>cqp %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
 ;; The conditional move instructions allow us to perform max and min operations
 ;; even when we don't have the appropriate max/min instruction using the FSEL
 ;; instruction.
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
new file mode 100644
index 00000000000..c71ba08c9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
+TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
index fe397518f2f..c3af759c0b9 100644
--- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
@@ -3,6 +3,13 @@
 /* { dg-require-effective-target float128 } */
 /* { dg-options "-mpower9-vector -O2 -ffast-math" } */
 
+/* If the compiler was configured to automatically generate power10 support with
+   --with-cpu=power10, turn it off.  Otherwise, it will generate XXMAXCQP and
+   XXMINCQP instructions.  */
+#ifdef _ARCH_PWR10
+#pragma GCC target ("cpu=power9")
+#endif
+
 #ifndef TYPE
 #define TYPE _Float128
 #endif
-- 
2.31.1


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
  2021-05-18 20:22 [PATCH 0/2] Add power10 IEEE 128-bit min/max/conditional move support Michael Meissner
  2021-05-18 20:26 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
@ 2021-05-18 20:28 ` Michael Meissner
  2021-05-20 19:27   ` will schmidt
                     ` (3 more replies)
  1 sibling, 4 replies; 18+ messages in thread
From: Michael Meissner @ 2021-05-18 20:28 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

[PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.

This patch adds the support for power10 IEEE 128-bit floating point conditional
move and for automatically generating min/max.

In this patch, I simplified things compared to previous patches.  Instead of
allowing any four of the modes to be used for the conditional move comparison
and the move itself could use different modes, I restricted the conditional
move to just the same mode.  I.e. you can do:

        _Float128 a, b, c, d, e, r;

        r = (a == b) ? c : d;

But you can't do:

        _Float128 c, d, r;
        double a, b;

        r = (a == b) ? c : d;

or:

        _Float128 a, b;
        double c, d, r;

        r = (a == b) ? c : d;

This eliminates a lot of the complexity of the code, because you don't have to
worry about the sizes being different, and the IEEE 128-bit types being
restricted to Altivec registers, while the SF/DF modes can use any VSX
register.

I did not modify the existing support that allowed conditional moves where
SFmode operands are compared and DFmode operands are moved (and vice versa).

I modified the test cases that I added to reflect this change.  I have also
fixed the test for not equal to use '!=' instead of '=='.

I have done bootstrap builds with this patch on the following 3 systems:
    1)	power9 running LE Linux using --with-cpu=power9
    2)	power8 running BE Linux using --with-cpu=power8, testing both
	32/64-bit.
    3)	power10 prototype running LE Linux using --with-cpu=power10.

There were no regressions to the tests, and the new test added passed.  Can I
check these patches into trunk branch for GCC 12?

I would like to check these patches into GCC 11 after a cooling off period, but
I can also not do the backport if desired.

gcc/
2021-05-18 Michael Meissner  <meissner@linux.ibm.com>

        * config/rs6000/rs6000.c (rs6000_maybe_emit_fp_cmove): Add IEEE
	128-bit floating point conditional move support.
	(have_compare_and_set_mask): Add IEEE 128-bit floating point
	types.
	* config/rs6000/rs6000.md (mov<mode>cc, IEEE128 iterator): New insn.
	(mov<mode>cc_p10, IEEE128 iterator): New insn.
	(mov<mode>cc_invert_p10, IEEE128 iterator): New insn.
	(fpmask<mode>, IEEE128 iterator): New insn.
	(xxsel<mode>, IEEE128 iterator): New insn.

gcc/testsuite/
2021-05-18  Michael Meissner  <meissner@linux.ibm.com>

        * gcc.target/powerpc/float128-cmove.c: New test.
        * gcc.target/powerpc/float128-minmax-3.c: New test.
---
 gcc/config/rs6000/rs6000.c                    |  38 ++++++-
 gcc/config/rs6000/rs6000.md                   | 106 ++++++++++++++++++
 .../gcc.target/powerpc/float128-cmove.c       |  58 ++++++++++
 .../gcc.target/powerpc/float128-minmax-3.c    |  15 +++
 4 files changed, 215 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index fdaf12aeda0..ef1ebaaee05 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15706,8 +15706,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   return 1;
 }
 
-/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
-   minimum with "C" semantics.
+/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
+   maximum or minimum with "C" semantics.
 
    Unless you use -ffast-math, you can't use these instructions to replace
    conditions that implicitly reverse the condition because the comparison
@@ -15783,6 +15783,7 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   enum rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
   rtx op1 = XEXP (op, 1);
+  machine_mode compare_mode = GET_MODE (op0);
   machine_mode result_mode = GET_MODE (dest);
   rtx compare_rtx;
   rtx cmove_rtx;
@@ -15791,6 +15792,35 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   if (!can_create_pseudo_p ())
     return 0;
 
+  /* We allow the comparison to be either SFmode/DFmode and the true/false
+     condition to be either SFmode/DFmode.  I.e. we allow:
+
+	float a, b;
+	double c, d, r;
+
+	r = (a == b) ? c : d;
+
+    and:
+
+	double a, b;
+	float c, d, r;
+
+	r = (a == b) ? c : d;
+
+    but we don't allow intermixing the IEEE 128-bit floating point types with
+    the 32/64-bit scalar types.
+
+    It gets too messy where SFmode/DFmode can use any register and TFmode/KFmode
+    can only use Altivec registers.  In addtion, we would need to do a XXPERMDI
+    if we compare SFmode/DFmode and move TFmode/KFmode.  */
+
+  if (compare_mode == result_mode
+      || (compare_mode == SFmode && result_mode == DFmode)
+      || (compare_mode == DFmode && result_mode == SFmode))
+    ;
+  else
+    return false;
+
   switch (code)
     {
     case EQ:
@@ -15843,6 +15873,10 @@ have_compare_and_set_mask (machine_mode mode)
     case E_DFmode:
       return TARGET_P9_MINMAX;
 
+    case E_KFmode:
+    case E_TFmode:
+      return TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode);
+
     default:
       break;
     }
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 3a1bc1f8547..0c76338c734 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5431,6 +5431,112 @@ (define_insn "*xxsel<mode>"
   "xxsel %x0,%x4,%x3,%x1"
   [(set_attr "type" "vecmove")])
 
+;; Support for ISA 3.1 IEEE 128-bit conditional move.  The mode used in the
+;; comparison must be the same as used in the conditional move.
+(define_expand "mov<mode>cc"
+   [(set (match_operand:IEEE128 0 "gpc_reg_operand")
+	 (if_then_else:IEEE128 (match_operand 1 "comparison_operator")
+			       (match_operand:IEEE128 2 "gpc_reg_operand")
+			       (match_operand:IEEE128 3 "gpc_reg_operand")))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+{
+  if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
+    DONE;
+  else
+    FAIL;
+})
+
+(define_insn_and_split "*mov<mode>cc_p10"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=&v,v")
+	(if_then_else:IEEE128
+	 (match_operator:CCFP 1 "fpmask_comparison_operator"
+		[(match_operand:IEEE128 2 "altivec_register_operand" "v,v")
+		 (match_operand:IEEE128 3 "altivec_register_operand" "v,v")])
+	 (match_operand:IEEE128 4 "altivec_register_operand" "v,v")
+	 (match_operand:IEEE128 5 "altivec_register_operand" "v,v")))
+   (clobber (match_scratch:V2DI 6 "=0,&v"))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "#"
+  "&& 1"
+  [(set (match_dup 6)
+	(if_then_else:V2DI (match_dup 1)
+			   (match_dup 7)
+			   (match_dup 8)))
+   (set (match_dup 0)
+	(if_then_else:IEEE128 (ne (match_dup 6)
+				  (match_dup 8))
+			      (match_dup 4)
+			      (match_dup 5)))]
+{
+  if (GET_CODE (operands[6]) == SCRATCH)
+    operands[6] = gen_reg_rtx (V2DImode);
+
+  operands[7] = CONSTM1_RTX (V2DImode);
+  operands[8] = CONST0_RTX (V2DImode);
+}
+ [(set_attr "length" "8")
+  (set_attr "type" "vecperm")])
+
+;; Handle inverting the fpmask comparisons.
+(define_insn_and_split "*mov<mode>cc_invert_p10"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=&v,v")
+	(if_then_else:IEEE128
+	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
+		[(match_operand:IEEE128 2 "altivec_register_operand" "v,v")
+		 (match_operand:IEEE128 3 "altivec_register_operand" "v,v")])
+	 (match_operand:IEEE128 4 "altivec_register_operand" "v,v")
+	 (match_operand:IEEE128 5 "altivec_register_operand" "v,v")))
+   (clobber (match_scratch:V2DI 6 "=0,&v"))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "#"
+  "&& 1"
+  [(set (match_dup 6)
+	(if_then_else:V2DI (match_dup 9)
+			   (match_dup 7)
+			   (match_dup 8)))
+   (set (match_dup 0)
+	(if_then_else:IEEE128 (ne (match_dup 6)
+				  (match_dup 8))
+			      (match_dup 5)
+			      (match_dup 4)))]
+{
+  rtx op1 = operands[1];
+  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
+
+  if (GET_CODE (operands[6]) == SCRATCH)
+    operands[6] = gen_reg_rtx (V2DImode);
+
+  operands[7] = CONSTM1_RTX (V2DImode);
+  operands[8] = CONST0_RTX (V2DImode);
+
+  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
+}
+ [(set_attr "length" "8")
+  (set_attr "type" "vecperm")])
+
+(define_insn "*fpmask<mode>"
+  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
+	(if_then_else:V2DI
+	 (match_operator:CCFP 1 "fpmask_comparison_operator"
+		[(match_operand:IEEE128 2 "altivec_register_operand" "v")
+		 (match_operand:IEEE128 3 "altivec_register_operand" "v")])
+	 (match_operand:V2DI 4 "all_ones_constant" "")
+	 (match_operand:V2DI 5 "zero_constant" "")))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xscmp%V1qp %0,%2,%3"
+  [(set_attr "type" "fpcompare")])
+
+(define_insn "*xxsel<mode>"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(if_then_else:IEEE128
+	 (ne (match_operand:V2DI 1 "altivec_register_operand" "v")
+	     (match_operand:V2DI 2 "zero_constant" ""))
+	 (match_operand:IEEE128 3 "altivec_register_operand" "v")
+	 (match_operand:IEEE128 4 "altivec_register_operand" "v")))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xxsel %x0,%x4,%x3,%x1"
+  [(set_attr "type" "vecmove")])
+
 \f
 ;; Conversions to and from floating-point.
 
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
new file mode 100644
index 00000000000..2fae8dc23bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
@@ -0,0 +1,58 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#ifndef TYPE
+#ifdef __LONG_DOUBLE_IEEE128__
+#define TYPE long double
+
+#else
+#define TYPE _Float128
+#endif
+#endif
+
+/* Verify that the ISA 3.1 (power10) IEEE 128-bit conditional move instructions
+   are generated.  */
+
+TYPE
+eq (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a == b) ? c : d;
+}
+
+TYPE
+ne (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a != b) ? c : d;
+}
+
+TYPE
+lt (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a < b) ? c : d;
+}
+
+TYPE
+le (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a <= b) ? c : d;
+}
+
+TYPE
+gt (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a > b) ? c : d;
+}
+
+TYPE
+ge (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a >= b) ? c : d;
+}
+
+/* { dg-final { scan-assembler-times {\mxscmpeqqp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxscmpgeqp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxscmpgtqp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxxsel\M}     6 } } */
+/* { dg-final { scan-assembler-not   {\mxscmpuqp\M}    } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
new file mode 100644
index 00000000000..6f7627c0f2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; }
+TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
-- 
2.31.1

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-18 20:26 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
@ 2021-05-20 19:25   ` will schmidt
  2021-05-21  1:38     ` Michael Meissner
  2021-05-24 15:47   ` Ping " Michael Meissner
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: will schmidt @ 2021-05-20 19:25 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Tue, 2021-05-18 at 16:26 -0400, Michael Meissner wrote:
> [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
> 

Hi,


> This patch adds the support for the IEEE 128-bit floating point C minimum and
> maximum instructions.  The next patch will add the support for using the
> compare and set mask instruction to implement conditional moves.
> 
> This patch does not try to re-use the code used for SF/DF min/max
> support.  It defines a separate insn for the IEEE 128-bit support.  It
> uses the code iterator <minmax> to simplify adding both operations.
> 
> GCC will not convert ?: operations into using min/max instructions provided in

I'd throw the ternary term in there, easier to search for later. 
s/?: operations/ternary (?:) operations /

> this patch unless the user uses -Ofast or similar switches due to issues with
> NaNs.  The next patch that adds conditional move instructions will enable the
> ?: conversion in many cases.
> 
> I have done bootstrap builds with this patch on the following 3 systems:
>     1)	power9 running LE Linux using --with-cpu=power9
>     2)	power8 running BE Linux using --with-cpu=power8, testing both
> 	32/64-bit.
>     3)	power10 prototype running LE Linux using --with-cpu=power10.
> 
> There were no regressions to the tests, and the new test added passed.  Can I
> check these patches into trunk branch for GCC 12?
> 
> I would like to check these patches into GCC 11 after a cooling off period, but
> I can also not do the backport if desired.
> 
> gcc/
> 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> 	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
> 	instructions.
> 	* config/rs6000/rs6000.md (s<minmax><mode>3, IEEE128 iterator):
> 	New insns.

ok

> 
> gcc/testsuite/
> 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* gcc.target/powerpc/float128-minmax-2.c: New test.
> 	* gcc.target/powerpc/float128-minmax.c: Turn off power10 code
> 	generation.

So, presumably the float128-minmax-2.c test adds/replaces the power10
code gen tests that were removed or disabled from float128-minmax.c. 



> ---
>  gcc/config/rs6000/rs6000.c                        |  3 ++-
>  gcc/config/rs6000/rs6000.md                       | 11 +++++++++++
>  .../gcc.target/powerpc/float128-minmax-2.c        | 15 +++++++++++++++
>  .../gcc.target/powerpc/float128-minmax.c          |  7 +++++++
>  4 files changed, 35 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 0d0595dddd6..fdaf12aeda0 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -16111,7 +16111,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
>    /* VSX/altivec have direct min/max insns.  */
>    if ((code == SMAX || code == SMIN)
>        && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
> -	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
> +	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
> +	  || (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))))
>      {
>        emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
>        return;
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 0bfeb24d9e8..3a1bc1f8547 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -5196,6 +5196,17 @@ (define_insn "*s<minmax><mode>3_vsx"
>  }
>    [(set_attr "type" "fp")])
> 
> +;; Min/max for ISA 3.1 IEEE 128-bit floating point
> +(define_insn "s<minmax><mode>3"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> +	(fp_minmax:IEEE128
> +	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
> +	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10"
> +  "xs<minmax>cqp %0,%1,%2"
> +  [(set_attr "type" "vecfloat")
> +   (set_attr "size" "128")])
> +
>  ;; The conditional move instructions allow us to perform max and min operations
>  ;; even when we don't have the appropriate max/min instruction using the FSEL
>  ;; instruction.

ok


> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> new file mode 100644
> index 00000000000..c71ba08c9f8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
> +
> +#ifndef TYPE
> +#define TYPE _Float128
> +#endif
> +
> +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
> +   call.  */
> +TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
> +TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
> +
> +/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
> +/* { dg-final { scan-assembler {\mxsmincqp\M} } } */

ok

> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> index fe397518f2f..c3af759c0b9 100644
> --- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> @@ -3,6 +3,13 @@
>  /* { dg-require-effective-target float128 } */
>  /* { dg-options "-mpower9-vector -O2 -ffast-math" } */
> 
> +/* If the compiler was configured to automatically generate power10 support with
> +   --with-cpu=power10, turn it off.  Otherwise, it will generate XXMAXCQP and
> +   XXMINCQP instructions.  */
> +#ifdef _ARCH_PWR10
> +#pragma GCC target ("cpu=power9")
> +#endif

Probably fine..  It's good to exercise the pragma target stuff, thoguh
I wonder if it would be better to just specify -mcpu=power9 in the
options since we are already specifying (redundant?) -mpower9-vector. 

I see similar changes in a later patch, probably OK there since those
tests do not appear to be specifying -mcpu=foo options that are already
pointed at power9 features...


> +
>  #ifndef TYPE
>  #define TYPE _Float128
>  #endif
> -- 
> 2.31.1

lgtm, 
thanks
-Will

> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
  2021-05-18 20:28 ` [PATCH 2/2] Add IEEE 128-bit fp conditional move " Michael Meissner
@ 2021-05-20 19:27   ` will schmidt
  2021-05-21  1:45     ` Michael Meissner
  2021-06-07 21:29     ` Segher Boessenkool
  2021-05-24 15:49   ` Ping " Michael Meissner
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 18+ messages in thread
From: will schmidt @ 2021-05-20 19:27 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Tue, 2021-05-18 at 16:28 -0400, Michael Meissner wrote:
> [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
> 

Hi,


> This patch adds the support for power10 IEEE 128-bit floating point conditional
> move and for automatically generating min/max.
> 
> In this patch, I simplified things compared to previous patches.  Instead of
> allowing any four of the modes to be used for the conditional move comparison
> and the move itself could use different modes, I restricted the conditional
> move to just the same mode.  I.e. you can do:

ok.

> 
>         _Float128 a, b, c, d, e, r;
> 
>         r = (a == b) ? c : d;
> 
> But you can't do:
> 
>         _Float128 c, d, r;
>         double a, b;
> 
>         r = (a == b) ? c : d;
> 
> or:
> 
>         _Float128 a, b;
>         double c, d, r;
> 
>         r = (a == b) ? c : d;
> 
> This eliminates a lot of the complexity of the code, because you don't have to
> worry about the sizes being different, and the IEEE 128-bit types being
> restricted to Altivec registers, while the SF/DF modes can use any VSX
> register.
> 
> I did not modify the existing support that allowed conditional moves where
> SFmode operands are compared and DFmode operands are moved (and vice versa).
> 
> I modified the test cases that I added to reflect this change.  I have also
> fixed the test for not equal to use '!=' instead of '=='.
> 
> I have done bootstrap builds with this patch on the following 3 systems:
>     1)	power9 running LE Linux using --with-cpu=power9
>     2)	power8 running BE Linux using --with-cpu=power8, testing both
> 	32/64-bit.
>     3)	power10 prototype running LE Linux using --with-cpu=power10.
> 
> There were no regressions to the tests, and the new test added passed.  Can I
> check these patches into trunk branch for GCC 12?
> 
> I would like to check these patches into GCC 11 after a cooling off period, but
> I can also not do the backport if desired.
> 
> gcc/
> 2021-05-18 Michael Meissner  <meissner@linux.ibm.com>
> 
>         * config/rs6000/rs6000.c (rs6000_maybe_emit_fp_cmove): Add IEEE
> 	128-bit floating point conditional move support.
> 	(have_compare_and_set_mask): Add IEEE 128-bit floating point
> 	types.
> 	* config/rs6000/rs6000.md (mov<mode>cc, IEEE128 iterator): New insn.
> 	(mov<mode>cc_p10, IEEE128 iterator): New insn.
> 	(mov<mode>cc_invert_p10, IEEE128 iterator): New insn.
> 	(fpmask<mode>, IEEE128 iterator): New insn.
> 	(xxsel<mode>, IEEE128 iterator): New insn.
> 
> gcc/testsuite/
> 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> 
>         * gcc.target/powerpc/float128-cmove.c: New test.
>         * gcc.target/powerpc/float128-minmax-3.c: New test.

ok


> ---
>  gcc/config/rs6000/rs6000.c                    |  38 ++++++-
>  gcc/config/rs6000/rs6000.md                   | 106 ++++++++++++++++++
>  .../gcc.target/powerpc/float128-cmove.c       |  58 ++++++++++
>  .../gcc.target/powerpc/float128-minmax-3.c    |  15 +++
>  4 files changed, 215 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index fdaf12aeda0..ef1ebaaee05 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15706,8 +15706,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>    return 1;
>  }
> 
> -/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
> -   minimum with "C" semantics.
> +/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
> +   maximum or minimum with "C" semantics.
> 
>     Unless you use -ffast-math, you can't use these instructions to replace
>     conditions that implicitly reverse the condition because the comparison
> @@ -15783,6 +15783,7 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>    enum rtx_code code = GET_CODE (op);
>    rtx op0 = XEXP (op, 0);
>    rtx op1 = XEXP (op, 1);
> +  machine_mode compare_mode = GET_MODE (op0);
>    machine_mode result_mode = GET_MODE (dest);
>    rtx compare_rtx;
>    rtx cmove_rtx;
> @@ -15791,6 +15792,35 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>    if (!can_create_pseudo_p ())
>      return 0;
> 
> +  /* We allow the comparison to be either SFmode/DFmode and the true/false
> +     condition to be either SFmode/DFmode.  I.e. we allow:
> +
> +	float a, b;
> +	double c, d, r;
> +
> +	r = (a == b) ? c : d;
> +
> +    and:
> +
> +	double a, b;
> +	float c, d, r;
> +
> +	r = (a == b) ? c : d;


This new comment does not seem to align with the comments in the
description, which statee "But you can't do ..." 


> +
> +    but we don't allow intermixing the IEEE 128-bit floating point types with
> +    the 32/64-bit scalar types.
> +
> +    It gets too messy where SFmode/DFmode can use any register and TFmode/KFmode
> +    can only use Altivec registers.  In addtion, we would need to do a XXPERMDI
> +    if we compare SFmode/DFmode and move TFmode/KFmode.  */
> +
> +  if (compare_mode == result_mode
> +      || (compare_mode == SFmode && result_mode == DFmode)
> +      || (compare_mode == DFmode && result_mode == SFmode))
> +    ;
> +  else
> +    return false;

Interesting if/else block.  May want to reverse the logic. I defer if
this way is notably simpler than inverting it.


> +
>    switch (code)
>      {
>      case EQ:
> @@ -15843,6 +15873,10 @@ have_compare_and_set_mask (machine_mode mode)
>      case E_DFmode:
>        return TARGET_P9_MINMAX;
> 
> +    case E_KFmode:
> +    case E_TFmode:
> +      return TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode);
> +
>      default:
>        break;
>      }

ok

> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 3a1bc1f8547..0c76338c734 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -5431,6 +5431,112 @@ (define_insn "*xxsel<mode>"
>    "xxsel %x0,%x4,%x3,%x1"
>    [(set_attr "type" "vecmove")])
> 
> +;; Support for ISA 3.1 IEEE 128-bit conditional move.  The mode used in the
> +;; comparison must be the same as used in the conditional move.
> +(define_expand "mov<mode>cc"
> +   [(set (match_operand:IEEE128 0 "gpc_reg_operand")
> +	 (if_then_else:IEEE128 (match_operand 1 "comparison_operator")
> +			       (match_operand:IEEE128 2 "gpc_reg_operand")
> +			       (match_operand:IEEE128 3 "gpc_reg_operand")))]
> +  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> +{
> +  if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
> +    DONE;
> +  else
> +    FAIL;
> +})
> +
> +(define_insn_and_split "*mov<mode>cc_p10"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=&v,v")
> +	(if_then_else:IEEE128
> +	 (match_operator:CCFP 1 "fpmask_comparison_operator"
> +		[(match_operand:IEEE128 2 "altivec_register_operand" "v,v")
> +		 (match_operand:IEEE128 3 "altivec_register_operand" "v,v")])
> +	 (match_operand:IEEE128 4 "altivec_register_operand" "v,v")
> +	 (match_operand:IEEE128 5 "altivec_register_operand" "v,v")))
> +   (clobber (match_scratch:V2DI 6 "=0,&v"))]
> +  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 6)
> +	(if_then_else:V2DI (match_dup 1)
> +			   (match_dup 7)
> +			   (match_dup 8)))
> +   (set (match_dup 0)
> +	(if_then_else:IEEE128 (ne (match_dup 6)
> +				  (match_dup 8))
> +			      (match_dup 4)
> +			      (match_dup 5)))]
> +{
> +  if (GET_CODE (operands[6]) == SCRATCH)
> +    operands[6] = gen_reg_rtx (V2DImode);
> +
> +  operands[7] = CONSTM1_RTX (V2DImode);
> +  operands[8] = CONST0_RTX (V2DImode);
> +}
> + [(set_attr "length" "8")
> +  (set_attr "type" "vecperm")])
> +
> +;; Handle inverting the fpmask comparisons.
> +(define_insn_and_split "*mov<mode>cc_invert_p10"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=&v,v")
> +	(if_then_else:IEEE128
> +	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
> +		[(match_operand:IEEE128 2 "altivec_register_operand" "v,v")
> +		 (match_operand:IEEE128 3 "altivec_register_operand" "v,v")])
> +	 (match_operand:IEEE128 4 "altivec_register_operand" "v,v")
> +	 (match_operand:IEEE128 5 "altivec_register_operand" "v,v")))
> +   (clobber (match_scratch:V2DI 6 "=0,&v"))]
> +  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 6)
> +	(if_then_else:V2DI (match_dup 9)
> +			   (match_dup 7)
> +			   (match_dup 8)))
> +   (set (match_dup 0)
> +	(if_then_else:IEEE128 (ne (match_dup 6)
> +				  (match_dup 8))
> +			      (match_dup 5)
> +			      (match_dup 4)))]
> +{
> +  rtx op1 = operands[1];
> +  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
> +
> +  if (GET_CODE (operands[6]) == SCRATCH)
> +    operands[6] = gen_reg_rtx (V2DImode);
> +
> +  operands[7] = CONSTM1_RTX (V2DImode);
> +  operands[8] = CONST0_RTX (V2DImode);
> +
> +  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
> +}
> + [(set_attr "length" "8")
> +  (set_attr "type" "vecperm")])
> +
> +(define_insn "*fpmask<mode>"
> +  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
> +	(if_then_else:V2DI
> +	 (match_operator:CCFP 1 "fpmask_comparison_operator"
> +		[(match_operand:IEEE128 2 "altivec_register_operand" "v")
> +		 (match_operand:IEEE128 3 "altivec_register_operand" "v")])
> +	 (match_operand:V2DI 4 "all_ones_constant" "")
> +	 (match_operand:V2DI 5 "zero_constant" "")))]
> +  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> +  "xscmp%V1qp %0,%2,%3"
> +  [(set_attr "type" "fpcompare")])
> +
> +(define_insn "*xxsel<mode>"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> +	(if_then_else:IEEE128
> +	 (ne (match_operand:V2DI 1 "altivec_register_operand" "v")
> +	     (match_operand:V2DI 2 "zero_constant" ""))
> +	 (match_operand:IEEE128 3 "altivec_register_operand" "v")
> +	 (match_operand:IEEE128 4 "altivec_register_operand" "v")))]
> +  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> +  "xxsel %x0,%x4,%x3,%x1"
> +  [(set_attr "type" "vecmove")])
> +

skimmed this part, nothing jumped out at me.

>  
>  ;; Conversions to and from floating-point.
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> new file mode 100644
> index 00000000000..2fae8dc23bc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> @@ -0,0 +1,58 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +
> +#ifndef TYPE
> +#ifdef __LONG_DOUBLE_IEEE128__
> +#define TYPE long double
> +
> +#else
> +#define TYPE _Float128
> +#endif
> +#endif
> +
> +/* Verify that the ISA 3.1 (power10) IEEE 128-bit conditional move instructions
> +   are generated.  */
> +
> +TYPE
> +eq (TYPE a, TYPE b, TYPE c, TYPE d)
> +{
> +  return (a == b) ? c : d;
> +}
> +
> +TYPE
> +ne (TYPE a, TYPE b, TYPE c, TYPE d)
> +{
> +  return (a != b) ? c : d;
> +}
> +
> +TYPE
> +lt (TYPE a, TYPE b, TYPE c, TYPE d)
> +{
> +  return (a < b) ? c : d;
> +}
> +
> +TYPE
> +le (TYPE a, TYPE b, TYPE c, TYPE d)
> +{
> +  return (a <= b) ? c : d;
> +}
> +
> +TYPE
> +gt (TYPE a, TYPE b, TYPE c, TYPE d)
> +{
> +  return (a > b) ? c : d;
> +}
> +
> +TYPE
> +ge (TYPE a, TYPE b, TYPE c, TYPE d)
> +{
> +  return (a >= b) ? c : d;
> +}
> +
> +/* { dg-final { scan-assembler-times {\mxscmpeqqp\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mxscmpgeqp\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mxscmpgtqp\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mxxsel\M}     6 } } */
> +/* { dg-final { scan-assembler-not   {\mxscmpuqp\M}    } } */

ok.

> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> new file mode 100644
> index 00000000000..6f7627c0f2a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +
> +#ifndef TYPE
> +#define TYPE _Float128
> +#endif
> +
> +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
> +   call.  */

s/"if/then/else"/"minmax"/  ?

> +TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; }
> +TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; }
> +
> +/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
> +/* { dg-final { scan-assembler {\mxsmincqp\M} } } */


ok
thanks
-Will

> -- 
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-20 19:25   ` will schmidt
@ 2021-05-21  1:38     ` Michael Meissner
  2021-06-07 20:08       ` Segher Boessenkool
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Meissner @ 2021-05-21  1:38 UTC (permalink / raw)
  To: will schmidt
  Cc: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Thu, May 20, 2021 at 02:25:58PM -0500, will schmidt wrote:
> I'd throw the ternary term in there, easier to search for later. 
> s/?: operations/ternary (?:) operations /

Thanks.

> So, presumably the float128-minmax-2.c test adds/replaces the power10
> code gen tests that were removed or disabled from float128-minmax.c. 

Yes.

> Probably fine..  It's good to exercise the pragma target stuff, thoguh
> I wonder if it would be better to just specify -mcpu=power9 in the
> options since we are already specifying (redundant?) -mpower9-vector. 
> 
> I see similar changes in a later patch, probably OK there since those
> tests do not appear to be specifying -mcpu=foo options that are already
> pointed at power9 features...

I think we really want a better solution than #pragma, since some systems (AIX
if memory serves) might not support #pragma to change code generation models,
because they don't have the assembler/linker support for it.

Basically for code generation tests, I see the following cases:

1) Test code targetting precisley power8 (or power9, power10), etc.  Hopefully
these are rare.

2) Test code targetting at least power8.  But as these tests show, that a lot
of the code won't generate the appropriate instructions on power10.  This is
what we have now.  It relies on undocumented switches like -mpower9-vector to
add the necessary support.

3) Test code targetting at least power8 but go to power9 at the maximum.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
  2021-05-20 19:27   ` will schmidt
@ 2021-05-21  1:45     ` Michael Meissner
  2021-06-07 21:29     ` Segher Boessenkool
  1 sibling, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2021-05-21  1:45 UTC (permalink / raw)
  To: will schmidt
  Cc: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Thu, May 20, 2021 at 02:27:06PM -0500, will schmidt wrote:
> > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> > index fdaf12aeda0..ef1ebaaee05 100644
> > --- a/gcc/config/rs6000/rs6000.c
> > +++ b/gcc/config/rs6000/rs6000.c
> > @@ -15706,8 +15706,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
> >    return 1;
> >  }
> > 
> > -/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
> > -   minimum with "C" semantics.
> > +/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
> > +   maximum or minimum with "C" semantics.
> > 
> >     Unless you use -ffast-math, you can't use these instructions to replace
> >     conditions that implicitly reverse the condition because the comparison
> > @@ -15783,6 +15783,7 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
> >    enum rtx_code code = GET_CODE (op);
> >    rtx op0 = XEXP (op, 0);
> >    rtx op1 = XEXP (op, 1);
> > +  machine_mode compare_mode = GET_MODE (op0);
> >    machine_mode result_mode = GET_MODE (dest);
> >    rtx compare_rtx;
> >    rtx cmove_rtx;
> > @@ -15791,6 +15792,35 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
> >    if (!can_create_pseudo_p ())
> >      return 0;
> > 
> > +  /* We allow the comparison to be either SFmode/DFmode and the true/false
> > +     condition to be either SFmode/DFmode.  I.e. we allow:
> > +
> > +	float a, b;
> > +	double c, d, r;
> > +
> > +	r = (a == b) ? c : d;
> > +
> > +    and:
> > +
> > +	double a, b;
> > +	float c, d, r;
> > +
> > +	r = (a == b) ? c : d;
> 
> 
> This new comment does not seem to align with the comments in the
> description, which statee "But you can't do ..." 

Yes, the comment is perhaps a little unclear.

> > +
> > +    but we don't allow intermixing the IEEE 128-bit floating point types with
> > +    the 32/64-bit scalar types.
> > +
> > +    It gets too messy where SFmode/DFmode can use any register and TFmode/KFmode
> > +    can only use Altivec registers.  In addtion, we would need to do a XXPERMDI
> > +    if we compare SFmode/DFmode and move TFmode/KFmode.  */
> > +
> > +  if (compare_mode == result_mode
> > +      || (compare_mode == SFmode && result_mode == DFmode)
> > +      || (compare_mode == DFmode && result_mode == SFmode))
> > +    ;
> > +  else
> > +    return false;
> 
> Interesting if/else block.  May want to reverse the logic. I defer if
> this way is notably simpler than inverting it.

I originally tried inverting it, and it just got messy.

> > +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> > @@ -0,0 +1,15 @@
> > +/* { dg-require-effective-target ppc_float128_hw } */
> > +/* { dg-require-effective-target power10_ok } */
> > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> > +
> > +#ifndef TYPE
> > +#define TYPE _Float128
> > +#endif
> > +
> > +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
> > +   call.  */
> 
> s/"if/then/else"/"minmax"/  ?

Thanks.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Ping [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-18 20:26 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
  2021-05-20 19:25   ` will schmidt
@ 2021-05-24 15:47   ` Michael Meissner
  2021-06-01 23:38   ` Ping #2: " Michael Meissner
  2021-06-07 20:25   ` Segher Boessenkool
  3 siblings, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2021-05-24 15:47 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

Ping patch:

| Subject: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
| Message-ID: <20210518202606.GA14382@ibm-toto.the-meissners.org>

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Ping [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
  2021-05-18 20:28 ` [PATCH 2/2] Add IEEE 128-bit fp conditional move " Michael Meissner
  2021-05-20 19:27   ` will schmidt
@ 2021-05-24 15:49   ` Michael Meissner
  2021-06-01 23:39   ` Ping #2: " Michael Meissner
  2021-06-07 22:31   ` Segher Boessenkool
  3 siblings, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2021-05-24 15:49 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

Ping patch.

| Subject: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
| Message-ID: <20210518202827.GB14382@ibm-toto.the-meissners.org>

Note this patch needs the following patch before it can be applied.

| Subject: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
| Message-ID: <20210518202606.GA14382@ibm-toto.the-meissners.org>

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Ping #2: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-18 20:26 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
  2021-05-20 19:25   ` will schmidt
  2021-05-24 15:47   ` Ping " Michael Meissner
@ 2021-06-01 23:38   ` Michael Meissner
  2021-06-07 20:25   ` Segher Boessenkool
  3 siblings, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2021-06-01 23:38 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

Ping again.

Original patch (Add IEEE 128-bit min/max support on PowerPC):

| Date: Tue, 18 May 2021 16:26:06 -0400
| Subject: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
| Message-ID: <20210518202606.GA14382@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570675.html

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Ping #2: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
  2021-05-18 20:28 ` [PATCH 2/2] Add IEEE 128-bit fp conditional move " Michael Meissner
  2021-05-20 19:27   ` will schmidt
  2021-05-24 15:49   ` Ping " Michael Meissner
@ 2021-06-01 23:39   ` Michael Meissner
  2021-06-07 22:31   ` Segher Boessenkool
  3 siblings, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2021-06-01 23:39 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

Ping patch again.

Original patch (Add IEEE 128-bit fp conditional move on PowerPC):

| Date: Tue, 18 May 2021 16:28:27 -0400
| Subject: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
| Message-ID: <20210518202827.GB14382@ibm-toto.the-meissners.org>

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-21  1:38     ` Michael Meissner
@ 2021-06-07 20:08       ` Segher Boessenkool
  0 siblings, 0 replies; 18+ messages in thread
From: Segher Boessenkool @ 2021-06-07 20:08 UTC (permalink / raw)
  To: Michael Meissner, will schmidt, gcc-patches, David Edelsohn,
	Bill Schmidt, Peter Bergner

Hi!

On Thu, May 20, 2021 at 09:38:49PM -0400, Michael Meissner wrote:
> Basically for code generation tests, I see the following cases:
> 
> 1) Test code targetting precisley power8 (or power9, power10), etc.  Hopefully
> these are rare.

-mdejagnu-cpu= works perfectly for this.  You may need a *_ok or a *_hw
as well (and/or other selectors).

> 2) Test code targetting at least power8.  But as these tests show, that a lot
> of the code won't generate the appropriate instructions on power10.  This is
> what we have now.  It relies on undocumented switches like -mpower9-vector to
> add the necessary support.

You should simply not run this test on too new systems.  You can use
dg-skip-if or similar.

> 3) Test code targetting at least power8 but go to power9 at the maximum.

But why?  We will keep testing all interesting CPU / OS combos as long
as they are interesting.


Segher

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-05-18 20:26 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
                     ` (2 preceding siblings ...)
  2021-06-01 23:38   ` Ping #2: " Michael Meissner
@ 2021-06-07 20:25   ` Segher Boessenkool
  2021-06-08 23:52     ` Michael Meissner
  3 siblings, 1 reply; 18+ messages in thread
From: Segher Boessenkool @ 2021-06-07 20:25 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Tue, May 18, 2021 at 04:26:06PM -0400, Michael Meissner wrote:
> This patch adds the support for the IEEE 128-bit floating point C minimum and
> maximum instructions.

> gcc/
> 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> 	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
> 	instructions.

3.1 fits on the previous line (it is better to not split numbers to a
new line).  What is up with the weird multiple spaces?  We don't align
the right border in changelogs :-)

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */

Is this needed?  And, why is ppc_float128_hw needed?  That combination
does not seem to make sense.

> --- a/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax.c
> @@ -3,6 +3,13 @@
>  /* { dg-require-effective-target float128 } */
>  /* { dg-options "-mpower9-vector -O2 -ffast-math" } */
>  
> +/* If the compiler was configured to automatically generate power10 support with
> +   --with-cpu=power10, turn it off.  Otherwise, it will generate XXMAXCQP and
> +   XXMINCQP instructions.  */
> +#ifdef _ARCH_PWR10
> +#pragma GCC target ("cpu=power9")
> +#endif

Yeah, don't.  Add a dg-skip-if if that is what you want.  That
-mpower9-vector shouldn't be there either.


Segher

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
  2021-05-20 19:27   ` will schmidt
  2021-05-21  1:45     ` Michael Meissner
@ 2021-06-07 21:29     ` Segher Boessenkool
  1 sibling, 0 replies; 18+ messages in thread
From: Segher Boessenkool @ 2021-06-07 21:29 UTC (permalink / raw)
  To: will schmidt
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner

On Thu, May 20, 2021 at 02:27:06PM -0500, will schmidt wrote:
> On Tue, 2021-05-18 at 16:28 -0400, Michael Meissner wrote:
> > +  if (compare_mode == result_mode
> > +      || (compare_mode == SFmode && result_mode == DFmode)
> > +      || (compare_mode == DFmode && result_mode == SFmode))
> > +    ;
> > +  else
> > +    return false;
> 
> Interesting if/else block.  May want to reverse the logic. I defer if
> this way is notably simpler than inverting it.

This is not simpler, no.  You want to do something that just returns
*first*, and then not have an "else".  *That* is simpler.

And just write !(...) around the condition, don't try to manually invert
it please.  You want both correct code and readable code, not neither of
these, they are not extremes you need to balance, each helps the other!


Segher

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
  2021-05-18 20:28 ` [PATCH 2/2] Add IEEE 128-bit fp conditional move " Michael Meissner
                     ` (2 preceding siblings ...)
  2021-06-01 23:39   ` Ping #2: " Michael Meissner
@ 2021-06-07 22:31   ` Segher Boessenkool
  2021-06-09  0:05     ` Michael Meissner
  3 siblings, 1 reply; 18+ messages in thread
From: Segher Boessenkool @ 2021-06-07 22:31 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Tue, May 18, 2021 at 04:28:27PM -0400, Michael Meissner wrote:
> In this patch, I simplified things compared to previous patches.  Instead of
> allowing any four of the modes to be used for the conditional move comparison
> and the move itself could use different modes, I restricted the conditional
> move to just the same mode.  I.e. you can do:
> 
>         _Float128 a, b, c, d, e, r;
> 
>         r = (a == b) ? c : d;
> 
> But you can't do:
> 
>         _Float128 c, d, r;
>         double a, b;
> 
>         r = (a == b) ? c : d;
> 
> or:
> 
>         _Float128 a, b;
>         double c, d, r;
> 
>         r = (a == b) ? c : d;
> 
> This eliminates a lot of the complexity of the code, because you don't have to
> worry about the sizes being different, and the IEEE 128-bit types being
> restricted to Altivec registers, while the SF/DF modes can use any VSX
> register.

You do not have to worry about that anyway.  You can just reuse the
existing rs6000_maybe_emit_fp_cmove.  Or why not?  The IF_THEN_ELSE we
generate there should work fine?

> +(define_expand "mov<mode>cc"
> +   [(set (match_operand:IEEE128 0 "gpc_reg_operand")
> +	 (if_then_else:IEEE128 (match_operand 1 "comparison_operator")
> +			       (match_operand:IEEE128 2 "gpc_reg_operand")
> +			       (match_operand:IEEE128 3 "gpc_reg_operand")))]
> +  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> +{
> +  if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
> +    DONE;
> +  else
> +    FAIL;
> +})

Why is this a special pattern anyway?  Why can you not do
  d = cond ? x : y;
with cond any comparison, not even including any floating point
possibly?


Segher

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC.
  2021-06-07 20:25   ` Segher Boessenkool
@ 2021-06-08 23:52     ` Michael Meissner
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2021-06-08 23:52 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Mon, Jun 07, 2021 at 03:25:06PM -0500, Segher Boessenkool wrote:
> On Tue, May 18, 2021 at 04:26:06PM -0400, Michael Meissner wrote:
> > This patch adds the support for the IEEE 128-bit floating point C minimum and
> > maximum instructions.
> 
> > gcc/
> > 2021-05-18  Michael Meissner  <meissner@linux.ibm.com>
> > 
> > 	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> > 	3.1   IEEE   128-bit   floating  point   xsmaxcqp   and   xsmincqp
> > 	instructions.
> 
> 3.1 fits on the previous line (it is better to not split numbers to a
> new line).  What is up with the weird multiple spaces?  We don't align
> the right border in changelogs :-)
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> > @@ -0,0 +1,15 @@
> > +/* { dg-require-effective-target ppc_float128_hw } */
> > +/* { dg-require-effective-target power10_ok } */
> 
> Is this needed?  And, why is ppc_float128_hw needed?  That combination
> does not seem to make sense.

Basically it is there to make sure that we are actually generating IEEE 128-bit
instructions.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
  2021-06-07 22:31   ` Segher Boessenkool
@ 2021-06-09  0:05     ` Michael Meissner
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2021-06-09  0:05 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner, Will Schmidt

On Mon, Jun 07, 2021 at 05:31:50PM -0500, Segher Boessenkool wrote:
> On Tue, May 18, 2021 at 04:28:27PM -0400, Michael Meissner wrote:
> > In this patch, I simplified things compared to previous patches.  Instead of
> > allowing any four of the modes to be used for the conditional move comparison
> > and the move itself could use different modes, I restricted the conditional
> > move to just the same mode.  I.e. you can do:
> > 
> >         _Float128 a, b, c, d, e, r;
> > 
> >         r = (a == b) ? c : d;
> > 
> > But you can't do:
> > 
> >         _Float128 c, d, r;
> >         double a, b;
> > 
> >         r = (a == b) ? c : d;
> > 
> > or:
> > 
> >         _Float128 a, b;
> >         double c, d, r;
> > 
> >         r = (a == b) ? c : d;
> > 
> > This eliminates a lot of the complexity of the code, because you don't have to
> > worry about the sizes being different, and the IEEE 128-bit types being
> > restricted to Altivec registers, while the SF/DF modes can use any VSX
> > register.
> 
> You do not have to worry about that anyway.  You can just reuse the
> existing rs6000_maybe_emit_fp_cmove.  Or why not?  The IF_THEN_ELSE we
> generate there should work fine?
> 
> > +(define_expand "mov<mode>cc"
> > +   [(set (match_operand:IEEE128 0 "gpc_reg_operand")
> > +	 (if_then_else:IEEE128 (match_operand 1 "comparison_operator")
> > +			       (match_operand:IEEE128 2 "gpc_reg_operand")
> > +			       (match_operand:IEEE128 3 "gpc_reg_operand")))]
> > +  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> > +{
> > +  if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
> > +    DONE;
> > +  else
> > +    FAIL;
> > +})
> 
> Why is this a special pattern anyway?  Why can you not do
>   d = cond ? x : y;
> with cond any comparison, not even including any floating point
> possibly?

Well in theory you can certainly do this, we just need to add the necessary
code to implement it.  It quickly becomes an exponential cascading pattern,
where you have one set of modes for the comparison and a different set of modes
for the movement.

I've certainly seen instances where the code has an integer comparison and then
a FP move.  We can do this via a SETBC type instruction, direct move, and then
XXSEL.  But that is beyond the scope of this patch.

If you remember, the original form of this patch allowed the comparison to be
SF, DF, KF, and possibly TF, along with the move.  It becomes complicated when
you have to consider that SF/DF comparisons only fill the upper 64 bits of the
vector register with the comparison, and the IEEE 128-bit types need to be in
Altivec registers.

So I scaled back the patch to just allow 128-bit conditional move.  I left in
the existing 64/34-bit mixture because there was at least one benchmark it was
used in the past.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.
  2021-04-15 15:57 [PATCH 0/2] Add IEEE 128-bit min/max/conditional move Michael Meissner
@ 2021-04-15 16:05 ` Michael Meissner
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Meissner @ 2021-04-15 16:05 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner, Will Schmidt

[PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC.

This patch adds the support for power10 IEEE 128-bit floating point conditional
move and for automatically generating min/max.

In this patch, I simplified things.  Instead of allowing any four of the modes
to be used for the conditional move comparison and the move itself could use
different modes, I restricted the conditional move to just the same mode.
I.e. you can do:

	_Float128 a, b, c, d, e, r;

	r = (a == b) ? c : d;

But you can't do:

	_Float128 c, d, r;
	double a, b;

	r = (a == b) ? c : d;

or:

	_Float128 a, b;
	double c, d, r;

	r = (a == b) ? c : d;

This eliminates a lot of the complexity of the code, because you don't have to
worry about the sizes being different, and the IEEE 128-bit types being
restricted to Altivec registers, while the SF/DF modes can use any VSX
register.

I did not modify the existing support that allowed conditional moves where
SFmode operands are compared and DFmode operands are moved (and vice versa).

This simplification also eliminates having to insert a XXPERMDI instruction if
you are comparing 64-bit values and doing a conditional move with 128-bit
values.

I modified the test cases that I added to reflect this change.  I have also
fixed the test for not equal to use '!=' instead of '=='.

I have built bootstrap compilers on both a little endian power9 Linux system
and a big endian power8 Linux system.  There were no regressions in either
build in the test suites.  Can I check these changes into the trunk for gcc 11?

gcc/
2021-04-14 Michael Meissner  <meissner@linux.ibm.com>

        * config/rs6000/rs6000.c (have_compare_and_set_mask): Add IEEE
        128-bit floating point types.
	* config/rs6000/rs6000.md (mov<mode>cc, IEEE128 iterator): New insn.
	(mov<mode>cc_p10, IEEE128 iterator): New insn.
	(mov<mode>cc_invert_p10, IEEE128 iterator): New insn.
	(fpmask<mode>, IEEE128 iterator): New insn.
	(xxsel<mode>, IEEE128 iterator): New insn.

gcc/testsuite/
2021-04-14  Michael Meissner  <meissner@linux.ibm.com>

        * gcc.target/powerpc/float128-cmove.c: New test.
        * gcc.target/powerpc/float128-minmax-3.c: New test.
---
 gcc/config/rs6000/rs6000.c                    |   8 +-
 gcc/config/rs6000/rs6000.md                   | 106 ++++++++++++++++++
 .../gcc.target/powerpc/float128-cmove.c       |  58 ++++++++++
 .../gcc.target/powerpc/float128-minmax-3.c    |  15 +++
 4 files changed, 185 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8d00f99e9fd..f979d7320ad 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15706,8 +15706,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   return 1;
 }
 
-/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
-   minimum with "C" semantics.
+/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
+   maximum or minimum with "C" semantics.
 
    Unless you use -ffast-math, you can't use these instructions to replace
    conditions that implicitly reverse the condition because the comparison
@@ -15843,6 +15843,10 @@ have_compare_and_set_mask (machine_mode mode)
     case E_DFmode:
       return TARGET_P9_MINMAX;
 
+    case E_KFmode:
+    case E_TFmode:
+      return TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode);
+
     default:
       break;
     }
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 17b2fdc1cdd..c74fc8ce6fc 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5429,6 +5429,112 @@ (define_insn "*xxsel<mode>"
   "xxsel %x0,%x4,%x3,%x1"
   [(set_attr "type" "vecmove")])
 
+;; Support for ISA 3.1 IEEE 128-bit conditional move.  The mode used in the
+;; comparison must be the same as used in the conditional move.
+(define_expand "mov<mode>cc"
+   [(set (match_operand:IEEE128 0 "gpc_reg_operand")
+	 (if_then_else:IEEE128 (match_operand 1 "comparison_operator")
+			       (match_operand:IEEE128 2 "gpc_reg_operand")
+			       (match_operand:IEEE128 3 "gpc_reg_operand")))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+{
+  if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
+    DONE;
+  else
+    FAIL;
+})
+
+(define_insn_and_split "*mov<mode>cc_p10"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=&v,v")
+	(if_then_else:IEEE128
+	 (match_operator:CCFP 1 "fpmask_comparison_operator"
+		[(match_operand:IEEE128 2 "altivec_register_operand" "v,v")
+		 (match_operand:IEEE128 3 "altivec_register_operand" "v,v")])
+	 (match_operand:IEEE128 4 "altivec_register_operand" "v,v")
+	 (match_operand:IEEE128 5 "altivec_register_operand" "v,v")))
+   (clobber (match_scratch:V2DI 6 "=0,&v"))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "#"
+  "&& 1"
+  [(set (match_dup 6)
+	(if_then_else:V2DI (match_dup 1)
+			   (match_dup 7)
+			   (match_dup 8)))
+   (set (match_dup 0)
+	(if_then_else:IEEE128 (ne (match_dup 6)
+				  (match_dup 8))
+			      (match_dup 4)
+			      (match_dup 5)))]
+{
+  if (GET_CODE (operands[6]) == SCRATCH)
+    operands[6] = gen_reg_rtx (V2DImode);
+
+  operands[7] = CONSTM1_RTX (V2DImode);
+  operands[8] = CONST0_RTX (V2DImode);
+}
+ [(set_attr "length" "8")
+  (set_attr "type" "vecperm")])
+
+;; Handle inverting the fpmask comparisons.
+(define_insn_and_split "*mov<mode>cc_invert_p10"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=&v,v")
+	(if_then_else:IEEE128
+	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
+		[(match_operand:IEEE128 2 "altivec_register_operand" "v,v")
+		 (match_operand:IEEE128 3 "altivec_register_operand" "v,v")])
+	 (match_operand:IEEE128 4 "altivec_register_operand" "v,v")
+	 (match_operand:IEEE128 5 "altivec_register_operand" "v,v")))
+   (clobber (match_scratch:V2DI 6 "=0,&v"))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "#"
+  "&& 1"
+  [(set (match_dup 6)
+	(if_then_else:V2DI (match_dup 9)
+			   (match_dup 7)
+			   (match_dup 8)))
+   (set (match_dup 0)
+	(if_then_else:IEEE128 (ne (match_dup 6)
+				  (match_dup 8))
+			      (match_dup 5)
+			      (match_dup 4)))]
+{
+  rtx op1 = operands[1];
+  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
+
+  if (GET_CODE (operands[6]) == SCRATCH)
+    operands[6] = gen_reg_rtx (V2DImode);
+
+  operands[7] = CONSTM1_RTX (V2DImode);
+  operands[8] = CONST0_RTX (V2DImode);
+
+  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
+}
+ [(set_attr "length" "8")
+  (set_attr "type" "vecperm")])
+
+(define_insn "*fpmask<mode>"
+  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
+	(if_then_else:V2DI
+	 (match_operator:CCFP 1 "fpmask_comparison_operator"
+		[(match_operand:IEEE128 2 "altivec_register_operand" "v")
+		 (match_operand:IEEE128 3 "altivec_register_operand" "v")])
+	 (match_operand:V2DI 4 "all_ones_constant" "")
+	 (match_operand:V2DI 5 "zero_constant" "")))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xscmp%V1qp %0,%2,%3"
+  [(set_attr "type" "fpcompare")])
+
+(define_insn "*xxsel<mode>"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(if_then_else:IEEE128
+	 (ne (match_operand:V2DI 1 "altivec_register_operand" "v")
+	     (match_operand:V2DI 2 "zero_constant" ""))
+	 (match_operand:IEEE128 3 "altivec_register_operand" "v")
+	 (match_operand:IEEE128 4 "altivec_register_operand" "v")))]
+  "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xxsel %x0,%x4,%x3,%x1"
+  [(set_attr "type" "vecmove")])
+
 \f
 ;; Conversions to and from floating-point.
 
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
new file mode 100644
index 00000000000..2fae8dc23bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
@@ -0,0 +1,58 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#ifndef TYPE
+#ifdef __LONG_DOUBLE_IEEE128__
+#define TYPE long double
+
+#else
+#define TYPE _Float128
+#endif
+#endif
+
+/* Verify that the ISA 3.1 (power10) IEEE 128-bit conditional move instructions
+   are generated.  */
+
+TYPE
+eq (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a == b) ? c : d;
+}
+
+TYPE
+ne (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a != b) ? c : d;
+}
+
+TYPE
+lt (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a < b) ? c : d;
+}
+
+TYPE
+le (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a <= b) ? c : d;
+}
+
+TYPE
+gt (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a > b) ? c : d;
+}
+
+TYPE
+ge (TYPE a, TYPE b, TYPE c, TYPE d)
+{
+  return (a >= b) ? c : d;
+}
+
+/* { dg-final { scan-assembler-times {\mxscmpeqqp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxscmpgeqp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxscmpgtqp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxxsel\M}     6 } } */
+/* { dg-final { scan-assembler-not   {\mxscmpuqp\M}    } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
new file mode 100644
index 00000000000..6f7627c0f2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; }
+TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-06-09  0:05 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-18 20:22 [PATCH 0/2] Add power10 IEEE 128-bit min/max/conditional move support Michael Meissner
2021-05-18 20:26 ` [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC Michael Meissner
2021-05-20 19:25   ` will schmidt
2021-05-21  1:38     ` Michael Meissner
2021-06-07 20:08       ` Segher Boessenkool
2021-05-24 15:47   ` Ping " Michael Meissner
2021-06-01 23:38   ` Ping #2: " Michael Meissner
2021-06-07 20:25   ` Segher Boessenkool
2021-06-08 23:52     ` Michael Meissner
2021-05-18 20:28 ` [PATCH 2/2] Add IEEE 128-bit fp conditional move " Michael Meissner
2021-05-20 19:27   ` will schmidt
2021-05-21  1:45     ` Michael Meissner
2021-06-07 21:29     ` Segher Boessenkool
2021-05-24 15:49   ` Ping " Michael Meissner
2021-06-01 23:39   ` Ping #2: " Michael Meissner
2021-06-07 22:31   ` Segher Boessenkool
2021-06-09  0:05     ` Michael Meissner
  -- strict thread matches above, loose matches on Subject: below --
2021-04-15 15:57 [PATCH 0/2] Add IEEE 128-bit min/max/conditional move Michael Meissner
2021-04-15 16:05 ` [PATCH 2/2] Add IEEE 128-bit fp conditional move on PowerPC Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).