[PATCH] Add power10 IEEE 128-bit minimum, maximum, and compare with mask instructions

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] Add power10 IEEE 128-bit minimum, maximum, and compare with mask instructions
@ 2020-08-27  2:41 Michael Meissner
  2020-08-27  2:43 ` [PATCH 1/4] PowerPC: Change cmove function return to bool Michael Meissner
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Michael Meissner @ 2020-08-27  2:41 UTC (permalink / raw)
  To: gcc-patches, Segher Boessenkool, David Edelsohn,
	Michael Meissner, Bill Schmidt, Peter Bergner

The following patches are a rewrite of the previous set of patches to add
support for the power10 IEEE 128-bit C minimum, C maximum, and compare/set mask
instructions that are similar to the instructions added in power9.

There are 4 patches in this series.

The first patch is a cosmetic patch.  In the previous patches, Segher said the
new functions should return bool instead of int.  In adding the support, I
noticed the two existing functions processing conditional moves
(rs6000_emit_cmove and rs6000_emit_int_cmove) returned int rather than bool.
The first patch just changes the return type and return statements to now
return bool.

The second patch renames the functions that generate the ISA 3.0 C minimum, C
maximum, and conditional move instructions to use a better name than just using
a _p9 suffix.  As Segher suggested, the names should be of the form
"maybe_emit" instead of "generate_", since both functions can fail.

The third patch adds the minimum and maximum support without adding the
conditional move support (the 4th patch will add the conditional move support).
Because of the NaN differences, the built-in functions will only generate these
instructions if -ffast-math is used.

The fourth patch adds the conditional move support.  In adding the conditional
move support, the optimizers will be able to convert things like:

	a = (b > c) ? b : c;

into the instructions.  Unlike the previous set of patches, this patch merges
together the scalar SF/DF conditional move with the scalar KF/TF conditional
move.  It extends the optimization that was previously used for SFmode and
DFmode to allow the comparison to be a different scalar floating point mode
than the move.  I.e.

	__float128 a, b, c;
	float x, y;

	/* ... */

	a = (x == y) ? b : c;

I did have to add an XXPERMDI if the comparison mode was SFmode or DFmode, and
the move mode is KFmode or TFmode (the XSCMP{EQ,GT,GE}DP instructions
explicitly set the bottom 64 bits of the vector register to 0).

I have built compilers on a little endian power9 Linux system with all 4
patches applied.  I did bootstrap builds and ran the testsuite, with no
regressions.  Previous versions of the patch was also tested on a little endian
power8 Linux system.  I would like to check these patches into the master
branch for GCC 11.  At this time, I do not anticipate needing to backport these
changes to GCC 10.3.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/4] PowerPC: Change cmove function return to bool
  2020-08-27  2:41 [PATCH] Add power10 IEEE 128-bit minimum, maximum, and compare with mask instructions Michael Meissner
@ 2020-08-27  2:43 ` Michael Meissner
  2020-08-27 20:47   ` will schmidt
  2020-09-10 19:15   ` Segher Boessenkool
  2020-08-27  2:44 ` [PATCH 2/4] PowerPC: Rename functions for min, max, cmove Michael Meissner
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 20+ messages in thread
From: Michael Meissner @ 2020-08-27  2:43 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

PowerPC: Change cmove function return to bool.

In doing the other work for adding ISA 3.1 128-bit minimum, maximum, and
conditional move support, I noticed the two functions that process conditional
moves return 'int' instead of 'bool'.  This patch changes these functions to
return 'bool'.

I have built compilers on a little endian power9 Linux system with all 4
patches applied.  I did bootstrap builds and ran the testsuite, with no
regressions.  Previous versions of the patch was also tested on a little endian
power8 Linux system.  I would like to check this patch into the master
branch for GCC 11.  At this time, I do not anticipate needing to backport these
changes to GCC 10.3.

gcc/
2020-08-26  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000-protos.h (rs6000_emit_cmove): Change return
	type to bool.
	(rs6000_emit_int_cmove): Change return type to bool.
	* config/rs6000/rs6000.c (rs6000_emit_cmove): Change return type
	to bool.
	(rs6000_emit_int_cmove): Change return type to bool.

---
 gcc/config/rs6000/rs6000-protos.h |  4 ++--
 gcc/config/rs6000/rs6000.c        | 32 +++++++++++++++----------------
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 28e859f4381..02e4d71922f 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -119,8 +119,8 @@ extern char * output_cbranch (rtx, const char *, int, rtx_insn *);
 extern const char * output_probe_stack_range (rtx, rtx, rtx);
 extern void rs6000_emit_dot_insn (rtx dst, rtx src, int dot, rtx ccreg);
 extern bool rs6000_emit_set_const (rtx, rtx);
-extern int rs6000_emit_cmove (rtx, rtx, rtx, rtx);
-extern int rs6000_emit_int_cmove (rtx, rtx, rtx, rtx);
+extern bool rs6000_emit_cmove (rtx, rtx, rtx, rtx);
+extern bool rs6000_emit_int_cmove (rtx, rtx, rtx, rtx);
 extern int rs6000_emit_vector_cond_expr (rtx, rtx, rtx, rtx, rtx, rtx);
 extern void rs6000_emit_minmax (rtx, enum rtx_code, rtx, rtx);
 extern void rs6000_expand_atomic_compare_and_swap (rtx op[]);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 1c1caa90ede..bac50c2bcf6 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15159,7 +15159,7 @@ rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
    operands of the last comparison is nonzero/true, FALSE_COND if it
    is zero/false.  Return 0 if the hardware has no such operation.  */
 
-int
+bool
 rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   enum rtx_code code = GET_CODE (op);
@@ -15175,11 +15175,11 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
       /* In the isel case however, we can use a compare immediate, so
 	 op1 may be a small constant.  */
       && (!TARGET_ISEL || !short_cint_operand (op1, VOIDmode)))
-    return 0;
+    return false;
   if (GET_MODE (true_cond) != result_mode)
-    return 0;
+    return false;
   if (GET_MODE (false_cond) != result_mode)
-    return 0;
+    return false;
 
   /* See if we can use the ISA 3.0 (power9) min/max/compare functions.  */
   if (TARGET_P9_MINMAX
@@ -15187,16 +15187,16 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
       && (result_mode == SFmode || result_mode == DFmode))
     {
       if (rs6000_emit_p9_fp_minmax (dest, op, true_cond, false_cond))
-	return 1;
+	return true;
 
       if (rs6000_emit_p9_fp_cmove (dest, op, true_cond, false_cond))
-	return 1;
+	return true;
     }
 
   /* Don't allow using floating point comparisons for integer results for
      now.  */
   if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode))
-    return 0;
+    return false;
 
   /* First, work out if the hardware can do this at all, or
      if it's too slow....  */
@@ -15204,7 +15204,7 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
     {
       if (TARGET_ISEL)
 	return rs6000_emit_int_cmove (dest, op, true_cond, false_cond);
-      return 0;
+      return false;
     }
 
   is_against_zero = op1 == CONST0_RTX (compare_mode);
@@ -15216,7 +15216,7 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
      generated.  */
   if (SCALAR_FLOAT_MODE_P (compare_mode)
       && flag_trapping_math && ! is_against_zero)
-    return 0;
+    return false;
 
   /* Eliminate half of the comparisons by switching operands, this
      makes the remaining code simpler.  */
@@ -15232,7 +15232,7 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   /* UNEQ and LTGT take four instructions for a comparison with zero,
      it'll probably be faster to use a branch here too.  */
   if (code == UNEQ && HONOR_NANS (compare_mode))
-    return 0;
+    return false;
 
   /* We're going to try to implement comparisons by performing
      a subtract, then comparing against zero.  Unfortunately,
@@ -15247,14 +15247,14 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
       && ((! rtx_equal_p (op0, false_cond) && ! rtx_equal_p (op1, false_cond))
 	  || (! rtx_equal_p (op0, true_cond)
 	      && ! rtx_equal_p (op1, true_cond))))
-    return 0;
+    return false;
 
   /* At this point we know we can use fsel.  */
 
   /* Don't allow compare_mode other than SFmode or DFmode, for others there
      is no fsel instruction.  */
   if (compare_mode != SFmode && compare_mode != DFmode)
-    return 0;
+    return false;
 
   /* Reduce the comparison to a comparison against zero.  */
   if (! is_against_zero)
@@ -15353,12 +15353,12 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 						gen_rtx_GE (VOIDmode,
 							    op0, op1),
 						true_cond, false_cond)));
-  return 1;
+  return true;
 }
 
 /* Same as above, but for ints (isel).  */
 
-int
+bool
 rs6000_emit_int_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   rtx condition_rtx, cr;
@@ -15368,7 +15368,7 @@ rs6000_emit_int_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   bool signedp;
 
   if (mode != SImode && (!TARGET_POWERPC64 || mode != DImode))
-    return 0;
+    return false;
 
   /* We still have to do the compare, because isel doesn't do a
      compare, it just looks at the CRx bits set by a previous compare
@@ -15403,7 +15403,7 @@ rs6000_emit_int_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 
   emit_insn (isel_func (dest, condition_rtx, true_cond, false_cond, cr));
 
-  return 1;
+  return true;
 }
 
 void
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/4] PowerPC: Rename functions for min, max, cmove
  2020-08-27  2:41 [PATCH] Add power10 IEEE 128-bit minimum, maximum, and compare with mask instructions Michael Meissner
  2020-08-27  2:43 ` [PATCH 1/4] PowerPC: Change cmove function return to bool Michael Meissner
@ 2020-08-27  2:44 ` Michael Meissner
  2020-08-27 20:47   ` will schmidt
  2020-09-10 19:29   ` Segher Boessenkool
  2020-08-27  2:45 ` [PATCH 3/4] PowerPC: Add power10 xsmaxcqp/xsmincqp support Michael Meissner
  2020-08-27  2:46 ` [PATCH 4/4] PowerPC: Add power10 xscmp{eq,gt,ge}qp support Michael Meissner
  3 siblings, 2 replies; 20+ messages in thread
From: Michael Meissner @ 2020-08-27  2:44 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

PowerPC: Rename functions for min, max, cmove.

This patch renames the functions that generate the ISA 3.0 C minimum, C
maximum, and conditional move instructions to use a better name than just using
a _p9 suffix.  Because the functions can fail, the names use "maybe_emit"
instead of "generate_" in the name.

I have built compilers on a little endian power9 Linux system with all 4
patches applied.  I did bootstrap builds and ran the testsuite, with no
regressions.  Previous versions of the patch was also tested on a little endian
power8 Linux system.  I would like to check this patch into the master
branch for GCC 11.  At this time, I do not anticipate needing to backport these
changes to GCC 10.3.

gcc/
2020-08-26  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_emit_p9_fp_minmax): Rename to
	maybe_emit_fp_c_minmax.
	(maybe_emit_fp_c_minmax): Rename rs6000_emit_p9_fp_minmax.  Return
	bool instead of int.
	(rs6000_emit_p9_fp_cmove): Rename to maybe_emit_fp_cmove.
	(maybe_emit_fp_cmove): Rename rs6000_emit_p9_fp_cmove.  Return
	bool instead of int.
	(have_compare_and_set_mask): New helper function.
	(rs6000_emit_cmove): Rework support of ISA 3.0 functions to
	generate "C" minimum, "C" maximum, and conditional move
	instructions for scalar floating point.

---
 gcc/config/rs6000/rs6000.c | 77 ++++++++++++++++++++++++++------------
 1 file changed, 53 insertions(+), 24 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index bac50c2bcf6..6324f930628 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15056,13 +15056,17 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   return 1;
 }
 
-/* ISA 3.0 (power9) minmax subcase to emit a XSMAXCDP or XSMINCDP instruction
-   for SF/DF scalars.  Move TRUE_COND to DEST if OP of the operands of the last
-   comparison is nonzero/true, FALSE_COND if it is zero/false.  Return 0 if the
-   hardware has no such operation.  */
+/* Possibly emit the C variant of the minimum or maximum instruction for
+   floating point scalars (xsmincdp, xsmaxcdp, etc.).
 
-static int
-rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
+   Move TRUE_COND to DEST if OP of the operands of the last comparison is
+   nonzero/true, FALSE_COND if it is zero/false.
+
+   Return false if we can't generate the appropriate minimum or maximum, and
+   true if we can did the minimum or maximum.  */
+
+static bool
+maybe_emit_fp_c_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   enum rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
@@ -15072,14 +15076,14 @@ rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   bool max_p = false;
 
   if (result_mode != compare_mode)
-    return 0;
+    return false;
 
   if (code == GE || code == GT)
     max_p = true;
   else if (code == LE || code == LT)
     max_p = false;
   else
-    return 0;
+    return false;
 
   if (rtx_equal_p (op0, true_cond) && rtx_equal_p (op1, false_cond))
     ;
@@ -15092,19 +15096,23 @@ rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
     max_p = !max_p;
 
   else
-    return 0;
+    return false;
 
   rs6000_emit_minmax (dest, max_p ? SMAX : SMIN, op0, op1);
-  return 1;
+  return true;
 }
 
-/* ISA 3.0 (power9) conditional move subcase to emit XSCMP{EQ,GE,GT,NE}DP and
-   XXSEL instructions for SF/DF scalars.  Move TRUE_COND to DEST if OP of the
-   operands of the last comparison is nonzero/true, FALSE_COND if it is
-   zero/false.  Return 0 if the hardware has no such operation.  */
+/* Possibly emit a floating point conditional move by generating a compare that
+   sets a mask instruction and a XXSEL select instruction.
 
-static int
-rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
+   Move TRUE_COND to DEST if OP of the operands of the last comparison is
+   nonzero/true, FALSE_COND if it is zero/false.
+
+   Return false if the operation cannot be generated, and true if we could
+   generate the instruction.  */
+
+static bool
+maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   enum rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
@@ -15132,7 +15140,7 @@ rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
       break;
 
     default:
-      return 0;
+      return false;
     }
 
   /* Generate:	[(parallel [(set (dest)
@@ -15152,7 +15160,28 @@ rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   emit_insn (gen_rtx_PARALLEL (VOIDmode,
 			       gen_rtvec (2, cmove_rtx, clobber_rtx)));
 
-  return 1;
+  return true;
+}
+
+/* Helper function to return true if the target has instructions to do a
+   compare and set mask instruction that can be used with XXSEL to implement a
+   conditional move.  It is also assumed that such a target also supports the
+   "C" minimum and maximum instructions. */
+
+static bool
+have_compare_and_set_mask (machine_mode mode)
+{
+  switch (mode)
+    {
+    case SFmode:
+    case DFmode:
+      return TARGET_P9_MINMAX;
+
+    default:
+      break;
+    }
+
+  return false;
 }
 
 /* Emit a conditional move: move TRUE_COND to DEST if OP of the
@@ -15181,15 +15210,15 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   if (GET_MODE (false_cond) != result_mode)
     return false;
 
-  /* See if we can use the ISA 3.0 (power9) min/max/compare functions.  */
-  if (TARGET_P9_MINMAX
-      && (compare_mode == SFmode || compare_mode == DFmode)
-      && (result_mode == SFmode || result_mode == DFmode))
+  /* See if we can use the "C" minimum, "C" maximum, and compare and set mask
+     instructions.  */
+  if (have_compare_and_set_mask (compare_mode)
+      && have_compare_and_set_mask (result_mode))
     {
-      if (rs6000_emit_p9_fp_minmax (dest, op, true_cond, false_cond))
+      if (maybe_emit_fp_c_minmax (dest, op, true_cond, false_cond))
 	return true;
 
-      if (rs6000_emit_p9_fp_cmove (dest, op, true_cond, false_cond))
+      if (maybe_emit_fp_cmove (dest, op, true_cond, false_cond))
 	return true;
     }
 
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/4] PowerPC: Add power10 xsmaxcqp/xsmincqp support
  2020-08-27  2:41 [PATCH] Add power10 IEEE 128-bit minimum, maximum, and compare with mask instructions Michael Meissner
  2020-08-27  2:43 ` [PATCH 1/4] PowerPC: Change cmove function return to bool Michael Meissner
  2020-08-27  2:44 ` [PATCH 2/4] PowerPC: Rename functions for min, max, cmove Michael Meissner
@ 2020-08-27  2:45 ` Michael Meissner
  2020-08-27 20:47   ` will schmidt
  2020-09-10 21:01   ` Segher Boessenkool
  2020-08-27  2:46 ` [PATCH 4/4] PowerPC: Add power10 xscmp{eq,gt,ge}qp support Michael Meissner
  3 siblings, 2 replies; 20+ messages in thread
From: Michael Meissner @ 2020-08-27  2:45 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

PowerPC: Add power10 xsmaxcqp/xsmincqp support.

This patch adds support for the ISA 3.1 (power10) IEEE 128-bit "C" minimum and
maximum functions.  Because of the NaN differences, the built-in functions will
only generate these instructions if -ffast-math is used until the conditional
move support is added in the next patch.

I have built compilers on a little endian power9 Linux system with all 4
patches applied.  I did bootstrap builds and ran the testsuite, with no
regressions.  Previous versions of the patch was also tested on a little endian
power8 Linux system.  I would like to check this patch into the master branch
for GCC 11.  At this time, I do not anticipate needing to backport these
changes to GCC 10.3.


gcc/
2020-08-26  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.h (FLOAT128_IEEE_MINMAX_P): New helper
	macro.
	* config/rs6000/rs6000.md (FSCALAR): New mode iterator for floating
	point scalars.
	(Fm): New mode attribute for floating point scalars.
	(s<minmax><mode>): Add support for the ISA 3.1 IEEE 128-bit
	minimum and maximum instructions.
	(s<minmax><mode>3_vsx): Add support for the ISA 3.1 IEEE 128-bit
	minimum and maximum instructions.

gcc/testsuite/
2020-08-26  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/float128-minmax-2.c: New test.

---
 gcc/config/rs6000/rs6000.c                    |  3 +-
 gcc/config/rs6000/rs6000.h                    |  4 +++
 gcc/config/rs6000/rs6000.md                   | 28 +++++++++++++++----
 .../gcc.target/powerpc/float128-minmax-2.c    | 15 ++++++++++
 4 files changed, 43 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6324f930628..05eb141a2cd 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15445,7 +15445,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
   /* VSX/altivec have direct min/max insns.  */
   if ((code == SMAX || code == SMIN)
       && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
-	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
+	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
+	  || FLOAT128_IEEE_MINMAX_P (mode)))
     {
       emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
       return;
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index bbd8060e143..b504aaa0199 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -345,6 +345,10 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
    || ((MODE) == TDmode)						\
    || (!TARGET_FLOAT128_TYPE && FLOAT128_IEEE_P (MODE)))
 
+/* Macro whether the float128 min/max instructions are enabled.  */
+#define FLOAT128_IEEE_MINMAX_P(MODE)					\
+  (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (MODE))
+
 /* Return true for floating point that does not use a vector register.  */
 #define SCALAR_FLOAT_MODE_NOT_VECTOR_P(MODE)				\
   (SCALAR_FLOAT_MODE_P (MODE) && !FLOAT128_VECTOR_P (MODE))
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 43b620ae1c0..006e60f09bc 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -789,6 +789,18 @@ (define_code_attr     minmax	[(smin "min")
 (define_code_attr     SMINMAX	[(smin "SMIN")
 				 (smax "SMAX")])
 
+;; Mode iterator for scalar binary floating point operations
+(define_mode_iterator FSCALAR [SF
+			       DF
+			       (KF "FLOAT128_IEEE_MINMAX_P (KFmode)")
+			       (TF "FLOAT128_IEEE_MINMAX_P (TFmode)")])
+
+;; Constraints to use for scalar FP operations
+(define_mode_attr Fm [(SF "wa")
+		      (DF "wa")
+		      (TF "v")
+		      (KF "v")])
+
 ;; Iterator to optimize the following cases:
 ;;	D-form load to FPR register & move to Altivec register
 ;;	Move Altivec register to FPR register and store
@@ -5142,9 +5154,9 @@ (define_insn "copysign<mode>3_fcpsgn"
 ;; to allow either DF/SF to use only traditional registers.
 
 (define_expand "s<minmax><mode>3"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand")
-	(fp_minmax:SFDF (match_operand:SFDF 1 "gpc_reg_operand")
-			(match_operand:SFDF 2 "gpc_reg_operand")))]
+  [(set (match_operand:FSCALAR 0 "gpc_reg_operand")
+	(fp_minmax:FSCALAR (match_operand:FSCALAR 1 "gpc_reg_operand")
+			   (match_operand:FSCALAR 2 "gpc_reg_operand")))]
   "TARGET_MINMAX"
 {
   rs6000_emit_minmax (operands[0], <SMINMAX>, operands[1], operands[2]);
@@ -5152,11 +5164,15 @@ (define_expand "s<minmax><mode>3"
 })
 
 (define_insn "*s<minmax><mode>3_vsx"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
-	(fp_minmax:SFDF (match_operand:SFDF 1 "vsx_register_operand" "<Fv>")
-			(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")))]
+  [(set (match_operand:FSCALAR 0 "vsx_register_operand" "=<Fm>")
+	(fp_minmax:FSCALAR
+	 (match_operand:FSCALAR 1 "vsx_register_operand" "<Fm>")
+	 (match_operand:FSCALAR 2 "vsx_register_operand" "<Fm>")))]
   "TARGET_VSX && TARGET_HARD_FLOAT"
 {
+  if (FLOAT128_IEEE_P (<MODE>mode))
+    return "xs<minmax>cqp %0,%1,%2";
+
   return (TARGET_P9_MINMAX
 	  ? "xs<minmax>cdp %x0,%x1,%x2"
 	  : "xs<minmax>dp %x0,%x1,%x2");
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
new file mode 100644
index 00000000000..c71ba08c9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
+TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/4] PowerPC: Add power10 xscmp{eq,gt,ge}qp support
  2020-08-27  2:41 [PATCH] Add power10 IEEE 128-bit minimum, maximum, and compare with mask instructions Michael Meissner
                   ` (2 preceding siblings ...)
  2020-08-27  2:45 ` [PATCH 3/4] PowerPC: Add power10 xsmaxcqp/xsmincqp support Michael Meissner
@ 2020-08-27  2:46 ` Michael Meissner
  2020-08-27 20:47   ` will schmidt
  2020-09-15 21:20   ` Segher Boessenkool
  3 siblings, 2 replies; 20+ messages in thread
From: Michael Meissner @ 2020-08-27  2:46 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

PowerPC: Add power10 xscmp{eq,gt,ge}qp support.

This patch adds the conditional move support.  In adding the conditional move
support, the optimizers will be able to convert things like:

	a = (b > c) ? b : c;

into the instructions.  This patch merges together the scalar SF/DF conditional
move with the scalar KF/TF conditional move.  It extends the optimization that
was previously used for SFmode and DFmode to allow the comparison to be a
different scalar floating point mode than the move.  I.e.

	__float128 a, b, c;
	float x, y;

	/* ... */

	a = (x == y) ? b : c;

I did have to add an XXPERMDI if the comparison mode was SFmode or DFmode, and
the move mode is KFmode or TFmode (the XSCMP{EQ,GT,GE}DP instructions
explicitly set the bottom 64 bits of the vector register to 0).

I have built compilers on a little endian power9 Linux system with all 4
patches applied.  I did bootstrap builds and ran the testsuite, with no
regressions.  Previous versions of the patch was also tested on a little endian
power8 Linux system.  I would like to check this patch into the master branch
for GCC 11.  At this time, I do not anticipate needing to backport these
changes to GCC 10.3.

gcc/
2020-08-26  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (fpmask_normal_or_invert_operator):
	New predicate.
	* config/rs6000/rs6000.c (have_compare_and_set_mask): Add IEEE
	128-bit floating point types.
	* config/rs6000/rs6000.md (FSCALAR2): New iterator for floating
	point conditional moves.
	(mov<SFDF:mode><SFDF2:mode>cc_p9): Replace with
	mov<FSCALAR:mode><FSCALAR2:mode>.
	(mov<SFDF:mode><SFDF2:mode>cc_invert_p9): Replace with
	mov<FSCALAR:mode><FSCALAR2:mode>.
	(mov<FSCALAR:mode><FSCALAR2:mode>): Combine both
	mov<SFDF:mode><SFDF2:mode>cc_p9 and
	mov<SFDF:mode><SFDF2:mode>cc_invert_p9 patterns.  Add ISA 3.1
	support for IEEE 128-bit conditional moves.  Always use an
	earlyclobber register for the mask.  Use XXPERMDI to extend the
	mask if we have a 64-bit comparison and 128-bit move.
	register for the mask.
	(fpmask<mode>, xxsel<mode>): Add ISA 3.1 support for IEEE 128-bit
	conditional moves.  Enable the generator functionality so
	mov<FSCALAR:mode><FSCALAR2:mode> can call it.  Update constraints
	for 128-bit operations.

gcc/testsuite/
2020-08-26  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/float128-cmove.c: New test.
	* gcc.target/powerpc/float128-minmax-3.c: New test.

---
 gcc/config/rs6000/predicates.md               |   5 +
 gcc/config/rs6000/rs6000.c                    |   4 +
 gcc/config/rs6000/rs6000.md                   | 154 ++++++++++--------
 .../gcc.target/powerpc/float128-cmove.c       |  93 +++++++++++
 .../gcc.target/powerpc/float128-minmax-3.c    |  15 ++
 5 files changed, 200 insertions(+), 71 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 2709e46f7e5..60b45601e9b 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1225,6 +1225,11 @@ (define_predicate "fpmask_comparison_operator"
 (define_predicate "invert_fpmask_comparison_operator"
   (match_code "ne,unlt,unle"))
 
+;; Return 1 if OP is either a fpmask_comparison_operator or
+;; invert_fpmask_comparsion_operator.
+(define_predicate "fpmask_normal_or_invert_operator"
+  (match_code "eq,gt,ge,ne,unlt,unle"))
+
 ;; Return 1 if OP is a comparison operation suitable for integer vector/scalar
 ;; comparisons that generate a -1/0 mask.
 (define_predicate "vecint_comparison_operator"
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 05eb141a2cd..403897926c5 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15177,6 +15177,10 @@ have_compare_and_set_mask (machine_mode mode)
     case DFmode:
       return TARGET_P9_MINMAX;
 
+    case KFmode:
+    case TFmode:
+      return FLOAT128_IEEE_MINMAX_P (mode);
+
     default:
       break;
     }
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 006e60f09bc..147c635994c 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -795,6 +795,15 @@ (define_mode_iterator FSCALAR [SF
 			       (KF "FLOAT128_IEEE_MINMAX_P (KFmode)")
 			       (TF "FLOAT128_IEEE_MINMAX_P (TFmode)")])
 
+;; Secondary iterator for scalar binary floating point operations.  This is
+;; used for the conditional moves when we have a compare and set mask
+;; instruction.  Using this instruction allows us to do a conditional move
+;; where the comparison type might be different from the values being moved.
+(define_mode_iterator FSCALAR2 [SF
+				DF
+				(KF "FLOAT128_IEEE_MINMAX_P (KFmode)")
+				(TF "FLOAT128_IEEE_MINMAX_P (TFmode)")])
+
 ;; Constraints to use for scalar FP operations
 (define_mode_attr Fm [(SF "wa")
 		      (DF "wa")
@@ -5290,10 +5299,10 @@ (define_insn "*setnbcr_<un>signed_<GPR:mode>"
 
 ;; Floating point conditional move
 (define_expand "mov<mode>cc"
-   [(set (match_operand:SFDF 0 "gpc_reg_operand")
-	 (if_then_else:SFDF (match_operand 1 "comparison_operator")
-			    (match_operand:SFDF 2 "gpc_reg_operand")
-			    (match_operand:SFDF 3 "gpc_reg_operand")))]
+   [(set (match_operand:FSCALAR 0 "gpc_reg_operand")
+	 (if_then_else:FSCALAR (match_operand 1 "comparison_operator")
+			       (match_operand:FSCALAR 2 "gpc_reg_operand")
+			       (match_operand:FSCALAR 3 "gpc_reg_operand")))]
   "TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT"
 {
   if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
@@ -5313,92 +5322,95 @@ (define_insn "*fsel<SFDF:mode><SFDF2:mode>4"
   "fsel %0,%1,%2,%3"
   [(set_attr "type" "fp")])
 
-(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_p9"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
-	(if_then_else:SFDF
-	 (match_operator:CCFP 1 "fpmask_comparison_operator"
-		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
-		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
-	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
-	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
-   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
+;; Conditional moves using scalar floating point types if we have a compare and
+;; set mask instruction (like xscmpeqdp).
+;;
+;; If we have a 64-bit comparison and a 128-bit mode, such as:
+;;
+;;	double x, y;
+;;	__float128 a, b, c;
+;;	a = (x == y) ? b : c;
+;;
+;; We need to extend the comparison (XSCMPEQDP puts 0's in the bottom 64-bits).
+(define_insn_and_split "*mov<FSCALAR:mode><FSCALAR2:mode>cc"
+  [(set (match_operand:FSCALAR 0 "vsx_register_operand" "=<FSCALAR:Fm>")
+	(if_then_else:FSCALAR
+	 (match_operator:CCFP 1 "fpmask_normal_or_invert_operator"
+		[(match_operand:FSCALAR2 2 "vsx_register_operand" "<FSCALAR2:Fm>")
+		 (match_operand:FSCALAR2 3 "vsx_register_operand" "<FSCALAR2:Fm>")])
+	 (match_operand:FSCALAR 4 "vsx_register_operand" "<FSCALAR:Fm>")
+	 (match_operand:FSCALAR 5 "vsx_register_operand" "<FSCALAR:Fm>")))
+   (clobber (match_scratch:V2DI 6 "=<FSCALAR2:Fm>"))]
   "TARGET_P9_MINMAX"
   "#"
-  ""
-  [(set (match_dup 6)
-	(if_then_else:V2DI (match_dup 1)
-			   (match_dup 7)
-			   (match_dup 8)))
-   (set (match_dup 0)
-	(if_then_else:SFDF (ne (match_dup 6)
-			       (match_dup 8))
-			   (match_dup 4)
-			   (match_dup 5)))]
+  "&& 1"
+  [(pc)]
 {
-  if (GET_CODE (operands[6]) == SCRATCH)
-    operands[6] = gen_reg_rtx (V2DImode);
+  rtx dest = operands[0];
+  rtx cmp = operands[1];
+  rtx cmp_op1 = operands[2];
+  rtx cmp_op2 = operands[3];
+  rtx move_t = operands[4];
+  rtx move_f = operands[5];
+  rtx mask_reg = operands[6];
+  rtx mask_m1 = CONSTM1_RTX (V2DImode);
+  rtx mask_0 = CONST0_RTX (V2DImode);
 
-  operands[7] = CONSTM1_RTX (V2DImode);
-  operands[8] = CONST0_RTX (V2DImode);
-}
- [(set_attr "length" "8")
-  (set_attr "type" "vecperm")])
+  if (GET_CODE (mask_reg) == SCRATCH)
+    mask_reg = gen_reg_rtx (V2DImode);
 
-;; Handle inverting the fpmask comparisons.
-(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_invert_p9"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
-	(if_then_else:SFDF
-	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
-		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
-		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
-	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
-	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
-   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
-  "TARGET_P9_MINMAX"
-  "#"
-  "&& 1"
-  [(set (match_dup 6)
-	(if_then_else:V2DI (match_dup 9)
-			   (match_dup 7)
-			   (match_dup 8)))
-   (set (match_dup 0)
-	(if_then_else:SFDF (ne (match_dup 6)
-			       (match_dup 8))
-			   (match_dup 5)
-			   (match_dup 4)))]
-{
-  rtx op1 = operands[1];
-  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
+  /* Reverse the operands and test if the comparison operator is inverted.  */
+  if (invert_fpmask_comparison_operator (cmp, CCFPmode))
+    {
+      enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (cmp));
+      cmp = gen_rtx_fmt_ee (cond, CCFPmode, cmp_op1, cmp_op2);
+      std::swap (move_t, move_f);
+    }
 
-  if (GET_CODE (operands[6]) == SCRATCH)
-    operands[6] = gen_reg_rtx (V2DImode);
+  /* Emit the compare and set mask instruction.  */
+  emit_insn (gen_fpmask<FSCALAR2:mode> (mask_reg, cmp, cmp_op1, cmp_op2,
+				        mask_m1, mask_0));
 
-  operands[7] = CONSTM1_RTX (V2DImode);
-  operands[8] = CONST0_RTX (V2DImode);
+  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
+     mask.  Because we are using the splat builtin to extend the V2DImode, we
+     need to use element 1 on little endian systems.  */
+  if (!FLOAT128_IEEE_P (<FSCALAR2:MODE>mode)
+      && FLOAT128_IEEE_P (<FSCALAR:MODE>mode))
+    {
+      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
+      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
+    }
 
-  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
+  /* Now emit the XXSEL insn.  */
+  emit_insn (gen_xxsel<FSCALAR:mode> (dest, mask_reg, mask_0, move_t, move_f));
+  DONE;
 }
  [(set_attr "length" "8")
   (set_attr "type" "vecperm")])
 
-(define_insn "*fpmask<mode>"
-  [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa")
+(define_insn "fpmask<mode>"
+  [(set (match_operand:V2DI 0 "vsx_register_operand" "=<Fm>")
 	(if_then_else:V2DI
 	 (match_operator:CCFP 1 "fpmask_comparison_operator"
-		[(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")
-		 (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")])
+		[(match_operand:FSCALAR 2 "vsx_register_operand" "<Fm>")
+		 (match_operand:FSCALAR 3 "vsx_register_operand" "<Fm>")])
 	 (match_operand:V2DI 4 "all_ones_constant" "")
 	 (match_operand:V2DI 5 "zero_constant" "")))]
   "TARGET_P9_MINMAX"
-  "xscmp%V1dp %x0,%x2,%x3"
+{
+  return (FLOAT128_IEEE_P (<MODE>mode)
+	  ? "xscmp%V1qp %0,%2,%3"
+	  : "xscmp%V1dp %x0,%x2,%x3");
+}
   [(set_attr "type" "fpcompare")])
 
-(define_insn "*xxsel<mode>"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
-	(if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
-			       (match_operand:V2DI 2 "zero_constant" ""))
-			   (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")
-			   (match_operand:SFDF 4 "vsx_register_operand" "<Fv>")))]
+(define_insn "xxsel<mode>"
+  [(set (match_operand:FSCALAR 0 "vsx_register_operand" "=<Fm>")
+	(if_then_else:FSCALAR
+	 (ne (match_operand:V2DI 1 "vsx_register_operand" "<Fm>")
+	     (match_operand:V2DI 2 "zero_constant" ""))
+	 (match_operand:FSCALAR 3 "vsx_register_operand" "<Fm>")
+	 (match_operand:FSCALAR 4 "vsx_register_operand" "<Fm>")))]
   "TARGET_P9_MINMAX"
   "xxsel %x0,%x4,%x3,%x1"
   [(set_attr "type" "vecmove")])
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
new file mode 100644
index 00000000000..639d5a77087
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
@@ -0,0 +1,93 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+/* { dg-final { scan-assembler     {\mxscmpeq[dq]p\M} } } */
+/* { dg-final { scan-assembler     {\mxxpermdi\M}     } } */
+/* { dg-final { scan-assembler     {\mxxsel\M}        } } */
+/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
+/* { dg-final { scan-assembler-not {\mfcmp[uo]\M}     } } */
+/* { dg-final { scan-assembler-not {\mfsel\M}         } } */
+
+/* This series of tests tests whether you can do a conditional move where the
+   test is one floating point type, and the result is another floating point
+   type.
+
+   If the comparison type is SF/DFmode, and the move type is IEEE 128-bit
+   floating point, we have to duplicate the mask in the lower 64-bits with
+   XXPERMDI because XSCMPEQDP clears the bottom 64-bits of the mask register.
+
+   Going the other way (IEEE 128-bit comparsion, 64-bit move) is fine as the
+   mask word will be 128-bits.  */
+
+float
+eq_f_d (float a, float b, double x, double y)
+{
+  return (x == y) ? a : b;
+}
+
+double
+eq_d_f (double a, double b, float x, float y)
+{
+  return (x == y) ? a : b;
+}
+
+float
+eq_f_f128 (float a, float b, __float128 x, __float128 y)
+{
+  return (x == y) ? a : b;
+}
+
+double
+eq_d_f128 (double a, double b, __float128 x, __float128 y)
+{
+  return (x == y) ? a : b;
+}
+
+__float128
+eq_f128_f (__float128 a, __float128 b, float x, float y)
+{
+  return (x == y) ? a : b;
+}
+
+__float128
+eq_f128_d (__float128 a, __float128 b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
+
+float
+ne_f_d (float a, float b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
+
+double
+ne_d_f (double a, double b, float x, float y)
+{
+  return (x != y) ? a : b;
+}
+
+float
+ne_f_f128 (float a, float b, __float128 x, __float128 y)
+{
+  return (x != y) ? a : b;
+}
+
+double
+ne_d_f128 (double a, double b, __float128 x, __float128 y)
+{
+  return (x != y) ? a : b;
+}
+
+__float128
+ne_f128_f (__float128 a, __float128 b, float x, float y)
+{
+  return (x != y) ? a : b;
+}
+
+__float128
+ne_f128_d (__float128 a, __float128 b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
new file mode 100644
index 00000000000..6f7627c0f2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; }
+TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] PowerPC: Change cmove function return to bool
  2020-08-27  2:43 ` [PATCH 1/4] PowerPC: Change cmove function return to bool Michael Meissner
@ 2020-08-27 20:47   ` will schmidt
  2020-09-10 19:15   ` Segher Boessenkool
  1 sibling, 0 replies; 20+ messages in thread
From: will schmidt @ 2020-08-27 20:47 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Wed, 2020-08-26 at 22:43 -0400, Michael Meissner via Gcc-patches wrote:
> PowerPC: Change cmove function return to bool.
> 
> In doing the other work for adding ISA 3.1 128-bit minimum, maximum, and
> conditional move support, I noticed the two functions that process conditional
> moves return 'int' instead of 'bool'.  This patch changes these functions to
> return 'bool'.
> 
> I have built compilers on a little endian power9 Linux system with all 4
> patches applied.  I did bootstrap builds and ran the testsuite, with no
> regressions.  Previous versions of the patch was also tested on a little endian
> power8 Linux system.  I would like to check this patch into the master
> branch for GCC 11.  At this time, I do not anticipate needing to backport these
> changes to GCC 10.3.
> 
> gcc/
> 2020-08-26  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000-protos.h (rs6000_emit_cmove): Change return
> 	type to bool.
> 	(rs6000_emit_int_cmove): Change return type to bool.
> 	* config/rs6000/rs6000.c (rs6000_emit_cmove): Change return type
> 	to bool.
> 	(rs6000_emit_int_cmove): Change return type to bool.
> 


lgtm,
thanks
-Will


> ---
>  gcc/config/rs6000/rs6000-protos.h |  4 ++--
>  gcc/config/rs6000/rs6000.c        | 32 +++++++++++++++----------------
>  2 files changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
> index 28e859f4381..02e4d71922f 100644
> --- a/gcc/config/rs6000/rs6000-protos.h
> +++ b/gcc/config/rs6000/rs6000-protos.h
> @@ -119,8 +119,8 @@ extern char * output_cbranch (rtx, const char *, int, rtx_insn *);
>  extern const char * output_probe_stack_range (rtx, rtx, rtx);
>  extern void rs6000_emit_dot_insn (rtx dst, rtx src, int dot, rtx ccreg);
>  extern bool rs6000_emit_set_const (rtx, rtx);
> -extern int rs6000_emit_cmove (rtx, rtx, rtx, rtx);
> -extern int rs6000_emit_int_cmove (rtx, rtx, rtx, rtx);
> +extern bool rs6000_emit_cmove (rtx, rtx, rtx, rtx);
> +extern bool rs6000_emit_int_cmove (rtx, rtx, rtx, rtx);
>  extern int rs6000_emit_vector_cond_expr (rtx, rtx, rtx, rtx, rtx, rtx);
>  extern void rs6000_emit_minmax (rtx, enum rtx_code, rtx, rtx);
>  extern void rs6000_expand_atomic_compare_and_swap (rtx op[]);
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 1c1caa90ede..bac50c2bcf6 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15159,7 +15159,7 @@ rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>     operands of the last comparison is nonzero/true, FALSE_COND if it
>     is zero/false.  Return 0 if the hardware has no such operation.  */
> 
> -int
> +bool
>  rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>  {
>    enum rtx_code code = GET_CODE (op);
> @@ -15175,11 +15175,11 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>        /* In the isel case however, we can use a compare immediate, so
>  	 op1 may be a small constant.  */
>        && (!TARGET_ISEL || !short_cint_operand (op1, VOIDmode)))
> -    return 0;
> +    return false;
>    if (GET_MODE (true_cond) != result_mode)
> -    return 0;
> +    return false;
>    if (GET_MODE (false_cond) != result_mode)
> -    return 0;
> +    return false;
> 
>    /* See if we can use the ISA 3.0 (power9) min/max/compare functions.  */
>    if (TARGET_P9_MINMAX
> @@ -15187,16 +15187,16 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>        && (result_mode == SFmode || result_mode == DFmode))
>      {
>        if (rs6000_emit_p9_fp_minmax (dest, op, true_cond, false_cond))
> -	return 1;
> +	return true;
> 
>        if (rs6000_emit_p9_fp_cmove (dest, op, true_cond, false_cond))
> -	return 1;
> +	return true;
>      }
> 
>    /* Don't allow using floating point comparisons for integer results for
>       now.  */
>    if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode))
> -    return 0;
> +    return false;
> 
>    /* First, work out if the hardware can do this at all, or
>       if it's too slow....  */
> @@ -15204,7 +15204,7 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>      {
>        if (TARGET_ISEL)
>  	return rs6000_emit_int_cmove (dest, op, true_cond, false_cond);
> -      return 0;
> +      return false;
>      }
> 
>    is_against_zero = op1 == CONST0_RTX (compare_mode);
> @@ -15216,7 +15216,7 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>       generated.  */
>    if (SCALAR_FLOAT_MODE_P (compare_mode)
>        && flag_trapping_math && ! is_against_zero)
> -    return 0;
> +    return false;
> 
>    /* Eliminate half of the comparisons by switching operands, this
>       makes the remaining code simpler.  */
> @@ -15232,7 +15232,7 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>    /* UNEQ and LTGT take four instructions for a comparison with zero,
>       it'll probably be faster to use a branch here too.  */
>    if (code == UNEQ && HONOR_NANS (compare_mode))
> -    return 0;
> +    return false;
> 
>    /* We're going to try to implement comparisons by performing
>       a subtract, then comparing against zero.  Unfortunately,
> @@ -15247,14 +15247,14 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>        && ((! rtx_equal_p (op0, false_cond) && ! rtx_equal_p (op1, false_cond))
>  	  || (! rtx_equal_p (op0, true_cond)
>  	      && ! rtx_equal_p (op1, true_cond))))
> -    return 0;
> +    return false;
> 
>    /* At this point we know we can use fsel.  */
> 
>    /* Don't allow compare_mode other than SFmode or DFmode, for others there
>       is no fsel instruction.  */
>    if (compare_mode != SFmode && compare_mode != DFmode)
> -    return 0;
> +    return false;
> 
>    /* Reduce the comparison to a comparison against zero.  */
>    if (! is_against_zero)
> @@ -15353,12 +15353,12 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>  						gen_rtx_GE (VOIDmode,
>  							    op0, op1),
>  						true_cond, false_cond)));
> -  return 1;
> +  return true;
>  }
> 
>  /* Same as above, but for ints (isel).  */
> 
> -int
> +bool
>  rs6000_emit_int_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>  {
>    rtx condition_rtx, cr;
> @@ -15368,7 +15368,7 @@ rs6000_emit_int_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>    bool signedp;
> 
>    if (mode != SImode && (!TARGET_POWERPC64 || mode != DImode))
> -    return 0;
> +    return false;
> 
>    /* We still have to do the compare, because isel doesn't do a
>       compare, it just looks at the CRx bits set by a previous compare
> @@ -15403,7 +15403,7 @@ rs6000_emit_int_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
> 
>    emit_insn (isel_func (dest, condition_rtx, true_cond, false_cond, cr));
> 
> -  return 1;
> +  return true;
>  }
> 
>  void
> -- 
> 2.22.0
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] PowerPC: Rename functions for min, max, cmove
  2020-08-27  2:44 ` [PATCH 2/4] PowerPC: Rename functions for min, max, cmove Michael Meissner
@ 2020-08-27 20:47   ` will schmidt
  2020-09-10 19:18     ` Segher Boessenkool
  2020-09-10 19:29   ` Segher Boessenkool
  1 sibling, 1 reply; 20+ messages in thread
From: will schmidt @ 2020-08-27 20:47 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Wed, 2020-08-26 at 22:44 -0400, Michael Meissner via Gcc-patches wrote:
> PowerPC: Rename functions for min, max, cmove.
> 
> This patch renames the functions that generate the ISA 3.0 C minimum, C
> maximum, and conditional move instructions to use a better name than just using
> a _p9 suffix.  Because the functions can fail, the names use "maybe_emit"
> instead of "generate_" in the name.
> 
> I have built compilers on a little endian power9 Linux system with all 4
> patches applied.  I did bootstrap builds and ran the testsuite, with no
> regressions.  Previous versions of the patch was also tested on a little endian
> power8 Linux system.  I would like to check this patch into the master
> branch for GCC 11.  At this time, I do not anticipate needing to backport these
> changes to GCC 10.3.
> 
> gcc/
> 2020-08-26  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/rs6000.c (rs6000_emit_p9_fp_minmax): Rename to
> 	maybe_emit_fp_c_minmax.
ok

> 	(maybe_emit_fp_c_minmax): Rename rs6000_emit_p9_fp_minmax.  Return
> 	bool instead of int.

function rename is redundant between the two entries?


> 	(rs6000_emit_p9_fp_cmove): Rename to maybe_emit_fp_cmove.
> 	(maybe_emit_fp_cmove): Rename rs6000_emit_p9_fp_cmove.  Return
> 	bool instead of int.

Function rename comment is redundant ?

> 	(have_compare_and_set_mask): New helper function.
> 	(rs6000_emit_cmove): Rework support of ISA 3.0 functions to
> 	generate "C" minimum, "C" maximum, and conditional move
> 	instructions for scalar floating point.
> 

Nothing else jumped out at me below. 
lgtm,  thanks, 
-Will

> ---
>  gcc/config/rs6000/rs6000.c | 77 ++++++++++++++++++++++++++------------
>  1 file changed, 53 insertions(+), 24 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index bac50c2bcf6..6324f930628 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15056,13 +15056,17 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
>    return 1;
>  }
> 
> -/* ISA 3.0 (power9) minmax subcase to emit a XSMAXCDP or XSMINCDP instruction
> -   for SF/DF scalars.  Move TRUE_COND to DEST if OP of the operands of the last
> -   comparison is nonzero/true, FALSE_COND if it is zero/false.  Return 0 if the
> -   hardware has no such operation.  */
> +/* Possibly emit the C variant of the minimum or maximum instruction for
> +   floating point scalars (xsmincdp, xsmaxcdp, etc.).
> 
> -static int
> -rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
> +   Move TRUE_COND to DEST if OP of the operands of the last comparison is
> +   nonzero/true, FALSE_COND if it is zero/false.
> +
> +   Return false if we can't generate the appropriate minimum or maximum, and
> +   true if we can did the minimum or maximum.  */
> +
> +static bool
> +maybe_emit_fp_c_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>  {
>    enum rtx_code code = GET_CODE (op);
>    rtx op0 = XEXP (op, 0);
> @@ -15072,14 +15076,14 @@ rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>    bool max_p = false;
> 
>    if (result_mode != compare_mode)
> -    return 0;
> +    return false;
> 
>    if (code == GE || code == GT)
>      max_p = true;
>    else if (code == LE || code == LT)
>      max_p = false;
>    else
> -    return 0;
> +    return false;
> 
>    if (rtx_equal_p (op0, true_cond) && rtx_equal_p (op1, false_cond))
>      ;
> @@ -15092,19 +15096,23 @@ rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>      max_p = !max_p;
> 
>    else
> -    return 0;
> +    return false;
> 
>    rs6000_emit_minmax (dest, max_p ? SMAX : SMIN, op0, op1);
> -  return 1;
> +  return true;
>  }
> 
> -/* ISA 3.0 (power9) conditional move subcase to emit XSCMP{EQ,GE,GT,NE}DP and
> -   XXSEL instructions for SF/DF scalars.  Move TRUE_COND to DEST if OP of the
> -   operands of the last comparison is nonzero/true, FALSE_COND if it is
> -   zero/false.  Return 0 if the hardware has no such operation.  */
> +/* Possibly emit a floating point conditional move by generating a compare that
> +   sets a mask instruction and a XXSEL select instruction.
> 
> -static int
> -rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
> +   Move TRUE_COND to DEST if OP of the operands of the last comparison is
> +   nonzero/true, FALSE_COND if it is zero/false.
> +
> +   Return false if the operation cannot be generated, and true if we could
> +   generate the instruction.  */
> +
> +static bool
> +maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>  {
>    enum rtx_code code = GET_CODE (op);
>    rtx op0 = XEXP (op, 0);
> @@ -15132,7 +15140,7 @@ rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>        break;
> 
>      default:
> -      return 0;
> +      return false;
>      }
> 
>    /* Generate:	[(parallel [(set (dest)
> @@ -15152,7 +15160,28 @@ rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>    emit_insn (gen_rtx_PARALLEL (VOIDmode,
>  			       gen_rtvec (2, cmove_rtx, clobber_rtx)));
> 
> -  return 1;
> +  return true;
> +}
> +
> +/* Helper function to return true if the target has instructions to do a
> +   compare and set mask instruction that can be used with XXSEL to implement a
> +   conditional move.  It is also assumed that such a target also supports the
> +   "C" minimum and maximum instructions. */
> +
> +static bool
> +have_compare_and_set_mask (machine_mode mode)
> +{
> +  switch (mode)
> +    {
> +    case SFmode:
> +    case DFmode:
> +      return TARGET_P9_MINMAX;
> +
> +    default:
> +      break;
> +    }
> +
> +  return false;
>  }
> 
>  /* Emit a conditional move: move TRUE_COND to DEST if OP of the
> @@ -15181,15 +15210,15 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
>    if (GET_MODE (false_cond) != result_mode)
>      return false;
> 
> -  /* See if we can use the ISA 3.0 (power9) min/max/compare functions.  */
> -  if (TARGET_P9_MINMAX
> -      && (compare_mode == SFmode || compare_mode == DFmode)
> -      && (result_mode == SFmode || result_mode == DFmode))
> +  /* See if we can use the "C" minimum, "C" maximum, and compare and set mask
> +     instructions.  */
> +  if (have_compare_and_set_mask (compare_mode)
> +      && have_compare_and_set_mask (result_mode))
>      {
> -      if (rs6000_emit_p9_fp_minmax (dest, op, true_cond, false_cond))
> +      if (maybe_emit_fp_c_minmax (dest, op, true_cond, false_cond))
>  	return true;
> 
> -      if (rs6000_emit_p9_fp_cmove (dest, op, true_cond, false_cond))
> +      if (maybe_emit_fp_cmove (dest, op, true_cond, false_cond))
>  	return true;
>      }
> 
> -- 
> 2.22.0
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] PowerPC: Add power10 xsmaxcqp/xsmincqp support
  2020-08-27  2:45 ` [PATCH 3/4] PowerPC: Add power10 xsmaxcqp/xsmincqp support Michael Meissner
@ 2020-08-27 20:47   ` will schmidt
  2020-08-28  4:09     ` Michael Meissner
  2020-09-10 21:01   ` Segher Boessenkool
  1 sibling, 1 reply; 20+ messages in thread
From: will schmidt @ 2020-08-27 20:47 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Wed, 2020-08-26 at 22:45 -0400, Michael Meissner via Gcc-patches wrote:
> PowerPC: Add power10 xsmaxcqp/xsmincqp support.
> 
> This patch adds support for the ISA 3.1 (power10) IEEE 128-bit "C" minimum and
> maximum functions.  Because of the NaN differences, the built-in functions will
> only generate these instructions if -ffast-math is used until the conditional
> move support is added in the next patch.
> 
> I have built compilers on a little endian power9 Linux system with all 4
> patches applied.  I did bootstrap builds and ran the testsuite, with no
> regressions.  Previous versions of the patch was also tested on a little endian
> power8 Linux system.  I would like to check this patch into the master branch
> for GCC 11.  At this time, I do not anticipate needing to backport these
> changes to GCC 10.3.
> 
> 
> gcc/
> 2020-08-26  Michael Meissner  <meissner@linux.ibm.com>
> 

	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add 128-bit
clause to if-check.  

> 	* config/rs6000/rs6000.h (FLOAT128_IEEE_MINMAX_P): New helper
> 	macro.
ok

> 	* config/rs6000/rs6000.md (FSCALAR): New mode iterator for floating
> 	point scalars.

> 	(Fm): New mode attribute for floating point scalars.

Mixed feels on mixed case, but I defer. :-) 

> 	(s<minmax><mode>): Add support for the ISA 3.1 IEEE 128-bit
> 	minimum and maximum instructions.
> 	(s<minmax><mode>3_vsx): Add support for the ISA 3.1 IEEE 128-bit
> 	minimum and maximum instructions.
> 
> gcc/testsuite/
> 2020-08-26  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* gcc.target/powerpc/float128-minmax-2.c: New test.
> 

the rest lgtm, 
thanks
-Will

> ---
>  gcc/config/rs6000/rs6000.c                    |  3 +-
>  gcc/config/rs6000/rs6000.h                    |  4 +++
>  gcc/config/rs6000/rs6000.md                   | 28 +++++++++++++++----
>  .../gcc.target/powerpc/float128-minmax-2.c    | 15 ++++++++++
>  4 files changed, 43 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 6324f930628..05eb141a2cd 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15445,7 +15445,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
>    /* VSX/altivec have direct min/max insns.  */
>    if ((code == SMAX || code == SMIN)
>        && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
> -	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
> +	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
> +	  || FLOAT128_IEEE_MINMAX_P (mode)))
>      {
>        emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
>        return;
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index bbd8060e143..b504aaa0199 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -345,6 +345,10 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
>     || ((MODE) == TDmode)						\
>     || (!TARGET_FLOAT128_TYPE && FLOAT128_IEEE_P (MODE)))
> 
> +/* Macro whether the float128 min/max instructions are enabled.  */
> +#define FLOAT128_IEEE_MINMAX_P(MODE)					\
> +  (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (MODE))
> +
>  /* Return true for floating point that does not use a vector register.  */
>  #define SCALAR_FLOAT_MODE_NOT_VECTOR_P(MODE)				\
>    (SCALAR_FLOAT_MODE_P (MODE) && !FLOAT128_VECTOR_P (MODE))
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 43b620ae1c0..006e60f09bc 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -789,6 +789,18 @@ (define_code_attr     minmax	[(smin "min")
>  (define_code_attr     SMINMAX	[(smin "SMIN")
>  				 (smax "SMAX")])
> 
> +;; Mode iterator for scalar binary floating point operations
> +(define_mode_iterator FSCALAR [SF
> +			       DF
> +			       (KF "FLOAT128_IEEE_MINMAX_P (KFmode)")
> +			       (TF "FLOAT128_IEEE_MINMAX_P (TFmode)")])
> +
> +;; Constraints to use for scalar FP operations
> +(define_mode_attr Fm [(SF "wa")
> +		      (DF "wa")
> +		      (TF "v")
> +		      (KF "v")])
> +
>  ;; Iterator to optimize the following cases:
>  ;;	D-form load to FPR register & move to Altivec register
>  ;;	Move Altivec register to FPR register and store
> @@ -5142,9 +5154,9 @@ (define_insn "copysign<mode>3_fcpsgn"
>  ;; to allow either DF/SF to use only traditional registers.
> 
>  (define_expand "s<minmax><mode>3"
> -  [(set (match_operand:SFDF 0 "gpc_reg_operand")
> -	(fp_minmax:SFDF (match_operand:SFDF 1 "gpc_reg_operand")
> -			(match_operand:SFDF 2 "gpc_reg_operand")))]
> +  [(set (match_operand:FSCALAR 0 "gpc_reg_operand")
> +	(fp_minmax:FSCALAR (match_operand:FSCALAR 1 "gpc_reg_operand")
> +			   (match_operand:FSCALAR 2 "gpc_reg_operand")))]
>    "TARGET_MINMAX"
>  {
>    rs6000_emit_minmax (operands[0], <SMINMAX>, operands[1], operands[2]);
> @@ -5152,11 +5164,15 @@ (define_expand "s<minmax><mode>3"
>  })
> 
>  (define_insn "*s<minmax><mode>3_vsx"
> -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
> -	(fp_minmax:SFDF (match_operand:SFDF 1 "vsx_register_operand" "<Fv>")
> -			(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")))]
> +  [(set (match_operand:FSCALAR 0 "vsx_register_operand" "=<Fm>")
> +	(fp_minmax:FSCALAR
> +	 (match_operand:FSCALAR 1 "vsx_register_operand" "<Fm>")
> +	 (match_operand:FSCALAR 2 "vsx_register_operand" "<Fm>")))]
>    "TARGET_VSX && TARGET_HARD_FLOAT"
>  {
> +  if (FLOAT128_IEEE_P (<MODE>mode))
> +    return "xs<minmax>cqp %0,%1,%2";
> +
>    return (TARGET_P9_MINMAX
>  	  ? "xs<minmax>cdp %x0,%x1,%x2"
>  	  : "xs<minmax>dp %x0,%x1,%x2");
> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> new file mode 100644
> index 00000000000..c71ba08c9f8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
> +
> +#ifndef TYPE
> +#define TYPE _Float128
> +#endif
> +
> +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
> +   call.  */
> +TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
> +TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
> +
> +/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
> +/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
> -- 
> 2.22.0
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PowerPC: Add power10 xscmp{eq,gt,ge}qp support
  2020-08-27  2:46 ` [PATCH 4/4] PowerPC: Add power10 xscmp{eq,gt,ge}qp support Michael Meissner
@ 2020-08-27 20:47   ` will schmidt
  2020-09-15 21:20   ` Segher Boessenkool
  1 sibling, 0 replies; 20+ messages in thread
From: will schmidt @ 2020-08-27 20:47 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Wed, 2020-08-26 at 22:46 -0400, Michael Meissner via Gcc-patches wrote:
> PowerPC: Add power10 xscmp{eq,gt,ge}qp support.
> 
> This patch adds the conditional move support.  In adding the conditional move
> support, the optimizers will be able to convert things like:
> 
> 	a = (b > c) ? b : c;
> 
> into the instructions.  This patch merges together the scalar SF/DF conditional
> move with the scalar KF/TF conditional move.  It extends the optimization that
> was previously used for SFmode and DFmode to allow the comparison to be a
> different scalar floating point mode than the move.  I.e.
> 
> 	__float128 a, b, c;
> 	float x, y;
> 
> 	/* ... */
> 
> 	a = (x == y) ? b : c;
> 
> I did have to add an XXPERMDI if the comparison mode was SFmode or DFmode, and
> the move mode is KFmode or TFmode (the XSCMP{EQ,GT,GE}DP instructions
> explicitly set the bottom 64 bits of the vector register to 0).
> 
> I have built compilers on a little endian power9 Linux system with all 4
> patches applied.  I did bootstrap builds and ran the testsuite, with no
> regressions.  Previous versions of the patch was also tested on a little endian
> power8 Linux system.  I would like to check this patch into the master branch
> for GCC 11.  At this time, I do not anticipate needing to backport these
> changes to GCC 10.3.
> 
> gcc/
> 2020-08-26  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* config/rs6000/predicates.md (fpmask_normal_or_invert_operator):
> 	New predicate.
> 	* config/rs6000/rs6000.c (have_compare_and_set_mask): Add IEEE
> 	128-bit floating point types.

> 	* config/rs6000/rs6000.md (FSCALAR2): New iterator for floating
> 	point conditional moves.


> 	(mov<SFDF:mode><SFDF2:mode>cc_p9): Replace with
> 	mov<FSCALAR:mode><FSCALAR2:mode>.

(mov<mode>cc): Update to use FSCALAR iterator

> 	(mov<SFDF:mode><SFDF2:mode>cc_invert_p9): Replace with
> 	mov<FSCALAR:mode><FSCALAR2:mode>.

> 	(mov<FSCALAR:mode><FSCALAR2:mode>): Combine both

If i followed correctly, that should that be
*mov<FSCALAR:mode><FSCALAR2:mode>cc " ..?

> 	mov<SFDF:mode><SFDF2:mode>cc_p9 and
> 	mov<SFDF:mode><SFDF2:mode>cc_invert_p9 patterns.  Add ISA 3.1
> 	support for IEEE 128-bit conditional moves.  Always use an
> 	earlyclobber register for the mask.  Use XXPERMDI to extend the
> 	mask if we have a 64-bit comparison and 128-bit move.
> 	register for the mask.

ok

> 	(fpmask<mode>, xxsel<mode>): Add ISA 3.1 support for IEEE 128-bit
> 	conditional moves.  Enable the generator functionality so
> 	mov<FSCALAR:mode><FSCALAR2:mode> can call it.  Update constraints
> 	for 128-bit operations.

ok

reviewed the rest, nothing else jumped out at me.   lgtm, thanks
-Will

> 
> gcc/testsuite/
> 2020-08-26  Michael Meissner  <meissner@linux.ibm.com>
> 
> 	* gcc.target/powerpc/float128-cmove.c: New test.
> 	* gcc.target/powerpc/float128-minmax-3.c: New test.
> 
> ---
>  gcc/config/rs6000/predicates.md               |   5 +
>  gcc/config/rs6000/rs6000.c                    |   4 +
>  gcc/config/rs6000/rs6000.md                   | 154 ++++++++++--------
>  .../gcc.target/powerpc/float128-cmove.c       |  93 +++++++++++
>  .../gcc.target/powerpc/float128-minmax-3.c    |  15 ++
>  5 files changed, 200 insertions(+), 71 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> 
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index 2709e46f7e5..60b45601e9b 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -1225,6 +1225,11 @@ (define_predicate "fpmask_comparison_operator"
>  (define_predicate "invert_fpmask_comparison_operator"
>    (match_code "ne,unlt,unle"))
> 
> +;; Return 1 if OP is either a fpmask_comparison_operator or
> +;; invert_fpmask_comparsion_operator.
> +(define_predicate "fpmask_normal_or_invert_operator"
> +  (match_code "eq,gt,ge,ne,unlt,unle"))
> +
>  ;; Return 1 if OP is a comparison operation suitable for integer vector/scalar
>  ;; comparisons that generate a -1/0 mask.
>  (define_predicate "vecint_comparison_operator"
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 05eb141a2cd..403897926c5 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15177,6 +15177,10 @@ have_compare_and_set_mask (machine_mode mode)
>      case DFmode:
>        return TARGET_P9_MINMAX;
> 
> +    case KFmode:
> +    case TFmode:
> +      return FLOAT128_IEEE_MINMAX_P (mode);
> +
>      default:
>        break;
>      }
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 006e60f09bc..147c635994c 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -795,6 +795,15 @@ (define_mode_iterator FSCALAR [SF
>  			       (KF "FLOAT128_IEEE_MINMAX_P (KFmode)")
>  			       (TF "FLOAT128_IEEE_MINMAX_P (TFmode)")])
> 
> +;; Secondary iterator for scalar binary floating point operations.  This is
> +;; used for the conditional moves when we have a compare and set mask
> +;; instruction.  Using this instruction allows us to do a conditional move
> +;; where the comparison type might be different from the values being moved.
> +(define_mode_iterator FSCALAR2 [SF
> +				DF
> +				(KF "FLOAT128_IEEE_MINMAX_P (KFmode)")
> +				(TF "FLOAT128_IEEE_MINMAX_P (TFmode)")])
> +
>  ;; Constraints to use for scalar FP operations
>  (define_mode_attr Fm [(SF "wa")
>  		      (DF "wa")
> @@ -5290,10 +5299,10 @@ (define_insn "*setnbcr_<un>signed_<GPR:mode>"
> 
>  ;; Floating point conditional move
>  (define_expand "mov<mode>cc"
> -   [(set (match_operand:SFDF 0 "gpc_reg_operand")
> -	 (if_then_else:SFDF (match_operand 1 "comparison_operator")
> -			    (match_operand:SFDF 2 "gpc_reg_operand")
> -			    (match_operand:SFDF 3 "gpc_reg_operand")))]
> +   [(set (match_operand:FSCALAR 0 "gpc_reg_operand")
> +	 (if_then_else:FSCALAR (match_operand 1 "comparison_operator")
> +			       (match_operand:FSCALAR 2 "gpc_reg_operand")
> +			       (match_operand:FSCALAR 3 "gpc_reg_operand")))]
>    "TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT"
>  {
>    if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
> @@ -5313,92 +5322,95 @@ (define_insn "*fsel<SFDF:mode><SFDF2:mode>4"
>    "fsel %0,%1,%2,%3"
>    [(set_attr "type" "fp")])
> 
> -(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_p9"
> -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
> -	(if_then_else:SFDF
> -	 (match_operator:CCFP 1 "fpmask_comparison_operator"
> -		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
> -		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
> -	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
> -	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
> -   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
> +;; Conditional moves using scalar floating point types if we have a compare and
> +;; set mask instruction (like xscmpeqdp).
> +;;
> +;; If we have a 64-bit comparison and a 128-bit mode, such as:
> +;;
> +;;	double x, y;
> +;;	__float128 a, b, c;
> +;;	a = (x == y) ? b : c;
> +;;
> +;; We need to extend the comparison (XSCMPEQDP puts 0's in the bottom 64-bits).
> +(define_insn_and_split "*mov<FSCALAR:mode><FSCALAR2:mode>cc"
> +  [(set (match_operand:FSCALAR 0 "vsx_register_operand" "=<FSCALAR:Fm>")
> +	(if_then_else:FSCALAR
> +	 (match_operator:CCFP 1 "fpmask_normal_or_invert_operator"
> +		[(match_operand:FSCALAR2 2 "vsx_register_operand" "<FSCALAR2:Fm>")
> +		 (match_operand:FSCALAR2 3 "vsx_register_operand" "<FSCALAR2:Fm>")])
> +	 (match_operand:FSCALAR 4 "vsx_register_operand" "<FSCALAR:Fm>")
> +	 (match_operand:FSCALAR 5 "vsx_register_operand" "<FSCALAR:Fm>")))
> +   (clobber (match_scratch:V2DI 6 "=<FSCALAR2:Fm>"))]
>    "TARGET_P9_MINMAX"
>    "#"
> -  ""
> -  [(set (match_dup 6)
> -	(if_then_else:V2DI (match_dup 1)
> -			   (match_dup 7)
> -			   (match_dup 8)))
> -   (set (match_dup 0)
> -	(if_then_else:SFDF (ne (match_dup 6)
> -			       (match_dup 8))
> -			   (match_dup 4)
> -			   (match_dup 5)))]
> +  "&& 1"
> +  [(pc)]
>  {
> -  if (GET_CODE (operands[6]) == SCRATCH)
> -    operands[6] = gen_reg_rtx (V2DImode);
> +  rtx dest = operands[0];
> +  rtx cmp = operands[1];
> +  rtx cmp_op1 = operands[2];
> +  rtx cmp_op2 = operands[3];
> +  rtx move_t = operands[4];
> +  rtx move_f = operands[5];
> +  rtx mask_reg = operands[6];
> +  rtx mask_m1 = CONSTM1_RTX (V2DImode);
> +  rtx mask_0 = CONST0_RTX (V2DImode);
> 
> -  operands[7] = CONSTM1_RTX (V2DImode);
> -  operands[8] = CONST0_RTX (V2DImode);
> -}
> - [(set_attr "length" "8")
> -  (set_attr "type" "vecperm")])
> +  if (GET_CODE (mask_reg) == SCRATCH)
> +    mask_reg = gen_reg_rtx (V2DImode);
> 
> -;; Handle inverting the fpmask comparisons.
> -(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_invert_p9"
> -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
> -	(if_then_else:SFDF
> -	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
> -		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
> -		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
> -	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
> -	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
> -   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
> -  "TARGET_P9_MINMAX"
> -  "#"
> -  "&& 1"
> -  [(set (match_dup 6)
> -	(if_then_else:V2DI (match_dup 9)
> -			   (match_dup 7)
> -			   (match_dup 8)))
> -   (set (match_dup 0)
> -	(if_then_else:SFDF (ne (match_dup 6)
> -			       (match_dup 8))
> -			   (match_dup 5)
> -			   (match_dup 4)))]
> -{
> -  rtx op1 = operands[1];
> -  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
> +  /* Reverse the operands and test if the comparison operator is inverted.  */
> +  if (invert_fpmask_comparison_operator (cmp, CCFPmode))
> +    {
> +      enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (cmp));
> +      cmp = gen_rtx_fmt_ee (cond, CCFPmode, cmp_op1, cmp_op2);
> +      std::swap (move_t, move_f);
> +    }
> 
> -  if (GET_CODE (operands[6]) == SCRATCH)
> -    operands[6] = gen_reg_rtx (V2DImode);
> +  /* Emit the compare and set mask instruction.  */
> +  emit_insn (gen_fpmask<FSCALAR2:mode> (mask_reg, cmp, cmp_op1, cmp_op2,
> +				        mask_m1, mask_0));
> 
> -  operands[7] = CONSTM1_RTX (V2DImode);
> -  operands[8] = CONST0_RTX (V2DImode);
> +  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
> +     mask.  Because we are using the splat builtin to extend the V2DImode, we
> +     need to use element 1 on little endian systems.  */
> +  if (!FLOAT128_IEEE_P (<FSCALAR2:MODE>mode)
> +      && FLOAT128_IEEE_P (<FSCALAR:MODE>mode))
> +    {
> +      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
> +      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
> +    }
> 
> -  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
> +  /* Now emit the XXSEL insn.  */
> +  emit_insn (gen_xxsel<FSCALAR:mode> (dest, mask_reg, mask_0, move_t, move_f));
> +  DONE;
>  }
>   [(set_attr "length" "8")
>    (set_attr "type" "vecperm")])
> 
> -(define_insn "*fpmask<mode>"
> -  [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa")
> +(define_insn "fpmask<mode>"
> +  [(set (match_operand:V2DI 0 "vsx_register_operand" "=<Fm>")
>  	(if_then_else:V2DI
>  	 (match_operator:CCFP 1 "fpmask_comparison_operator"
> -		[(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")
> -		 (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")])
> +		[(match_operand:FSCALAR 2 "vsx_register_operand" "<Fm>")
> +		 (match_operand:FSCALAR 3 "vsx_register_operand" "<Fm>")])
>  	 (match_operand:V2DI 4 "all_ones_constant" "")
>  	 (match_operand:V2DI 5 "zero_constant" "")))]
>    "TARGET_P9_MINMAX"
> -  "xscmp%V1dp %x0,%x2,%x3"
> +{
> +  return (FLOAT128_IEEE_P (<MODE>mode)
> +	  ? "xscmp%V1qp %0,%2,%3"
> +	  : "xscmp%V1dp %x0,%x2,%x3");
> +}
>    [(set_attr "type" "fpcompare")])
> 
> -(define_insn "*xxsel<mode>"
> -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
> -	(if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
> -			       (match_operand:V2DI 2 "zero_constant" ""))
> -			   (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")
> -			   (match_operand:SFDF 4 "vsx_register_operand" "<Fv>")))]
> +(define_insn "xxsel<mode>"
> +  [(set (match_operand:FSCALAR 0 "vsx_register_operand" "=<Fm>")
> +	(if_then_else:FSCALAR
> +	 (ne (match_operand:V2DI 1 "vsx_register_operand" "<Fm>")
> +	     (match_operand:V2DI 2 "zero_constant" ""))
> +	 (match_operand:FSCALAR 3 "vsx_register_operand" "<Fm>")
> +	 (match_operand:FSCALAR 4 "vsx_register_operand" "<Fm>")))]
>    "TARGET_P9_MINMAX"
>    "xxsel %x0,%x4,%x3,%x1"
>    [(set_attr "type" "vecmove")])
> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> new file mode 100644
> index 00000000000..639d5a77087
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> @@ -0,0 +1,93 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +/* { dg-final { scan-assembler     {\mxscmpeq[dq]p\M} } } */
> +/* { dg-final { scan-assembler     {\mxxpermdi\M}     } } */
> +/* { dg-final { scan-assembler     {\mxxsel\M}        } } */
> +/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
> +/* { dg-final { scan-assembler-not {\mfcmp[uo]\M}     } } */
> +/* { dg-final { scan-assembler-not {\mfsel\M}         } } */
> +
> +/* This series of tests tests whether you can do a conditional move where the
> +   test is one floating point type, and the result is another floating point
> +   type.
> +
> +   If the comparison type is SF/DFmode, and the move type is IEEE 128-bit
> +   floating point, we have to duplicate the mask in the lower 64-bits with
> +   XXPERMDI because XSCMPEQDP clears the bottom 64-bits of the mask register.
> +
> +   Going the other way (IEEE 128-bit comparsion, 64-bit move) is fine as the
> +   mask word will be 128-bits.  */
> +
> +float
> +eq_f_d (float a, float b, double x, double y)
> +{
> +  return (x == y) ? a : b;
> +}
> +
> +double
> +eq_d_f (double a, double b, float x, float y)
> +{
> +  return (x == y) ? a : b;
> +}
> +
> +float
> +eq_f_f128 (float a, float b, __float128 x, __float128 y)
> +{
> +  return (x == y) ? a : b;
> +}
> +
> +double
> +eq_d_f128 (double a, double b, __float128 x, __float128 y)
> +{
> +  return (x == y) ? a : b;
> +}
> +
> +__float128
> +eq_f128_f (__float128 a, __float128 b, float x, float y)
> +{
> +  return (x == y) ? a : b;
> +}
> +
> +__float128
> +eq_f128_d (__float128 a, __float128 b, double x, double y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +float
> +ne_f_d (float a, float b, double x, double y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +double
> +ne_d_f (double a, double b, float x, float y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +float
> +ne_f_f128 (float a, float b, __float128 x, __float128 y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +double
> +ne_d_f128 (double a, double b, __float128 x, __float128 y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +__float128
> +ne_f128_f (__float128 a, __float128 b, float x, float y)
> +{
> +  return (x != y) ? a : b;
> +}
> +
> +__float128
> +ne_f128_d (__float128 a, __float128 b, double x, double y)
> +{
> +  return (x != y) ? a : b;
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> new file mode 100644
> index 00000000000..6f7627c0f2a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
> @@ -0,0 +1,15 @@
> +/* { dg-require-effective-target ppc_float128_hw } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +
> +#ifndef TYPE
> +#define TYPE _Float128
> +#endif
> +
> +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
> +   call.  */
> +TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; }
> +TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; }
> +
> +/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
> +/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
> -- 
> 2.22.0
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] PowerPC: Add power10 xsmaxcqp/xsmincqp support
  2020-08-27 20:47   ` will schmidt
@ 2020-08-28  4:09     ` Michael Meissner
  2020-09-10 20:18       ` Segher Boessenkool
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Meissner @ 2020-08-28  4:09 UTC (permalink / raw)
  To: will schmidt
  Cc: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

On Thu, Aug 27, 2020 at 03:47:19PM -0500, will schmidt wrote:
> > 	(Fm): New mode attribute for floating point scalars.
> 
> Mixed feels on mixed case, but I defer. :-) 

It is similar to other mode attributes (Ff, Fv) used for setting constraints
based on the mode.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] PowerPC: Change cmove function return to bool
  2020-08-27  2:43 ` [PATCH 1/4] PowerPC: Change cmove function return to bool Michael Meissner
  2020-08-27 20:47   ` will schmidt
@ 2020-09-10 19:15   ` Segher Boessenkool
  1 sibling, 0 replies; 20+ messages in thread
From: Segher Boessenkool @ 2020-09-10 19:15 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner

On Wed, Aug 26, 2020 at 10:43:23PM -0400, Michael Meissner wrote:
> 	* config/rs6000/rs6000-protos.h (rs6000_emit_cmove): Change return
> 	type to bool.
> 	(rs6000_emit_int_cmove): Change return type to bool.
> 	* config/rs6000/rs6000.c (rs6000_emit_cmove): Change return type
> 	to bool.
> 	(rs6000_emit_int_cmove): Change return type to bool.

Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] PowerPC: Rename functions for min, max, cmove
  2020-08-27 20:47   ` will schmidt
@ 2020-09-10 19:18     ` Segher Boessenkool
  0 siblings, 0 replies; 20+ messages in thread
From: Segher Boessenkool @ 2020-09-10 19:18 UTC (permalink / raw)
  To: will schmidt
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner

On Thu, Aug 27, 2020 at 03:47:15PM -0500, will schmidt wrote:
> On Wed, 2020-08-26 at 22:44 -0400, Michael Meissner via Gcc-patches wrote:
> > 	* config/rs6000/rs6000.c (rs6000_emit_p9_fp_minmax): Rename to
> > 	maybe_emit_fp_c_minmax.
> ok
> 
> > 	(maybe_emit_fp_c_minmax): Rename rs6000_emit_p9_fp_minmax.  Return
> > 	bool instead of int.
> 
> function rename is redundant between the two entries?

Yes, there are cleaner ways to write this, while still keeping both the
old and new names on the lhs of the colon (where they should be!)
Without naming each thing twice, even.


Segher

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] PowerPC: Rename functions for min, max, cmove
  2020-08-27  2:44 ` [PATCH 2/4] PowerPC: Rename functions for min, max, cmove Michael Meissner
  2020-08-27 20:47   ` will schmidt
@ 2020-09-10 19:29   ` Segher Boessenkool
  2020-09-11 22:15     ` [PATCH 2/4, revised patch applied] " Michael Meissner
  1 sibling, 1 reply; 20+ messages in thread
From: Segher Boessenkool @ 2020-09-10 19:29 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner

On Wed, Aug 26, 2020 at 10:44:22PM -0400, Michael Meissner wrote:
> PowerPC: Rename functions for min, max, cmove.
> 
> This patch renames the functions that generate the ISA 3.0 C minimum, C
> maximum, and conditional move instructions to use a better name than just using
> a _p9 suffix.  Because the functions can fail, the names use "maybe_emit"
> instead of "generate_" in the name.

Please keep the rs6000_ prefix though.

If we want to drop that, we should do it *everywhere*.

> -/* ISA 3.0 (power9) minmax subcase to emit a XSMAXCDP or XSMINCDP instruction
> -   for SF/DF scalars.  Move TRUE_COND to DEST if OP of the operands of the last
> -   comparison is nonzero/true, FALSE_COND if it is zero/false.  Return 0 if the
> -   hardware has no such operation.  */
> +/* Possibly emit the C variant of the minimum or maximum instruction for
> +   floating point scalars (xsmincdp, xsmaxcdp, etc.).

Please explain that this is asymmetric.

This isn't a proper min/max, except with fastmath :-(

> +  /* See if we can use the "C" minimum, "C" maximum, and compare and set mask
> +     instructions.  */

Just write the actual mnemonics please.

Okay for trunk with those things fixed.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] PowerPC: Add power10 xsmaxcqp/xsmincqp support
  2020-08-28  4:09     ` Michael Meissner
@ 2020-09-10 20:18       ` Segher Boessenkool
  0 siblings, 0 replies; 20+ messages in thread
From: Segher Boessenkool @ 2020-09-10 20:18 UTC (permalink / raw)
  To: Michael Meissner, will schmidt, gcc-patches, David Edelsohn,
	Bill Schmidt, Peter Bergner

On Fri, Aug 28, 2020 at 12:09:48AM -0400, Michael Meissner wrote:
> On Thu, Aug 27, 2020 at 03:47:19PM -0500, will schmidt wrote:
> > > 	(Fm): New mode attribute for floating point scalars.
> > 
> > Mixed feels on mixed case, but I defer. :-) 
> 
> It is similar to other mode attributes (Ff, Fv) used for setting constraints
> based on the mode.

Yes, and those are unnecessarily cryptic as well!  Not to mention this
Studly Case thing ;-)


Segher

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/4] PowerPC: Add power10 xsmaxcqp/xsmincqp support
  2020-08-27  2:45 ` [PATCH 3/4] PowerPC: Add power10 xsmaxcqp/xsmincqp support Michael Meissner
  2020-08-27 20:47   ` will schmidt
@ 2020-09-10 21:01   ` Segher Boessenkool
  1 sibling, 0 replies; 20+ messages in thread
From: Segher Boessenkool @ 2020-09-10 21:01 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner

Hi!

On Wed, Aug 26, 2020 at 10:45:26PM -0400, Michael Meissner wrote:
> 	* config/rs6000/rs6000.md (FSCALAR): New mode iterator for floating
> 	point scalars.

We have the long-established SFDF for a very similar thing.  So maybe
just call this SFDFTF or SDTF or something?

Alternatively, we have FP for all the floating point modes -- maybe
add a BFP that will then we just the (IEEE) binary floating point modes?

> 	(Fm): New mode attribute for floating point scalars.

Not happy about this at all.  Just make *one* attribute that is the
constraint to use for any mode?  And if there are multiple choices for
what constraint to use with some mode, well, hiding that behind a mode
attribute is obfuscation then.

The <Fv> attribute always is "wa" nowadays (I should clean that up some
day, heh).  And <Ff> is either "f" or "d", but those two constraints are
identical these days.

I have no idea what "F" means?  (Or "m"!)

> 	(s<minmax><mode>): Add support for the ISA 3.1 IEEE 128-bit
> 	minimum and maximum instructions.
> 	(s<minmax><mode>3_vsx): Add support for the ISA 3.1 IEEE 128-bit
> 	minimum and maximum instructions.

> +/* Macro whether the float128 min/max instructions are enabled.  */
> +#define FLOAT128_IEEE_MINMAX_P(MODE)					\
> +  (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (MODE))

This is a horrible name.  It is about the minmax*c* insns.  And the "_P"
is wrong (this macro is not checking if the mode is a minmax, whatever
that would mean!)

> +;; Constraints to use for scalar FP operations
> +(define_mode_attr Fm [(SF "wa")
> +		      (DF "wa")
> +		      (TF "v")
> +		      (KF "v")])

You miss a lot of context here...  This of course is *not* for all
scalar FP ops.  Ah, so the "m" is for "minmax" I guess.

>  (define_insn "*s<minmax><mode>3_vsx"
> -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
> -	(fp_minmax:SFDF (match_operand:SFDF 1 "vsx_register_operand" "<Fv>")
> -			(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")))]
> +  [(set (match_operand:FSCALAR 0 "vsx_register_operand" "=<Fm>")
> +	(fp_minmax:FSCALAR
> +	 (match_operand:FSCALAR 1 "vsx_register_operand" "<Fm>")
> +	 (match_operand:FSCALAR 2 "vsx_register_operand" "<Fm>")))]
>    "TARGET_VSX && TARGET_HARD_FLOAT"
>  {
> +  if (FLOAT128_IEEE_P (<MODE>mode))
> +    return "xs<minmax>cqp %0,%1,%2";

Nope.  Since you need to handle this with completely separate code
anyway, just make it a separate instruction pattern please.  And there
is no need for a define_expand for it, so you can just leave that one
intact as-is.

Segher

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/4, revised patch applied] PowerPC: Rename functions for min, max, cmove
  2020-09-10 19:29   ` Segher Boessenkool
@ 2020-09-11 22:15     ` Michael Meissner
  2020-09-15 18:38       ` Alexandre Oliva
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Meissner @ 2020-09-11 22:15 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner

Here is the patch that I applied:

From 1a2e0742e3e3c45f75d0ce31c45a7778c8d1f45e Mon Sep 17 00:00:00 2001
From: Michael Meissner <meissner@linux.ibm.com>
Date: Fri, 11 Sep 2020 16:57:13 -0400
Subject: [PATCH] PowerPC: rename some functions.

gcc/
2020-09-11  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_maybe_emit_maxc_minc): Rename
	from rs6000_emit_p9_fp_minmax.  Change return type to bool.  Add
	comments to document NaN/signed zero behavior.
	(rs6000_maybe_emit_fp_cmove): Rename from rs6000_emit_p9_fp_cmove.
	(have_compare_and_set_mask): New helper function.
	(rs6000_emit_cmove): Update calls to new names and the new helper
	function.
---
 gcc/config/rs6000/rs6000.c | 93 ++++++++++++++++++++++++++++----------
 1 file changed, 69 insertions(+), 24 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 9908830b07a..20a4ba382bc 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15057,13 +15057,33 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   return 1;
 }
 
-/* ISA 3.0 (power9) minmax subcase to emit a XSMAXCDP or XSMINCDP instruction
-   for SF/DF scalars.  Move TRUE_COND to DEST if OP of the operands of the last
-   comparison is nonzero/true, FALSE_COND if it is zero/false.  Return 0 if the
-   hardware has no such operation.  */
+/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
+   minimum with "C" semantics.
 
-static int
-rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
+   Unless you use -ffast-math, you can't use these instructions to replace
+   conditions that implicitly reverse the condition because the comparison
+   might generate a NaN or signed zer0.
+
+   I.e. the following can be replaced all of the time
+	ret = (op1 >  op2) ? op1 : op2	; generate xsmaxcdp
+	ret = (op1 >= op2) ? op1 : op2	; generate xsmaxcdp
+	ret = (op1 <  op2) ? op1 : op2;	; generate xsmincdp
+	ret = (op1 <= op2) ? op1 : op2;	; generate xsmincdp
+
+   The following can be replaced only if -ffast-math is used:
+	ret = (op1 <  op2) ? op2 : op1	; generate xsmaxcdp
+	ret = (op1 <= op2) ? op2 : op1	; generate xsmaxcdp
+	ret = (op1 >  op2) ? op2 : op1;	; generate xsmincdp
+	ret = (op1 >= op2) ? op2 : op1;	; generate xsmincdp
+
+   Move TRUE_COND to DEST if OP of the operands of the last comparison is
+   nonzero/true, FALSE_COND if it is zero/false.
+
+   Return false if we can't generate the appropriate minimum or maximum, and
+   true if we can did the minimum or maximum.  */
+
+static bool
+rs6000_maybe_emit_maxc_minc (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   enum rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
@@ -15073,14 +15093,14 @@ rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   bool max_p = false;
 
   if (result_mode != compare_mode)
-    return 0;
+    return false;
 
   if (code == GE || code == GT)
     max_p = true;
   else if (code == LE || code == LT)
     max_p = false;
   else
-    return 0;
+    return false;
 
   if (rtx_equal_p (op0, true_cond) && rtx_equal_p (op1, false_cond))
     ;
@@ -15093,19 +15113,23 @@ rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond)
     max_p = !max_p;
 
   else
-    return 0;
+    return false;
 
   rs6000_emit_minmax (dest, max_p ? SMAX : SMIN, op0, op1);
-  return 1;
+  return true;
 }
 
-/* ISA 3.0 (power9) conditional move subcase to emit XSCMP{EQ,GE,GT,NE}DP and
-   XXSEL instructions for SF/DF scalars.  Move TRUE_COND to DEST if OP of the
-   operands of the last comparison is nonzero/true, FALSE_COND if it is
-   zero/false.  Return 0 if the hardware has no such operation.  */
+/* Possibly emit a floating point conditional move by generating a compare that
+   sets a mask instruction and a XXSEL select instruction.
 
-static int
-rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
+   Move TRUE_COND to DEST if OP of the operands of the last comparison is
+   nonzero/true, FALSE_COND if it is zero/false.
+
+   Return false if the operation cannot be generated, and true if we could
+   generate the instruction.  */
+
+static bool
+rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
 {
   enum rtx_code code = GET_CODE (op);
   rtx op0 = XEXP (op, 0);
@@ -15133,7 +15157,7 @@ rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
       break;
 
     default:
-      return 0;
+      return false;
     }
 
   /* Generate:	[(parallel [(set (dest)
@@ -15153,7 +15177,28 @@ rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   emit_insn (gen_rtx_PARALLEL (VOIDmode,
 			       gen_rtvec (2, cmove_rtx, clobber_rtx)));
 
-  return 1;
+  return true;
+}
+
+/* Helper function to return true if the target has instructions to do a
+   compare and set mask instruction that can be used with XXSEL to implement a
+   conditional move.  It is also assumed that such a target also supports the
+   "C" minimum and maximum instructions. */
+
+static bool
+have_compare_and_set_mask (machine_mode mode)
+{
+  switch (mode)
+    {
+    case SFmode:
+    case DFmode:
+      return TARGET_P9_MINMAX;
+
+    default:
+      break;
+    }
+
+  return false;
 }
 
 /* Emit a conditional move: move TRUE_COND to DEST if OP of the
@@ -15182,15 +15227,15 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond)
   if (GET_MODE (false_cond) != result_mode)
     return false;
 
-  /* See if we can use the ISA 3.0 (power9) min/max/compare functions.  */
-  if (TARGET_P9_MINMAX
-      && (compare_mode == SFmode || compare_mode == DFmode)
-      && (result_mode == SFmode || result_mode == DFmode))
+  /* See if we can use the "C" minimum, "C" maximum, and compare and set mask
+     instructions.  */
+  if (have_compare_and_set_mask (compare_mode)
+      && have_compare_and_set_mask (result_mode))
     {
-      if (rs6000_emit_p9_fp_minmax (dest, op, true_cond, false_cond))
+      if (rs6000_maybe_emit_maxc_minc (dest, op, true_cond, false_cond))
 	return true;
 
-      if (rs6000_emit_p9_fp_cmove (dest, op, true_cond, false_cond))
+      if (rs6000_maybe_emit_fp_cmove (dest, op, true_cond, false_cond))
 	return true;
     }
 
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4, revised patch applied] PowerPC: Rename functions for min, max, cmove
  2020-09-11 22:15     ` [PATCH 2/4, revised patch applied] " Michael Meissner
@ 2020-09-15 18:38       ` Alexandre Oliva
  2020-09-15 19:51         ` Peter Bergner
  2020-09-15 19:58         ` Segher Boessenkool
  0 siblings, 2 replies; 20+ messages in thread
From: Alexandre Oliva @ 2020-09-15 18:38 UTC (permalink / raw)
  To: Michael Meissner
  Cc: Segher Boessenkool, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner

Hello, Mike,

On Sep 11, 2020, Michael Meissner via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

> +    case SFmode:
> +    case DFmode:

gcc110 (ppc64) in the build farm didn't like this.  The bootstrap
compiler barfs on these expressions, because of some constexpr issue I
haven't really looked into.

I'm testing this patch.  I'll check it in when I'm done.


use E_*mode instead of just *mode

From: Alexandre Oliva <oliva@adacore.com>

g++ 4.8.5 rejected cases with SFmode and DFmode, presumably due to
some bug in the constexpr implementation.

for  gcc/ChangeLog

	* config/rs6000/rs6000.c (have_compare_and_set_mask): Use
	E_*mode in cases.
---
 gcc/config/rs6000/rs6000.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6d0c550..b32fe91 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15190,8 +15190,8 @@ have_compare_and_set_mask (machine_mode mode)
 {
   switch (mode)
     {
-    case SFmode:
-    case DFmode:
+    case E_SFmode:
+    case E_DFmode:
       return TARGET_P9_MINMAX;
 
     default:



-- 
Alexandre Oliva, happy hacker
https://FSFLA.org/blogs/lxo/
Free Software Activist
GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4, revised patch applied] PowerPC: Rename functions for min, max, cmove
  2020-09-15 18:38       ` Alexandre Oliva
@ 2020-09-15 19:51         ` Peter Bergner
  2020-09-15 19:58         ` Segher Boessenkool
  1 sibling, 0 replies; 20+ messages in thread
From: Peter Bergner @ 2020-09-15 19:51 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Michael Meissner, Segher Boessenkool, gcc-patches,
	David Edelsohn, Bill Schmidt, Bill Seurer

On 9/15/20 1:38 PM, Alexandre Oliva wrote:
>> +    case SFmode:
>> +    case DFmode:
> 
> gcc110 (ppc64) in the build farm didn't like this.  The bootstrap
> compiler barfs on these expressions, because of some constexpr issue I
> haven't really looked into.
> 
> I'm testing this patch.  I'll check it in when I'm done.
> 
> 
> use E_*mode instead of just *mode

Bill's nightly testing on one of our old systems just hit this too.
Thanks for fixing and testing the fix!

Peter



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4, revised patch applied] PowerPC: Rename functions for min, max, cmove
  2020-09-15 18:38       ` Alexandre Oliva
  2020-09-15 19:51         ` Peter Bergner
@ 2020-09-15 19:58         ` Segher Boessenkool
  1 sibling, 0 replies; 20+ messages in thread
From: Segher Boessenkool @ 2020-09-15 19:58 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Michael Meissner, Bill Schmidt, gcc-patches, David Edelsohn

On Tue, Sep 15, 2020 at 03:38:05PM -0300, Alexandre Oliva wrote:
> On Sep 11, 2020, Michael Meissner via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> > +    case SFmode:
> > +    case DFmode:
> 
> gcc110 (ppc64) in the build farm didn't like this.  The bootstrap
> compiler barfs on these expressions, because of some constexpr issue I
> haven't really looked into.

Yeah, the system compiler is 4.8.5 (this is centos7).

> I'm testing this patch.  I'll check it in when I'm done.

It is pre-approved, just check it in already please!


Segher


> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15190,8 +15190,8 @@ have_compare_and_set_mask (machine_mode mode)
>  {
>    switch (mode)
>      {
> -    case SFmode:
> -    case DFmode:
> +    case E_SFmode:
> +    case E_DFmode:
>        return TARGET_P9_MINMAX;
>  
>      default:

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] PowerPC: Add power10 xscmp{eq,gt,ge}qp support
  2020-08-27  2:46 ` [PATCH 4/4] PowerPC: Add power10 xscmp{eq,gt,ge}qp support Michael Meissner
  2020-08-27 20:47   ` will schmidt
@ 2020-09-15 21:20   ` Segher Boessenkool
  1 sibling, 0 replies; 20+ messages in thread
From: Segher Boessenkool @ 2020-09-15 21:20 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, David Edelsohn, Bill Schmidt,
	Peter Bergner

On Wed, Aug 26, 2020 at 10:46:37PM -0400, Michael Meissner wrote:
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index 2709e46f7e5..60b45601e9b 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -1225,6 +1225,11 @@ (define_predicate "fpmask_comparison_operator"
>  (define_predicate "invert_fpmask_comparison_operator"
>    (match_code "ne,unlt,unle"))
>  
> +;; Return 1 if OP is either a fpmask_comparison_operator or
> +;; invert_fpmask_comparsion_operator.
> +(define_predicate "fpmask_normal_or_invert_operator"
> +  (match_code "eq,gt,ge,ne,unlt,unle"))

Keep "comparison" in the name?  Maybe "any_fpmask_comparison_operator",
we have other things named in that scheme already.

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -15177,6 +15177,10 @@ have_compare_and_set_mask (machine_mode mode)
>      case DFmode:
>        return TARGET_P9_MINMAX;
>  
> +    case KFmode:
> +    case TFmode:
> +      return FLOAT128_IEEE_MINMAX_P (mode);

That needs the E_ stuff as well.

> +;; Secondary iterator for scalar binary floating point operations.  This is
> +;; used for the conditional moves when we have a compare and set mask
> +;; instruction.  Using this instruction allows us to do a conditional move
> +;; where the comparison type might be different from the values being moved.
> +(define_mode_iterator FSCALAR2 [SF
> +				DF
> +				(KF "FLOAT128_IEEE_MINMAX_P (KFmode)")
> +				(TF "FLOAT128_IEEE_MINMAX_P (TFmode)")])

Needs a name change just like FSCALAR.  Maybe BFP?  Or better, just FP,
and rename the current FP to something else (it is only used for cstore
and cbranch, it should use a much less generic name there).

Please cut down the comment.  See GPR/GPR2 for example:

; This mode iterator allows :GPR to be used to indicate the allowable size
; of whole values in GPRs.
(define_mode_iterator GPR [SI (DI "TARGET_POWERPC64")])

; And again, for patterns that need two (potentially) different integer modes.
(define_mode_iterator GPR2 [SI (DI "TARGET_POWERPC64")])

It should not talk about an example where it is used: it can much easier
say something much more generic!

(And then send a patch first doing FP just as SFDF and replacing it
where we want it; and then a later patch adding KF.  That way, your
patch might be readable!)

Thanks,

Segher

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-09-15 21:21 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-27  2:41 [PATCH] Add power10 IEEE 128-bit minimum, maximum, and compare with mask instructions Michael Meissner
2020-08-27  2:43 ` [PATCH 1/4] PowerPC: Change cmove function return to bool Michael Meissner
2020-08-27 20:47   ` will schmidt
2020-09-10 19:15   ` Segher Boessenkool
2020-08-27  2:44 ` [PATCH 2/4] PowerPC: Rename functions for min, max, cmove Michael Meissner
2020-08-27 20:47   ` will schmidt
2020-09-10 19:18     ` Segher Boessenkool
2020-09-10 19:29   ` Segher Boessenkool
2020-09-11 22:15     ` [PATCH 2/4, revised patch applied] " Michael Meissner
2020-09-15 18:38       ` Alexandre Oliva
2020-09-15 19:51         ` Peter Bergner
2020-09-15 19:58         ` Segher Boessenkool
2020-08-27  2:45 ` [PATCH 3/4] PowerPC: Add power10 xsmaxcqp/xsmincqp support Michael Meissner
2020-08-27 20:47   ` will schmidt
2020-08-28  4:09     ` Michael Meissner
2020-09-10 20:18       ` Segher Boessenkool
2020-09-10 21:01   ` Segher Boessenkool
2020-08-27  2:46 ` [PATCH 4/4] PowerPC: Add power10 xscmp{eq,gt,ge}qp support Michael Meissner
2020-08-27 20:47   ` will schmidt
2020-09-15 21:20   ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).