[PATCH 0/2] Power10 IEEE 128-bit min, max, cmove

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 0/2] Power10 IEEE 128-bit min, max, cmove
@ 2020-11-16  4:45 Michael Meissner
  2020-11-16  4:50 ` [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support Michael Meissner
  2020-11-16  4:53 ` [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move Michael Meissner
  0 siblings, 2 replies; 7+ messages in thread
From: Michael Meissner @ 2020-11-16  4:45 UTC (permalink / raw)
  To: gcc-patches, Michael Meissner, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

These two patches add support for the XSMAXCQP, XSMINCQP, XSCMP{EQ,GT,GE}QP
instructions in the Power10.  These instructions allow the compiler to generate
minimum, maxmimum, and conditional move support for IEEE 128-bit floating point
type.

I have posted versions of these patches before, going back to at least the June
time frame.  Because I wanted to focus on getting support in the library to
allow long double to be configured as IEEE 128-bit on little endian PowerPC
server systems, I haven't reposted these patches in awhile.  While the other
patches are more important, it would be nice to get these patches into GCC 11.

In these patches, I attempted to address various comments from the previous
times I posted the patches.

There are 2 patches in this set.  The first adds the min/max support if
-ffast-math is used.  The second adds conditional move support, and it enables
the min/max to be generated more often.

I have done bootstrap compilers on both little endian power9 systems and big
endian power8 system many time.  I would like to commit these patches to the
master branch.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support.
  2020-11-16  4:45 [PATCH 0/2] Power10 IEEE 128-bit min, max, cmove Michael Meissner
@ 2020-11-16  4:50 ` Michael Meissner
  2020-12-04  4:35   ` Ping: " Michael Meissner
  2020-12-10 15:38   ` Ping x2: " Michael Meissner
  2020-11-16  4:53 ` [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move Michael Meissner
  1 sibling, 2 replies; 7+ messages in thread
From: Michael Meissner @ 2020-11-16  4:50 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support.

This patch adds the support for the IEEE 128-bit floating point C minimum and
maximum instructions.  The next patch will add the support for using the
compare and set mask instruction to implement conditional moves.

Originally, I tried to add the min/max instructions via a super defination that
covers all of the types.  In this patch, based on patch feedback, I rewrote the
patch to be simpler and just provide the new instructions as a separate insn.

I have built little endian power9 bootstrap compilers with these patches in it,
and there were no regressions.  I have also build big endian power8 bootstrap
compilers, and there were no regressions.  Can I check this into the master
branch?

gcc/
2020-11-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
	3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp instructions.
	* config/rs6000/rs60000.h (FLOAT128_MIN_MAX_FPMASK_P): New macro.
	* config/rs6000/rs6000.md (s<minmax><mode>3): Add support for the
	ISA 3.1 IEEE 128-bit minimum and maximum instructions.

gcc/testsuite/
2020-11-15  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/float128-minmax-2.c: New test.
---
 gcc/config/rs6000/rs6000.c                        |  3 ++-
 gcc/config/rs6000/rs6000.h                        |  5 +++++
 gcc/config/rs6000/rs6000.md                       | 11 +++++++++++
 .../gcc.target/powerpc/float128-minmax-2.c        | 15 +++++++++++++++
 4 files changed, 33 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index d7dcd93f088..f6a1f63e842 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15741,7 +15741,8 @@ rs6000_emit_minmax (rtx dest, enum rtx_code code, rtx op0, rtx op1)
   /* VSX/altivec have direct min/max insns.  */
   if ((code == SMAX || code == SMIN)
       && (VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode)
-	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))))
+	  || (mode == SFmode && VECTOR_UNIT_VSX_P (DFmode))
+	  || FLOAT128_MIN_MAX_FPMASK_P (mode)))
     {
       emit_insn (gen_rtx_SET (dest, gen_rtx_fmt_ee (code, mode, op0, op1)));
       return;
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 5a47aa14722..886559dbfdf 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -345,6 +345,11 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
    || ((MODE) == TDmode)						\
    || (!TARGET_FLOAT128_TYPE && FLOAT128_IEEE_P (MODE)))
 
+/* Macro whether the float128 minimum, maximum, and set compare mask
+   instructions are enabled.  */
+#define FLOAT128_MIN_MAX_FPMASK_P(MODE)					\
+  (TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (MODE))
+
 /* Return true for floating point that does not use a vector register.  */
 #define SCALAR_FLOAT_MODE_NOT_VECTOR_P(MODE)				\
   (SCALAR_FLOAT_MODE_P (MODE) && !FLOAT128_VECTOR_P (MODE))
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 5e5ad9f7c3d..d8fbac124fb 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5163,6 +5163,17 @@ (define_insn "*s<minmax><mode>3_vsx"
 }
   [(set_attr "type" "fp")])
 
+;; Min/max for ISA 3.1 IEEE 128-bit floating point
+(define_insn "s<minmax><mode>3"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(fp_minmax:IEEE128
+	 (match_operand:IEEE128 1 "altivec_register_operand" "v")
+	 (match_operand:IEEE128 2 "altivec_register_operand" "v")))]
+  "TARGET_POWER10"
+  "xs<minmax>cqp %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
 ;; The conditional move instructions allow us to perform max and min operations
 ;; even when we don't have the appropriate max/min instruction using the FSEL
 ;; instruction.
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
new file mode 100644
index 00000000000..c71ba08c9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return __builtin_fminf128 (a, b); }
+TYPE f128_max (TYPE a, TYPE b) { return __builtin_fmaxf128 (a, b); }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move
  2020-11-16  4:45 [PATCH 0/2] Power10 IEEE 128-bit min, max, cmove Michael Meissner
  2020-11-16  4:50 ` [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support Michael Meissner
@ 2020-11-16  4:53 ` Michael Meissner
  2020-12-04  4:37   ` Ping: " Michael Meissner
  2020-12-10 15:40   ` Ping x2: " Michael Meissner
  1 sibling, 2 replies; 7+ messages in thread
From: Michael Meissner @ 2020-11-16  4:53 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

Power10: Add IEEE 128-bit fp conditional move.

This patch adds the support for power10 IEEE 128-bit floating point conditional
move and for automatically generating min/max.  Unlike the previous patch, I
decided to keep two separate patterns for fpmask before splitting (one pattern
for normal compares, and the other pattern for inverted compares).  I can go
back to a single pattern with a new predicate that allows either comparison.

Compared to the original code, these patterns do simplify the fpmask insns to
having one alternative instead of two.  In the original code, the first
alternative tried to use the result as a temporary register.  But that doesn't
work if you are doing a conditional move with SF/DF types, but the comparison
is KF/TF.  That is because the SF/DF types can use the traditional FPR
registers, but IEEE 128-bit floating point can only do arithmetic in the
traditional Altivec registers.

This code also has to insert a XXPERMDI if you are moving KF/TF values, but
the comparison is done with SF/DF values.  In this case, the set and compare
mask for SF/DF clears the bottom 64-bits of the register, and the XXPERMDI is
needed to fill it.

I have tested these patches with a bootstrap compiler on a little endian power9
system and also on a big endian power8 system.  There were no regressions.  Can
I check this patch into the master branch?

gcc/
2020-11-15 Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (have_compare_and_set_mask): Add IEEE
	128-bit floating point types.
	* config/rs6000/rs6000.md (FPMASK): New iterator.
	(FPMASK2): New iterator.
	(Fv mode attribute): Add KFmode and TFmode.
	(mov<FPMASK:mode><FPMASK2:mode>cc_fpmask): Replace
	mov<SFDF:mode><SFDF2:mode>cc_p9.  Add IEEE 128-bit fp support.
	(mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask): Replace
	mov<SFDF:mode><SFDF2:mode>cc_invert_p9.  Add IEEE 128-bit fp
	support.
	(fpmask<mode>): Add IEEE 128-bit fp support.  Enable generator to
	build te RTL.
	(xxsel<mode>): Add IEEE 128-bit fp support.  Enable generator to
	build te RTL.

gcc/testsuite/
2020-11-15  Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/float128-cmove.c: New test.
	* gcc.target/powerpc/float128-minmax-3.c: New test.
---
 gcc/config/rs6000/rs6000.c                    |   8 +-
 gcc/config/rs6000/rs6000.md                   | 188 ++++++++++++------
 .../gcc.target/powerpc/float128-cmove.c       |  93 +++++++++
 .../gcc.target/powerpc/float128-minmax-3.c    |  15 ++
 4 files changed, 237 insertions(+), 67 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index f6a1f63e842..b6fd21a5d6f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -15336,8 +15336,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false,
   return 1;
 }
 
-/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or
-   minimum with "C" semantics.
+/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a
+   maximum or minimum with "C" semantics.
 
    Unless you use -ffast-math, you can't use these instructions to replace
    conditions that implicitly reverse the condition because the comparison
@@ -15473,6 +15473,10 @@ have_compare_and_set_mask (machine_mode mode)
     case E_DFmode:
       return TARGET_P9_MINMAX;
 
+    case E_KFmode:
+    case E_TFmode:
+      return FLOAT128_MIN_MAX_FPMASK_P (mode);
+
     default:
       break;
     }
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index d8fbac124fb..0a40f3acf78 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -564,6 +564,19 @@ (define_mode_iterator SFDF [SF DF])
 ; And again, for when we need two FP modes in a pattern.
 (define_mode_iterator SFDF2 [SF DF])
 
+; Floating scalars that supports the set compare mask instruction.
+(define_mode_iterator FPMASK [SF
+			      DF
+			      (KF "FLOAT128_MIN_MAX_FPMASK_P (KFmode)")
+			      (TF "FLOAT128_MIN_MAX_FPMASK_P (TFmode)")])
+
+; And again, for patterns that need two (potentially) different floating point
+; scalars that support the set compare mask instruction.
+(define_mode_iterator FPMASK2 [SF
+			       DF
+			       (KF "FLOAT128_MIN_MAX_FPMASK_P (KFmode)")
+			       (TF "FLOAT128_MIN_MAX_FPMASK_P (TFmode)")])
+
 ; A generic s/d attribute, for sp/dp for example.
 (define_mode_attr sd [(SF   "s") (DF   "d")
 		      (V4SF "s") (V2DF "d")])
@@ -597,8 +610,13 @@ (define_mode_attr Ff		[(SF "f") (DF "d") (DI "d")])
 ; SF/DF constraint for arithmetic on VSX registers using instructions added in
 ; ISA 2.06 (power7).  This includes instructions that normally target DF mode,
 ; but are used on SFmode, since internally SFmode values are kept in the DFmode
-; format.
-(define_mode_attr Fv		[(SF "wa") (DF "wa") (DI "wa")])
+; format.  Also include IEEE 128-bit instructions which are restricted to the
+; Altivec registers.
+(define_mode_attr Fv		[(SF "wa")
+				 (DF "wa")
+				 (DI "wa")
+				 (KF "v")
+				 (TF "v")])
 
 ; Which isa is needed for those float instructions?
 (define_mode_attr Fisa		[(SF "p8v")  (DF "*") (DI "*")])
@@ -5285,10 +5303,10 @@ (define_insn "*setnbcr_<un>signed_<GPR:mode>"
 
 ;; Floating point conditional move
 (define_expand "mov<mode>cc"
-   [(set (match_operand:SFDF 0 "gpc_reg_operand")
-	 (if_then_else:SFDF (match_operand 1 "comparison_operator")
-			    (match_operand:SFDF 2 "gpc_reg_operand")
-			    (match_operand:SFDF 3 "gpc_reg_operand")))]
+   [(set (match_operand:FPMASK 0 "gpc_reg_operand")
+	 (if_then_else:FPMASK (match_operand 1 "comparison_operator")
+			      (match_operand:FPMASK 2 "gpc_reg_operand")
+			      (match_operand:FPMASK 3 "gpc_reg_operand")))]
   "TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT"
 {
   if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3]))
@@ -5308,92 +5326,132 @@ (define_insn "*fsel<SFDF:mode><SFDF2:mode>4"
   "fsel %0,%1,%2,%3"
   [(set_attr "type" "fp")])
 
-(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_p9"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
-	(if_then_else:SFDF
+(define_insn_and_split "*mov<FPMASK:mode><FPMASK2:mode>cc_fpmask"
+  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=<FPMASK:Fv>")
+	(if_then_else:FPMASK
 	 (match_operator:CCFP 1 "fpmask_comparison_operator"
-		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
-		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
-	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
-	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
-   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
+	    [(match_operand:FPMASK2 2 "vsx_register_operand" "<FPMASK2:Fv>")
+	     (match_operand:FPMASK2 3 "vsx_register_operand" "<FPMASK2:Fv>")])
+	 (match_operand:FPMASK 4 "vsx_register_operand" "<FPMASK:Fv>")
+	 (match_operand:FPMASK 5 "vsx_register_operand" "<FPMASK:Fv>")))
+   (clobber (match_scratch:V2DI 6 "=&<FPMASK2:Fv>"))]
   "TARGET_P9_MINMAX"
   "#"
-  ""
-  [(set (match_dup 6)
-	(if_then_else:V2DI (match_dup 1)
-			   (match_dup 7)
-			   (match_dup 8)))
-   (set (match_dup 0)
-	(if_then_else:SFDF (ne (match_dup 6)
-			       (match_dup 8))
-			   (match_dup 4)
-			   (match_dup 5)))]
+  "&& 1"
+  [(pc)]
 {
-  if (GET_CODE (operands[6]) == SCRATCH)
-    operands[6] = gen_reg_rtx (V2DImode);
+  rtx dest = operands[0];
+  rtx cmp = operands[1];
+  rtx cmp_op1 = operands[2];
+  rtx cmp_op2 = operands[3];
+  rtx move_t = operands[4];
+  rtx move_f = operands[5];
+  rtx mask_reg = operands[6];
+  rtx mask_m1 = CONSTM1_RTX (V2DImode);
+  rtx mask_0 = CONST0_RTX (V2DImode);
+  machine_mode move_mode = <FPMASK:MODE>mode;
+  machine_mode compare_mode = <FPMASK2:MODE>mode;
 
-  operands[7] = CONSTM1_RTX (V2DImode);
-  operands[8] = CONST0_RTX (V2DImode);
+  if (GET_CODE (mask_reg) == SCRATCH)
+    mask_reg = gen_reg_rtx (V2DImode);
+
+  /* Emit the compare and set mask instruction.  */
+  emit_insn (gen_fpmask<FPMASK2:mode> (mask_reg, cmp, cmp_op1, cmp_op2,
+				       mask_m1, mask_0));
+
+  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
+     mask.  Because we are using the splat builtin to extend the V2DImode, we
+     need to use element 1 on little endian systems.  */
+  if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode))
+    {
+      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
+      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
+    }
+
+  /* Now emit the XXSEL insn.  */
+  emit_insn (gen_xxsel<FPMASK:mode> (dest, mask_reg, mask_0, move_t, move_f));
+  DONE;
 }
- [(set_attr "length" "8")
+ ;; length is 12 in case we need to add XXPERMDI
+ [(set_attr "length" "12")
   (set_attr "type" "vecperm")])
 
 ;; Handle inverting the fpmask comparisons.
-(define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_invert_p9"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>")
-	(if_then_else:SFDF
+(define_insn_and_split "*mov<FPMASK:mode><FPMASK2:mode>cc_invert_fpmask"
+  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=<FPMASK:Fv>")
+	(if_then_else:FPMASK
 	 (match_operator:CCFP 1 "invert_fpmask_comparison_operator"
-		[(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")
-		 (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")])
-	 (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")
-	 (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>")))
-   (clobber (match_scratch:V2DI 6 "=0,&wa"))]
+	    [(match_operand:FPMASK2 2 "vsx_register_operand" "<FPMASK2:Fv>")
+	     (match_operand:FPMASK2 3 "vsx_register_operand" "<FPMASK2:Fv>")])
+	 (match_operand:FPMASK 4 "vsx_register_operand" "<FPMASK:Fv>")
+	 (match_operand:FPMASK 5 "vsx_register_operand" "<FPMASK:Fv>")))
+   (clobber (match_scratch:V2DI 6 "=&<FPMASK2:Fv>"))]
   "TARGET_P9_MINMAX"
   "#"
   "&& 1"
-  [(set (match_dup 6)
-	(if_then_else:V2DI (match_dup 9)
-			   (match_dup 7)
-			   (match_dup 8)))
-   (set (match_dup 0)
-	(if_then_else:SFDF (ne (match_dup 6)
-			       (match_dup 8))
-			   (match_dup 5)
-			   (match_dup 4)))]
+  [(pc)]
 {
-  rtx op1 = operands[1];
-  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1));
+  rtx dest = operands[0];
+  rtx old_cmp = operands[1];
+  rtx cmp_op1 = operands[2];
+  rtx cmp_op2 = operands[3];
+  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (old_cmp));
+  rtx cmp_rev = gen_rtx_fmt_ee (cond, CCFPmode, cmp_op1, cmp_op2);
+  rtx move_f = operands[4];
+  rtx move_t = operands[5];
+  rtx mask_reg = operands[6];
+  rtx mask_m1 = CONSTM1_RTX (V2DImode);
+  rtx mask_0 = CONST0_RTX (V2DImode);
+  machine_mode move_mode = <FPMASK:MODE>mode;
+  machine_mode compare_mode = <FPMASK2:MODE>mode;
+
+  if (GET_CODE (mask_reg) == SCRATCH)
+    mask_reg = gen_reg_rtx (V2DImode);
 
-  if (GET_CODE (operands[6]) == SCRATCH)
-    operands[6] = gen_reg_rtx (V2DImode);
+  /* Emit the compare and set mask instruction.  */
+  emit_insn (gen_fpmask<FPMASK2:mode> (mask_reg, cmp_rev, cmp_op1, cmp_op2,
+				       mask_m1, mask_0));
 
-  operands[7] = CONSTM1_RTX (V2DImode);
-  operands[8] = CONST0_RTX (V2DImode);
+  /* If we have a 64-bit comparison, but an 128-bit move, we need to extend the
+     mask.  Because we are using the splat builtin to extend the V2DImode, we
+     need to use element 1 on little endian systems.  */
+  if (!FLOAT128_IEEE_P (compare_mode) && FLOAT128_IEEE_P (move_mode))
+    {
+      rtx element = WORDS_BIG_ENDIAN ? const0_rtx : const1_rtx;
+      emit_insn (gen_vsx_xxspltd_v2di (mask_reg, mask_reg, element));
+    }
 
-  operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]);
+  /* Now emit the XXSEL insn.  */
+  emit_insn (gen_xxsel<FPMASK:mode> (dest, mask_reg, mask_0, move_t, move_f));
+  DONE;
 }
- [(set_attr "length" "8")
+ ;; length is 12 in case we need to add XXPERMDI
+ [(set_attr "length" "12")
   (set_attr "type" "vecperm")])
 
-(define_insn "*fpmask<mode>"
-  [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa")
+(define_insn "fpmask<mode>"
+  [(set (match_operand:V2DI 0 "vsx_register_operand" "=<Fv>")
 	(if_then_else:V2DI
 	 (match_operator:CCFP 1 "fpmask_comparison_operator"
-		[(match_operand:SFDF 2 "vsx_register_operand" "<Fv>")
-		 (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")])
+		[(match_operand:FPMASK 2 "vsx_register_operand" "<Fv>")
+		 (match_operand:FPMASK 3 "vsx_register_operand" "<Fv>")])
 	 (match_operand:V2DI 4 "all_ones_constant" "")
 	 (match_operand:V2DI 5 "zero_constant" "")))]
   "TARGET_P9_MINMAX"
-  "xscmp%V1dp %x0,%x2,%x3"
+{
+  return (FLOAT128_IEEE_P (<MODE>mode)
+	  ? "xscmp%V1qp %0,%2,%3"
+	  : "xscmp%V1dp %x0,%x2,%x3");
+}
   [(set_attr "type" "fpcompare")])
 
-(define_insn "*xxsel<mode>"
-  [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>")
-	(if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
-			       (match_operand:V2DI 2 "zero_constant" ""))
-			   (match_operand:SFDF 3 "vsx_register_operand" "<Fv>")
-			   (match_operand:SFDF 4 "vsx_register_operand" "<Fv>")))]
+(define_insn "xxsel<mode>"
+  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=wa")
+	(if_then_else:FPMASK
+	 (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
+	     (match_operand:V2DI 2 "zero_constant" ""))
+	 (match_operand:FPMASK 3 "vsx_register_operand" "wa")
+	 (match_operand:FPMASK 4 "vsx_register_operand" "wa")))]
   "TARGET_P9_MINMAX"
   "xxsel %x0,%x4,%x3,%x1"
   [(set_attr "type" "vecmove")])
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
new file mode 100644
index 00000000000..639d5a77087
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
@@ -0,0 +1,93 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+/* { dg-final { scan-assembler     {\mxscmpeq[dq]p\M} } } */
+/* { dg-final { scan-assembler     {\mxxpermdi\M}     } } */
+/* { dg-final { scan-assembler     {\mxxsel\M}        } } */
+/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
+/* { dg-final { scan-assembler-not {\mfcmp[uo]\M}     } } */
+/* { dg-final { scan-assembler-not {\mfsel\M}         } } */
+
+/* This series of tests tests whether you can do a conditional move where the
+   test is one floating point type, and the result is another floating point
+   type.
+
+   If the comparison type is SF/DFmode, and the move type is IEEE 128-bit
+   floating point, we have to duplicate the mask in the lower 64-bits with
+   XXPERMDI because XSCMPEQDP clears the bottom 64-bits of the mask register.
+
+   Going the other way (IEEE 128-bit comparsion, 64-bit move) is fine as the
+   mask word will be 128-bits.  */
+
+float
+eq_f_d (float a, float b, double x, double y)
+{
+  return (x == y) ? a : b;
+}
+
+double
+eq_d_f (double a, double b, float x, float y)
+{
+  return (x == y) ? a : b;
+}
+
+float
+eq_f_f128 (float a, float b, __float128 x, __float128 y)
+{
+  return (x == y) ? a : b;
+}
+
+double
+eq_d_f128 (double a, double b, __float128 x, __float128 y)
+{
+  return (x == y) ? a : b;
+}
+
+__float128
+eq_f128_f (__float128 a, __float128 b, float x, float y)
+{
+  return (x == y) ? a : b;
+}
+
+__float128
+eq_f128_d (__float128 a, __float128 b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
+
+float
+ne_f_d (float a, float b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
+
+double
+ne_d_f (double a, double b, float x, float y)
+{
+  return (x != y) ? a : b;
+}
+
+float
+ne_f_f128 (float a, float b, __float128 x, __float128 y)
+{
+  return (x != y) ? a : b;
+}
+
+double
+ne_d_f128 (double a, double b, __float128 x, __float128 y)
+{
+  return (x != y) ? a : b;
+}
+
+__float128
+ne_f128_f (__float128 a, __float128 b, float x, float y)
+{
+  return (x != y) ? a : b;
+}
+
+__float128
+ne_f128_d (__float128 a, __float128 b, double x, double y)
+{
+  return (x != y) ? a : b;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
new file mode 100644
index 00000000000..6f7627c0f2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c
@@ -0,0 +1,15 @@
+/* { dg-require-effective-target ppc_float128_hw } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+#ifndef TYPE
+#define TYPE _Float128
+#endif
+
+/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a
+   call.  */
+TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; }
+TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; }
+
+/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */
+/* { dg-final { scan-assembler {\mxsmincqp\M} } } */
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Ping: [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support.
  2020-11-16  4:50 ` [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support Michael Meissner
@ 2020-12-04  4:35   ` Michael Meissner
  2020-12-10 15:38   ` Ping x2: " Michael Meissner
  1 sibling, 0 replies; 7+ messages in thread
From: Michael Meissner @ 2020-12-04  4:35 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

I haven't received a reply for this patch:

| Date: Sun, 15 Nov 2020 23:50:51 -0500
| Subject: [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support.
| Message-ID: <20201116045051.GA3952@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559166.html

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Ping: [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move
  2020-11-16  4:53 ` [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move Michael Meissner
@ 2020-12-04  4:37   ` Michael Meissner
  2020-12-10 15:40   ` Ping x2: " Michael Meissner
  1 sibling, 0 replies; 7+ messages in thread
From: Michael Meissner @ 2020-12-04  4:37 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

I haven't received a reply for this patch:

| Date: Sun, 15 Nov 2020 23:53:20 -0500
| Subject: [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move
| Message-ID: <20201116045320.GB3952@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559167.html

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Ping x2: [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support.
  2020-11-16  4:50 ` [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support Michael Meissner
  2020-12-04  4:35   ` Ping: " Michael Meissner
@ 2020-12-10 15:38   ` Michael Meissner
  1 sibling, 0 replies; 7+ messages in thread
From: Michael Meissner @ 2020-12-10 15:38 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

This patch has been around for quite some time.  It isn't critical for enabling
IEEE 128-bit long double, but it improves code generation for float128 on
power10.

I haven't received a reply for this patch:

| Date: Sun, 15 Nov 2020 23:50:51 -0500
| Subject: [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support.
| Message-ID: <20201116045051.GA3952@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559166.html

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Ping x2: [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move
  2020-11-16  4:53 ` [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move Michael Meissner
  2020-12-04  4:37   ` Ping: " Michael Meissner
@ 2020-12-10 15:40   ` Michael Meissner
  1 sibling, 0 replies; 7+ messages in thread
From: Michael Meissner @ 2020-12-10 15:40 UTC (permalink / raw)
  To: Michael Meissner, gcc-patches, Segher Boessenkool,
	David Edelsohn, Bill Schmidt, Peter Bergner

This needs the first patch in the series to be applied first.  This patch is
not critical for enabling IEEE 128-bit long double, but it does improve
float128 code generation on power10.

I haven't received a reply for this patch:

| Date: Sun, 15 Nov 2020 23:53:20 -0500
| Subject: [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move
| Message-ID: <20201116045320.GB3952@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559167.html

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-12-10 15:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16  4:45 [PATCH 0/2] Power10 IEEE 128-bit min, max, cmove Michael Meissner
2020-11-16  4:50 ` [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support Michael Meissner
2020-12-04  4:35   ` Ping: " Michael Meissner
2020-12-10 15:38   ` Ping x2: " Michael Meissner
2020-11-16  4:53 ` [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move Michael Meissner
2020-12-04  4:37   ` Ping: " Michael Meissner
2020-12-10 15:40   ` Ping x2: " Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).