public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]
@ 2022-01-14 22:56 Jakub Jelinek
  2022-01-15  8:29 ` Uros Bizjak
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2022-01-14 22:56 UTC (permalink / raw)
  To: Richard Biener, Uros Bizjak; +Cc: gcc-patches

Hi!

C++20:
#include <compare>
auto cmp4way(double a, double b)
{
  return a <=> b;
}
expands to:
        ucomisd %xmm1, %xmm0
        jp      .L8
        movl    $0, %eax
        jne     .L8
.L2:
        ret
        .p2align 4,,10
        .p2align 3
.L8:
        comisd  %xmm0, %xmm1
        movl    $-1, %eax
        ja      .L2
        ucomisd %xmm1, %xmm0
        setbe   %al
        addl    $1, %eax
        ret
That is 3 comparisons of the same operands.
The following patch improves it to just one comparison:
        comisd  %xmm1, %xmm0
        jp      .L4
        seta    %al
        movl    $0, %edx
        leal    -1(%rax,%rax), %eax
        cmove   %edx, %eax
        ret
.L4:
        movl    $2, %eax
        ret
While a <=> b expands to a == b ? 0 : a < b ? -1 : a > b ? 1 : 2
where the first comparison is equality and this shouldn't raise
exceptions on qNaN operands, if the operands aren't equal (which
includes unordered cases), then it immediately performs < or >
comparison and that raises exceptions even on qNaNs, so we can just
perform a single comparison that raises exceptions on qNaN.
As the 4 different cases are encoded as
ZF CF PF
1  1  1  a unordered b
0  0  0  a > b
0  1  0  a < b
1  0  0  a == b
we can emit optimal sequence of comparions, first jp
for the unordered case, then je for the == case and finally jb
for the < case.

The patch pattern recognizes spaceship-like comparisons during
widening_mul if the spaceship optab is implemented, and replaces
thoose comparisons with comparisons of .SPACESHIP ifn which returns
-1/0/1/2 based on the comparison.  This seems to work well both for the
case of just returning the -1/0/1/2 (when we have just a common
successor with a PHI) or when the different cases are handled with
various other basic blocks.  The testcases cover both of those cases,
the latter with different function calls in those.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-01-14  Jakub Jelinek  <jakub@redhat.com>

	PR target/103973
	* tree-cfg.h (cond_only_block_p): Declare.
	* tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
	* tree-cfg.c (cond_only_block_p): ... here.  No longer static.
	* optabs.def (spaceship_optab): New optab.
	* internal-fn.def (SPACESHIP): New internal function.
	* internal-fn.h (expand_SPACESHIP): Declare.
	* internal-fn.c (expand_PHI): Formatting fix.
	(expand_SPACESHIP): New function.
	* tree-ssa-math-opts.c (optimize_spaceship): New function.
	(math_opts_dom_walker::after_dom_children): Use it.
	* config/i386/i386.md (spaceship<mode>3): New define_expand.
	* config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
	* config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
	* doc/md.texi (spaceship@var{m}3): Document.

	* gcc.target/i386/pr103973-1.c: New test.
	* gcc.target/i386/pr103973-2.c: New test.
	* gcc.target/i386/pr103973-3.c: New test.
	* gcc.target/i386/pr103973-4.c: New test.
	* g++.target/i386/pr103973-1.C: New test.
	* g++.target/i386/pr103973-2.C: New test.
	* g++.target/i386/pr103973-3.C: New test.
	* g++.target/i386/pr103973-4.C: New test.

--- gcc/tree-cfg.h.jj	2022-01-13 22:29:07.414943450 +0100
+++ gcc/tree-cfg.h	2022-01-14 12:59:42.147866622 +0100
@@ -111,6 +111,7 @@ extern basic_block gimple_switch_label_b
 extern basic_block gimple_switch_default_bb (function *, gswitch *);
 extern edge gimple_switch_edge (function *, gswitch *, unsigned);
 extern edge gimple_switch_default_edge (function *, gswitch *);
+extern bool cond_only_block_p (basic_block);
 
 /* Return true if the LHS of a call should be removed.  */
 
--- gcc/tree-ssa-phiopt.c.jj	2022-01-13 22:29:07.514942041 +0100
+++ gcc/tree-ssa-phiopt.c	2022-01-14 12:59:42.146866637 +0100
@@ -1958,31 +1958,6 @@ minmax_replacement (basic_block cond_bb,
   return true;
 }
 
-/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
-
-static bool
-cond_only_block_p (basic_block bb)
-{
-  /* BB must have no executable statements.  */
-  gimple_stmt_iterator gsi = gsi_after_labels (bb);
-  if (phi_nodes (bb))
-    return false;
-  while (!gsi_end_p (gsi))
-    {
-      gimple *stmt = gsi_stmt (gsi);
-      if (is_gimple_debug (stmt))
-	;
-      else if (gimple_code (stmt) == GIMPLE_NOP
-	       || gimple_code (stmt) == GIMPLE_PREDICT
-	       || gimple_code (stmt) == GIMPLE_COND)
-	;
-      else
-	return false;
-      gsi_next (&gsi);
-    }
-  return true;
-}
-
 /* Attempt to optimize (x <=> y) cmp 0 and similar comparisons.
    For strong ordering <=> try to match something like:
     <bb 2> :  // cond3_bb (== cond2_bb)
--- gcc/tree-cfg.c.jj	2022-01-13 22:29:07.399943661 +0100
+++ gcc/tree-cfg.c	2022-01-14 12:59:42.148866608 +0100
@@ -9410,6 +9410,31 @@ gimple_switch_default_edge (function *if
   return gimple_switch_edge (ifun, gs, 0);
 }
 
+/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
+
+bool
+cond_only_block_p (basic_block bb)
+{
+  /* BB must have no executable statements.  */
+  gimple_stmt_iterator gsi = gsi_after_labels (bb);
+  if (phi_nodes (bb))
+    return false;
+  while (!gsi_end_p (gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      if (is_gimple_debug (stmt))
+	;
+      else if (gimple_code (stmt) == GIMPLE_NOP
+	       || gimple_code (stmt) == GIMPLE_PREDICT
+	       || gimple_code (stmt) == GIMPLE_COND)
+	;
+      else
+	return false;
+      gsi_next (&gsi);
+    }
+  return true;
+}
+
 
 /* Emit return warnings.  */
 
--- gcc/optabs.def.jj	2022-01-13 22:29:07.371944055 +0100
+++ gcc/optabs.def	2022-01-14 17:10:45.681029356 +0100
@@ -259,6 +259,7 @@ OPTAB_D (usubv4_optab, "usubv$I$a4")
 OPTAB_D (umulv4_optab, "umulv$I$a4")
 OPTAB_D (negv3_optab, "negv$I$a3")
 OPTAB_D (addptr3_optab, "addptr$a3")
+OPTAB_D (spaceship_optab, "spaceship$a3")
 
 OPTAB_D (smul_highpart_optab, "smul$a3_highpart")
 OPTAB_D (umul_highpart_optab, "umul$a3_highpart")
--- gcc/internal-fn.def.jj	2022-01-13 22:29:07.367944112 +0100
+++ gcc/internal-fn.def	2022-01-14 12:59:42.146866637 +0100
@@ -430,6 +430,9 @@ DEF_INTERNAL_FN (NOP, ECF_CONST | ECF_LE
 /* Temporary vehicle for __builtin_shufflevector.  */
 DEF_INTERNAL_FN (SHUFFLEVECTOR, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 
+/* <=> optimization.  */
+DEF_INTERNAL_FN (SPACESHIP, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
+
 #undef DEF_INTERNAL_INT_FN
 #undef DEF_INTERNAL_FLT_FN
 #undef DEF_INTERNAL_FLT_FLOATN_FN
--- gcc/internal-fn.h.jj	2022-01-13 22:29:15.344831763 +0100
+++ gcc/internal-fn.h	2022-01-14 13:31:47.892386409 +0100
@@ -241,6 +241,7 @@ extern void expand_internal_call (gcall
 extern void expand_internal_call (internal_fn, gcall *);
 extern void expand_PHI (internal_fn, gcall *);
 extern void expand_SHUFFLEVECTOR (internal_fn, gcall *);
+extern void expand_SPACESHIP (internal_fn, gcall *);
 
 extern bool vectorized_internal_fn_supported_p (internal_fn, tree);
 
--- gcc/internal-fn.c.jj	2022-01-13 22:29:15.344831763 +0100
+++ gcc/internal-fn.c	2022-01-14 17:23:04.335568234 +0100
@@ -4425,5 +4425,27 @@ expand_SHUFFLEVECTOR (internal_fn, gcall
 void
 expand_PHI (internal_fn, gcall *)
 {
-    gcc_unreachable ();
+  gcc_unreachable ();
+}
+
+void
+expand_SPACESHIP (internal_fn, gcall *stmt)
+{
+  tree lhs = gimple_call_lhs (stmt);
+  tree rhs1 = gimple_call_arg (stmt, 0);
+  tree rhs2 = gimple_call_arg (stmt, 1);
+  tree type = TREE_TYPE (rhs1);
+
+  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+  rtx op1 = expand_normal (rhs1);
+  rtx op2 = expand_normal (rhs2);
+
+  class expand_operand ops[3];
+  create_output_operand (&ops[0], target, TYPE_MODE (TREE_TYPE (lhs)));
+  create_input_operand (&ops[1], op1, TYPE_MODE (type));
+  create_input_operand (&ops[2], op2, TYPE_MODE (type));
+  insn_code icode = optab_handler (spaceship_optab, TYPE_MODE (type));
+  expand_insn (icode, 3, ops);
+  if (!rtx_equal_p (target, ops[0].value))
+    emit_move_insn (target, ops[0].value);
 }
--- gcc/tree-ssa-math-opts.c.jj	2022-01-13 22:29:07.491942365 +0100
+++ gcc/tree-ssa-math-opts.c	2022-01-14 20:41:45.000712458 +0100
@@ -4637,6 +4637,195 @@ convert_mult_to_highpart (gassign *stmt,
   return true;
 }
 
+/* If target has spaceship<MODE>3 expander, pattern recognize
+   <bb 2> [local count: 1073741824]:
+   if (a_2(D) == b_3(D))
+     goto <bb 6>; [34.00%]
+   else
+     goto <bb 3>; [66.00%]
+
+   <bb 3> [local count: 708669601]:
+   if (a_2(D) < b_3(D))
+     goto <bb 6>; [1.04%]
+   else
+     goto <bb 4>; [98.96%]
+
+   <bb 4> [local count: 701299439]:
+   if (a_2(D) > b_3(D))
+     goto <bb 5>; [48.89%]
+   else
+     goto <bb 6>; [51.11%]
+
+   <bb 5> [local count: 342865295]:
+
+   <bb 6> [local count: 1073741824]:
+   and turn it into:
+   <bb 2> [local count: 1073741824]:
+   _1 = .SPACESHIP (a_2(D), b_3(D));
+   if (_1 == 0)
+     goto <bb 6>; [34.00%]
+   else
+     goto <bb 3>; [66.00%]
+
+   <bb 3> [local count: 708669601]:
+   if (_1 == -1)
+     goto <bb 6>; [1.04%]
+   else
+     goto <bb 4>; [98.96%]
+
+   <bb 4> [local count: 701299439]:
+   if (_1 == 1)
+     goto <bb 5>; [48.89%]
+   else
+     goto <bb 6>; [51.11%]
+
+   <bb 5> [local count: 342865295]:
+
+   <bb 6> [local count: 1073741824]:
+   so that the backend can emit optimal comparison and
+   conditional jump sequence.  */
+
+static void
+optimize_spaceship (gimple *stmt)
+{
+  enum tree_code code = gimple_cond_code (stmt);
+  if (code != EQ_EXPR && code != NE_EXPR)
+    return;
+  tree arg1 = gimple_cond_lhs (stmt);
+  tree arg2 = gimple_cond_rhs (stmt);
+  if (!SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg1))
+      || !HONOR_NANS (TREE_TYPE (arg1))
+      || optab_handler (spaceship_optab,
+			TYPE_MODE (TREE_TYPE (arg1))) == CODE_FOR_nothing
+      || operand_equal_p (arg1, arg2, 0))
+    return;
+
+  basic_block bb0 = gimple_bb (stmt), bb1, bb2;
+  edge em1 = NULL, e1 = NULL, e2 = NULL;
+  bb1 = EDGE_SUCC (bb0, 1)->dest;
+  if (((EDGE_SUCC (bb0, 0)->flags & EDGE_TRUE_VALUE) != 0) ^ (code == EQ_EXPR))
+    bb1 = EDGE_SUCC (bb0, 0)->dest;
+
+  gimple *g = last_stmt (bb1);
+  if (g == NULL
+      || gimple_code (g) != GIMPLE_COND
+      || !single_pred_p (bb1)
+      || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
+	  ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
+	  : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
+	     || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
+      || !cond_only_block_p (bb1))
+    return;
+
+  enum tree_code ccode = (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
+			  ? LT_EXPR : GT_EXPR);
+  switch (gimple_cond_code (g))
+    {
+    case LT_EXPR:
+    case LE_EXPR:
+      break;
+    case GT_EXPR:
+    case GE_EXPR:
+      ccode = ccode == LT_EXPR ? GT_EXPR : LT_EXPR;
+      break;
+    default:
+      return;
+    }
+
+  /* With NaNs, </<=/>/>= are false, so we need to look for the
+     third comparison on the false edge from whatever non-equality
+     comparison the second comparison is.  */
+  int i = (EDGE_SUCC (bb1, 0)->flags & EDGE_TRUE_VALUE) != 0;
+  bb2 = EDGE_SUCC (bb1, i)->dest;
+  g = last_stmt (bb2);
+  if (g == NULL
+      || gimple_code (g) != GIMPLE_COND
+      || !single_pred_p (bb2)
+      || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
+	  ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
+	  : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
+	     || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
+      || !cond_only_block_p (bb2)
+      || EDGE_SUCC (bb2, 0)->dest == EDGE_SUCC (bb2, 1)->dest)
+    return;
+
+  enum tree_code ccode2
+    = (operand_equal_p (gimple_cond_lhs (g), arg1, 0) ? LT_EXPR : GT_EXPR);
+  switch (gimple_cond_code (g))
+    {
+    case LT_EXPR:
+    case LE_EXPR:
+      break;
+    case GT_EXPR:
+    case GE_EXPR:
+      ccode2 = ccode2 == LT_EXPR ? GT_EXPR : LT_EXPR;
+      break;
+    default:
+      return;
+    }
+  if (ccode == ccode2)
+    return;
+
+  if (ccode == LT_EXPR)
+    {
+      em1 = EDGE_SUCC (bb1, 1 - i);
+      e1 = EDGE_SUCC (bb2, 0);
+      e2 = EDGE_SUCC (bb2, 1);
+      if ((e1->flags & EDGE_TRUE_VALUE) == 0)
+	std::swap (e1, e2);
+    }
+  else
+    {
+      e1 = EDGE_SUCC (bb1, 1 - i);
+      em1 = EDGE_SUCC (bb2, 0);
+      e2 = EDGE_SUCC (bb2, 1);
+      if ((em1->flags & EDGE_TRUE_VALUE) == 0)
+	std::swap (em1, e2);
+    }
+
+  g = gimple_build_call_internal (IFN_SPACESHIP, 2, arg1, arg2);
+  tree lhs = make_ssa_name (integer_type_node);
+  gimple_call_set_lhs (g, lhs);
+  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
+
+  gcond *cond = as_a <gcond *> (stmt);
+  gimple_cond_set_lhs (cond, lhs);
+  gimple_cond_set_rhs (cond, integer_zero_node);
+  update_stmt (stmt);
+
+  g = last_stmt (bb1);
+  cond = as_a <gcond *> (g);
+  gimple_cond_set_code (cond, EQ_EXPR);
+  gimple_cond_set_lhs (cond, lhs);
+  if (em1->src == bb1)
+    gimple_cond_set_rhs (cond, integer_minus_one_node);
+  else
+    {
+      gcc_assert (e1->src == bb1);
+      gimple_cond_set_rhs (cond, integer_one_node);
+    }
+  update_stmt (g);
+
+  g = last_stmt (bb2);
+  cond = as_a <gcond *> (g);
+  gimple_cond_set_lhs (cond, lhs);
+  if (em1->src == bb2)
+    gimple_cond_set_rhs (cond, integer_minus_one_node);
+  else
+    {
+      gcc_assert (e1->src == bb2);
+      gimple_cond_set_rhs (cond, integer_one_node);
+    }
+  gimple_cond_set_code (cond,
+			(e2->flags & EDGE_TRUE_VALUE) ? NE_EXPR : EQ_EXPR);
+  update_stmt (g);
+
+  wide_int wm1 = wi::minus_one (TYPE_PRECISION (integer_type_node));
+  wide_int w2 = wi::two (TYPE_PRECISION (integer_type_node));
+  set_range_info (lhs, VR_RANGE, wm1, w2);
+}
+
 
 /* Find integer multiplications where the operands are extended from
    smaller types, and replace the MULT_EXPR with a WIDEN_MULT_EXPR
@@ -4798,6 +4987,8 @@ math_opts_dom_walker::after_dom_children
 	      break;
 	    }
 	}
+      else if (gimple_code (stmt) == GIMPLE_COND)
+	optimize_spaceship (stmt);
       gsi_next (&gsi);
     }
   if (fma_state.m_deferring_p
--- gcc/config/i386/i386.md.jj	2022-01-14 11:51:34.432384170 +0100
+++ gcc/config/i386/i386.md	2022-01-14 18:22:41.140906449 +0100
@@ -23886,6 +23886,18 @@ (define_insn "hreset"
   [(set_attr "type" "other")
    (set_attr "length" "4")])
 
+;; Spaceship optimization
+(define_expand "spaceship<mode>3"
+  [(match_operand:SI 0 "register_operand")
+   (match_operand:MODEF 1 "cmp_fp_expander_operand")
+   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
+  "(TARGET_80387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH))
+   && TARGET_CMOVE && TARGET_IEEE_FP"
+{
+  ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
+  DONE;
+})
+
 (include "mmx.md")
 (include "sse.md")
 (include "sync.md")
--- gcc/config/i386/i386-protos.h.jj	2022-01-11 23:11:21.798298473 +0100
+++ gcc/config/i386/i386-protos.h	2022-01-14 17:57:53.570021792 +0100
@@ -150,6 +150,7 @@ extern bool ix86_expand_int_vec_cmp (rtx
 extern bool ix86_expand_fp_vec_cmp (rtx[]);
 extern void ix86_expand_sse_movcc (rtx, rtx, rtx, rtx);
 extern void ix86_expand_sse_unpack (rtx, rtx, bool, bool);
+extern void ix86_expand_fp_spaceship (rtx, rtx, rtx);
 extern bool ix86_expand_int_addcc (rtx[]);
 extern rtx_insn *ix86_expand_call (rtx, rtx, rtx, rtx, rtx, bool);
 extern bool ix86_call_use_plt_p (rtx);
--- gcc/config/i386/i386-expand.c.jj	2022-01-14 11:51:34.429384213 +0100
+++ gcc/config/i386/i386-expand.c	2022-01-14 21:02:09.446437810 +0100
@@ -2879,6 +2879,46 @@ ix86_expand_setcc (rtx dest, enum rtx_co
   emit_insn (gen_rtx_SET (dest, ret));
 }
 
+/* Expand floating point op0 <=> op1 if NaNs are honored.  */
+
+void
+ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1)
+{
+  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI);
+  rtx gt = ix86_expand_fp_compare (GT, op0, op1);
+  rtx l0 = gen_label_rtx ();
+  rtx l1 = gen_label_rtx ();
+  rtx l2 = gen_label_rtx ();
+  rtx lend = gen_label_rtx ();
+  rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
+			   gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
+  rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
+				  gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
+  rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
+  add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
+  rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
+			   gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
+  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
+			      gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx);
+  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
+  add_reg_br_prob_note (jmp, profile_probability::unlikely ());
+  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt,
+			      gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx);
+  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
+  add_reg_br_prob_note (jmp, profile_probability::even ());
+  emit_move_insn (dest, constm1_rtx);
+  emit_jump (lend);
+  emit_label (l0);
+  emit_move_insn (dest, const0_rtx);
+  emit_jump (lend);
+  emit_label (l1);
+  emit_move_insn (dest, const1_rtx);
+  emit_jump (lend);
+  emit_label (l2);
+  emit_move_insn (dest, const2_rtx);
+  emit_label (lend);
+}
+
 /* Expand comparison setting or clearing carry flag.  Return true when
    successful and set pop for the operation.  */
 static bool
--- gcc/doc/md.texi.jj	2022-01-13 22:29:15.329831975 +0100
+++ gcc/doc/md.texi	2022-01-14 21:07:11.533180235 +0100
@@ -8055,6 +8055,15 @@ inclusive and operand 1 exclusive.
 If this pattern is not defined, a call to the library function
 @code{__clear_cache} is used.
 
+@cindex @code{spaceship@var{m}3} instruction pattern
+@item @samp{spaceship@var{m}3}
+Initialize output operand 0 with mode of integer type to -1, 0, 1 or 2
+if operand 1 with mode @var{m} compares less than operand 2, equal to
+operand 2, greater than operand 2 or is unordered with operand 2.
+@var{m} should be a scalar floating point mode.
+
+This pattern is not allowed to @code{FAIL}.
+
 @end table
 
 @end ifset
--- gcc/testsuite/gcc.target/i386/pr103973-1.c.jj	2022-01-14 19:21:42.080770182 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-1.c	2022-01-14 20:52:22.764713273 +0100
@@ -0,0 +1,98 @@
+/* PR target/103973 */
+/* { dg-do run } */
+/* { dg-options "-O2 -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
+
+__attribute__((noipa)) int m1 (void) { return -1; }
+__attribute__((noipa)) int p0 (void) { return 0; }
+__attribute__((noipa)) int p1 (void) { return 1; }
+__attribute__((noipa)) int p2 (void) { return 2; }
+
+__attribute__((noipa)) int
+foo (double a, double b)
+{
+  if (a == b)
+    return 0;
+  if (a < b)
+    return -1;
+  if (a > b)
+    return 1;
+  return 2;
+}
+
+__attribute__((noipa)) int
+bar (double a, double b)
+{
+  if (a == b)
+    return p0 ();
+  if (a < b)
+    return m1 ();
+  if (a > b)
+    return p1 ();
+  return p2 ();
+}
+
+__attribute__((noipa)) int
+baz (double a, double b)
+{
+  if (a == b)
+    return p0 ();
+  if (b < a)
+    return p1 ();
+  if (a < b)
+    return m1 ();
+  return p2 ();
+}
+
+__attribute__((noipa)) int
+qux (double a)
+{
+  if (a != 0.0f)
+    {
+      if (a <= 0.0f)
+	return -1;
+      if (a >= 0.0f)
+	return 1;
+      return 2;
+    }
+  return 0;
+}
+
+int
+main ()
+{
+  double m5 = -5.0f;
+  double p5 = 5.0f;
+  volatile double p0 = 0.0f;
+  double nan = p0 / p0;
+  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
+    __builtin_abort ();
+  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
+    __builtin_abort ();
+  if (foo (m5, nan) != 2 || foo (nan, p5) != 2)
+    __builtin_abort ();
+  if (foo (nan, nan) != 2)
+    __builtin_abort ();
+  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
+    __builtin_abort ();
+  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
+    __builtin_abort ();
+  if (bar (m5, nan) != 2 || bar (nan, p5) != 2)
+    __builtin_abort ();
+  if (bar (nan, nan) != 2)
+    __builtin_abort ();
+  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
+    __builtin_abort ();
+  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
+    __builtin_abort ();
+  if (baz (m5, nan) != 2 || baz (nan, p5) != 2)
+    __builtin_abort ();
+  if (baz (nan, nan) != 2)
+    __builtin_abort ();
+  if (qux (p0) != 0 || qux (nan) != 2)
+    __builtin_abort ();
+  if (qux (m5) != -1 || qux (p5) != 1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/i386/pr103973-2.c.jj	2022-01-14 19:21:50.766647425 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-2.c	2022-01-14 20:43:40.230085717 +0100
@@ -0,0 +1,7 @@
+/* PR target/103973 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -march=i686" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#include "pr103973-1.c"
--- gcc/testsuite/gcc.target/i386/pr103973-3.c.jj	2022-01-14 19:23:15.645447824 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-3.c	2022-01-14 20:43:46.415998435 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do run } */
+/* { dg-options "-O2 -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
+
+#define double float
+#include "pr103973-1.c"
--- gcc/testsuite/gcc.target/i386/pr103973-4.c.jj	2022-01-14 19:25:15.622752173 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-4.c	2022-01-14 20:43:53.518898224 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -march=i686" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#define double float
+#include "pr103973-1.c"
--- gcc/testsuite/g++.target/i386/pr103973-1.C.jj	2022-01-14 20:44:35.225309786 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-1.C	2022-01-14 20:55:04.611429778 +0100
@@ -0,0 +1,71 @@
+// PR target/103973
+// { dg-do run }
+// { dg-options "-O2 -std=c++20 -save-temps" }
+// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
+// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
+
+#include <compare>
+
+#ifndef double_type
+#define double_type double
+#endif
+
+__attribute__((noipa)) auto
+foo (double_type a, double_type b)
+{
+  return a <=> b;
+}
+
+__attribute__((noipa)) int
+bar (double_type a, double_type b)
+{
+  auto c = foo (a, b);
+  if (c == std::partial_ordering::less)
+    return -1;
+  if (c == std::partial_ordering::equivalent)
+    return 0;
+  if (c == std::partial_ordering::greater)
+    return 1;
+  return 2;
+}
+
+__attribute__((noipa)) auto
+baz (double_type a)
+{
+  return a <=> 0.0f;
+}
+
+__attribute__((noipa)) int
+qux (double_type a)
+{
+  auto c = baz (a);
+  if (c == std::partial_ordering::less)
+    return -1;
+  if (c == std::partial_ordering::equivalent)
+    return 0;
+  if (c == std::partial_ordering::greater)
+    return 1;
+  return 2;
+}
+
+int
+main ()
+{
+  double_type m5 = -5.0;
+  double_type p5 = 5.0;
+  volatile double_type p0 = 0.0;
+  double_type nan = p0 / p0;
+  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
+    __builtin_abort ();
+  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
+    __builtin_abort ();
+  if (bar (m5, nan) != 2 || bar (nan, p5) != 2)
+    __builtin_abort ();
+  if (bar (nan, nan) != 2)
+    __builtin_abort ();
+  if (qux (p0) != 0 || qux (nan) != 2)
+    __builtin_abort ();
+  if (qux (m5) != -1 || qux (p5) != 1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/g++.target/i386/pr103973-2.C.jj	2022-01-14 20:52:50.309324647 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-2.C	2022-01-14 20:55:41.967902715 +0100
@@ -0,0 +1,7 @@
+// PR target/103973
+// { dg-do compile { target ia32 } }
+// { dg-options "-O2 -march=i686 -std=c++20" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#include "pr103973-1.C"
--- gcc/testsuite/g++.target/i386/pr103973-3.C.jj	2022-01-14 20:52:53.212283686 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-3.C	2022-01-14 20:55:49.052802755 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do run }
+// { dg-options "-O2 -save-temps -std=c++20" }
+// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
+// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
+
+#define double_type float
+#include "pr103973-1.C"
--- gcc/testsuite/g++.target/i386/pr103973-4.C.jj	2022-01-14 20:52:55.951245044 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-4.C	2022-01-14 20:55:55.723708635 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do compile { target ia32 } }
+// { dg-options "-O2 -march=i686 -std=c++20" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#define double_type float
+#include "pr103973-1.C"

	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]
  2022-01-14 22:56 [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973] Jakub Jelinek
@ 2022-01-15  8:29 ` Uros Bizjak
  2022-01-15  9:56   ` Jakub Jelinek
  0 siblings, 1 reply; 8+ messages in thread
From: Uros Bizjak @ 2022-01-15  8:29 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

On Fri, Jan 14, 2022 at 11:56 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> C++20:
> #include <compare>
> auto cmp4way(double a, double b)
> {
>   return a <=> b;
> }
> expands to:
>         ucomisd %xmm1, %xmm0
>         jp      .L8
>         movl    $0, %eax
>         jne     .L8
> .L2:
>         ret
>         .p2align 4,,10
>         .p2align 3
> .L8:
>         comisd  %xmm0, %xmm1
>         movl    $-1, %eax
>         ja      .L2
>         ucomisd %xmm1, %xmm0
>         setbe   %al
>         addl    $1, %eax
>         ret
> That is 3 comparisons of the same operands.
> The following patch improves it to just one comparison:
>         comisd  %xmm1, %xmm0
>         jp      .L4
>         seta    %al
>         movl    $0, %edx
>         leal    -1(%rax,%rax), %eax
>         cmove   %edx, %eax
>         ret
> .L4:
>         movl    $2, %eax
>         ret
> While a <=> b expands to a == b ? 0 : a < b ? -1 : a > b ? 1 : 2
> where the first comparison is equality and this shouldn't raise
> exceptions on qNaN operands, if the operands aren't equal (which
> includes unordered cases), then it immediately performs < or >
> comparison and that raises exceptions even on qNaNs, so we can just
> perform a single comparison that raises exceptions on qNaN.
> As the 4 different cases are encoded as
> ZF CF PF
> 1  1  1  a unordered b
> 0  0  0  a > b
> 0  1  0  a < b
> 1  0  0  a == b
> we can emit optimal sequence of comparions, first jp
> for the unordered case, then je for the == case and finally jb
> for the < case.
>
> The patch pattern recognizes spaceship-like comparisons during
> widening_mul if the spaceship optab is implemented, and replaces
> thoose comparisons with comparisons of .SPACESHIP ifn which returns
> -1/0/1/2 based on the comparison.  This seems to work well both for the
> case of just returning the -1/0/1/2 (when we have just a common
> successor with a PHI) or when the different cases are handled with
> various other basic blocks.  The testcases cover both of those cases,
> the latter with different function calls in those.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2022-01-14  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/103973
>         * tree-cfg.h (cond_only_block_p): Declare.
>         * tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
>         * tree-cfg.c (cond_only_block_p): ... here.  No longer static.
>         * optabs.def (spaceship_optab): New optab.
>         * internal-fn.def (SPACESHIP): New internal function.
>         * internal-fn.h (expand_SPACESHIP): Declare.
>         * internal-fn.c (expand_PHI): Formatting fix.
>         (expand_SPACESHIP): New function.
>         * tree-ssa-math-opts.c (optimize_spaceship): New function.
>         (math_opts_dom_walker::after_dom_children): Use it.
>         * config/i386/i386.md (spaceship<mode>3): New define_expand.
>         * config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
>         * config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
>         * doc/md.texi (spaceship@var{m}3): Document.
>
>         * gcc.target/i386/pr103973-1.c: New test.
>         * gcc.target/i386/pr103973-2.c: New test.
>         * gcc.target/i386/pr103973-3.c: New test.
>         * gcc.target/i386/pr103973-4.c: New test.
>         * g++.target/i386/pr103973-1.C: New test.
>         * g++.target/i386/pr103973-2.C: New test.
>         * g++.target/i386/pr103973-3.C: New test.
>         * g++.target/i386/pr103973-4.C: New test.

>--- gcc/config/i386/i386.md.jj  2022-01-14 11:51:34.432384170 +0100
> +++ gcc/config/i386/i386.md     2022-01-14 18:22:41.140906449 +0100
> @@ -23886,6 +23886,18 @@ (define_insn "hreset"
>    [(set_attr "type" "other")
>     (set_attr "length" "4")])
>
> +;; Spaceship optimization
> +(define_expand "spaceship<mode>3"
> +  [(match_operand:SI 0 "register_operand")
> +   (match_operand:MODEF 1 "cmp_fp_expander_operand")
> +   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
> +  "(TARGET_80387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH))
> +   && TARGET_CMOVE && TARGET_IEEE_FP"

Is there a reason that this pattern is limited to TARGET_IEEE_FP?
During the expansion in ix86_expand_fp_spaceship, we can just skip
jump on unordered, while ix86_expand_fp_compare will emit the correct
comparison mode depending on TARGET_IEEE_FP.

> +{
> +  ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
> +  DONE;
> +})
> +
>  (include "mmx.md")
>  (include "sse.md")
>  (include "sync.md")
> --- gcc/config/i386/i386-expand.c.jj    2022-01-14 11:51:34.429384213 +0100
> +++ gcc/config/i386/i386-expand.c       2022-01-14 21:02:09.446437810 +0100
> @@ -2879,6 +2879,46 @@ ix86_expand_setcc (rtx dest, enum rtx_co
>    emit_insn (gen_rtx_SET (dest, ret));
>  }
>
> +/* Expand floating point op0 <=> op1 if NaNs are honored.  */
> +
> +void
> +ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1)
> +{
> +  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI);
> +  rtx gt = ix86_expand_fp_compare (GT, op0, op1);
> +  rtx l0 = gen_label_rtx ();
> +  rtx l1 = gen_label_rtx ();
> +  rtx l2 = gen_label_rtx ();
> +  rtx lend = gen_label_rtx ();
> +  rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
> +                          gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> +  rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
> +                                 gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
> +  rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +  add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());

Please also add JUMP_LABEL to the insn.

> +  rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
> +                          gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
> +                             gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx);
> +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +  add_reg_br_prob_note (jmp, profile_probability::unlikely ());
> +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt,
> +                             gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx);
> +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +  add_reg_br_prob_note (jmp, profile_probability::even ());
> +  emit_move_insn (dest, constm1_rtx);
> +  emit_jump (lend);
> +  emit_label (l0);

and LABEL_NUSES label.

> +  emit_move_insn (dest, const0_rtx);
> +  emit_jump (lend);
> +  emit_label (l1);
> +  emit_move_insn (dest, const1_rtx);
> +  emit_jump (lend);
> +  emit_label (l2);
> +  emit_move_insn (dest, const2_rtx);
> +  emit_label (lend);
> +}
> +
>  /* Expand comparison setting or clearing carry flag.  Return true when
>     successful and set pop for the operation.  */
>  static bool

Uros.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]
  2022-01-15  8:29 ` Uros Bizjak
@ 2022-01-15  9:56   ` Jakub Jelinek
  2022-01-15 10:42     ` Uros Bizjak
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2022-01-15  9:56 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: Richard Biener, gcc-patches

On Sat, Jan 15, 2022 at 09:29:05AM +0100, Uros Bizjak wrote:
> > --- gcc/config/i386/i386.md.jj  2022-01-14 11:51:34.432384170 +0100
> > +++ gcc/config/i386/i386.md     2022-01-14 18:22:41.140906449 +0100
> > @@ -23886,6 +23886,18 @@ (define_insn "hreset"
> >    [(set_attr "type" "other")
> >     (set_attr "length" "4")])
> >
> > +;; Spaceship optimization
> > +(define_expand "spaceship<mode>3"
> > +  [(match_operand:SI 0 "register_operand")
> > +   (match_operand:MODEF 1 "cmp_fp_expander_operand")
> > +   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
> > +  "(TARGET_80387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH))
> > +   && TARGET_CMOVE && TARGET_IEEE_FP"
> 
> Is there a reason that this pattern is limited to TARGET_IEEE_FP?
> During the expansion in ix86_expand_fp_spaceship, we can just skip
> jump on unordered, while ix86_expand_fp_compare will emit the correct
> comparison mode depending on TARGET_IEEE_FP.

For -ffast-math I thought <=> expands to just x == y ? 0 : x < y ? -1 : 1
but apparently not, it is still x == y ? 0 : x < y ? -1 : x > y ? 1 : 2
but it is still optimized much better than the non-fast-math case
without the patch:
        comisd  %xmm1, %xmm0
        je      .L12
        jb      .L13
        movl    $1, %edx
        movl    $2, %eax
        cmova   %edx, %eax
        ret
        .p2align 4,,10
        .p2align 3
.L12:
        xorl    %eax, %eax
        ret
        .p2align 4,,10
        .p2align 3
.L13:
        movl    $-1, %eax
        ret
so just one comparison but admittedly the
        movl    $1, %edx
        movl    $2, %eax
        cmova   %edx, %eax
part is unnecessary.
So below is an incremental patch that handles even the !HONORS_NANS
case at the gimple pattern matching (while for HONOR_NANS we must
obey that for NaN operand(s) >/>=/</<= is false and so need to make sure
we look for the third comparison on the false edge of the second one,
for !HONOR_NANS that is not the case.  With the incremental patch we get:
        comisd  %xmm1, %xmm0
        je      .L2
        seta    %al
        leal    -1(%rax,%rax), %eax
        ret
        .p2align 4,,10
        .p2align 3
.L2:
        xorl    %eax, %eax
        ret
for -O2 -ffast-math.
Also, I've added || (TARGET_SAHF && TARGET_USE_SAHF), because apparently
we can handle that case nicely too, it is just the IX86_FPCMP_ARITH
case where fp_compare already emits very specific code (and we can't
call ix86_expand_fp_compare 3 times because that would for IEEE_FP
emit different comparisons which couldn't be CSEd).
I'll add also -ffast-math testsuite coverage and retest.

Also, I wonder if I shouldn't handle XFmode the same, thoughts on that?

> > +  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI);
> > +  rtx gt = ix86_expand_fp_compare (GT, op0, op1);
> > +  rtx l0 = gen_label_rtx ();
> > +  rtx l1 = gen_label_rtx ();
> > +  rtx l2 = gen_label_rtx ();
> > +  rtx lend = gen_label_rtx ();
> > +  rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
> > +                          gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> > +  rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
> > +                                 gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
> > +  rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > +  add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
> 
> Please also add JUMP_LABEL to the insn.
> 
> > +  rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
> > +                          gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> > +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
> > +                             gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx);
> > +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > +  add_reg_br_prob_note (jmp, profile_probability::unlikely ());
> > +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt,
> > +                             gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx);
> > +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > +  add_reg_br_prob_note (jmp, profile_probability::even ());
> > +  emit_move_insn (dest, constm1_rtx);
> > +  emit_jump (lend);
> > +  emit_label (l0);
> 
> and LABEL_NUSES label.

Why?  That seems to be a waste of time to me, unless something uses them
already during expansion.  Because pass_expand::execute
runs:
  /* We need JUMP_LABEL be set in order to redirect jumps, and hence
     split edges which edge insertions might do.  */
  rebuild_jump_labels (get_insns ());
which resets all LABEL_NUSES to 0 (well, to:
      if (LABEL_P (insn))
        LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0);
and then recomputes them and adds JUMP_LABEL if needed:
              JUMP_LABEL (insn) = label;

--- gcc/config/i386/i386.md.jj	2022-01-15 09:51:25.404468342 +0100
+++ gcc/config/i386/i386.md	2022-01-15 09:56:31.602109421 +0100
@@ -23892,7 +23892,7 @@ (define_expand "spaceship<mode>3"
    (match_operand:MODEF 1 "cmp_fp_expander_operand")
    (match_operand:MODEF 2 "cmp_fp_expander_operand")]
   "(TARGET_80387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH))
-   && TARGET_CMOVE && TARGET_IEEE_FP"
+   && (TARGET_CMOVE || (TARGET_SAHF && TARGET_USE_SAHF))"
 {
   ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
   DONE;
--- gcc/config/i386/i386-expand.c.jj	2022-01-15 09:51:25.411468242 +0100
+++ gcc/config/i386/i386-expand.c	2022-01-15 10:38:26.924333651 +0100
@@ -2884,18 +2884,23 @@ ix86_expand_setcc (rtx dest, enum rtx_co
 void
 ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1)
 {
-  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI);
+  gcc_checking_assert (ix86_fp_comparison_strategy (GT) != IX86_FPCMP_ARITH);
   rtx gt = ix86_expand_fp_compare (GT, op0, op1);
   rtx l0 = gen_label_rtx ();
   rtx l1 = gen_label_rtx ();
-  rtx l2 = gen_label_rtx ();
+  rtx l2 = TARGET_IEEE_FP ? gen_label_rtx () : NULL_RTX;
   rtx lend = gen_label_rtx ();
-  rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
-			   gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
-  rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
+  rtx tmp;
+  rtx_insn *jmp;
+  if (l2)
+    {
+      rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
+			       gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
+      tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
 				  gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
-  rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
-  add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
+      jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
+      add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
+    }
   rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
 			   gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
   tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
@@ -2914,8 +2919,11 @@ ix86_expand_fp_spaceship (rtx dest, rtx
   emit_label (l1);
   emit_move_insn (dest, const1_rtx);
   emit_jump (lend);
-  emit_label (l2);
-  emit_move_insn (dest, const2_rtx);
+  if (l2)
+    {
+      emit_label (l2);
+      emit_move_insn (dest, const2_rtx);
+    }
   emit_label (lend);
 }
 
--- gcc/tree-ssa-math-opts.c.jj	2022-01-15 09:51:25.402468370 +0100
+++ gcc/tree-ssa-math-opts.c	2022-01-15 10:35:52.366533951 +0100
@@ -4694,7 +4694,6 @@ optimize_spaceship (gimple *stmt)
   tree arg1 = gimple_cond_lhs (stmt);
   tree arg2 = gimple_cond_rhs (stmt);
   if (!SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg1))
-      || !HONOR_NANS (TREE_TYPE (arg1))
       || optab_handler (spaceship_optab,
 			TYPE_MODE (TREE_TYPE (arg1))) == CODE_FOR_nothing
       || operand_equal_p (arg1, arg2, 0))
@@ -4732,56 +4731,67 @@ optimize_spaceship (gimple *stmt)
       return;
     }
 
-  /* With NaNs, </<=/>/>= are false, so we need to look for the
-     third comparison on the false edge from whatever non-equality
-     comparison the second comparison is.  */
-  int i = (EDGE_SUCC (bb1, 0)->flags & EDGE_TRUE_VALUE) != 0;
-  bb2 = EDGE_SUCC (bb1, i)->dest;
-  g = last_stmt (bb2);
-  if (g == NULL
-      || gimple_code (g) != GIMPLE_COND
-      || !single_pred_p (bb2)
-      || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
-	  ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
-	  : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
-	     || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
-      || !cond_only_block_p (bb2)
-      || EDGE_SUCC (bb2, 0)->dest == EDGE_SUCC (bb2, 1)->dest)
-    return;
-
-  enum tree_code ccode2
-    = (operand_equal_p (gimple_cond_lhs (g), arg1, 0) ? LT_EXPR : GT_EXPR);
-  switch (gimple_cond_code (g))
+  for (int i = 0; i < 2; ++i)
     {
-    case LT_EXPR:
-    case LE_EXPR:
+      /* With NaNs, </<=/>/>= are false, so we need to look for the
+	 third comparison on the false edge from whatever non-equality
+	 comparison the second comparison is.  */
+      if (HONOR_NANS (TREE_TYPE (arg1))
+	  && (EDGE_SUCC (bb1, i)->flags & EDGE_TRUE_VALUE) != 0)
+	continue;
+
+      bb2 = EDGE_SUCC (bb1, i)->dest;
+      g = last_stmt (bb2);
+      if (g == NULL
+	  || gimple_code (g) != GIMPLE_COND
+	  || !single_pred_p (bb2)
+	  || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
+	      ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
+	      : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
+		 || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
+	  || !cond_only_block_p (bb2)
+	  || EDGE_SUCC (bb2, 0)->dest == EDGE_SUCC (bb2, 1)->dest)
+	continue;
+
+      enum tree_code ccode2
+	= (operand_equal_p (gimple_cond_lhs (g), arg1, 0) ? LT_EXPR : GT_EXPR);
+      switch (gimple_cond_code (g))
+	{
+	case LT_EXPR:
+	case LE_EXPR:
+	  break;
+	case GT_EXPR:
+	case GE_EXPR:
+	  ccode2 = ccode2 == LT_EXPR ? GT_EXPR : LT_EXPR;
+	  break;
+	default:
+	  continue;
+	}
+      if (HONOR_NANS (TREE_TYPE (arg1)) && ccode == ccode2)
+	return;
+
+      if ((ccode == LT_EXPR)
+	  ^ ((EDGE_SUCC (bb1, i)->flags & EDGE_TRUE_VALUE) != 0))
+	{
+	  em1 = EDGE_SUCC (bb1, 1 - i);
+	  e1 = EDGE_SUCC (bb2, 0);
+	  e2 = EDGE_SUCC (bb2, 1);
+	  if ((ccode2 == LT_EXPR) ^ ((e1->flags & EDGE_TRUE_VALUE) == 0))
+	    std::swap (e1, e2);
+	}
+      else
+	{
+	  e1 = EDGE_SUCC (bb1, 1 - i);
+	  em1 = EDGE_SUCC (bb2, 0);
+	  e2 = EDGE_SUCC (bb2, 1);
+	  if ((ccode2 != LT_EXPR) ^ ((em1->flags & EDGE_TRUE_VALUE) == 0))
+	    std::swap (em1, e2);
+	}
       break;
-    case GT_EXPR:
-    case GE_EXPR:
-      ccode2 = ccode2 == LT_EXPR ? GT_EXPR : LT_EXPR;
-      break;
-    default:
-      return;
     }
-  if (ccode == ccode2)
-    return;
 
-  if (ccode == LT_EXPR)
-    {
-      em1 = EDGE_SUCC (bb1, 1 - i);
-      e1 = EDGE_SUCC (bb2, 0);
-      e2 = EDGE_SUCC (bb2, 1);
-      if ((e1->flags & EDGE_TRUE_VALUE) == 0)
-	std::swap (e1, e2);
-    }
-  else
-    {
-      e1 = EDGE_SUCC (bb1, 1 - i);
-      em1 = EDGE_SUCC (bb2, 0);
-      e2 = EDGE_SUCC (bb2, 1);
-      if ((em1->flags & EDGE_TRUE_VALUE) == 0)
-	std::swap (em1, e2);
-    }
+  if (em1 == NULL)
+    return;
 
   g = gimple_build_call_internal (IFN_SPACESHIP, 2, arg1, arg2);
   tree lhs = make_ssa_name (integer_type_node);
@@ -4796,14 +4806,19 @@ optimize_spaceship (gimple *stmt)
 
   g = last_stmt (bb1);
   cond = as_a <gcond *> (g);
-  gimple_cond_set_code (cond, EQ_EXPR);
   gimple_cond_set_lhs (cond, lhs);
   if (em1->src == bb1)
-    gimple_cond_set_rhs (cond, integer_minus_one_node);
+    {
+      gimple_cond_set_rhs (cond, integer_minus_one_node);
+      gimple_cond_set_code (cond, (em1->flags & EDGE_TRUE_VALUE)
+				  ? EQ_EXPR : NE_EXPR);
+    }
   else
     {
       gcc_assert (e1->src == bb1);
       gimple_cond_set_rhs (cond, integer_one_node);
+      gimple_cond_set_code (cond, (e1->flags & EDGE_TRUE_VALUE)
+				  ? EQ_EXPR : NE_EXPR);
     }
   update_stmt (g);
 


	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]
  2022-01-15  9:56   ` Jakub Jelinek
@ 2022-01-15 10:42     ` Uros Bizjak
  2022-01-15 11:22       ` [PATCH] widening_mul, i386, v2: " Jakub Jelinek
  0 siblings, 1 reply; 8+ messages in thread
From: Uros Bizjak @ 2022-01-15 10:42 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

On Sat, Jan 15, 2022 at 10:56 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Sat, Jan 15, 2022 at 09:29:05AM +0100, Uros Bizjak wrote:
> > > --- gcc/config/i386/i386.md.jj  2022-01-14 11:51:34.432384170 +0100
> > > +++ gcc/config/i386/i386.md     2022-01-14 18:22:41.140906449 +0100
> > > @@ -23886,6 +23886,18 @@ (define_insn "hreset"
> > >    [(set_attr "type" "other")
> > >     (set_attr "length" "4")])
> > >
> > > +;; Spaceship optimization
> > > +(define_expand "spaceship<mode>3"
> > > +  [(match_operand:SI 0 "register_operand")
> > > +   (match_operand:MODEF 1 "cmp_fp_expander_operand")
> > > +   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
> > > +  "(TARGET_80387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH))
> > > +   && TARGET_CMOVE && TARGET_IEEE_FP"
> >
> > Is there a reason that this pattern is limited to TARGET_IEEE_FP?
> > During the expansion in ix86_expand_fp_spaceship, we can just skip
> > jump on unordered, while ix86_expand_fp_compare will emit the correct
> > comparison mode depending on TARGET_IEEE_FP.
>
> For -ffast-math I thought <=> expands to just x == y ? 0 : x < y ? -1 : 1
> but apparently not, it is still x == y ? 0 : x < y ? -1 : x > y ? 1 : 2
> but it is still optimized much better than the non-fast-math case
> without the patch:
>         comisd  %xmm1, %xmm0
>         je      .L12
>         jb      .L13
>         movl    $1, %edx
>         movl    $2, %eax
>         cmova   %edx, %eax
>         ret
>         .p2align 4,,10
>         .p2align 3
> .L12:
>         xorl    %eax, %eax
>         ret
>         .p2align 4,,10
>         .p2align 3
> .L13:
>         movl    $-1, %eax
>         ret
> so just one comparison but admittedly the
>         movl    $1, %edx
>         movl    $2, %eax
>         cmova   %edx, %eax
> part is unnecessary.
> So below is an incremental patch that handles even the !HONORS_NANS
> case at the gimple pattern matching (while for HONOR_NANS we must
> obey that for NaN operand(s) >/>=/</<= is false and so need to make sure
> we look for the third comparison on the false edge of the second one,
> for !HONOR_NANS that is not the case.  With the incremental patch we get:
>         comisd  %xmm1, %xmm0
>         je      .L2
>         seta    %al
>         leal    -1(%rax,%rax), %eax
>         ret
>         .p2align 4,,10
>         .p2align 3
> .L2:
>         xorl    %eax, %eax
>         ret
> for -O2 -ffast-math.
> Also, I've added || (TARGET_SAHF && TARGET_USE_SAHF), because apparently
> we can handle that case nicely too, it is just the IX86_FPCMP_ARITH
> case where fp_compare already emits very specific code (and we can't
> call ix86_expand_fp_compare 3 times because that would for IEEE_FP
> emit different comparisons which couldn't be CSEd).
> I'll add also -ffast-math testsuite coverage and retest.
>
> Also, I wonder if I shouldn't handle XFmode the same, thoughts on that?

Yes, that would be nice. XFmode is used for long double, and not obsolete.

>
> > > +  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI);
> > > +  rtx gt = ix86_expand_fp_compare (GT, op0, op1);
> > > +  rtx l0 = gen_label_rtx ();
> > > +  rtx l1 = gen_label_rtx ();
> > > +  rtx l2 = gen_label_rtx ();
> > > +  rtx lend = gen_label_rtx ();
> > > +  rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
> > > +                          gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> > > +  rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
> > > +                                 gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
> > > +  rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > > +  add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
> >
> > Please also add JUMP_LABEL to the insn.
> >
> > > +  rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
> > > +                          gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> > > +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
> > > +                             gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx);
> > > +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > > +  add_reg_br_prob_note (jmp, profile_probability::unlikely ());
> > > +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt,
> > > +                             gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx);
> > > +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > > +  add_reg_br_prob_note (jmp, profile_probability::even ());
> > > +  emit_move_insn (dest, constm1_rtx);
> > > +  emit_jump (lend);
> > > +  emit_label (l0);
> >
> > and LABEL_NUSES label.
>
> Why?  That seems to be a waste of time to me, unless something uses them
> already during expansion.  Because pass_expand::execute
> runs:
>   /* We need JUMP_LABEL be set in order to redirect jumps, and hence
>      split edges which edge insertions might do.  */
>   rebuild_jump_labels (get_insns ());
> which resets all LABEL_NUSES to 0 (well, to:
>       if (LABEL_P (insn))
>         LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0);
> and then recomputes them and adds JUMP_LABEL if needed:
>               JUMP_LABEL (insn) = label;

I was not aware of that detail. Thanks for sharing (and I wonder if
all other cases should be removed from the source).

Uros.

> --- gcc/config/i386/i386.md.jj  2022-01-15 09:51:25.404468342 +0100
> +++ gcc/config/i386/i386.md     2022-01-15 09:56:31.602109421 +0100
> @@ -23892,7 +23892,7 @@ (define_expand "spaceship<mode>3"
>     (match_operand:MODEF 1 "cmp_fp_expander_operand")
>     (match_operand:MODEF 2 "cmp_fp_expander_operand")]
>    "(TARGET_80387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH))
> -   && TARGET_CMOVE && TARGET_IEEE_FP"
> +   && (TARGET_CMOVE || (TARGET_SAHF && TARGET_USE_SAHF))"
>  {
>    ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
>    DONE;
> --- gcc/config/i386/i386-expand.c.jj    2022-01-15 09:51:25.411468242 +0100
> +++ gcc/config/i386/i386-expand.c       2022-01-15 10:38:26.924333651 +0100
> @@ -2884,18 +2884,23 @@ ix86_expand_setcc (rtx dest, enum rtx_co
>  void
>  ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1)
>  {
> -  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI);
> +  gcc_checking_assert (ix86_fp_comparison_strategy (GT) != IX86_FPCMP_ARITH);
>    rtx gt = ix86_expand_fp_compare (GT, op0, op1);
>    rtx l0 = gen_label_rtx ();
>    rtx l1 = gen_label_rtx ();
> -  rtx l2 = gen_label_rtx ();
> +  rtx l2 = TARGET_IEEE_FP ? gen_label_rtx () : NULL_RTX;
>    rtx lend = gen_label_rtx ();
> -  rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
> -                          gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> -  rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
> +  rtx tmp;
> +  rtx_insn *jmp;
> +  if (l2)
> +    {
> +      rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
> +                              gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> +      tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
>                                   gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
> -  rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> -  add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
> +      jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +      add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
> +    }
>    rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
>                            gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
>    tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
> @@ -2914,8 +2919,11 @@ ix86_expand_fp_spaceship (rtx dest, rtx
>    emit_label (l1);
>    emit_move_insn (dest, const1_rtx);
>    emit_jump (lend);
> -  emit_label (l2);
> -  emit_move_insn (dest, const2_rtx);
> +  if (l2)
> +    {
> +      emit_label (l2);
> +      emit_move_insn (dest, const2_rtx);
> +    }
>    emit_label (lend);
>  }
>
> --- gcc/tree-ssa-math-opts.c.jj 2022-01-15 09:51:25.402468370 +0100
> +++ gcc/tree-ssa-math-opts.c    2022-01-15 10:35:52.366533951 +0100
> @@ -4694,7 +4694,6 @@ optimize_spaceship (gimple *stmt)
>    tree arg1 = gimple_cond_lhs (stmt);
>    tree arg2 = gimple_cond_rhs (stmt);
>    if (!SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg1))
> -      || !HONOR_NANS (TREE_TYPE (arg1))
>        || optab_handler (spaceship_optab,
>                         TYPE_MODE (TREE_TYPE (arg1))) == CODE_FOR_nothing
>        || operand_equal_p (arg1, arg2, 0))
> @@ -4732,56 +4731,67 @@ optimize_spaceship (gimple *stmt)
>        return;
>      }
>
> -  /* With NaNs, </<=/>/>= are false, so we need to look for the
> -     third comparison on the false edge from whatever non-equality
> -     comparison the second comparison is.  */
> -  int i = (EDGE_SUCC (bb1, 0)->flags & EDGE_TRUE_VALUE) != 0;
> -  bb2 = EDGE_SUCC (bb1, i)->dest;
> -  g = last_stmt (bb2);
> -  if (g == NULL
> -      || gimple_code (g) != GIMPLE_COND
> -      || !single_pred_p (bb2)
> -      || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
> -         ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
> -         : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
> -            || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
> -      || !cond_only_block_p (bb2)
> -      || EDGE_SUCC (bb2, 0)->dest == EDGE_SUCC (bb2, 1)->dest)
> -    return;
> -
> -  enum tree_code ccode2
> -    = (operand_equal_p (gimple_cond_lhs (g), arg1, 0) ? LT_EXPR : GT_EXPR);
> -  switch (gimple_cond_code (g))
> +  for (int i = 0; i < 2; ++i)
>      {
> -    case LT_EXPR:
> -    case LE_EXPR:
> +      /* With NaNs, </<=/>/>= are false, so we need to look for the
> +        third comparison on the false edge from whatever non-equality
> +        comparison the second comparison is.  */
> +      if (HONOR_NANS (TREE_TYPE (arg1))
> +         && (EDGE_SUCC (bb1, i)->flags & EDGE_TRUE_VALUE) != 0)
> +       continue;
> +
> +      bb2 = EDGE_SUCC (bb1, i)->dest;
> +      g = last_stmt (bb2);
> +      if (g == NULL
> +         || gimple_code (g) != GIMPLE_COND
> +         || !single_pred_p (bb2)
> +         || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
> +             ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
> +             : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
> +                || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
> +         || !cond_only_block_p (bb2)
> +         || EDGE_SUCC (bb2, 0)->dest == EDGE_SUCC (bb2, 1)->dest)
> +       continue;
> +
> +      enum tree_code ccode2
> +       = (operand_equal_p (gimple_cond_lhs (g), arg1, 0) ? LT_EXPR : GT_EXPR);
> +      switch (gimple_cond_code (g))
> +       {
> +       case LT_EXPR:
> +       case LE_EXPR:
> +         break;
> +       case GT_EXPR:
> +       case GE_EXPR:
> +         ccode2 = ccode2 == LT_EXPR ? GT_EXPR : LT_EXPR;
> +         break;
> +       default:
> +         continue;
> +       }
> +      if (HONOR_NANS (TREE_TYPE (arg1)) && ccode == ccode2)
> +       return;
> +
> +      if ((ccode == LT_EXPR)
> +         ^ ((EDGE_SUCC (bb1, i)->flags & EDGE_TRUE_VALUE) != 0))
> +       {
> +         em1 = EDGE_SUCC (bb1, 1 - i);
> +         e1 = EDGE_SUCC (bb2, 0);
> +         e2 = EDGE_SUCC (bb2, 1);
> +         if ((ccode2 == LT_EXPR) ^ ((e1->flags & EDGE_TRUE_VALUE) == 0))
> +           std::swap (e1, e2);
> +       }
> +      else
> +       {
> +         e1 = EDGE_SUCC (bb1, 1 - i);
> +         em1 = EDGE_SUCC (bb2, 0);
> +         e2 = EDGE_SUCC (bb2, 1);
> +         if ((ccode2 != LT_EXPR) ^ ((em1->flags & EDGE_TRUE_VALUE) == 0))
> +           std::swap (em1, e2);
> +       }
>        break;
> -    case GT_EXPR:
> -    case GE_EXPR:
> -      ccode2 = ccode2 == LT_EXPR ? GT_EXPR : LT_EXPR;
> -      break;
> -    default:
> -      return;
>      }
> -  if (ccode == ccode2)
> -    return;
>
> -  if (ccode == LT_EXPR)
> -    {
> -      em1 = EDGE_SUCC (bb1, 1 - i);
> -      e1 = EDGE_SUCC (bb2, 0);
> -      e2 = EDGE_SUCC (bb2, 1);
> -      if ((e1->flags & EDGE_TRUE_VALUE) == 0)
> -       std::swap (e1, e2);
> -    }
> -  else
> -    {
> -      e1 = EDGE_SUCC (bb1, 1 - i);
> -      em1 = EDGE_SUCC (bb2, 0);
> -      e2 = EDGE_SUCC (bb2, 1);
> -      if ((em1->flags & EDGE_TRUE_VALUE) == 0)
> -       std::swap (em1, e2);
> -    }
> +  if (em1 == NULL)
> +    return;
>
>    g = gimple_build_call_internal (IFN_SPACESHIP, 2, arg1, arg2);
>    tree lhs = make_ssa_name (integer_type_node);
> @@ -4796,14 +4806,19 @@ optimize_spaceship (gimple *stmt)
>
>    g = last_stmt (bb1);
>    cond = as_a <gcond *> (g);
> -  gimple_cond_set_code (cond, EQ_EXPR);
>    gimple_cond_set_lhs (cond, lhs);
>    if (em1->src == bb1)
> -    gimple_cond_set_rhs (cond, integer_minus_one_node);
> +    {
> +      gimple_cond_set_rhs (cond, integer_minus_one_node);
> +      gimple_cond_set_code (cond, (em1->flags & EDGE_TRUE_VALUE)
> +                                 ? EQ_EXPR : NE_EXPR);
> +    }
>    else
>      {
>        gcc_assert (e1->src == bb1);
>        gimple_cond_set_rhs (cond, integer_one_node);
> +      gimple_cond_set_code (cond, (e1->flags & EDGE_TRUE_VALUE)
> +                                 ? EQ_EXPR : NE_EXPR);
>      }
>    update_stmt (g);
>
>
>
>         Jakub
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] widening_mul, i386, v2: Improve spaceship expansion on x86 [PR103973]
  2022-01-15 10:42     ` Uros Bizjak
@ 2022-01-15 11:22       ` Jakub Jelinek
  2022-01-15 16:40         ` Uros Bizjak
  2022-01-17 12:04         ` Richard Biener
  0 siblings, 2 replies; 8+ messages in thread
From: Jakub Jelinek @ 2022-01-15 11:22 UTC (permalink / raw)
  To: Richard Biener, Uros Bizjak; +Cc: gcc-patches

On Sat, Jan 15, 2022 at 11:42:55AM +0100, Uros Bizjak wrote:
> Yes, that would be nice. XFmode is used for long double, and not obsolete.

Ok, that seems to work.  Compared to the incremental patch I've posted, I
also had to add handling of the case where we have just
x == y ? 0 : x < y ? -1 : 1 (both for -ffast-math and non-ffast-math).
Apparently even that is worth optimizing.
Tested so far on the new testcases, will run full bootstrap/regtest tonight.

> > Why?  That seems to be a waste of time to me, unless something uses them
> > already during expansion.  Because pass_expand::execute
> > runs:
> >   /* We need JUMP_LABEL be set in order to redirect jumps, and hence
> >      split edges which edge insertions might do.  */
> >   rebuild_jump_labels (get_insns ());
> > which resets all LABEL_NUSES to 0 (well, to:
> >       if (LABEL_P (insn))
> >         LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0);
> > and then recomputes them and adds JUMP_LABEL if needed:
> >               JUMP_LABEL (insn) = label;
> 
> I was not aware of that detail. Thanks for sharing (and I wonder if
> all other cases should be removed from the source).

I guess it depends, for code that can only be called during the expand pass
dropping it should be just fine, for code that can be called also (or only)
later I think adding JUMP_LABEL and correct LABEL_NUSES is needed because
nothing will fix it up afterwards.

2022-01-15  Jakub Jelinek  <jakub@redhat.com>

	PR target/103973
	* tree-cfg.h (cond_only_block_p): Declare.
	* tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
	* tree-cfg.c (cond_only_block_p): ... here.  No longer static.
	* optabs.def (spaceship_optab): New optab.
	* internal-fn.def (SPACESHIP): New internal function.
	* internal-fn.h (expand_SPACESHIP): Declare.
	* internal-fn.c (expand_PHI): Formatting fix.
	(expand_SPACESHIP): New function.
	* tree-ssa-math-opts.c (optimize_spaceship): New function.
	(math_opts_dom_walker::after_dom_children): Use it.
	* config/i386/i386.md (spaceship<mode>3): New define_expand.
	* config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
	* config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
	* doc/md.texi (spaceship@var{m}3): Document.

	* gcc.target/i386/pr103973-1.c: New test.
	* gcc.target/i386/pr103973-2.c: New test.
	* gcc.target/i386/pr103973-3.c: New test.
	* gcc.target/i386/pr103973-4.c: New test.
	* gcc.target/i386/pr103973-5.c: New test.
	* gcc.target/i386/pr103973-6.c: New test.
	* gcc.target/i386/pr103973-7.c: New test.
	* gcc.target/i386/pr103973-8.c: New test.
	* gcc.target/i386/pr103973-9.c: New test.
	* gcc.target/i386/pr103973-10.c: New test.
	* gcc.target/i386/pr103973-11.c: New test.
	* gcc.target/i386/pr103973-12.c: New test.
	* gcc.target/i386/pr103973-13.c: New test.
	* gcc.target/i386/pr103973-14.c: New test.
	* gcc.target/i386/pr103973-15.c: New test.
	* gcc.target/i386/pr103973-16.c: New test.
	* gcc.target/i386/pr103973-17.c: New test.
	* gcc.target/i386/pr103973-18.c: New test.
	* gcc.target/i386/pr103973-19.c: New test.
	* gcc.target/i386/pr103973-20.c: New test.
	* g++.target/i386/pr103973-1.C: New test.
	* g++.target/i386/pr103973-2.C: New test.
	* g++.target/i386/pr103973-3.C: New test.
	* g++.target/i386/pr103973-4.C: New test.
	* g++.target/i386/pr103973-5.C: New test.
	* g++.target/i386/pr103973-6.C: New test.
	* g++.target/i386/pr103973-7.C: New test.
	* g++.target/i386/pr103973-8.C: New test.
	* g++.target/i386/pr103973-9.C: New test.
	* g++.target/i386/pr103973-10.C: New test.
	* g++.target/i386/pr103973-11.C: New test.
	* g++.target/i386/pr103973-12.C: New test.
	* g++.target/i386/pr103973-13.C: New test.
	* g++.target/i386/pr103973-14.C: New test.
	* g++.target/i386/pr103973-15.C: New test.
	* g++.target/i386/pr103973-16.C: New test.
	* g++.target/i386/pr103973-17.C: New test.
	* g++.target/i386/pr103973-18.C: New test.
	* g++.target/i386/pr103973-19.C: New test.
	* g++.target/i386/pr103973-20.C: New test.

--- gcc/tree-cfg.h.jj	2022-01-14 23:57:44.491718086 +0100
+++ gcc/tree-cfg.h	2022-01-15 09:51:25.359468982 +0100
@@ -111,6 +111,7 @@ extern basic_block gimple_switch_label_b
 extern basic_block gimple_switch_default_bb (function *, gswitch *);
 extern edge gimple_switch_edge (function *, gswitch *, unsigned);
 extern edge gimple_switch_default_edge (function *, gswitch *);
+extern bool cond_only_block_p (basic_block);
 
 /* Return true if the LHS of a call should be removed.  */
 
--- gcc/tree-ssa-phiopt.c.jj	2022-01-14 23:57:44.536717549 +0100
+++ gcc/tree-ssa-phiopt.c	2022-01-15 09:51:25.361468954 +0100
@@ -1958,31 +1958,6 @@ minmax_replacement (basic_block cond_bb,
   return true;
 }
 
-/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
-
-static bool
-cond_only_block_p (basic_block bb)
-{
-  /* BB must have no executable statements.  */
-  gimple_stmt_iterator gsi = gsi_after_labels (bb);
-  if (phi_nodes (bb))
-    return false;
-  while (!gsi_end_p (gsi))
-    {
-      gimple *stmt = gsi_stmt (gsi);
-      if (is_gimple_debug (stmt))
-	;
-      else if (gimple_code (stmt) == GIMPLE_NOP
-	       || gimple_code (stmt) == GIMPLE_PREDICT
-	       || gimple_code (stmt) == GIMPLE_COND)
-	;
-      else
-	return false;
-      gsi_next (&gsi);
-    }
-  return true;
-}
-
 /* Attempt to optimize (x <=> y) cmp 0 and similar comparisons.
    For strong ordering <=> try to match something like:
     <bb 2> :  // cond3_bb (== cond2_bb)
--- gcc/tree-cfg.c.jj	2022-01-14 23:57:44.477718253 +0100
+++ gcc/tree-cfg.c	2022-01-15 09:51:25.363468925 +0100
@@ -9410,6 +9410,31 @@ gimple_switch_default_edge (function *if
   return gimple_switch_edge (ifun, gs, 0);
 }
 
+/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
+
+bool
+cond_only_block_p (basic_block bb)
+{
+  /* BB must have no executable statements.  */
+  gimple_stmt_iterator gsi = gsi_after_labels (bb);
+  if (phi_nodes (bb))
+    return false;
+  while (!gsi_end_p (gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      if (is_gimple_debug (stmt))
+	;
+      else if (gimple_code (stmt) == GIMPLE_NOP
+	       || gimple_code (stmt) == GIMPLE_PREDICT
+	       || gimple_code (stmt) == GIMPLE_COND)
+	;
+      else
+	return false;
+      gsi_next (&gsi);
+    }
+  return true;
+}
+
 
 /* Emit return warnings.  */
 
--- gcc/optabs.def.jj	2022-01-14 23:57:44.445718634 +0100
+++ gcc/optabs.def	2022-01-15 09:51:25.383468640 +0100
@@ -259,6 +259,7 @@ OPTAB_D (usubv4_optab, "usubv$I$a4")
 OPTAB_D (umulv4_optab, "umulv$I$a4")
 OPTAB_D (negv3_optab, "negv$I$a3")
 OPTAB_D (addptr3_optab, "addptr$a3")
+OPTAB_D (spaceship_optab, "spaceship$a3")
 
 OPTAB_D (smul_highpart_optab, "smul$a3_highpart")
 OPTAB_D (umul_highpart_optab, "umul$a3_highpart")
--- gcc/internal-fn.def.jj	2022-01-14 23:57:44.433718778 +0100
+++ gcc/internal-fn.def	2022-01-15 09:51:25.399468413 +0100
@@ -430,6 +430,9 @@ DEF_INTERNAL_FN (NOP, ECF_CONST | ECF_LE
 /* Temporary vehicle for __builtin_shufflevector.  */
 DEF_INTERNAL_FN (SHUFFLEVECTOR, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 
+/* <=> optimization.  */
+DEF_INTERNAL_FN (SPACESHIP, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
+
 #undef DEF_INTERNAL_INT_FN
 #undef DEF_INTERNAL_FLT_FN
 #undef DEF_INTERNAL_FLT_FLOATN_FN
--- gcc/internal-fn.h.jj	2022-01-14 23:57:44.445718634 +0100
+++ gcc/internal-fn.h	2022-01-15 09:51:25.399468413 +0100
@@ -241,6 +241,7 @@ extern void expand_internal_call (gcall
 extern void expand_internal_call (internal_fn, gcall *);
 extern void expand_PHI (internal_fn, gcall *);
 extern void expand_SHUFFLEVECTOR (internal_fn, gcall *);
+extern void expand_SPACESHIP (internal_fn, gcall *);
 
 extern bool vectorized_internal_fn_supported_p (internal_fn, tree);
 
--- gcc/internal-fn.c.jj	2022-01-14 23:57:44.433718778 +0100
+++ gcc/internal-fn.c	2022-01-15 09:51:25.400468399 +0100
@@ -4425,5 +4425,27 @@ expand_SHUFFLEVECTOR (internal_fn, gcall
 void
 expand_PHI (internal_fn, gcall *)
 {
-    gcc_unreachable ();
+  gcc_unreachable ();
+}
+
+void
+expand_SPACESHIP (internal_fn, gcall *stmt)
+{
+  tree lhs = gimple_call_lhs (stmt);
+  tree rhs1 = gimple_call_arg (stmt, 0);
+  tree rhs2 = gimple_call_arg (stmt, 1);
+  tree type = TREE_TYPE (rhs1);
+
+  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+  rtx op1 = expand_normal (rhs1);
+  rtx op2 = expand_normal (rhs2);
+
+  class expand_operand ops[3];
+  create_output_operand (&ops[0], target, TYPE_MODE (TREE_TYPE (lhs)));
+  create_input_operand (&ops[1], op1, TYPE_MODE (type));
+  create_input_operand (&ops[2], op2, TYPE_MODE (type));
+  insn_code icode = optab_handler (spaceship_optab, TYPE_MODE (type));
+  expand_insn (icode, 3, ops);
+  if (!rtx_equal_p (target, ops[0].value))
+    emit_move_insn (target, ops[0].value);
 }
--- gcc/tree-ssa-math-opts.c.jj	2022-01-14 23:57:44.492718074 +0100
+++ gcc/tree-ssa-math-opts.c	2022-01-15 11:37:13.131069782 +0100
@@ -4637,6 +4637,227 @@ convert_mult_to_highpart (gassign *stmt,
   return true;
 }
 
+/* If target has spaceship<MODE>3 expander, pattern recognize
+   <bb 2> [local count: 1073741824]:
+   if (a_2(D) == b_3(D))
+     goto <bb 6>; [34.00%]
+   else
+     goto <bb 3>; [66.00%]
+
+   <bb 3> [local count: 708669601]:
+   if (a_2(D) < b_3(D))
+     goto <bb 6>; [1.04%]
+   else
+     goto <bb 4>; [98.96%]
+
+   <bb 4> [local count: 701299439]:
+   if (a_2(D) > b_3(D))
+     goto <bb 5>; [48.89%]
+   else
+     goto <bb 6>; [51.11%]
+
+   <bb 5> [local count: 342865295]:
+
+   <bb 6> [local count: 1073741824]:
+   and turn it into:
+   <bb 2> [local count: 1073741824]:
+   _1 = .SPACESHIP (a_2(D), b_3(D));
+   if (_1 == 0)
+     goto <bb 6>; [34.00%]
+   else
+     goto <bb 3>; [66.00%]
+
+   <bb 3> [local count: 708669601]:
+   if (_1 == -1)
+     goto <bb 6>; [1.04%]
+   else
+     goto <bb 4>; [98.96%]
+
+   <bb 4> [local count: 701299439]:
+   if (_1 == 1)
+     goto <bb 5>; [48.89%]
+   else
+     goto <bb 6>; [51.11%]
+
+   <bb 5> [local count: 342865295]:
+
+   <bb 6> [local count: 1073741824]:
+   so that the backend can emit optimal comparison and
+   conditional jump sequence.  */
+
+static void
+optimize_spaceship (gimple *stmt)
+{
+  enum tree_code code = gimple_cond_code (stmt);
+  if (code != EQ_EXPR && code != NE_EXPR)
+    return;
+  tree arg1 = gimple_cond_lhs (stmt);
+  tree arg2 = gimple_cond_rhs (stmt);
+  if (!SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg1))
+      || optab_handler (spaceship_optab,
+			TYPE_MODE (TREE_TYPE (arg1))) == CODE_FOR_nothing
+      || operand_equal_p (arg1, arg2, 0))
+    return;
+
+  basic_block bb0 = gimple_bb (stmt), bb1, bb2 = NULL;
+  edge em1 = NULL, e1 = NULL, e2 = NULL;
+  bb1 = EDGE_SUCC (bb0, 1)->dest;
+  if (((EDGE_SUCC (bb0, 0)->flags & EDGE_TRUE_VALUE) != 0) ^ (code == EQ_EXPR))
+    bb1 = EDGE_SUCC (bb0, 0)->dest;
+
+  gimple *g = last_stmt (bb1);
+  if (g == NULL
+      || gimple_code (g) != GIMPLE_COND
+      || !single_pred_p (bb1)
+      || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
+	  ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
+	  : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
+	     || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
+      || !cond_only_block_p (bb1))
+    return;
+
+  enum tree_code ccode = (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
+			  ? LT_EXPR : GT_EXPR);
+  switch (gimple_cond_code (g))
+    {
+    case LT_EXPR:
+    case LE_EXPR:
+      break;
+    case GT_EXPR:
+    case GE_EXPR:
+      ccode = ccode == LT_EXPR ? GT_EXPR : LT_EXPR;
+      break;
+    default:
+      return;
+    }
+
+  for (int i = 0; i < 2; ++i)
+    {
+      /* With NaNs, </<=/>/>= are false, so we need to look for the
+	 third comparison on the false edge from whatever non-equality
+	 comparison the second comparison is.  */
+      if (HONOR_NANS (TREE_TYPE (arg1))
+	  && (EDGE_SUCC (bb1, i)->flags & EDGE_TRUE_VALUE) != 0)
+	continue;
+
+      bb2 = EDGE_SUCC (bb1, i)->dest;
+      g = last_stmt (bb2);
+      if (g == NULL
+	  || gimple_code (g) != GIMPLE_COND
+	  || !single_pred_p (bb2)
+	  || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
+	      ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
+	      : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
+		 || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
+	  || !cond_only_block_p (bb2)
+	  || EDGE_SUCC (bb2, 0)->dest == EDGE_SUCC (bb2, 1)->dest)
+	continue;
+
+      enum tree_code ccode2
+	= (operand_equal_p (gimple_cond_lhs (g), arg1, 0) ? LT_EXPR : GT_EXPR);
+      switch (gimple_cond_code (g))
+	{
+	case LT_EXPR:
+	case LE_EXPR:
+	  break;
+	case GT_EXPR:
+	case GE_EXPR:
+	  ccode2 = ccode2 == LT_EXPR ? GT_EXPR : LT_EXPR;
+	  break;
+	default:
+	  continue;
+	}
+      if (HONOR_NANS (TREE_TYPE (arg1)) && ccode == ccode2)
+	continue;
+
+      if ((ccode == LT_EXPR)
+	  ^ ((EDGE_SUCC (bb1, i)->flags & EDGE_TRUE_VALUE) != 0))
+	{
+	  em1 = EDGE_SUCC (bb1, 1 - i);
+	  e1 = EDGE_SUCC (bb2, 0);
+	  e2 = EDGE_SUCC (bb2, 1);
+	  if ((ccode2 == LT_EXPR) ^ ((e1->flags & EDGE_TRUE_VALUE) == 0))
+	    std::swap (e1, e2);
+	}
+      else
+	{
+	  e1 = EDGE_SUCC (bb1, 1 - i);
+	  em1 = EDGE_SUCC (bb2, 0);
+	  e2 = EDGE_SUCC (bb2, 1);
+	  if ((ccode2 != LT_EXPR) ^ ((em1->flags & EDGE_TRUE_VALUE) == 0))
+	    std::swap (em1, e2);
+	}
+      break;
+    }
+
+  if (em1 == NULL)
+    {
+      if ((ccode == LT_EXPR)
+	  ^ ((EDGE_SUCC (bb1, 0)->flags & EDGE_TRUE_VALUE) != 0))
+	{
+	  em1 = EDGE_SUCC (bb1, 1);
+	  e1 = EDGE_SUCC (bb1, 0);
+	  e2 = (e1->flags & EDGE_TRUE_VALUE) ? em1 : e1;
+	}
+      else
+	{
+	  em1 = EDGE_SUCC (bb1, 0);
+	  e1 = EDGE_SUCC (bb1, 1);
+	  e2 = (e1->flags & EDGE_TRUE_VALUE) ? em1 : e1;
+	}
+    }
+
+  g = gimple_build_call_internal (IFN_SPACESHIP, 2, arg1, arg2);
+  tree lhs = make_ssa_name (integer_type_node);
+  gimple_call_set_lhs (g, lhs);
+  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
+
+  gcond *cond = as_a <gcond *> (stmt);
+  gimple_cond_set_lhs (cond, lhs);
+  gimple_cond_set_rhs (cond, integer_zero_node);
+  update_stmt (stmt);
+
+  g = last_stmt (bb1);
+  cond = as_a <gcond *> (g);
+  gimple_cond_set_lhs (cond, lhs);
+  if (em1->src == bb1 && e2 != em1)
+    {
+      gimple_cond_set_rhs (cond, integer_minus_one_node);
+      gimple_cond_set_code (cond, (em1->flags & EDGE_TRUE_VALUE)
+				  ? EQ_EXPR : NE_EXPR);
+    }
+  else
+    {
+      gcc_assert (e1->src == bb1 && e2 != e1);
+      gimple_cond_set_rhs (cond, integer_one_node);
+      gimple_cond_set_code (cond, (e1->flags & EDGE_TRUE_VALUE)
+				  ? EQ_EXPR : NE_EXPR);
+    }
+  update_stmt (g);
+
+  if (e2 != e1 && e2 != em1)
+    {
+      g = last_stmt (bb2);
+      cond = as_a <gcond *> (g);
+      gimple_cond_set_lhs (cond, lhs);
+      if (em1->src == bb2)
+	gimple_cond_set_rhs (cond, integer_minus_one_node);
+      else
+	{
+	  gcc_assert (e1->src == bb2);
+	  gimple_cond_set_rhs (cond, integer_one_node);
+	}
+      gimple_cond_set_code (cond,
+			    (e2->flags & EDGE_TRUE_VALUE) ? NE_EXPR : EQ_EXPR);
+      update_stmt (g);
+    }
+
+  wide_int wm1 = wi::minus_one (TYPE_PRECISION (integer_type_node));
+  wide_int w2 = wi::two (TYPE_PRECISION (integer_type_node));
+  set_range_info (lhs, VR_RANGE, wm1, w2);
+}
+
 
 /* Find integer multiplications where the operands are extended from
    smaller types, and replace the MULT_EXPR with a WIDEN_MULT_EXPR
@@ -4798,6 +5019,8 @@ math_opts_dom_walker::after_dom_children
 	      break;
 	    }
 	}
+      else if (gimple_code (stmt) == GIMPLE_COND)
+	optimize_spaceship (stmt);
       gsi_next (&gsi);
     }
   if (fma_state.m_deferring_p
--- gcc/config/i386/i386.md.jj	2022-01-14 23:57:59.047544505 +0100
+++ gcc/config/i386/i386.md	2022-01-15 12:13:28.116073760 +0100
@@ -23886,6 +23886,28 @@ (define_insn "hreset"
   [(set_attr "type" "other")
    (set_attr "length" "4")])
 
+;; Spaceship optimization
+(define_expand "spaceship<mode>3"
+  [(match_operand:SI 0 "register_operand")
+   (match_operand:MODEF 1 "cmp_fp_expander_operand")
+   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
+  "(TARGET_80387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH))
+   && (TARGET_CMOVE || (TARGET_SAHF && TARGET_USE_SAHF))"
+{
+  ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "spaceshipxf3"
+  [(match_operand:SI 0 "register_operand")
+   (match_operand:XF 1 "nonmemory_operand")
+   (match_operand:XF 2 "nonmemory_operand")]
+  "TARGET_80387 && (TARGET_CMOVE || (TARGET_SAHF && TARGET_USE_SAHF))"
+{
+  ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
+  DONE;
+})
+
 (include "mmx.md")
 (include "sse.md")
 (include "sync.md")
--- gcc/config/i386/i386-protos.h.jj	2022-01-14 23:57:44.398719195 +0100
+++ gcc/config/i386/i386-protos.h	2022-01-15 09:51:25.410468256 +0100
@@ -150,6 +150,7 @@ extern bool ix86_expand_int_vec_cmp (rtx
 extern bool ix86_expand_fp_vec_cmp (rtx[]);
 extern void ix86_expand_sse_movcc (rtx, rtx, rtx, rtx);
 extern void ix86_expand_sse_unpack (rtx, rtx, bool, bool);
+extern void ix86_expand_fp_spaceship (rtx, rtx, rtx);
 extern bool ix86_expand_int_addcc (rtx[]);
 extern rtx_insn *ix86_expand_call (rtx, rtx, rtx, rtx, rtx, bool);
 extern bool ix86_call_use_plt_p (rtx);
--- gcc/config/i386/i386-expand.c.jj	2022-01-14 23:57:44.379719421 +0100
+++ gcc/config/i386/i386-expand.c	2022-01-15 10:38:26.924333651 +0100
@@ -2879,6 +2879,54 @@ ix86_expand_setcc (rtx dest, enum rtx_co
   emit_insn (gen_rtx_SET (dest, ret));
 }
 
+/* Expand floating point op0 <=> op1 if NaNs are honored.  */
+
+void
+ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1)
+{
+  gcc_checking_assert (ix86_fp_comparison_strategy (GT) != IX86_FPCMP_ARITH);
+  rtx gt = ix86_expand_fp_compare (GT, op0, op1);
+  rtx l0 = gen_label_rtx ();
+  rtx l1 = gen_label_rtx ();
+  rtx l2 = TARGET_IEEE_FP ? gen_label_rtx () : NULL_RTX;
+  rtx lend = gen_label_rtx ();
+  rtx tmp;
+  rtx_insn *jmp;
+  if (l2)
+    {
+      rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
+			       gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
+      tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
+				  gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
+      jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
+      add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
+    }
+  rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
+			   gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
+  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
+			      gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx);
+  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
+  add_reg_br_prob_note (jmp, profile_probability::unlikely ());
+  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt,
+			      gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx);
+  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
+  add_reg_br_prob_note (jmp, profile_probability::even ());
+  emit_move_insn (dest, constm1_rtx);
+  emit_jump (lend);
+  emit_label (l0);
+  emit_move_insn (dest, const0_rtx);
+  emit_jump (lend);
+  emit_label (l1);
+  emit_move_insn (dest, const1_rtx);
+  emit_jump (lend);
+  if (l2)
+    {
+      emit_label (l2);
+      emit_move_insn (dest, const2_rtx);
+    }
+  emit_label (lend);
+}
+
 /* Expand comparison setting or clearing carry flag.  Return true when
    successful and set pop for the operation.  */
 static bool
--- gcc/doc/md.texi.jj	2022-01-14 23:57:44.419718944 +0100
+++ gcc/doc/md.texi	2022-01-15 09:51:25.429467985 +0100
@@ -8055,6 +8055,15 @@ inclusive and operand 1 exclusive.
 If this pattern is not defined, a call to the library function
 @code{__clear_cache} is used.
 
+@cindex @code{spaceship@var{m}3} instruction pattern
+@item @samp{spaceship@var{m}3}
+Initialize output operand 0 with mode of integer type to -1, 0, 1 or 2
+if operand 1 with mode @var{m} compares less than operand 2, equal to
+operand 2, greater than operand 2 or is unordered with operand 2.
+@var{m} should be a scalar floating point mode.
+
+This pattern is not allowed to @code{FAIL}.
+
 @end table
 
 @end ifset
--- gcc/testsuite/gcc.target/i386/pr103973-1.c.jj	2022-01-15 09:51:25.430467971 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-1.c	2022-01-15 09:51:25.430467971 +0100
@@ -0,0 +1,98 @@
+/* PR target/103973 */
+/* { dg-do run } */
+/* { dg-options "-O2 -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
+
+__attribute__((noipa)) int m1 (void) { return -1; }
+__attribute__((noipa)) int p0 (void) { return 0; }
+__attribute__((noipa)) int p1 (void) { return 1; }
+__attribute__((noipa)) int p2 (void) { return 2; }
+
+__attribute__((noipa)) int
+foo (double a, double b)
+{
+  if (a == b)
+    return 0;
+  if (a < b)
+    return -1;
+  if (a > b)
+    return 1;
+  return 2;
+}
+
+__attribute__((noipa)) int
+bar (double a, double b)
+{
+  if (a == b)
+    return p0 ();
+  if (a < b)
+    return m1 ();
+  if (a > b)
+    return p1 ();
+  return p2 ();
+}
+
+__attribute__((noipa)) int
+baz (double a, double b)
+{
+  if (a == b)
+    return p0 ();
+  if (b < a)
+    return p1 ();
+  if (a < b)
+    return m1 ();
+  return p2 ();
+}
+
+__attribute__((noipa)) int
+qux (double a)
+{
+  if (a != 0.0f)
+    {
+      if (a <= 0.0f)
+	return -1;
+      if (a >= 0.0f)
+	return 1;
+      return 2;
+    }
+  return 0;
+}
+
+int
+main ()
+{
+  double m5 = -5.0f;
+  double p5 = 5.0f;
+  volatile double p0 = 0.0f;
+  double nan = p0 / p0;
+  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
+    __builtin_abort ();
+  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
+    __builtin_abort ();
+  if (foo (m5, nan) != 2 || foo (nan, p5) != 2)
+    __builtin_abort ();
+  if (foo (nan, nan) != 2)
+    __builtin_abort ();
+  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
+    __builtin_abort ();
+  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
+    __builtin_abort ();
+  if (bar (m5, nan) != 2 || bar (nan, p5) != 2)
+    __builtin_abort ();
+  if (bar (nan, nan) != 2)
+    __builtin_abort ();
+  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
+    __builtin_abort ();
+  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
+    __builtin_abort ();
+  if (baz (m5, nan) != 2 || baz (nan, p5) != 2)
+    __builtin_abort ();
+  if (baz (nan, nan) != 2)
+    __builtin_abort ();
+  if (qux (p0) != 0 || qux (nan) != 2)
+    __builtin_abort ();
+  if (qux (m5) != -1 || qux (p5) != 1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/i386/pr103973-2.c.jj	2022-01-15 09:51:25.430467971 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-2.c	2022-01-15 12:00:15.864355970 +0100
@@ -0,0 +1,7 @@
+/* PR target/103973 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#include "pr103973-1.c"
--- gcc/testsuite/gcc.target/i386/pr103973-3.c.jj	2022-01-15 09:51:25.430467971 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-3.c	2022-01-15 09:51:25.430467971 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do run } */
+/* { dg-options "-O2 -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
+
+#define double float
+#include "pr103973-1.c"
--- gcc/testsuite/gcc.target/i386/pr103973-4.c.jj	2022-01-15 09:51:25.430467971 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-4.c	2022-01-15 12:00:15.864355970 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#define double float
+#include "pr103973-1.c"
--- gcc/testsuite/gcc.target/i386/pr103973-5.c.jj	2022-01-15 11:04:02.427452420 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-5.c	2022-01-15 11:06:39.594216502 +0100
@@ -0,0 +1,85 @@
+/* PR target/103973 */
+/* { dg-do run } */
+/* { dg-options "-O2 -ffast-math -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
+
+__attribute__((noipa)) int m1 (void) { return -1; }
+__attribute__((noipa)) int p0 (void) { return 0; }
+__attribute__((noipa)) int p1 (void) { return 1; }
+__attribute__((noipa)) int p2 (void) { return 2; }
+
+__attribute__((noipa)) int
+foo (double a, double b)
+{
+  if (a == b)
+    return 0;
+  if (a < b)
+    return -1;
+  if (a > b)
+    return 1;
+  return 2;
+}
+
+__attribute__((noipa)) int
+bar (double a, double b)
+{
+  if (a == b)
+    return p0 ();
+  if (a < b)
+    return m1 ();
+  if (a > b)
+    return p1 ();
+  return p2 ();
+}
+
+__attribute__((noipa)) int
+baz (double a, double b)
+{
+  if (a == b)
+    return p0 ();
+  if (b < a)
+    return p1 ();
+  if (a < b)
+    return m1 ();
+  return p2 ();
+}
+
+__attribute__((noipa)) int
+qux (double a)
+{
+  if (a != 0.0f)
+    {
+      if (a <= 0.0f)
+	return -1;
+      if (a >= 0.0f)
+	return 1;
+      return 2;
+    }
+  return 0;
+}
+
+int
+main ()
+{
+  double m5 = -5.0f;
+  double p5 = 5.0f;
+  double p0 = 0.0f;
+  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
+    __builtin_abort ();
+  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
+    __builtin_abort ();
+  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
+    __builtin_abort ();
+  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
+    __builtin_abort ();
+  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
+    __builtin_abort ();
+  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
+    __builtin_abort ();
+  if (qux (p0) != 0)
+    __builtin_abort ();
+  if (qux (m5) != -1 || qux (p5) != 1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/i386/pr103973-6.c.jj	2022-01-15 11:05:24.377286081 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-6.c	2022-01-15 12:00:15.864355970 +0100
@@ -0,0 +1,7 @@
+/* PR target/103973 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#include "pr103973-5.c"
--- gcc/testsuite/gcc.target/i386/pr103973-7.c.jj	2022-01-15 11:05:28.620225748 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-7.c	2022-01-15 11:06:03.899724076 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do run } */
+/* { dg-options "-O2 -ffast-math -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
+
+#define double float
+#include "pr103973-5.c"
--- gcc/testsuite/gcc.target/i386/pr103973-8.c.jj	2022-01-15 11:05:32.273173801 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-8.c	2022-01-15 12:00:15.865355956 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#define double float
+#include "pr103973-5.c"
--- gcc/testsuite/gcc.target/i386/pr103973-9.c.jj	2022-01-15 11:41:11.895661977 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-9.c	2022-01-15 11:43:31.718668421 +0100
@@ -0,0 +1,89 @@
+/* PR target/103973 */
+/* { dg-do run } */
+/* { dg-options "-O2 -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
+
+__attribute__((noipa)) int m1 (void) { return -1; }
+__attribute__((noipa)) int p0 (void) { return 0; }
+__attribute__((noipa)) int p1 (void) { return 1; }
+
+__attribute__((noipa)) int
+foo (double a, double b)
+{
+  if (a == b)
+    return 0;
+  if (a < b)
+    return -1;
+  return 1;
+}
+
+__attribute__((noipa)) int
+bar (double a, double b)
+{
+  if (a == b)
+    return p0 ();
+  if (a < b)
+    return m1 ();
+  return p1 ();
+}
+
+__attribute__((noipa)) int
+baz (double a, double b)
+{
+  if (a == b)
+    return p0 ();
+  if (b < a)
+    return p1 ();
+  return m1 ();
+}
+
+__attribute__((noipa)) int
+qux (double a)
+{
+  if (a != 0.0f)
+    {
+      if (a <= 0.0f)
+	return -1;
+      return 1;
+    }
+  return 0;
+}
+
+int
+main ()
+{
+  double m5 = -5.0f;
+  double p5 = 5.0f;
+  volatile double p0 = 0.0f;
+  double nan = p0 / p0;
+  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
+    __builtin_abort ();
+  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
+    __builtin_abort ();
+  if (foo (m5, nan) != 1 || foo (nan, p5) != 1)
+    __builtin_abort ();
+  if (foo (nan, nan) != 1)
+    __builtin_abort ();
+  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
+    __builtin_abort ();
+  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
+    __builtin_abort ();
+  if (bar (m5, nan) != 1 || bar (nan, p5) != 1)
+    __builtin_abort ();
+  if (bar (nan, nan) != 1)
+    __builtin_abort ();
+  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
+    __builtin_abort ();
+  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
+    __builtin_abort ();
+  if (baz (m5, nan) != -1 || baz (nan, p5) != -1)
+    __builtin_abort ();
+  if (baz (nan, nan) != -1)
+    __builtin_abort ();
+  if (qux (p0) != 0 || qux (nan) != 1)
+    __builtin_abort ();
+  if (qux (m5) != -1 || qux (p5) != 1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/i386/pr103973-10.c.jj	2022-01-15 11:44:56.503459584 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-10.c	2022-01-15 12:00:15.865355956 +0100
@@ -0,0 +1,7 @@
+/* PR target/103973 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#include "pr103973-9.c"
--- gcc/testsuite/gcc.target/i386/pr103973-11.c.jj	2022-01-15 11:44:56.504459570 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-11.c	2022-01-15 11:45:08.783284502 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do run } */
+/* { dg-options "-O2 -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
+
+#define double float
+#include "pr103973-9.c"
--- gcc/testsuite/gcc.target/i386/pr103973-12.c.jj	2022-01-15 11:44:56.506459542 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-12.c	2022-01-15 12:00:15.865355956 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#define double float
+#include "pr103973-9.c"
--- gcc/testsuite/gcc.target/i386/pr103973-13.c.jj	2022-01-15 11:44:56.507459527 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-13.c	2022-01-15 11:44:19.254990661 +0100
@@ -0,0 +1,76 @@
+/* PR target/103973 */
+/* { dg-do run } */
+/* { dg-options "-O2 -ffast-math -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
+
+__attribute__((noipa)) int m1 (void) { return -1; }
+__attribute__((noipa)) int p0 (void) { return 0; }
+__attribute__((noipa)) int p1 (void) { return 1; }
+
+__attribute__((noipa)) int
+foo (double a, double b)
+{
+  if (a == b)
+    return 0;
+  if (a < b)
+    return -1;
+  return 1;
+}
+
+__attribute__((noipa)) int
+bar (double a, double b)
+{
+  if (a == b)
+    return p0 ();
+  if (a < b)
+    return m1 ();
+  return p1 ();
+}
+
+__attribute__((noipa)) int
+baz (double a, double b)
+{
+  if (a == b)
+    return p0 ();
+  if (b < a)
+    return p1 ();
+  return m1 ();
+}
+
+__attribute__((noipa)) int
+qux (double a)
+{
+  if (a != 0.0f)
+    {
+      if (a <= 0.0f)
+	return -1;
+      return 1;
+    }
+  return 0;
+}
+
+int
+main ()
+{
+  double m5 = -5.0f;
+  double p5 = 5.0f;
+  double p0 = 0.0f;
+  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
+    __builtin_abort ();
+  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
+    __builtin_abort ();
+  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
+    __builtin_abort ();
+  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
+    __builtin_abort ();
+  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
+    __builtin_abort ();
+  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
+    __builtin_abort ();
+  if (qux (p0) != 0)
+    __builtin_abort ();
+  if (qux (m5) != -1 || qux (p5) != 1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/i386/pr103973-14.c.jj	2022-01-15 11:44:56.508459513 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-14.c	2022-01-15 12:00:15.865355956 +0100
@@ -0,0 +1,7 @@
+/* PR target/103973 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#include "pr103973-13.c"
--- gcc/testsuite/gcc.target/i386/pr103973-15.c.jj	2022-01-15 11:44:56.509459499 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-15.c	2022-01-15 11:45:27.532017186 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do run } */
+/* { dg-options "-O2 -ffast-math -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
+
+#define double float
+#include "pr103973-13.c"
--- gcc/testsuite/gcc.target/i386/pr103973-16.c.jj	2022-01-15 11:44:56.510459485 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-16.c	2022-01-15 12:00:15.865355956 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#define double float
+#include "pr103973-13.c"
--- gcc/testsuite/gcc.target/i386/pr103973-17.c.jj	2022-01-15 12:01:30.713290043 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-17.c	2022-01-15 12:08:07.244642996 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do run { target large_long_double } } */
+/* { dg-options "-O2 -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#define double long double
+#include "pr103973-1.c"
--- gcc/testsuite/gcc.target/i386/pr103973-18.c.jj	2022-01-15 12:04:28.332760546 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-18.c	2022-01-15 12:08:13.633552013 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do run { target large_long_double } } */
+/* { dg-options "-O2 -ffast-math -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#define double long double
+#include "pr103973-5.c"
--- gcc/testsuite/gcc.target/i386/pr103973-19.c.jj	2022-01-15 12:04:31.235719206 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-19.c	2022-01-15 12:08:18.792478544 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do run { target large_long_double } } */
+/* { dg-options "-O2 -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#define double long double
+#include "pr103973-9.c"
--- gcc/testsuite/gcc.target/i386/pr103973-20.c.jj	2022-01-15 12:04:34.648670603 +0100
+++ gcc/testsuite/gcc.target/i386/pr103973-20.c	2022-01-15 12:08:26.220372764 +0100
@@ -0,0 +1,8 @@
+/* PR target/103973 */
+/* { dg-do run { target large_long_double } } */
+/* { dg-options "-O2 -ffast-math -save-temps" } */
+/* { dg-final { scan-assembler-not "'\tfucom" } } */
+/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
+
+#define double long double
+#include "pr103973-13.c"
--- gcc/testsuite/g++.target/i386/pr103973-1.C.jj	2022-01-15 09:51:25.443467786 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-1.C	2022-01-15 09:51:25.443467786 +0100
@@ -0,0 +1,71 @@
+// PR target/103973
+// { dg-do run }
+// { dg-options "-O2 -std=c++20 -save-temps" }
+// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
+// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
+
+#include <compare>
+
+#ifndef double_type
+#define double_type double
+#endif
+
+__attribute__((noipa)) auto
+foo (double_type a, double_type b)
+{
+  return a <=> b;
+}
+
+__attribute__((noipa)) int
+bar (double_type a, double_type b)
+{
+  auto c = foo (a, b);
+  if (c == std::partial_ordering::less)
+    return -1;
+  if (c == std::partial_ordering::equivalent)
+    return 0;
+  if (c == std::partial_ordering::greater)
+    return 1;
+  return 2;
+}
+
+__attribute__((noipa)) auto
+baz (double_type a)
+{
+  return a <=> 0.0f;
+}
+
+__attribute__((noipa)) int
+qux (double_type a)
+{
+  auto c = baz (a);
+  if (c == std::partial_ordering::less)
+    return -1;
+  if (c == std::partial_ordering::equivalent)
+    return 0;
+  if (c == std::partial_ordering::greater)
+    return 1;
+  return 2;
+}
+
+int
+main ()
+{
+  double_type m5 = -5.0;
+  double_type p5 = 5.0;
+  volatile double_type p0 = 0.0;
+  double_type nan = p0 / p0;
+  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
+    __builtin_abort ();
+  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
+    __builtin_abort ();
+  if (bar (m5, nan) != 2 || bar (nan, p5) != 2)
+    __builtin_abort ();
+  if (bar (nan, nan) != 2)
+    __builtin_abort ();
+  if (qux (p0) != 0 || qux (nan) != 2)
+    __builtin_abort ();
+  if (qux (m5) != -1 || qux (p5) != 1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/g++.target/i386/pr103973-2.C.jj	2022-01-15 09:51:25.443467786 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-2.C	2022-01-15 12:00:42.392978175 +0100
@@ -0,0 +1,7 @@
+// PR target/103973
+// { dg-do compile { target ia32 } }
+// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#include "pr103973-1.C"
--- gcc/testsuite/g++.target/i386/pr103973-3.C.jj	2022-01-15 09:51:25.443467786 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-3.C	2022-01-15 09:51:25.443467786 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do run }
+// { dg-options "-O2 -save-temps -std=c++20" }
+// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
+// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
+
+#define double_type float
+#include "pr103973-1.C"
--- gcc/testsuite/g++.target/i386/pr103973-4.C.jj	2022-01-15 09:51:25.443467786 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-4.C	2022-01-15 12:00:42.392978175 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do compile { target ia32 } }
+// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#define double_type float
+#include "pr103973-1.C"
--- gcc/testsuite/g++.target/i386/pr103973-5.C.jj	2022-01-15 11:07:17.398678932 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-5.C	2022-01-15 11:07:48.314239313 +0100
@@ -0,0 +1,66 @@
+// PR target/103973
+// { dg-do run }
+// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
+// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
+// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
+
+#include <compare>
+
+#ifndef double_type
+#define double_type double
+#endif
+
+__attribute__((noipa)) auto
+foo (double_type a, double_type b)
+{
+  return a <=> b;
+}
+
+__attribute__((noipa)) int
+bar (double_type a, double_type b)
+{
+  auto c = foo (a, b);
+  if (c == std::partial_ordering::less)
+    return -1;
+  if (c == std::partial_ordering::equivalent)
+    return 0;
+  if (c == std::partial_ordering::greater)
+    return 1;
+  return 2;
+}
+
+__attribute__((noipa)) auto
+baz (double_type a)
+{
+  return a <=> 0.0f;
+}
+
+__attribute__((noipa)) int
+qux (double_type a)
+{
+  auto c = baz (a);
+  if (c == std::partial_ordering::less)
+    return -1;
+  if (c == std::partial_ordering::equivalent)
+    return 0;
+  if (c == std::partial_ordering::greater)
+    return 1;
+  return 2;
+}
+
+int
+main ()
+{
+  double_type m5 = -5.0;
+  double_type p5 = 5.0;
+  double_type p0 = 0.0;
+  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
+    __builtin_abort ();
+  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
+    __builtin_abort ();
+  if (qux (p0) != 0)
+    __builtin_abort ();
+  if (qux (m5) != -1 || qux (p5) != 1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/g++.target/i386/pr103973-6.C.jj	2022-01-15 11:08:07.181971016 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-6.C	2022-01-15 12:00:42.392978175 +0100
@@ -0,0 +1,7 @@
+// PR target/103973
+// { dg-do compile { target ia32 } }
+// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#include "pr103973-5.C"
--- gcc/testsuite/g++.target/i386/pr103973-7.C.jj	2022-01-15 11:08:10.054930163 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-7.C	2022-01-15 11:08:39.354513526 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do run }
+// { dg-options "-O2 -ffast-math -save-temps -std=c++20" }
+// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
+// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
+
+#define double_type float
+#include "pr103973-5.C"
--- gcc/testsuite/g++.target/i386/pr103973-8.C.jj	2022-01-15 11:08:13.064887361 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-8.C	2022-01-15 12:00:42.392978175 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do compile { target ia32 } }
+// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#define double_type float
+#include "pr103973-5.C"
--- gcc/testsuite/g++.target/i386/pr103973-9.C.jj	2022-01-15 11:46:15.455333909 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-9.C	2022-01-15 11:47:00.152696626 +0100
@@ -0,0 +1,67 @@
+// PR target/103973
+// { dg-do run }
+// { dg-options "-O2 -std=c++20 -save-temps" }
+// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
+// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
+
+#include <compare>
+
+#ifndef double_type
+#define double_type double
+#endif
+
+__attribute__((noipa)) auto
+foo (double_type a, double_type b)
+{
+  return a <=> b;
+}
+
+__attribute__((noipa)) int
+bar (double_type a, double_type b)
+{
+  auto c = foo (a, b);
+  if (c == std::partial_ordering::less)
+    return -1;
+  if (c == std::partial_ordering::equivalent)
+    return 0;
+  return 1;
+}
+
+__attribute__((noipa)) auto
+baz (double_type a)
+{
+  return a <=> 0.0f;
+}
+
+__attribute__((noipa)) int
+qux (double_type a)
+{
+  auto c = baz (a);
+  if (c == std::partial_ordering::less)
+    return -1;
+  if (c == std::partial_ordering::equivalent)
+    return 0;
+  return 1;
+}
+
+int
+main ()
+{
+  double_type m5 = -5.0;
+  double_type p5 = 5.0;
+  volatile double_type p0 = 0.0;
+  double_type nan = p0 / p0;
+  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
+    __builtin_abort ();
+  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
+    __builtin_abort ();
+  if (bar (m5, nan) != 1 || bar (nan, p5) != 1)
+    __builtin_abort ();
+  if (bar (nan, nan) != 1)
+    __builtin_abort ();
+  if (qux (p0) != 0 || qux (nan) != 1)
+    __builtin_abort ();
+  if (qux (m5) != -1 || qux (p5) != 1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/g++.target/i386/pr103973-10.C.jj	2022-01-15 11:48:31.928388111 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-10.C	2022-01-15 12:00:42.393978161 +0100
@@ -0,0 +1,7 @@
+// PR target/103973
+// { dg-do compile { target ia32 } }
+// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#include "pr103973-9.C"
--- gcc/testsuite/g++.target/i386/pr103973-11.C.jj	2022-01-15 11:48:31.929388096 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-11.C	2022-01-15 11:48:46.756176703 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do run }
+// { dg-options "-O2 -save-temps -std=c++20" }
+// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
+// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
+
+#define double_type float
+#include "pr103973-9.C"
--- gcc/testsuite/g++.target/i386/pr103973-12.C.jj	2022-01-15 11:48:31.931388068 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-12.C	2022-01-15 12:00:42.393978161 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do compile { target ia32 } }
+// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#define double_type float
+#include "pr103973-9.C"
--- gcc/testsuite/g++.target/i386/pr103973-13.C.jj	2022-01-15 11:48:31.932388054 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-13.C	2022-01-15 11:48:13.484651079 +0100
@@ -0,0 +1,62 @@
+// PR target/103973
+// { dg-do run }
+// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
+// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
+// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
+
+#include <compare>
+
+#ifndef double_type
+#define double_type double
+#endif
+
+__attribute__((noipa)) auto
+foo (double_type a, double_type b)
+{
+  return a <=> b;
+}
+
+__attribute__((noipa)) int
+bar (double_type a, double_type b)
+{
+  auto c = foo (a, b);
+  if (c == std::partial_ordering::less)
+    return -1;
+  if (c == std::partial_ordering::equivalent)
+    return 0;
+  return 1;
+}
+
+__attribute__((noipa)) auto
+baz (double_type a)
+{
+  return a <=> 0.0f;
+}
+
+__attribute__((noipa)) int
+qux (double_type a)
+{
+  auto c = baz (a);
+  if (c == std::partial_ordering::greater)
+    return 1;
+  if (c == std::partial_ordering::equivalent)
+    return 0;
+  return -1;
+}
+
+int
+main ()
+{
+  double_type m5 = -5.0;
+  double_type p5 = 5.0;
+  double_type p0 = 0.0;
+  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
+    __builtin_abort ();
+  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
+    __builtin_abort ();
+  if (qux (p0) != 0)
+    __builtin_abort ();
+  if (qux (m5) != -1 || qux (p5) != 1)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/g++.target/i386/pr103973-14.C.jj	2022-01-15 11:48:31.933388039 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-14.C	2022-01-15 12:00:42.393978161 +0100
@@ -0,0 +1,7 @@
+// PR target/103973
+// { dg-do compile { target ia32 } }
+// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#include "pr103973-13.C"
--- gcc/testsuite/g++.target/i386/pr103973-15.C.jj	2022-01-15 11:48:31.934388025 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-15.C	2022-01-15 11:49:07.262884325 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do run }
+// { dg-options "-O2 -ffast-math -save-temps -std=c++20" }
+// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
+// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
+
+#define double_type float
+#include "pr103973-13.C"
--- gcc/testsuite/g++.target/i386/pr103973-16.C.jj	2022-01-15 11:48:31.935388011 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-16.C	2022-01-15 12:00:42.393978161 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do compile { target ia32 } }
+// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#define double_type float
+#include "pr103973-13.C"
--- gcc/testsuite/g++.target/i386/pr103973-17.C.jj	2022-01-15 12:09:38.499343432 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-17.C	2022-01-15 12:08:54.276973207 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do run { target large_long_double } }
+// { dg-options "-O2 -std=c++20 -save-temps" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#define double_type long double
+#include "pr103973-1.C"
--- gcc/testsuite/g++.target/i386/pr103973-18.C.jj	2022-01-15 12:09:41.472301093 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-18.C	2022-01-15 12:09:15.681668382 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do run { target large_long_double } }
+// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#define double_type long double
+#include "pr103973-5.C"
--- gcc/testsuite/g++.target/i386/pr103973-19.C.jj	2022-01-15 12:09:43.544271589 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-19.C	2022-01-15 12:09:22.726568054 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do run { target large_long_double } }
+// { dg-options "-O2 -std=c++20 -save-temps" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#define double_type long double
+#include "pr103973-9.C"
--- gcc/testsuite/g++.target/i386/pr103973-20.C.jj	2022-01-15 12:09:46.301232323 +0100
+++ gcc/testsuite/g++.target/i386/pr103973-20.C	2022-01-15 12:09:33.491414751 +0100
@@ -0,0 +1,8 @@
+// PR target/103973
+// { dg-do run { target large_long_double } }
+// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
+// { dg-final { scan-assembler-not "'\tfucom" } }
+// { dg-final { scan-assembler-times "\tfcom" 2 } }
+
+#define double_type long double
+#include "pr103973-13.C"


	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] widening_mul, i386, v2: Improve spaceship expansion on x86 [PR103973]
  2022-01-15 11:22       ` [PATCH] widening_mul, i386, v2: " Jakub Jelinek
@ 2022-01-15 16:40         ` Uros Bizjak
  2022-01-17 12:04         ` Richard Biener
  1 sibling, 0 replies; 8+ messages in thread
From: Uros Bizjak @ 2022-01-15 16:40 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

On Sat, Jan 15, 2022 at 12:23 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Sat, Jan 15, 2022 at 11:42:55AM +0100, Uros Bizjak wrote:
> > Yes, that would be nice. XFmode is used for long double, and not obsolete.
>
> Ok, that seems to work.  Compared to the incremental patch I've posted, I
> also had to add handling of the case where we have just
> x == y ? 0 : x < y ? -1 : 1 (both for -ffast-math and non-ffast-math).
> Apparently even that is worth optimizing.
> Tested so far on the new testcases, will run full bootstrap/regtest tonight.
>
> > > Why?  That seems to be a waste of time to me, unless something uses them
> > > already during expansion.  Because pass_expand::execute
> > > runs:
> > >   /* We need JUMP_LABEL be set in order to redirect jumps, and hence
> > >      split edges which edge insertions might do.  */
> > >   rebuild_jump_labels (get_insns ());
> > > which resets all LABEL_NUSES to 0 (well, to:
> > >       if (LABEL_P (insn))
> > >         LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0);
> > > and then recomputes them and adds JUMP_LABEL if needed:
> > >               JUMP_LABEL (insn) = label;
> >
> > I was not aware of that detail. Thanks for sharing (and I wonder if
> > all other cases should be removed from the source).
>
> I guess it depends, for code that can only be called during the expand pass
> dropping it should be just fine, for code that can be called also (or only)
> later I think adding JUMP_LABEL and correct LABEL_NUSES is needed because
> nothing will fix it up afterwards.
>
> 2022-01-15  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/103973
>         * tree-cfg.h (cond_only_block_p): Declare.
>         * tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
>         * tree-cfg.c (cond_only_block_p): ... here.  No longer static.
>         * optabs.def (spaceship_optab): New optab.
>         * internal-fn.def (SPACESHIP): New internal function.
>         * internal-fn.h (expand_SPACESHIP): Declare.
>         * internal-fn.c (expand_PHI): Formatting fix.
>         (expand_SPACESHIP): New function.
>         * tree-ssa-math-opts.c (optimize_spaceship): New function.
>         (math_opts_dom_walker::after_dom_children): Use it.
>         * config/i386/i386.md (spaceship<mode>3): New define_expand.
>         * config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
>         * config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
>         * doc/md.texi (spaceship@var{m}3): Document.
>
>         * gcc.target/i386/pr103973-1.c: New test.
>         * gcc.target/i386/pr103973-2.c: New test.
>         * gcc.target/i386/pr103973-3.c: New test.
>         * gcc.target/i386/pr103973-4.c: New test.
>         * gcc.target/i386/pr103973-5.c: New test.
>         * gcc.target/i386/pr103973-6.c: New test.
>         * gcc.target/i386/pr103973-7.c: New test.
>         * gcc.target/i386/pr103973-8.c: New test.
>         * gcc.target/i386/pr103973-9.c: New test.
>         * gcc.target/i386/pr103973-10.c: New test.
>         * gcc.target/i386/pr103973-11.c: New test.
>         * gcc.target/i386/pr103973-12.c: New test.
>         * gcc.target/i386/pr103973-13.c: New test.
>         * gcc.target/i386/pr103973-14.c: New test.
>         * gcc.target/i386/pr103973-15.c: New test.
>         * gcc.target/i386/pr103973-16.c: New test.
>         * gcc.target/i386/pr103973-17.c: New test.
>         * gcc.target/i386/pr103973-18.c: New test.
>         * gcc.target/i386/pr103973-19.c: New test.
>         * gcc.target/i386/pr103973-20.c: New test.
>         * g++.target/i386/pr103973-1.C: New test.
>         * g++.target/i386/pr103973-2.C: New test.
>         * g++.target/i386/pr103973-3.C: New test.
>         * g++.target/i386/pr103973-4.C: New test.
>         * g++.target/i386/pr103973-5.C: New test.
>         * g++.target/i386/pr103973-6.C: New test.
>         * g++.target/i386/pr103973-7.C: New test.
>         * g++.target/i386/pr103973-8.C: New test.
>         * g++.target/i386/pr103973-9.C: New test.
>         * g++.target/i386/pr103973-10.C: New test.
>         * g++.target/i386/pr103973-11.C: New test.
>         * g++.target/i386/pr103973-12.C: New test.
>         * g++.target/i386/pr103973-13.C: New test.
>         * g++.target/i386/pr103973-14.C: New test.
>         * g++.target/i386/pr103973-15.C: New test.
>         * g++.target/i386/pr103973-16.C: New test.
>         * g++.target/i386/pr103973-17.C: New test.
>         * g++.target/i386/pr103973-18.C: New test.
>         * g++.target/i386/pr103973-19.C: New test.
>         * g++.target/i386/pr103973-20.C: New test.

OK (with a comment fix below) for the x86 part.

Thanks,
Uros.

> --- gcc/tree-cfg.h.jj   2022-01-14 23:57:44.491718086 +0100
> +++ gcc/tree-cfg.h      2022-01-15 09:51:25.359468982 +0100
> @@ -111,6 +111,7 @@ extern basic_block gimple_switch_label_b
>  extern basic_block gimple_switch_default_bb (function *, gswitch *);
>  extern edge gimple_switch_edge (function *, gswitch *, unsigned);
>  extern edge gimple_switch_default_edge (function *, gswitch *);
> +extern bool cond_only_block_p (basic_block);
>
>  /* Return true if the LHS of a call should be removed.  */
>
> --- gcc/tree-ssa-phiopt.c.jj    2022-01-14 23:57:44.536717549 +0100
> +++ gcc/tree-ssa-phiopt.c       2022-01-15 09:51:25.361468954 +0100
> @@ -1958,31 +1958,6 @@ minmax_replacement (basic_block cond_bb,
>    return true;
>  }
>
> -/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
> -
> -static bool
> -cond_only_block_p (basic_block bb)
> -{
> -  /* BB must have no executable statements.  */
> -  gimple_stmt_iterator gsi = gsi_after_labels (bb);
> -  if (phi_nodes (bb))
> -    return false;
> -  while (!gsi_end_p (gsi))
> -    {
> -      gimple *stmt = gsi_stmt (gsi);
> -      if (is_gimple_debug (stmt))
> -       ;
> -      else if (gimple_code (stmt) == GIMPLE_NOP
> -              || gimple_code (stmt) == GIMPLE_PREDICT
> -              || gimple_code (stmt) == GIMPLE_COND)
> -       ;
> -      else
> -       return false;
> -      gsi_next (&gsi);
> -    }
> -  return true;
> -}
> -
>  /* Attempt to optimize (x <=> y) cmp 0 and similar comparisons.
>     For strong ordering <=> try to match something like:
>      <bb 2> :  // cond3_bb (== cond2_bb)
> --- gcc/tree-cfg.c.jj   2022-01-14 23:57:44.477718253 +0100
> +++ gcc/tree-cfg.c      2022-01-15 09:51:25.363468925 +0100
> @@ -9410,6 +9410,31 @@ gimple_switch_default_edge (function *if
>    return gimple_switch_edge (ifun, gs, 0);
>  }
>
> +/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
> +
> +bool
> +cond_only_block_p (basic_block bb)
> +{
> +  /* BB must have no executable statements.  */
> +  gimple_stmt_iterator gsi = gsi_after_labels (bb);
> +  if (phi_nodes (bb))
> +    return false;
> +  while (!gsi_end_p (gsi))
> +    {
> +      gimple *stmt = gsi_stmt (gsi);
> +      if (is_gimple_debug (stmt))
> +       ;
> +      else if (gimple_code (stmt) == GIMPLE_NOP
> +              || gimple_code (stmt) == GIMPLE_PREDICT
> +              || gimple_code (stmt) == GIMPLE_COND)
> +       ;
> +      else
> +       return false;
> +      gsi_next (&gsi);
> +    }
> +  return true;
> +}
> +
>
>  /* Emit return warnings.  */
>
> --- gcc/optabs.def.jj   2022-01-14 23:57:44.445718634 +0100
> +++ gcc/optabs.def      2022-01-15 09:51:25.383468640 +0100
> @@ -259,6 +259,7 @@ OPTAB_D (usubv4_optab, "usubv$I$a4")
>  OPTAB_D (umulv4_optab, "umulv$I$a4")
>  OPTAB_D (negv3_optab, "negv$I$a3")
>  OPTAB_D (addptr3_optab, "addptr$a3")
> +OPTAB_D (spaceship_optab, "spaceship$a3")
>
>  OPTAB_D (smul_highpart_optab, "smul$a3_highpart")
>  OPTAB_D (umul_highpart_optab, "umul$a3_highpart")
> --- gcc/internal-fn.def.jj      2022-01-14 23:57:44.433718778 +0100
> +++ gcc/internal-fn.def 2022-01-15 09:51:25.399468413 +0100
> @@ -430,6 +430,9 @@ DEF_INTERNAL_FN (NOP, ECF_CONST | ECF_LE
>  /* Temporary vehicle for __builtin_shufflevector.  */
>  DEF_INTERNAL_FN (SHUFFLEVECTOR, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>
> +/* <=> optimization.  */
> +DEF_INTERNAL_FN (SPACESHIP, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
> +
>  #undef DEF_INTERNAL_INT_FN
>  #undef DEF_INTERNAL_FLT_FN
>  #undef DEF_INTERNAL_FLT_FLOATN_FN
> --- gcc/internal-fn.h.jj        2022-01-14 23:57:44.445718634 +0100
> +++ gcc/internal-fn.h   2022-01-15 09:51:25.399468413 +0100
> @@ -241,6 +241,7 @@ extern void expand_internal_call (gcall
>  extern void expand_internal_call (internal_fn, gcall *);
>  extern void expand_PHI (internal_fn, gcall *);
>  extern void expand_SHUFFLEVECTOR (internal_fn, gcall *);
> +extern void expand_SPACESHIP (internal_fn, gcall *);
>
>  extern bool vectorized_internal_fn_supported_p (internal_fn, tree);
>
> --- gcc/internal-fn.c.jj        2022-01-14 23:57:44.433718778 +0100
> +++ gcc/internal-fn.c   2022-01-15 09:51:25.400468399 +0100
> @@ -4425,5 +4425,27 @@ expand_SHUFFLEVECTOR (internal_fn, gcall
>  void
>  expand_PHI (internal_fn, gcall *)
>  {
> -    gcc_unreachable ();
> +  gcc_unreachable ();
> +}
> +
> +void
> +expand_SPACESHIP (internal_fn, gcall *stmt)
> +{
> +  tree lhs = gimple_call_lhs (stmt);
> +  tree rhs1 = gimple_call_arg (stmt, 0);
> +  tree rhs2 = gimple_call_arg (stmt, 1);
> +  tree type = TREE_TYPE (rhs1);
> +
> +  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> +  rtx op1 = expand_normal (rhs1);
> +  rtx op2 = expand_normal (rhs2);
> +
> +  class expand_operand ops[3];
> +  create_output_operand (&ops[0], target, TYPE_MODE (TREE_TYPE (lhs)));
> +  create_input_operand (&ops[1], op1, TYPE_MODE (type));
> +  create_input_operand (&ops[2], op2, TYPE_MODE (type));
> +  insn_code icode = optab_handler (spaceship_optab, TYPE_MODE (type));
> +  expand_insn (icode, 3, ops);
> +  if (!rtx_equal_p (target, ops[0].value))
> +    emit_move_insn (target, ops[0].value);
>  }
> --- gcc/tree-ssa-math-opts.c.jj 2022-01-14 23:57:44.492718074 +0100
> +++ gcc/tree-ssa-math-opts.c    2022-01-15 11:37:13.131069782 +0100
> @@ -4637,6 +4637,227 @@ convert_mult_to_highpart (gassign *stmt,
>    return true;
>  }
>
> +/* If target has spaceship<MODE>3 expander, pattern recognize
> +   <bb 2> [local count: 1073741824]:
> +   if (a_2(D) == b_3(D))
> +     goto <bb 6>; [34.00%]
> +   else
> +     goto <bb 3>; [66.00%]
> +
> +   <bb 3> [local count: 708669601]:
> +   if (a_2(D) < b_3(D))
> +     goto <bb 6>; [1.04%]
> +   else
> +     goto <bb 4>; [98.96%]
> +
> +   <bb 4> [local count: 701299439]:
> +   if (a_2(D) > b_3(D))
> +     goto <bb 5>; [48.89%]
> +   else
> +     goto <bb 6>; [51.11%]
> +
> +   <bb 5> [local count: 342865295]:
> +
> +   <bb 6> [local count: 1073741824]:
> +   and turn it into:
> +   <bb 2> [local count: 1073741824]:
> +   _1 = .SPACESHIP (a_2(D), b_3(D));
> +   if (_1 == 0)
> +     goto <bb 6>; [34.00%]
> +   else
> +     goto <bb 3>; [66.00%]
> +
> +   <bb 3> [local count: 708669601]:
> +   if (_1 == -1)
> +     goto <bb 6>; [1.04%]
> +   else
> +     goto <bb 4>; [98.96%]
> +
> +   <bb 4> [local count: 701299439]:
> +   if (_1 == 1)
> +     goto <bb 5>; [48.89%]
> +   else
> +     goto <bb 6>; [51.11%]
> +
> +   <bb 5> [local count: 342865295]:
> +
> +   <bb 6> [local count: 1073741824]:
> +   so that the backend can emit optimal comparison and
> +   conditional jump sequence.  */
> +
> +static void
> +optimize_spaceship (gimple *stmt)
> +{
> +  enum tree_code code = gimple_cond_code (stmt);
> +  if (code != EQ_EXPR && code != NE_EXPR)
> +    return;
> +  tree arg1 = gimple_cond_lhs (stmt);
> +  tree arg2 = gimple_cond_rhs (stmt);
> +  if (!SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg1))
> +      || optab_handler (spaceship_optab,
> +                       TYPE_MODE (TREE_TYPE (arg1))) == CODE_FOR_nothing
> +      || operand_equal_p (arg1, arg2, 0))
> +    return;
> +
> +  basic_block bb0 = gimple_bb (stmt), bb1, bb2 = NULL;
> +  edge em1 = NULL, e1 = NULL, e2 = NULL;
> +  bb1 = EDGE_SUCC (bb0, 1)->dest;
> +  if (((EDGE_SUCC (bb0, 0)->flags & EDGE_TRUE_VALUE) != 0) ^ (code == EQ_EXPR))
> +    bb1 = EDGE_SUCC (bb0, 0)->dest;
> +
> +  gimple *g = last_stmt (bb1);
> +  if (g == NULL
> +      || gimple_code (g) != GIMPLE_COND
> +      || !single_pred_p (bb1)
> +      || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
> +         ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
> +         : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
> +            || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
> +      || !cond_only_block_p (bb1))
> +    return;
> +
> +  enum tree_code ccode = (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
> +                         ? LT_EXPR : GT_EXPR);
> +  switch (gimple_cond_code (g))
> +    {
> +    case LT_EXPR:
> +    case LE_EXPR:
> +      break;
> +    case GT_EXPR:
> +    case GE_EXPR:
> +      ccode = ccode == LT_EXPR ? GT_EXPR : LT_EXPR;
> +      break;
> +    default:
> +      return;
> +    }
> +
> +  for (int i = 0; i < 2; ++i)
> +    {
> +      /* With NaNs, </<=/>/>= are false, so we need to look for the
> +        third comparison on the false edge from whatever non-equality
> +        comparison the second comparison is.  */
> +      if (HONOR_NANS (TREE_TYPE (arg1))
> +         && (EDGE_SUCC (bb1, i)->flags & EDGE_TRUE_VALUE) != 0)
> +       continue;
> +
> +      bb2 = EDGE_SUCC (bb1, i)->dest;
> +      g = last_stmt (bb2);
> +      if (g == NULL
> +         || gimple_code (g) != GIMPLE_COND
> +         || !single_pred_p (bb2)
> +         || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
> +             ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
> +             : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
> +                || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
> +         || !cond_only_block_p (bb2)
> +         || EDGE_SUCC (bb2, 0)->dest == EDGE_SUCC (bb2, 1)->dest)
> +       continue;
> +
> +      enum tree_code ccode2
> +       = (operand_equal_p (gimple_cond_lhs (g), arg1, 0) ? LT_EXPR : GT_EXPR);
> +      switch (gimple_cond_code (g))
> +       {
> +       case LT_EXPR:
> +       case LE_EXPR:
> +         break;
> +       case GT_EXPR:
> +       case GE_EXPR:
> +         ccode2 = ccode2 == LT_EXPR ? GT_EXPR : LT_EXPR;
> +         break;
> +       default:
> +         continue;
> +       }
> +      if (HONOR_NANS (TREE_TYPE (arg1)) && ccode == ccode2)
> +       continue;
> +
> +      if ((ccode == LT_EXPR)
> +         ^ ((EDGE_SUCC (bb1, i)->flags & EDGE_TRUE_VALUE) != 0))
> +       {
> +         em1 = EDGE_SUCC (bb1, 1 - i);
> +         e1 = EDGE_SUCC (bb2, 0);
> +         e2 = EDGE_SUCC (bb2, 1);
> +         if ((ccode2 == LT_EXPR) ^ ((e1->flags & EDGE_TRUE_VALUE) == 0))
> +           std::swap (e1, e2);
> +       }
> +      else
> +       {
> +         e1 = EDGE_SUCC (bb1, 1 - i);
> +         em1 = EDGE_SUCC (bb2, 0);
> +         e2 = EDGE_SUCC (bb2, 1);
> +         if ((ccode2 != LT_EXPR) ^ ((em1->flags & EDGE_TRUE_VALUE) == 0))
> +           std::swap (em1, e2);
> +       }
> +      break;
> +    }
> +
> +  if (em1 == NULL)
> +    {
> +      if ((ccode == LT_EXPR)
> +         ^ ((EDGE_SUCC (bb1, 0)->flags & EDGE_TRUE_VALUE) != 0))
> +       {
> +         em1 = EDGE_SUCC (bb1, 1);
> +         e1 = EDGE_SUCC (bb1, 0);
> +         e2 = (e1->flags & EDGE_TRUE_VALUE) ? em1 : e1;
> +       }
> +      else
> +       {
> +         em1 = EDGE_SUCC (bb1, 0);
> +         e1 = EDGE_SUCC (bb1, 1);
> +         e2 = (e1->flags & EDGE_TRUE_VALUE) ? em1 : e1;
> +       }
> +    }
> +
> +  g = gimple_build_call_internal (IFN_SPACESHIP, 2, arg1, arg2);
> +  tree lhs = make_ssa_name (integer_type_node);
> +  gimple_call_set_lhs (g, lhs);
> +  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +
> +  gcond *cond = as_a <gcond *> (stmt);
> +  gimple_cond_set_lhs (cond, lhs);
> +  gimple_cond_set_rhs (cond, integer_zero_node);
> +  update_stmt (stmt);
> +
> +  g = last_stmt (bb1);
> +  cond = as_a <gcond *> (g);
> +  gimple_cond_set_lhs (cond, lhs);
> +  if (em1->src == bb1 && e2 != em1)
> +    {
> +      gimple_cond_set_rhs (cond, integer_minus_one_node);
> +      gimple_cond_set_code (cond, (em1->flags & EDGE_TRUE_VALUE)
> +                                 ? EQ_EXPR : NE_EXPR);
> +    }
> +  else
> +    {
> +      gcc_assert (e1->src == bb1 && e2 != e1);
> +      gimple_cond_set_rhs (cond, integer_one_node);
> +      gimple_cond_set_code (cond, (e1->flags & EDGE_TRUE_VALUE)
> +                                 ? EQ_EXPR : NE_EXPR);
> +    }
> +  update_stmt (g);
> +
> +  if (e2 != e1 && e2 != em1)
> +    {
> +      g = last_stmt (bb2);
> +      cond = as_a <gcond *> (g);
> +      gimple_cond_set_lhs (cond, lhs);
> +      if (em1->src == bb2)
> +       gimple_cond_set_rhs (cond, integer_minus_one_node);
> +      else
> +       {
> +         gcc_assert (e1->src == bb2);
> +         gimple_cond_set_rhs (cond, integer_one_node);
> +       }
> +      gimple_cond_set_code (cond,
> +                           (e2->flags & EDGE_TRUE_VALUE) ? NE_EXPR : EQ_EXPR);
> +      update_stmt (g);
> +    }
> +
> +  wide_int wm1 = wi::minus_one (TYPE_PRECISION (integer_type_node));
> +  wide_int w2 = wi::two (TYPE_PRECISION (integer_type_node));
> +  set_range_info (lhs, VR_RANGE, wm1, w2);
> +}
> +
>
>  /* Find integer multiplications where the operands are extended from
>     smaller types, and replace the MULT_EXPR with a WIDEN_MULT_EXPR
> @@ -4798,6 +5019,8 @@ math_opts_dom_walker::after_dom_children
>               break;
>             }
>         }
> +      else if (gimple_code (stmt) == GIMPLE_COND)
> +       optimize_spaceship (stmt);
>        gsi_next (&gsi);
>      }
>    if (fma_state.m_deferring_p
> --- gcc/config/i386/i386.md.jj  2022-01-14 23:57:59.047544505 +0100
> +++ gcc/config/i386/i386.md     2022-01-15 12:13:28.116073760 +0100
> @@ -23886,6 +23886,28 @@ (define_insn "hreset"
>    [(set_attr "type" "other")
>     (set_attr "length" "4")])
>
> +;; Spaceship optimization
> +(define_expand "spaceship<mode>3"
> +  [(match_operand:SI 0 "register_operand")
> +   (match_operand:MODEF 1 "cmp_fp_expander_operand")
> +   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
> +  "(TARGET_80387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH))
> +   && (TARGET_CMOVE || (TARGET_SAHF && TARGET_USE_SAHF))"
> +{
> +  ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
> +  DONE;
> +})
> +
> +(define_expand "spaceshipxf3"
> +  [(match_operand:SI 0 "register_operand")
> +   (match_operand:XF 1 "nonmemory_operand")
> +   (match_operand:XF 2 "nonmemory_operand")]
> +  "TARGET_80387 && (TARGET_CMOVE || (TARGET_SAHF && TARGET_USE_SAHF))"
> +{
> +  ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
> +  DONE;
> +})
> +
>  (include "mmx.md")
>  (include "sse.md")
>  (include "sync.md")
> --- gcc/config/i386/i386-protos.h.jj    2022-01-14 23:57:44.398719195 +0100
> +++ gcc/config/i386/i386-protos.h       2022-01-15 09:51:25.410468256 +0100
> @@ -150,6 +150,7 @@ extern bool ix86_expand_int_vec_cmp (rtx
>  extern bool ix86_expand_fp_vec_cmp (rtx[]);
>  extern void ix86_expand_sse_movcc (rtx, rtx, rtx, rtx);
>  extern void ix86_expand_sse_unpack (rtx, rtx, bool, bool);
> +extern void ix86_expand_fp_spaceship (rtx, rtx, rtx);
>  extern bool ix86_expand_int_addcc (rtx[]);
>  extern rtx_insn *ix86_expand_call (rtx, rtx, rtx, rtx, rtx, bool);
>  extern bool ix86_call_use_plt_p (rtx);
> --- gcc/config/i386/i386-expand.c.jj    2022-01-14 23:57:44.379719421 +0100
> +++ gcc/config/i386/i386-expand.c       2022-01-15 10:38:26.924333651 +0100
> @@ -2879,6 +2879,54 @@ ix86_expand_setcc (rtx dest, enum rtx_co
>    emit_insn (gen_rtx_SET (dest, ret));
>  }
>
> +/* Expand floating point op0 <=> op1 if NaNs are honored.  */

Probably independent of NaN, the expansion also works for -ffast-math.

> +
> +void
> +ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1)
> +{
> +  gcc_checking_assert (ix86_fp_comparison_strategy (GT) != IX86_FPCMP_ARITH);
> +  rtx gt = ix86_expand_fp_compare (GT, op0, op1);
> +  rtx l0 = gen_label_rtx ();
> +  rtx l1 = gen_label_rtx ();
> +  rtx l2 = TARGET_IEEE_FP ? gen_label_rtx () : NULL_RTX;
> +  rtx lend = gen_label_rtx ();
> +  rtx tmp;
> +  rtx_insn *jmp;
> +  if (l2)
> +    {
> +      rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
> +                              gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> +      tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
> +                                 gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
> +      jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +      add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
> +    }
> +  rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
> +                          gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
> +                             gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx);
> +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +  add_reg_br_prob_note (jmp, profile_probability::unlikely ());
> +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt,
> +                             gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx);
> +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +  add_reg_br_prob_note (jmp, profile_probability::even ());
> +  emit_move_insn (dest, constm1_rtx);
> +  emit_jump (lend);
> +  emit_label (l0);
> +  emit_move_insn (dest, const0_rtx);
> +  emit_jump (lend);
> +  emit_label (l1);
> +  emit_move_insn (dest, const1_rtx);
> +  emit_jump (lend);
> +  if (l2)
> +    {
> +      emit_label (l2);
> +      emit_move_insn (dest, const2_rtx);
> +    }
> +  emit_label (lend);
> +}
> +
>  /* Expand comparison setting or clearing carry flag.  Return true when
>     successful and set pop for the operation.  */
>  static bool
> --- gcc/doc/md.texi.jj  2022-01-14 23:57:44.419718944 +0100
> +++ gcc/doc/md.texi     2022-01-15 09:51:25.429467985 +0100
> @@ -8055,6 +8055,15 @@ inclusive and operand 1 exclusive.
>  If this pattern is not defined, a call to the library function
>  @code{__clear_cache} is used.
>
> +@cindex @code{spaceship@var{m}3} instruction pattern
> +@item @samp{spaceship@var{m}3}
> +Initialize output operand 0 with mode of integer type to -1, 0, 1 or 2
> +if operand 1 with mode @var{m} compares less than operand 2, equal to
> +operand 2, greater than operand 2 or is unordered with operand 2.
> +@var{m} should be a scalar floating point mode.
> +
> +This pattern is not allowed to @code{FAIL}.
> +
>  @end table
>
>  @end ifset
> --- gcc/testsuite/gcc.target/i386/pr103973-1.c.jj       2022-01-15 09:51:25.430467971 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-1.c  2022-01-15 09:51:25.430467971 +0100
> @@ -0,0 +1,98 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
> +
> +__attribute__((noipa)) int m1 (void) { return -1; }
> +__attribute__((noipa)) int p0 (void) { return 0; }
> +__attribute__((noipa)) int p1 (void) { return 1; }
> +__attribute__((noipa)) int p2 (void) { return 2; }
> +
> +__attribute__((noipa)) int
> +foo (double a, double b)
> +{
> +  if (a == b)
> +    return 0;
> +  if (a < b)
> +    return -1;
> +  if (a > b)
> +    return 1;
> +  return 2;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (a < b)
> +    return m1 ();
> +  if (a > b)
> +    return p1 ();
> +  return p2 ();
> +}
> +
> +__attribute__((noipa)) int
> +baz (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (b < a)
> +    return p1 ();
> +  if (a < b)
> +    return m1 ();
> +  return p2 ();
> +}
> +
> +__attribute__((noipa)) int
> +qux (double a)
> +{
> +  if (a != 0.0f)
> +    {
> +      if (a <= 0.0f)
> +       return -1;
> +      if (a >= 0.0f)
> +       return 1;
> +      return 2;
> +    }
> +  return 0;
> +}
> +
> +int
> +main ()
> +{
> +  double m5 = -5.0f;
> +  double p5 = 5.0f;
> +  volatile double p0 = 0.0f;
> +  double nan = p0 / p0;
> +  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (foo (m5, nan) != 2 || foo (nan, p5) != 2)
> +    __builtin_abort ();
> +  if (foo (nan, nan) != 2)
> +    __builtin_abort ();
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (m5, nan) != 2 || bar (nan, p5) != 2)
> +    __builtin_abort ();
> +  if (bar (nan, nan) != 2)
> +    __builtin_abort ();
> +  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (baz (m5, nan) != 2 || baz (nan, p5) != 2)
> +    __builtin_abort ();
> +  if (baz (nan, nan) != 2)
> +    __builtin_abort ();
> +  if (qux (p0) != 0 || qux (nan) != 2)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.target/i386/pr103973-2.c.jj       2022-01-15 09:51:25.430467971 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-2.c  2022-01-15 12:00:15.864355970 +0100
> @@ -0,0 +1,7 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#include "pr103973-1.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-3.c.jj       2022-01-15 09:51:25.430467971 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-3.c  2022-01-15 09:51:25.430467971 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
> +
> +#define double float
> +#include "pr103973-1.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-4.c.jj       2022-01-15 09:51:25.430467971 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-4.c  2022-01-15 12:00:15.864355970 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double float
> +#include "pr103973-1.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-5.c.jj       2022-01-15 11:04:02.427452420 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-5.c  2022-01-15 11:06:39.594216502 +0100
> @@ -0,0 +1,85 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
> +
> +__attribute__((noipa)) int m1 (void) { return -1; }
> +__attribute__((noipa)) int p0 (void) { return 0; }
> +__attribute__((noipa)) int p1 (void) { return 1; }
> +__attribute__((noipa)) int p2 (void) { return 2; }
> +
> +__attribute__((noipa)) int
> +foo (double a, double b)
> +{
> +  if (a == b)
> +    return 0;
> +  if (a < b)
> +    return -1;
> +  if (a > b)
> +    return 1;
> +  return 2;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (a < b)
> +    return m1 ();
> +  if (a > b)
> +    return p1 ();
> +  return p2 ();
> +}
> +
> +__attribute__((noipa)) int
> +baz (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (b < a)
> +    return p1 ();
> +  if (a < b)
> +    return m1 ();
> +  return p2 ();
> +}
> +
> +__attribute__((noipa)) int
> +qux (double a)
> +{
> +  if (a != 0.0f)
> +    {
> +      if (a <= 0.0f)
> +       return -1;
> +      if (a >= 0.0f)
> +       return 1;
> +      return 2;
> +    }
> +  return 0;
> +}
> +
> +int
> +main ()
> +{
> +  double m5 = -5.0f;
> +  double p5 = 5.0f;
> +  double p0 = 0.0f;
> +  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.target/i386/pr103973-6.c.jj       2022-01-15 11:05:24.377286081 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-6.c  2022-01-15 12:00:15.864355970 +0100
> @@ -0,0 +1,7 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#include "pr103973-5.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-7.c.jj       2022-01-15 11:05:28.620225748 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-7.c  2022-01-15 11:06:03.899724076 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
> +
> +#define double float
> +#include "pr103973-5.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-8.c.jj       2022-01-15 11:05:32.273173801 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-8.c  2022-01-15 12:00:15.865355956 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double float
> +#include "pr103973-5.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-9.c.jj       2022-01-15 11:41:11.895661977 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-9.c  2022-01-15 11:43:31.718668421 +0100
> @@ -0,0 +1,89 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
> +
> +__attribute__((noipa)) int m1 (void) { return -1; }
> +__attribute__((noipa)) int p0 (void) { return 0; }
> +__attribute__((noipa)) int p1 (void) { return 1; }
> +
> +__attribute__((noipa)) int
> +foo (double a, double b)
> +{
> +  if (a == b)
> +    return 0;
> +  if (a < b)
> +    return -1;
> +  return 1;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (a < b)
> +    return m1 ();
> +  return p1 ();
> +}
> +
> +__attribute__((noipa)) int
> +baz (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (b < a)
> +    return p1 ();
> +  return m1 ();
> +}
> +
> +__attribute__((noipa)) int
> +qux (double a)
> +{
> +  if (a != 0.0f)
> +    {
> +      if (a <= 0.0f)
> +       return -1;
> +      return 1;
> +    }
> +  return 0;
> +}
> +
> +int
> +main ()
> +{
> +  double m5 = -5.0f;
> +  double p5 = 5.0f;
> +  volatile double p0 = 0.0f;
> +  double nan = p0 / p0;
> +  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (foo (m5, nan) != 1 || foo (nan, p5) != 1)
> +    __builtin_abort ();
> +  if (foo (nan, nan) != 1)
> +    __builtin_abort ();
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (m5, nan) != 1 || bar (nan, p5) != 1)
> +    __builtin_abort ();
> +  if (bar (nan, nan) != 1)
> +    __builtin_abort ();
> +  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (baz (m5, nan) != -1 || baz (nan, p5) != -1)
> +    __builtin_abort ();
> +  if (baz (nan, nan) != -1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0 || qux (nan) != 1)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.target/i386/pr103973-10.c.jj      2022-01-15 11:44:56.503459584 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-10.c 2022-01-15 12:00:15.865355956 +0100
> @@ -0,0 +1,7 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#include "pr103973-9.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-11.c.jj      2022-01-15 11:44:56.504459570 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-11.c 2022-01-15 11:45:08.783284502 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
> +
> +#define double float
> +#include "pr103973-9.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-12.c.jj      2022-01-15 11:44:56.506459542 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-12.c 2022-01-15 12:00:15.865355956 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double float
> +#include "pr103973-9.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-13.c.jj      2022-01-15 11:44:56.507459527 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-13.c 2022-01-15 11:44:19.254990661 +0100
> @@ -0,0 +1,76 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
> +
> +__attribute__((noipa)) int m1 (void) { return -1; }
> +__attribute__((noipa)) int p0 (void) { return 0; }
> +__attribute__((noipa)) int p1 (void) { return 1; }
> +
> +__attribute__((noipa)) int
> +foo (double a, double b)
> +{
> +  if (a == b)
> +    return 0;
> +  if (a < b)
> +    return -1;
> +  return 1;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (a < b)
> +    return m1 ();
> +  return p1 ();
> +}
> +
> +__attribute__((noipa)) int
> +baz (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (b < a)
> +    return p1 ();
> +  return m1 ();
> +}
> +
> +__attribute__((noipa)) int
> +qux (double a)
> +{
> +  if (a != 0.0f)
> +    {
> +      if (a <= 0.0f)
> +       return -1;
> +      return 1;
> +    }
> +  return 0;
> +}
> +
> +int
> +main ()
> +{
> +  double m5 = -5.0f;
> +  double p5 = 5.0f;
> +  double p0 = 0.0f;
> +  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.target/i386/pr103973-14.c.jj      2022-01-15 11:44:56.508459513 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-14.c 2022-01-15 12:00:15.865355956 +0100
> @@ -0,0 +1,7 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#include "pr103973-13.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-15.c.jj      2022-01-15 11:44:56.509459499 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-15.c 2022-01-15 11:45:27.532017186 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
> +
> +#define double float
> +#include "pr103973-13.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-16.c.jj      2022-01-15 11:44:56.510459485 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-16.c 2022-01-15 12:00:15.865355956 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double float
> +#include "pr103973-13.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-17.c.jj      2022-01-15 12:01:30.713290043 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-17.c 2022-01-15 12:08:07.244642996 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run { target large_long_double } } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double long double
> +#include "pr103973-1.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-18.c.jj      2022-01-15 12:04:28.332760546 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-18.c 2022-01-15 12:08:13.633552013 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run { target large_long_double } } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double long double
> +#include "pr103973-5.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-19.c.jj      2022-01-15 12:04:31.235719206 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-19.c 2022-01-15 12:08:18.792478544 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run { target large_long_double } } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double long double
> +#include "pr103973-9.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-20.c.jj      2022-01-15 12:04:34.648670603 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-20.c 2022-01-15 12:08:26.220372764 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run { target large_long_double } } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double long double
> +#include "pr103973-13.c"
> --- gcc/testsuite/g++.target/i386/pr103973-1.C.jj       2022-01-15 09:51:25.443467786 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-1.C  2022-01-15 09:51:25.443467786 +0100
> @@ -0,0 +1,71 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
> +
> +#include <compare>
> +
> +#ifndef double_type
> +#define double_type double
> +#endif
> +
> +__attribute__((noipa)) auto
> +foo (double_type a, double_type b)
> +{
> +  return a <=> b;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double_type a, double_type b)
> +{
> +  auto c = foo (a, b);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  if (c == std::partial_ordering::greater)
> +    return 1;
> +  return 2;
> +}
> +
> +__attribute__((noipa)) auto
> +baz (double_type a)
> +{
> +  return a <=> 0.0f;
> +}
> +
> +__attribute__((noipa)) int
> +qux (double_type a)
> +{
> +  auto c = baz (a);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  if (c == std::partial_ordering::greater)
> +    return 1;
> +  return 2;
> +}
> +
> +int
> +main ()
> +{
> +  double_type m5 = -5.0;
> +  double_type p5 = 5.0;
> +  volatile double_type p0 = 0.0;
> +  double_type nan = p0 / p0;
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (m5, nan) != 2 || bar (nan, p5) != 2)
> +    __builtin_abort ();
> +  if (bar (nan, nan) != 2)
> +    __builtin_abort ();
> +  if (qux (p0) != 0 || qux (nan) != 2)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/g++.target/i386/pr103973-2.C.jj       2022-01-15 09:51:25.443467786 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-2.C  2022-01-15 12:00:42.392978175 +0100
> @@ -0,0 +1,7 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#include "pr103973-1.C"
> --- gcc/testsuite/g++.target/i386/pr103973-3.C.jj       2022-01-15 09:51:25.443467786 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-3.C  2022-01-15 09:51:25.443467786 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -save-temps -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
> +
> +#define double_type float
> +#include "pr103973-1.C"
> --- gcc/testsuite/g++.target/i386/pr103973-4.C.jj       2022-01-15 09:51:25.443467786 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-4.C  2022-01-15 12:00:42.392978175 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type float
> +#include "pr103973-1.C"
> --- gcc/testsuite/g++.target/i386/pr103973-5.C.jj       2022-01-15 11:07:17.398678932 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-5.C  2022-01-15 11:07:48.314239313 +0100
> @@ -0,0 +1,66 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
> +
> +#include <compare>
> +
> +#ifndef double_type
> +#define double_type double
> +#endif
> +
> +__attribute__((noipa)) auto
> +foo (double_type a, double_type b)
> +{
> +  return a <=> b;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double_type a, double_type b)
> +{
> +  auto c = foo (a, b);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  if (c == std::partial_ordering::greater)
> +    return 1;
> +  return 2;
> +}
> +
> +__attribute__((noipa)) auto
> +baz (double_type a)
> +{
> +  return a <=> 0.0f;
> +}
> +
> +__attribute__((noipa)) int
> +qux (double_type a)
> +{
> +  auto c = baz (a);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  if (c == std::partial_ordering::greater)
> +    return 1;
> +  return 2;
> +}
> +
> +int
> +main ()
> +{
> +  double_type m5 = -5.0;
> +  double_type p5 = 5.0;
> +  double_type p0 = 0.0;
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/g++.target/i386/pr103973-6.C.jj       2022-01-15 11:08:07.181971016 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-6.C  2022-01-15 12:00:42.392978175 +0100
> @@ -0,0 +1,7 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#include "pr103973-5.C"
> --- gcc/testsuite/g++.target/i386/pr103973-7.C.jj       2022-01-15 11:08:10.054930163 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-7.C  2022-01-15 11:08:39.354513526 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -ffast-math -save-temps -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
> +
> +#define double_type float
> +#include "pr103973-5.C"
> --- gcc/testsuite/g++.target/i386/pr103973-8.C.jj       2022-01-15 11:08:13.064887361 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-8.C  2022-01-15 12:00:42.392978175 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type float
> +#include "pr103973-5.C"
> --- gcc/testsuite/g++.target/i386/pr103973-9.C.jj       2022-01-15 11:46:15.455333909 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-9.C  2022-01-15 11:47:00.152696626 +0100
> @@ -0,0 +1,67 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
> +
> +#include <compare>
> +
> +#ifndef double_type
> +#define double_type double
> +#endif
> +
> +__attribute__((noipa)) auto
> +foo (double_type a, double_type b)
> +{
> +  return a <=> b;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double_type a, double_type b)
> +{
> +  auto c = foo (a, b);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  return 1;
> +}
> +
> +__attribute__((noipa)) auto
> +baz (double_type a)
> +{
> +  return a <=> 0.0f;
> +}
> +
> +__attribute__((noipa)) int
> +qux (double_type a)
> +{
> +  auto c = baz (a);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  return 1;
> +}
> +
> +int
> +main ()
> +{
> +  double_type m5 = -5.0;
> +  double_type p5 = 5.0;
> +  volatile double_type p0 = 0.0;
> +  double_type nan = p0 / p0;
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (m5, nan) != 1 || bar (nan, p5) != 1)
> +    __builtin_abort ();
> +  if (bar (nan, nan) != 1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0 || qux (nan) != 1)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/g++.target/i386/pr103973-10.C.jj      2022-01-15 11:48:31.928388111 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-10.C 2022-01-15 12:00:42.393978161 +0100
> @@ -0,0 +1,7 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#include "pr103973-9.C"
> --- gcc/testsuite/g++.target/i386/pr103973-11.C.jj      2022-01-15 11:48:31.929388096 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-11.C 2022-01-15 11:48:46.756176703 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -save-temps -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
> +
> +#define double_type float
> +#include "pr103973-9.C"
> --- gcc/testsuite/g++.target/i386/pr103973-12.C.jj      2022-01-15 11:48:31.931388068 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-12.C 2022-01-15 12:00:42.393978161 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type float
> +#include "pr103973-9.C"
> --- gcc/testsuite/g++.target/i386/pr103973-13.C.jj      2022-01-15 11:48:31.932388054 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-13.C 2022-01-15 11:48:13.484651079 +0100
> @@ -0,0 +1,62 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
> +
> +#include <compare>
> +
> +#ifndef double_type
> +#define double_type double
> +#endif
> +
> +__attribute__((noipa)) auto
> +foo (double_type a, double_type b)
> +{
> +  return a <=> b;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double_type a, double_type b)
> +{
> +  auto c = foo (a, b);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  return 1;
> +}
> +
> +__attribute__((noipa)) auto
> +baz (double_type a)
> +{
> +  return a <=> 0.0f;
> +}
> +
> +__attribute__((noipa)) int
> +qux (double_type a)
> +{
> +  auto c = baz (a);
> +  if (c == std::partial_ordering::greater)
> +    return 1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  return -1;
> +}
> +
> +int
> +main ()
> +{
> +  double_type m5 = -5.0;
> +  double_type p5 = 5.0;
> +  double_type p0 = 0.0;
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/g++.target/i386/pr103973-14.C.jj      2022-01-15 11:48:31.933388039 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-14.C 2022-01-15 12:00:42.393978161 +0100
> @@ -0,0 +1,7 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#include "pr103973-13.C"
> --- gcc/testsuite/g++.target/i386/pr103973-15.C.jj      2022-01-15 11:48:31.934388025 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-15.C 2022-01-15 11:49:07.262884325 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -ffast-math -save-temps -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
> +
> +#define double_type float
> +#include "pr103973-13.C"
> --- gcc/testsuite/g++.target/i386/pr103973-16.C.jj      2022-01-15 11:48:31.935388011 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-16.C 2022-01-15 12:00:42.393978161 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type float
> +#include "pr103973-13.C"
> --- gcc/testsuite/g++.target/i386/pr103973-17.C.jj      2022-01-15 12:09:38.499343432 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-17.C 2022-01-15 12:08:54.276973207 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run { target large_long_double } }
> +// { dg-options "-O2 -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type long double
> +#include "pr103973-1.C"
> --- gcc/testsuite/g++.target/i386/pr103973-18.C.jj      2022-01-15 12:09:41.472301093 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-18.C 2022-01-15 12:09:15.681668382 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run { target large_long_double } }
> +// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type long double
> +#include "pr103973-5.C"
> --- gcc/testsuite/g++.target/i386/pr103973-19.C.jj      2022-01-15 12:09:43.544271589 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-19.C 2022-01-15 12:09:22.726568054 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run { target large_long_double } }
> +// { dg-options "-O2 -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type long double
> +#include "pr103973-9.C"
> --- gcc/testsuite/g++.target/i386/pr103973-20.C.jj      2022-01-15 12:09:46.301232323 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-20.C 2022-01-15 12:09:33.491414751 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run { target large_long_double } }
> +// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type long double
> +#include "pr103973-13.C"
>
>
>         Jakub
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] widening_mul, i386, v2: Improve spaceship expansion on x86 [PR103973]
  2022-01-15 11:22       ` [PATCH] widening_mul, i386, v2: " Jakub Jelinek
  2022-01-15 16:40         ` Uros Bizjak
@ 2022-01-17 12:04         ` Richard Biener
  2022-01-17 12:36           ` Jakub Jelinek
  1 sibling, 1 reply; 8+ messages in thread
From: Richard Biener @ 2022-01-17 12:04 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Uros Bizjak, gcc-patches

On Sat, 15 Jan 2022, Jakub Jelinek wrote:

> On Sat, Jan 15, 2022 at 11:42:55AM +0100, Uros Bizjak wrote:
> > Yes, that would be nice. XFmode is used for long double, and not obsolete.
> 
> Ok, that seems to work.  Compared to the incremental patch I've posted, I
> also had to add handling of the case where we have just
> x == y ? 0 : x < y ? -1 : 1 (both for -ffast-math and non-ffast-math).
> Apparently even that is worth optimizing.
> Tested so far on the new testcases, will run full bootstrap/regtest tonight.
> 
> > > Why?  That seems to be a waste of time to me, unless something uses them
> > > already during expansion.  Because pass_expand::execute
> > > runs:
> > >   /* We need JUMP_LABEL be set in order to redirect jumps, and hence
> > >      split edges which edge insertions might do.  */
> > >   rebuild_jump_labels (get_insns ());
> > > which resets all LABEL_NUSES to 0 (well, to:
> > >       if (LABEL_P (insn))
> > >         LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0);
> > > and then recomputes them and adds JUMP_LABEL if needed:
> > >               JUMP_LABEL (insn) = label;
> > 
> > I was not aware of that detail. Thanks for sharing (and I wonder if
> > all other cases should be removed from the source).
> 
> I guess it depends, for code that can only be called during the expand pass
> dropping it should be just fine, for code that can be called also (or only)
> later I think adding JUMP_LABEL and correct LABEL_NUSES is needed because
> nothing will fix it up afterwards.

I'm noting that

+  /* BB must have no executable statements.  */
+  gimple_stmt_iterator gsi = gsi_after_labels (bb);
+  if (phi_nodes (bb))
+    return false;

disallows blocks with just a virtual PHI which wouldn't be
"executable".  Not sure if anything will break when we fix that.

For code generation we rely on RTL opts to merge compare/scc
and the subsequent branches on -1/0/1/[-2], correct?  I wonder
whether that works on other targets as well or whether a
asm-goto with "optab" UNSPEC text would be more forward looking?
The restriction to scalar floats is probably because with
scalar integers we're doing fine and with vectors we'd need some
very much different tricks, right?

The middle-end changes look OK, I don't see anything that
couldn't be changed if other targets run into problems with
getting similar optimized code.

Thanks,
Richard.

> 2022-01-15  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/103973
> 	* tree-cfg.h (cond_only_block_p): Declare.
> 	* tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
> 	* tree-cfg.c (cond_only_block_p): ... here.  No longer static.
> 	* optabs.def (spaceship_optab): New optab.
> 	* internal-fn.def (SPACESHIP): New internal function.
> 	* internal-fn.h (expand_SPACESHIP): Declare.
> 	* internal-fn.c (expand_PHI): Formatting fix.
> 	(expand_SPACESHIP): New function.
> 	* tree-ssa-math-opts.c (optimize_spaceship): New function.
> 	(math_opts_dom_walker::after_dom_children): Use it.
> 	* config/i386/i386.md (spaceship<mode>3): New define_expand.
> 	* config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
> 	* config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
> 	* doc/md.texi (spaceship@var{m}3): Document.
> 
> 	* gcc.target/i386/pr103973-1.c: New test.
> 	* gcc.target/i386/pr103973-2.c: New test.
> 	* gcc.target/i386/pr103973-3.c: New test.
> 	* gcc.target/i386/pr103973-4.c: New test.
> 	* gcc.target/i386/pr103973-5.c: New test.
> 	* gcc.target/i386/pr103973-6.c: New test.
> 	* gcc.target/i386/pr103973-7.c: New test.
> 	* gcc.target/i386/pr103973-8.c: New test.
> 	* gcc.target/i386/pr103973-9.c: New test.
> 	* gcc.target/i386/pr103973-10.c: New test.
> 	* gcc.target/i386/pr103973-11.c: New test.
> 	* gcc.target/i386/pr103973-12.c: New test.
> 	* gcc.target/i386/pr103973-13.c: New test.
> 	* gcc.target/i386/pr103973-14.c: New test.
> 	* gcc.target/i386/pr103973-15.c: New test.
> 	* gcc.target/i386/pr103973-16.c: New test.
> 	* gcc.target/i386/pr103973-17.c: New test.
> 	* gcc.target/i386/pr103973-18.c: New test.
> 	* gcc.target/i386/pr103973-19.c: New test.
> 	* gcc.target/i386/pr103973-20.c: New test.
> 	* g++.target/i386/pr103973-1.C: New test.
> 	* g++.target/i386/pr103973-2.C: New test.
> 	* g++.target/i386/pr103973-3.C: New test.
> 	* g++.target/i386/pr103973-4.C: New test.
> 	* g++.target/i386/pr103973-5.C: New test.
> 	* g++.target/i386/pr103973-6.C: New test.
> 	* g++.target/i386/pr103973-7.C: New test.
> 	* g++.target/i386/pr103973-8.C: New test.
> 	* g++.target/i386/pr103973-9.C: New test.
> 	* g++.target/i386/pr103973-10.C: New test.
> 	* g++.target/i386/pr103973-11.C: New test.
> 	* g++.target/i386/pr103973-12.C: New test.
> 	* g++.target/i386/pr103973-13.C: New test.
> 	* g++.target/i386/pr103973-14.C: New test.
> 	* g++.target/i386/pr103973-15.C: New test.
> 	* g++.target/i386/pr103973-16.C: New test.
> 	* g++.target/i386/pr103973-17.C: New test.
> 	* g++.target/i386/pr103973-18.C: New test.
> 	* g++.target/i386/pr103973-19.C: New test.
> 	* g++.target/i386/pr103973-20.C: New test.
> 
> --- gcc/tree-cfg.h.jj	2022-01-14 23:57:44.491718086 +0100
> +++ gcc/tree-cfg.h	2022-01-15 09:51:25.359468982 +0100
> @@ -111,6 +111,7 @@ extern basic_block gimple_switch_label_b
>  extern basic_block gimple_switch_default_bb (function *, gswitch *);
>  extern edge gimple_switch_edge (function *, gswitch *, unsigned);
>  extern edge gimple_switch_default_edge (function *, gswitch *);
> +extern bool cond_only_block_p (basic_block);
>  
>  /* Return true if the LHS of a call should be removed.  */
>  
> --- gcc/tree-ssa-phiopt.c.jj	2022-01-14 23:57:44.536717549 +0100
> +++ gcc/tree-ssa-phiopt.c	2022-01-15 09:51:25.361468954 +0100
> @@ -1958,31 +1958,6 @@ minmax_replacement (basic_block cond_bb,
>    return true;
>  }
>  
> -/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
> -
> -static bool
> -cond_only_block_p (basic_block bb)
> -{
> -  /* BB must have no executable statements.  */
> -  gimple_stmt_iterator gsi = gsi_after_labels (bb);
> -  if (phi_nodes (bb))
> -    return false;
> -  while (!gsi_end_p (gsi))
> -    {
> -      gimple *stmt = gsi_stmt (gsi);
> -      if (is_gimple_debug (stmt))
> -	;
> -      else if (gimple_code (stmt) == GIMPLE_NOP
> -	       || gimple_code (stmt) == GIMPLE_PREDICT
> -	       || gimple_code (stmt) == GIMPLE_COND)
> -	;
> -      else
> -	return false;
> -      gsi_next (&gsi);
> -    }
> -  return true;
> -}
> -
>  /* Attempt to optimize (x <=> y) cmp 0 and similar comparisons.
>     For strong ordering <=> try to match something like:
>      <bb 2> :  // cond3_bb (== cond2_bb)
> --- gcc/tree-cfg.c.jj	2022-01-14 23:57:44.477718253 +0100
> +++ gcc/tree-cfg.c	2022-01-15 09:51:25.363468925 +0100
> @@ -9410,6 +9410,31 @@ gimple_switch_default_edge (function *if
>    return gimple_switch_edge (ifun, gs, 0);
>  }
>  
> +/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
> +
> +bool
> +cond_only_block_p (basic_block bb)
> +{
> +  /* BB must have no executable statements.  */
> +  gimple_stmt_iterator gsi = gsi_after_labels (bb);
> +  if (phi_nodes (bb))
> +    return false;
> +  while (!gsi_end_p (gsi))
> +    {
> +      gimple *stmt = gsi_stmt (gsi);
> +      if (is_gimple_debug (stmt))
> +	;
> +      else if (gimple_code (stmt) == GIMPLE_NOP
> +	       || gimple_code (stmt) == GIMPLE_PREDICT
> +	       || gimple_code (stmt) == GIMPLE_COND)
> +	;
> +      else
> +	return false;
> +      gsi_next (&gsi);
> +    }
> +  return true;
> +}
> +
>  
>  /* Emit return warnings.  */
>  
> --- gcc/optabs.def.jj	2022-01-14 23:57:44.445718634 +0100
> +++ gcc/optabs.def	2022-01-15 09:51:25.383468640 +0100
> @@ -259,6 +259,7 @@ OPTAB_D (usubv4_optab, "usubv$I$a4")
>  OPTAB_D (umulv4_optab, "umulv$I$a4")
>  OPTAB_D (negv3_optab, "negv$I$a3")
>  OPTAB_D (addptr3_optab, "addptr$a3")
> +OPTAB_D (spaceship_optab, "spaceship$a3")
>  
>  OPTAB_D (smul_highpart_optab, "smul$a3_highpart")
>  OPTAB_D (umul_highpart_optab, "umul$a3_highpart")
> --- gcc/internal-fn.def.jj	2022-01-14 23:57:44.433718778 +0100
> +++ gcc/internal-fn.def	2022-01-15 09:51:25.399468413 +0100
> @@ -430,6 +430,9 @@ DEF_INTERNAL_FN (NOP, ECF_CONST | ECF_LE
>  /* Temporary vehicle for __builtin_shufflevector.  */
>  DEF_INTERNAL_FN (SHUFFLEVECTOR, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>  
> +/* <=> optimization.  */
> +DEF_INTERNAL_FN (SPACESHIP, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
> +
>  #undef DEF_INTERNAL_INT_FN
>  #undef DEF_INTERNAL_FLT_FN
>  #undef DEF_INTERNAL_FLT_FLOATN_FN
> --- gcc/internal-fn.h.jj	2022-01-14 23:57:44.445718634 +0100
> +++ gcc/internal-fn.h	2022-01-15 09:51:25.399468413 +0100
> @@ -241,6 +241,7 @@ extern void expand_internal_call (gcall
>  extern void expand_internal_call (internal_fn, gcall *);
>  extern void expand_PHI (internal_fn, gcall *);
>  extern void expand_SHUFFLEVECTOR (internal_fn, gcall *);
> +extern void expand_SPACESHIP (internal_fn, gcall *);
>  
>  extern bool vectorized_internal_fn_supported_p (internal_fn, tree);
>  
> --- gcc/internal-fn.c.jj	2022-01-14 23:57:44.433718778 +0100
> +++ gcc/internal-fn.c	2022-01-15 09:51:25.400468399 +0100
> @@ -4425,5 +4425,27 @@ expand_SHUFFLEVECTOR (internal_fn, gcall
>  void
>  expand_PHI (internal_fn, gcall *)
>  {
> -    gcc_unreachable ();
> +  gcc_unreachable ();
> +}
> +
> +void
> +expand_SPACESHIP (internal_fn, gcall *stmt)
> +{
> +  tree lhs = gimple_call_lhs (stmt);
> +  tree rhs1 = gimple_call_arg (stmt, 0);
> +  tree rhs2 = gimple_call_arg (stmt, 1);
> +  tree type = TREE_TYPE (rhs1);
> +
> +  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> +  rtx op1 = expand_normal (rhs1);
> +  rtx op2 = expand_normal (rhs2);
> +
> +  class expand_operand ops[3];
> +  create_output_operand (&ops[0], target, TYPE_MODE (TREE_TYPE (lhs)));
> +  create_input_operand (&ops[1], op1, TYPE_MODE (type));
> +  create_input_operand (&ops[2], op2, TYPE_MODE (type));
> +  insn_code icode = optab_handler (spaceship_optab, TYPE_MODE (type));
> +  expand_insn (icode, 3, ops);
> +  if (!rtx_equal_p (target, ops[0].value))
> +    emit_move_insn (target, ops[0].value);
>  }
> --- gcc/tree-ssa-math-opts.c.jj	2022-01-14 23:57:44.492718074 +0100
> +++ gcc/tree-ssa-math-opts.c	2022-01-15 11:37:13.131069782 +0100
> @@ -4637,6 +4637,227 @@ convert_mult_to_highpart (gassign *stmt,
>    return true;
>  }
>  
> +/* If target has spaceship<MODE>3 expander, pattern recognize
> +   <bb 2> [local count: 1073741824]:
> +   if (a_2(D) == b_3(D))
> +     goto <bb 6>; [34.00%]
> +   else
> +     goto <bb 3>; [66.00%]
> +
> +   <bb 3> [local count: 708669601]:
> +   if (a_2(D) < b_3(D))
> +     goto <bb 6>; [1.04%]
> +   else
> +     goto <bb 4>; [98.96%]
> +
> +   <bb 4> [local count: 701299439]:
> +   if (a_2(D) > b_3(D))
> +     goto <bb 5>; [48.89%]
> +   else
> +     goto <bb 6>; [51.11%]
> +
> +   <bb 5> [local count: 342865295]:
> +
> +   <bb 6> [local count: 1073741824]:
> +   and turn it into:
> +   <bb 2> [local count: 1073741824]:
> +   _1 = .SPACESHIP (a_2(D), b_3(D));
> +   if (_1 == 0)
> +     goto <bb 6>; [34.00%]
> +   else
> +     goto <bb 3>; [66.00%]
> +
> +   <bb 3> [local count: 708669601]:
> +   if (_1 == -1)
> +     goto <bb 6>; [1.04%]
> +   else
> +     goto <bb 4>; [98.96%]
> +
> +   <bb 4> [local count: 701299439]:
> +   if (_1 == 1)
> +     goto <bb 5>; [48.89%]
> +   else
> +     goto <bb 6>; [51.11%]
> +
> +   <bb 5> [local count: 342865295]:
> +
> +   <bb 6> [local count: 1073741824]:
> +   so that the backend can emit optimal comparison and
> +   conditional jump sequence.  */
> +
> +static void
> +optimize_spaceship (gimple *stmt)
> +{
> +  enum tree_code code = gimple_cond_code (stmt);
> +  if (code != EQ_EXPR && code != NE_EXPR)
> +    return;
> +  tree arg1 = gimple_cond_lhs (stmt);
> +  tree arg2 = gimple_cond_rhs (stmt);
> +  if (!SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg1))
> +      || optab_handler (spaceship_optab,
> +			TYPE_MODE (TREE_TYPE (arg1))) == CODE_FOR_nothing
> +      || operand_equal_p (arg1, arg2, 0))
> +    return;
> +
> +  basic_block bb0 = gimple_bb (stmt), bb1, bb2 = NULL;
> +  edge em1 = NULL, e1 = NULL, e2 = NULL;
> +  bb1 = EDGE_SUCC (bb0, 1)->dest;
> +  if (((EDGE_SUCC (bb0, 0)->flags & EDGE_TRUE_VALUE) != 0) ^ (code == EQ_EXPR))
> +    bb1 = EDGE_SUCC (bb0, 0)->dest;
> +
> +  gimple *g = last_stmt (bb1);
> +  if (g == NULL
> +      || gimple_code (g) != GIMPLE_COND
> +      || !single_pred_p (bb1)
> +      || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
> +	  ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
> +	  : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
> +	     || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
> +      || !cond_only_block_p (bb1))
> +    return;
> +
> +  enum tree_code ccode = (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
> +			  ? LT_EXPR : GT_EXPR);
> +  switch (gimple_cond_code (g))
> +    {
> +    case LT_EXPR:
> +    case LE_EXPR:
> +      break;
> +    case GT_EXPR:
> +    case GE_EXPR:
> +      ccode = ccode == LT_EXPR ? GT_EXPR : LT_EXPR;
> +      break;
> +    default:
> +      return;
> +    }
> +
> +  for (int i = 0; i < 2; ++i)
> +    {
> +      /* With NaNs, </<=/>/>= are false, so we need to look for the
> +	 third comparison on the false edge from whatever non-equality
> +	 comparison the second comparison is.  */
> +      if (HONOR_NANS (TREE_TYPE (arg1))
> +	  && (EDGE_SUCC (bb1, i)->flags & EDGE_TRUE_VALUE) != 0)
> +	continue;
> +
> +      bb2 = EDGE_SUCC (bb1, i)->dest;
> +      g = last_stmt (bb2);
> +      if (g == NULL
> +	  || gimple_code (g) != GIMPLE_COND
> +	  || !single_pred_p (bb2)
> +	  || (operand_equal_p (gimple_cond_lhs (g), arg1, 0)
> +	      ? !operand_equal_p (gimple_cond_rhs (g), arg2, 0)
> +	      : (!operand_equal_p (gimple_cond_lhs (g), arg2, 0)
> +		 || !operand_equal_p (gimple_cond_rhs (g), arg1, 0)))
> +	  || !cond_only_block_p (bb2)
> +	  || EDGE_SUCC (bb2, 0)->dest == EDGE_SUCC (bb2, 1)->dest)
> +	continue;
> +
> +      enum tree_code ccode2
> +	= (operand_equal_p (gimple_cond_lhs (g), arg1, 0) ? LT_EXPR : GT_EXPR);
> +      switch (gimple_cond_code (g))
> +	{
> +	case LT_EXPR:
> +	case LE_EXPR:
> +	  break;
> +	case GT_EXPR:
> +	case GE_EXPR:
> +	  ccode2 = ccode2 == LT_EXPR ? GT_EXPR : LT_EXPR;
> +	  break;
> +	default:
> +	  continue;
> +	}
> +      if (HONOR_NANS (TREE_TYPE (arg1)) && ccode == ccode2)
> +	continue;
> +
> +      if ((ccode == LT_EXPR)
> +	  ^ ((EDGE_SUCC (bb1, i)->flags & EDGE_TRUE_VALUE) != 0))
> +	{
> +	  em1 = EDGE_SUCC (bb1, 1 - i);
> +	  e1 = EDGE_SUCC (bb2, 0);
> +	  e2 = EDGE_SUCC (bb2, 1);
> +	  if ((ccode2 == LT_EXPR) ^ ((e1->flags & EDGE_TRUE_VALUE) == 0))
> +	    std::swap (e1, e2);
> +	}
> +      else
> +	{
> +	  e1 = EDGE_SUCC (bb1, 1 - i);
> +	  em1 = EDGE_SUCC (bb2, 0);
> +	  e2 = EDGE_SUCC (bb2, 1);
> +	  if ((ccode2 != LT_EXPR) ^ ((em1->flags & EDGE_TRUE_VALUE) == 0))
> +	    std::swap (em1, e2);
> +	}
> +      break;
> +    }
> +
> +  if (em1 == NULL)
> +    {
> +      if ((ccode == LT_EXPR)
> +	  ^ ((EDGE_SUCC (bb1, 0)->flags & EDGE_TRUE_VALUE) != 0))
> +	{
> +	  em1 = EDGE_SUCC (bb1, 1);
> +	  e1 = EDGE_SUCC (bb1, 0);
> +	  e2 = (e1->flags & EDGE_TRUE_VALUE) ? em1 : e1;
> +	}
> +      else
> +	{
> +	  em1 = EDGE_SUCC (bb1, 0);
> +	  e1 = EDGE_SUCC (bb1, 1);
> +	  e2 = (e1->flags & EDGE_TRUE_VALUE) ? em1 : e1;
> +	}
> +    }
> +
> +  g = gimple_build_call_internal (IFN_SPACESHIP, 2, arg1, arg2);
> +  tree lhs = make_ssa_name (integer_type_node);
> +  gimple_call_set_lhs (g, lhs);
> +  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
> +  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +
> +  gcond *cond = as_a <gcond *> (stmt);
> +  gimple_cond_set_lhs (cond, lhs);
> +  gimple_cond_set_rhs (cond, integer_zero_node);
> +  update_stmt (stmt);
> +
> +  g = last_stmt (bb1);
> +  cond = as_a <gcond *> (g);
> +  gimple_cond_set_lhs (cond, lhs);
> +  if (em1->src == bb1 && e2 != em1)
> +    {
> +      gimple_cond_set_rhs (cond, integer_minus_one_node);
> +      gimple_cond_set_code (cond, (em1->flags & EDGE_TRUE_VALUE)
> +				  ? EQ_EXPR : NE_EXPR);
> +    }
> +  else
> +    {
> +      gcc_assert (e1->src == bb1 && e2 != e1);
> +      gimple_cond_set_rhs (cond, integer_one_node);
> +      gimple_cond_set_code (cond, (e1->flags & EDGE_TRUE_VALUE)
> +				  ? EQ_EXPR : NE_EXPR);
> +    }
> +  update_stmt (g);
> +
> +  if (e2 != e1 && e2 != em1)
> +    {
> +      g = last_stmt (bb2);
> +      cond = as_a <gcond *> (g);
> +      gimple_cond_set_lhs (cond, lhs);
> +      if (em1->src == bb2)
> +	gimple_cond_set_rhs (cond, integer_minus_one_node);
> +      else
> +	{
> +	  gcc_assert (e1->src == bb2);
> +	  gimple_cond_set_rhs (cond, integer_one_node);
> +	}
> +      gimple_cond_set_code (cond,
> +			    (e2->flags & EDGE_TRUE_VALUE) ? NE_EXPR : EQ_EXPR);
> +      update_stmt (g);
> +    }
> +
> +  wide_int wm1 = wi::minus_one (TYPE_PRECISION (integer_type_node));
> +  wide_int w2 = wi::two (TYPE_PRECISION (integer_type_node));
> +  set_range_info (lhs, VR_RANGE, wm1, w2);
> +}
> +
>  
>  /* Find integer multiplications where the operands are extended from
>     smaller types, and replace the MULT_EXPR with a WIDEN_MULT_EXPR
> @@ -4798,6 +5019,8 @@ math_opts_dom_walker::after_dom_children
>  	      break;
>  	    }
>  	}
> +      else if (gimple_code (stmt) == GIMPLE_COND)
> +	optimize_spaceship (stmt);
>        gsi_next (&gsi);
>      }
>    if (fma_state.m_deferring_p
> --- gcc/config/i386/i386.md.jj	2022-01-14 23:57:59.047544505 +0100
> +++ gcc/config/i386/i386.md	2022-01-15 12:13:28.116073760 +0100
> @@ -23886,6 +23886,28 @@ (define_insn "hreset"
>    [(set_attr "type" "other")
>     (set_attr "length" "4")])
>  
> +;; Spaceship optimization
> +(define_expand "spaceship<mode>3"
> +  [(match_operand:SI 0 "register_operand")
> +   (match_operand:MODEF 1 "cmp_fp_expander_operand")
> +   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
> +  "(TARGET_80387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH))
> +   && (TARGET_CMOVE || (TARGET_SAHF && TARGET_USE_SAHF))"
> +{
> +  ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
> +  DONE;
> +})
> +
> +(define_expand "spaceshipxf3"
> +  [(match_operand:SI 0 "register_operand")
> +   (match_operand:XF 1 "nonmemory_operand")
> +   (match_operand:XF 2 "nonmemory_operand")]
> +  "TARGET_80387 && (TARGET_CMOVE || (TARGET_SAHF && TARGET_USE_SAHF))"
> +{
> +  ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
> +  DONE;
> +})
> +
>  (include "mmx.md")
>  (include "sse.md")
>  (include "sync.md")
> --- gcc/config/i386/i386-protos.h.jj	2022-01-14 23:57:44.398719195 +0100
> +++ gcc/config/i386/i386-protos.h	2022-01-15 09:51:25.410468256 +0100
> @@ -150,6 +150,7 @@ extern bool ix86_expand_int_vec_cmp (rtx
>  extern bool ix86_expand_fp_vec_cmp (rtx[]);
>  extern void ix86_expand_sse_movcc (rtx, rtx, rtx, rtx);
>  extern void ix86_expand_sse_unpack (rtx, rtx, bool, bool);
> +extern void ix86_expand_fp_spaceship (rtx, rtx, rtx);
>  extern bool ix86_expand_int_addcc (rtx[]);
>  extern rtx_insn *ix86_expand_call (rtx, rtx, rtx, rtx, rtx, bool);
>  extern bool ix86_call_use_plt_p (rtx);
> --- gcc/config/i386/i386-expand.c.jj	2022-01-14 23:57:44.379719421 +0100
> +++ gcc/config/i386/i386-expand.c	2022-01-15 10:38:26.924333651 +0100
> @@ -2879,6 +2879,54 @@ ix86_expand_setcc (rtx dest, enum rtx_co
>    emit_insn (gen_rtx_SET (dest, ret));
>  }
>  
> +/* Expand floating point op0 <=> op1 if NaNs are honored.  */
> +
> +void
> +ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1)
> +{
> +  gcc_checking_assert (ix86_fp_comparison_strategy (GT) != IX86_FPCMP_ARITH);
> +  rtx gt = ix86_expand_fp_compare (GT, op0, op1);
> +  rtx l0 = gen_label_rtx ();
> +  rtx l1 = gen_label_rtx ();
> +  rtx l2 = TARGET_IEEE_FP ? gen_label_rtx () : NULL_RTX;
> +  rtx lend = gen_label_rtx ();
> +  rtx tmp;
> +  rtx_insn *jmp;
> +  if (l2)
> +    {
> +      rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
> +			       gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> +      tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
> +				  gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
> +      jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +      add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
> +    }
> +  rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
> +			   gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
> +			      gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx);
> +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +  add_reg_br_prob_note (jmp, profile_probability::unlikely ());
> +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt,
> +			      gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx);
> +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +  add_reg_br_prob_note (jmp, profile_probability::even ());
> +  emit_move_insn (dest, constm1_rtx);
> +  emit_jump (lend);
> +  emit_label (l0);
> +  emit_move_insn (dest, const0_rtx);
> +  emit_jump (lend);
> +  emit_label (l1);
> +  emit_move_insn (dest, const1_rtx);
> +  emit_jump (lend);
> +  if (l2)
> +    {
> +      emit_label (l2);
> +      emit_move_insn (dest, const2_rtx);
> +    }
> +  emit_label (lend);
> +}
> +
>  /* Expand comparison setting or clearing carry flag.  Return true when
>     successful and set pop for the operation.  */
>  static bool
> --- gcc/doc/md.texi.jj	2022-01-14 23:57:44.419718944 +0100
> +++ gcc/doc/md.texi	2022-01-15 09:51:25.429467985 +0100
> @@ -8055,6 +8055,15 @@ inclusive and operand 1 exclusive.
>  If this pattern is not defined, a call to the library function
>  @code{__clear_cache} is used.
>  
> +@cindex @code{spaceship@var{m}3} instruction pattern
> +@item @samp{spaceship@var{m}3}
> +Initialize output operand 0 with mode of integer type to -1, 0, 1 or 2
> +if operand 1 with mode @var{m} compares less than operand 2, equal to
> +operand 2, greater than operand 2 or is unordered with operand 2.
> +@var{m} should be a scalar floating point mode.
> +
> +This pattern is not allowed to @code{FAIL}.
> +
>  @end table
>  
>  @end ifset
> --- gcc/testsuite/gcc.target/i386/pr103973-1.c.jj	2022-01-15 09:51:25.430467971 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-1.c	2022-01-15 09:51:25.430467971 +0100
> @@ -0,0 +1,98 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
> +
> +__attribute__((noipa)) int m1 (void) { return -1; }
> +__attribute__((noipa)) int p0 (void) { return 0; }
> +__attribute__((noipa)) int p1 (void) { return 1; }
> +__attribute__((noipa)) int p2 (void) { return 2; }
> +
> +__attribute__((noipa)) int
> +foo (double a, double b)
> +{
> +  if (a == b)
> +    return 0;
> +  if (a < b)
> +    return -1;
> +  if (a > b)
> +    return 1;
> +  return 2;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (a < b)
> +    return m1 ();
> +  if (a > b)
> +    return p1 ();
> +  return p2 ();
> +}
> +
> +__attribute__((noipa)) int
> +baz (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (b < a)
> +    return p1 ();
> +  if (a < b)
> +    return m1 ();
> +  return p2 ();
> +}
> +
> +__attribute__((noipa)) int
> +qux (double a)
> +{
> +  if (a != 0.0f)
> +    {
> +      if (a <= 0.0f)
> +	return -1;
> +      if (a >= 0.0f)
> +	return 1;
> +      return 2;
> +    }
> +  return 0;
> +}
> +
> +int
> +main ()
> +{
> +  double m5 = -5.0f;
> +  double p5 = 5.0f;
> +  volatile double p0 = 0.0f;
> +  double nan = p0 / p0;
> +  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (foo (m5, nan) != 2 || foo (nan, p5) != 2)
> +    __builtin_abort ();
> +  if (foo (nan, nan) != 2)
> +    __builtin_abort ();
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (m5, nan) != 2 || bar (nan, p5) != 2)
> +    __builtin_abort ();
> +  if (bar (nan, nan) != 2)
> +    __builtin_abort ();
> +  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (baz (m5, nan) != 2 || baz (nan, p5) != 2)
> +    __builtin_abort ();
> +  if (baz (nan, nan) != 2)
> +    __builtin_abort ();
> +  if (qux (p0) != 0 || qux (nan) != 2)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.target/i386/pr103973-2.c.jj	2022-01-15 09:51:25.430467971 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-2.c	2022-01-15 12:00:15.864355970 +0100
> @@ -0,0 +1,7 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#include "pr103973-1.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-3.c.jj	2022-01-15 09:51:25.430467971 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-3.c	2022-01-15 09:51:25.430467971 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
> +
> +#define double float
> +#include "pr103973-1.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-4.c.jj	2022-01-15 09:51:25.430467971 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-4.c	2022-01-15 12:00:15.864355970 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double float
> +#include "pr103973-1.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-5.c.jj	2022-01-15 11:04:02.427452420 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-5.c	2022-01-15 11:06:39.594216502 +0100
> @@ -0,0 +1,85 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
> +
> +__attribute__((noipa)) int m1 (void) { return -1; }
> +__attribute__((noipa)) int p0 (void) { return 0; }
> +__attribute__((noipa)) int p1 (void) { return 1; }
> +__attribute__((noipa)) int p2 (void) { return 2; }
> +
> +__attribute__((noipa)) int
> +foo (double a, double b)
> +{
> +  if (a == b)
> +    return 0;
> +  if (a < b)
> +    return -1;
> +  if (a > b)
> +    return 1;
> +  return 2;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (a < b)
> +    return m1 ();
> +  if (a > b)
> +    return p1 ();
> +  return p2 ();
> +}
> +
> +__attribute__((noipa)) int
> +baz (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (b < a)
> +    return p1 ();
> +  if (a < b)
> +    return m1 ();
> +  return p2 ();
> +}
> +
> +__attribute__((noipa)) int
> +qux (double a)
> +{
> +  if (a != 0.0f)
> +    {
> +      if (a <= 0.0f)
> +	return -1;
> +      if (a >= 0.0f)
> +	return 1;
> +      return 2;
> +    }
> +  return 0;
> +}
> +
> +int
> +main ()
> +{
> +  double m5 = -5.0f;
> +  double p5 = 5.0f;
> +  double p0 = 0.0f;
> +  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.target/i386/pr103973-6.c.jj	2022-01-15 11:05:24.377286081 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-6.c	2022-01-15 12:00:15.864355970 +0100
> @@ -0,0 +1,7 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#include "pr103973-5.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-7.c.jj	2022-01-15 11:05:28.620225748 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-7.c	2022-01-15 11:06:03.899724076 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
> +
> +#define double float
> +#include "pr103973-5.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-8.c.jj	2022-01-15 11:05:32.273173801 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-8.c	2022-01-15 12:00:15.865355956 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double float
> +#include "pr103973-5.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-9.c.jj	2022-01-15 11:41:11.895661977 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-9.c	2022-01-15 11:43:31.718668421 +0100
> @@ -0,0 +1,89 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
> +
> +__attribute__((noipa)) int m1 (void) { return -1; }
> +__attribute__((noipa)) int p0 (void) { return 0; }
> +__attribute__((noipa)) int p1 (void) { return 1; }
> +
> +__attribute__((noipa)) int
> +foo (double a, double b)
> +{
> +  if (a == b)
> +    return 0;
> +  if (a < b)
> +    return -1;
> +  return 1;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (a < b)
> +    return m1 ();
> +  return p1 ();
> +}
> +
> +__attribute__((noipa)) int
> +baz (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (b < a)
> +    return p1 ();
> +  return m1 ();
> +}
> +
> +__attribute__((noipa)) int
> +qux (double a)
> +{
> +  if (a != 0.0f)
> +    {
> +      if (a <= 0.0f)
> +	return -1;
> +      return 1;
> +    }
> +  return 0;
> +}
> +
> +int
> +main ()
> +{
> +  double m5 = -5.0f;
> +  double p5 = 5.0f;
> +  volatile double p0 = 0.0f;
> +  double nan = p0 / p0;
> +  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (foo (m5, nan) != 1 || foo (nan, p5) != 1)
> +    __builtin_abort ();
> +  if (foo (nan, nan) != 1)
> +    __builtin_abort ();
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (m5, nan) != 1 || bar (nan, p5) != 1)
> +    __builtin_abort ();
> +  if (bar (nan, nan) != 1)
> +    __builtin_abort ();
> +  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (baz (m5, nan) != -1 || baz (nan, p5) != -1)
> +    __builtin_abort ();
> +  if (baz (nan, nan) != -1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0 || qux (nan) != 1)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.target/i386/pr103973-10.c.jj	2022-01-15 11:44:56.503459584 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-10.c	2022-01-15 12:00:15.865355956 +0100
> @@ -0,0 +1,7 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#include "pr103973-9.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-11.c.jj	2022-01-15 11:44:56.504459570 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-11.c	2022-01-15 11:45:08.783284502 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
> +
> +#define double float
> +#include "pr103973-9.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-12.c.jj	2022-01-15 11:44:56.506459542 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-12.c	2022-01-15 12:00:15.865355956 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double float
> +#include "pr103973-9.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-13.c.jj	2022-01-15 11:44:56.507459527 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-13.c	2022-01-15 11:44:19.254990661 +0100
> @@ -0,0 +1,76 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomisd" 4 { target { ! ia32 } } } } */
> +
> +__attribute__((noipa)) int m1 (void) { return -1; }
> +__attribute__((noipa)) int p0 (void) { return 0; }
> +__attribute__((noipa)) int p1 (void) { return 1; }
> +
> +__attribute__((noipa)) int
> +foo (double a, double b)
> +{
> +  if (a == b)
> +    return 0;
> +  if (a < b)
> +    return -1;
> +  return 1;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (a < b)
> +    return m1 ();
> +  return p1 ();
> +}
> +
> +__attribute__((noipa)) int
> +baz (double a, double b)
> +{
> +  if (a == b)
> +    return p0 ();
> +  if (b < a)
> +    return p1 ();
> +  return m1 ();
> +}
> +
> +__attribute__((noipa)) int
> +qux (double a)
> +{
> +  if (a != 0.0f)
> +    {
> +      if (a <= 0.0f)
> +	return -1;
> +      return 1;
> +    }
> +  return 0;
> +}
> +
> +int
> +main ()
> +{
> +  double m5 = -5.0f;
> +  double p5 = 5.0f;
> +  double p0 = 0.0f;
> +  if (foo (p5, p5) != 0 || foo (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (foo (m5, p5) != -1 || foo (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (baz (p5, p5) != 0 || baz (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (baz (m5, p5) != -1 || baz (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.target/i386/pr103973-14.c.jj	2022-01-15 11:44:56.508459513 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-14.c	2022-01-15 12:00:15.865355956 +0100
> @@ -0,0 +1,7 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#include "pr103973-13.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-15.c.jj	2022-01-15 11:44:56.509459499 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-15.c	2022-01-15 11:45:27.532017186 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "\tcomiss" 4 { target { ! ia32 } } } } */
> +
> +#define double float
> +#include "pr103973-13.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-16.c.jj	2022-01-15 11:44:56.510459485 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-16.c	2022-01-15 12:00:15.865355956 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double float
> +#include "pr103973-13.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-17.c.jj	2022-01-15 12:01:30.713290043 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-17.c	2022-01-15 12:08:07.244642996 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run { target large_long_double } } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double long double
> +#include "pr103973-1.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-18.c.jj	2022-01-15 12:04:28.332760546 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-18.c	2022-01-15 12:08:13.633552013 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run { target large_long_double } } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double long double
> +#include "pr103973-5.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-19.c.jj	2022-01-15 12:04:31.235719206 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-19.c	2022-01-15 12:08:18.792478544 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run { target large_long_double } } */
> +/* { dg-options "-O2 -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double long double
> +#include "pr103973-9.c"
> --- gcc/testsuite/gcc.target/i386/pr103973-20.c.jj	2022-01-15 12:04:34.648670603 +0100
> +++ gcc/testsuite/gcc.target/i386/pr103973-20.c	2022-01-15 12:08:26.220372764 +0100
> @@ -0,0 +1,8 @@
> +/* PR target/103973 */
> +/* { dg-do run { target large_long_double } } */
> +/* { dg-options "-O2 -ffast-math -save-temps" } */
> +/* { dg-final { scan-assembler-not "'\tfucom" } } */
> +/* { dg-final { scan-assembler-times "\tfcom" 4 } } */
> +
> +#define double long double
> +#include "pr103973-13.c"
> --- gcc/testsuite/g++.target/i386/pr103973-1.C.jj	2022-01-15 09:51:25.443467786 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-1.C	2022-01-15 09:51:25.443467786 +0100
> @@ -0,0 +1,71 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
> +
> +#include <compare>
> +
> +#ifndef double_type
> +#define double_type double
> +#endif
> +
> +__attribute__((noipa)) auto
> +foo (double_type a, double_type b)
> +{
> +  return a <=> b;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double_type a, double_type b)
> +{
> +  auto c = foo (a, b);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  if (c == std::partial_ordering::greater)
> +    return 1;
> +  return 2;
> +}
> +
> +__attribute__((noipa)) auto
> +baz (double_type a)
> +{
> +  return a <=> 0.0f;
> +}
> +
> +__attribute__((noipa)) int
> +qux (double_type a)
> +{
> +  auto c = baz (a);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  if (c == std::partial_ordering::greater)
> +    return 1;
> +  return 2;
> +}
> +
> +int
> +main ()
> +{
> +  double_type m5 = -5.0;
> +  double_type p5 = 5.0;
> +  volatile double_type p0 = 0.0;
> +  double_type nan = p0 / p0;
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (m5, nan) != 2 || bar (nan, p5) != 2)
> +    __builtin_abort ();
> +  if (bar (nan, nan) != 2)
> +    __builtin_abort ();
> +  if (qux (p0) != 0 || qux (nan) != 2)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/g++.target/i386/pr103973-2.C.jj	2022-01-15 09:51:25.443467786 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-2.C	2022-01-15 12:00:42.392978175 +0100
> @@ -0,0 +1,7 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#include "pr103973-1.C"
> --- gcc/testsuite/g++.target/i386/pr103973-3.C.jj	2022-01-15 09:51:25.443467786 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-3.C	2022-01-15 09:51:25.443467786 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -save-temps -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
> +
> +#define double_type float
> +#include "pr103973-1.C"
> --- gcc/testsuite/g++.target/i386/pr103973-4.C.jj	2022-01-15 09:51:25.443467786 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-4.C	2022-01-15 12:00:42.392978175 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type float
> +#include "pr103973-1.C"
> --- gcc/testsuite/g++.target/i386/pr103973-5.C.jj	2022-01-15 11:07:17.398678932 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-5.C	2022-01-15 11:07:48.314239313 +0100
> @@ -0,0 +1,66 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
> +
> +#include <compare>
> +
> +#ifndef double_type
> +#define double_type double
> +#endif
> +
> +__attribute__((noipa)) auto
> +foo (double_type a, double_type b)
> +{
> +  return a <=> b;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double_type a, double_type b)
> +{
> +  auto c = foo (a, b);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  if (c == std::partial_ordering::greater)
> +    return 1;
> +  return 2;
> +}
> +
> +__attribute__((noipa)) auto
> +baz (double_type a)
> +{
> +  return a <=> 0.0f;
> +}
> +
> +__attribute__((noipa)) int
> +qux (double_type a)
> +{
> +  auto c = baz (a);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  if (c == std::partial_ordering::greater)
> +    return 1;
> +  return 2;
> +}
> +
> +int
> +main ()
> +{
> +  double_type m5 = -5.0;
> +  double_type p5 = 5.0;
> +  double_type p0 = 0.0;
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/g++.target/i386/pr103973-6.C.jj	2022-01-15 11:08:07.181971016 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-6.C	2022-01-15 12:00:42.392978175 +0100
> @@ -0,0 +1,7 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#include "pr103973-5.C"
> --- gcc/testsuite/g++.target/i386/pr103973-7.C.jj	2022-01-15 11:08:10.054930163 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-7.C	2022-01-15 11:08:39.354513526 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -ffast-math -save-temps -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
> +
> +#define double_type float
> +#include "pr103973-5.C"
> --- gcc/testsuite/g++.target/i386/pr103973-8.C.jj	2022-01-15 11:08:13.064887361 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-8.C	2022-01-15 12:00:42.392978175 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type float
> +#include "pr103973-5.C"
> --- gcc/testsuite/g++.target/i386/pr103973-9.C.jj	2022-01-15 11:46:15.455333909 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-9.C	2022-01-15 11:47:00.152696626 +0100
> @@ -0,0 +1,67 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
> +
> +#include <compare>
> +
> +#ifndef double_type
> +#define double_type double
> +#endif
> +
> +__attribute__((noipa)) auto
> +foo (double_type a, double_type b)
> +{
> +  return a <=> b;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double_type a, double_type b)
> +{
> +  auto c = foo (a, b);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  return 1;
> +}
> +
> +__attribute__((noipa)) auto
> +baz (double_type a)
> +{
> +  return a <=> 0.0f;
> +}
> +
> +__attribute__((noipa)) int
> +qux (double_type a)
> +{
> +  auto c = baz (a);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  return 1;
> +}
> +
> +int
> +main ()
> +{
> +  double_type m5 = -5.0;
> +  double_type p5 = 5.0;
> +  volatile double_type p0 = 0.0;
> +  double_type nan = p0 / p0;
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (bar (m5, nan) != 1 || bar (nan, p5) != 1)
> +    __builtin_abort ();
> +  if (bar (nan, nan) != 1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0 || qux (nan) != 1)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/g++.target/i386/pr103973-10.C.jj	2022-01-15 11:48:31.928388111 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-10.C	2022-01-15 12:00:42.393978161 +0100
> @@ -0,0 +1,7 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#include "pr103973-9.C"
> --- gcc/testsuite/g++.target/i386/pr103973-11.C.jj	2022-01-15 11:48:31.929388096 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-11.C	2022-01-15 11:48:46.756176703 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -save-temps -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
> +
> +#define double_type float
> +#include "pr103973-9.C"
> --- gcc/testsuite/g++.target/i386/pr103973-12.C.jj	2022-01-15 11:48:31.931388068 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-12.C	2022-01-15 12:00:42.393978161 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type float
> +#include "pr103973-9.C"
> --- gcc/testsuite/g++.target/i386/pr103973-13.C.jj	2022-01-15 11:48:31.932388054 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-13.C	2022-01-15 11:48:13.484651079 +0100
> @@ -0,0 +1,62 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tucomisd" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomisd" 2 { target { ! ia32 } } } }
> +
> +#include <compare>
> +
> +#ifndef double_type
> +#define double_type double
> +#endif
> +
> +__attribute__((noipa)) auto
> +foo (double_type a, double_type b)
> +{
> +  return a <=> b;
> +}
> +
> +__attribute__((noipa)) int
> +bar (double_type a, double_type b)
> +{
> +  auto c = foo (a, b);
> +  if (c == std::partial_ordering::less)
> +    return -1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  return 1;
> +}
> +
> +__attribute__((noipa)) auto
> +baz (double_type a)
> +{
> +  return a <=> 0.0f;
> +}
> +
> +__attribute__((noipa)) int
> +qux (double_type a)
> +{
> +  auto c = baz (a);
> +  if (c == std::partial_ordering::greater)
> +    return 1;
> +  if (c == std::partial_ordering::equivalent)
> +    return 0;
> +  return -1;
> +}
> +
> +int
> +main ()
> +{
> +  double_type m5 = -5.0;
> +  double_type p5 = 5.0;
> +  double_type p0 = 0.0;
> +  if (bar (p5, p5) != 0 || bar (m5, m5) != 0)
> +    __builtin_abort ();
> +  if (bar (m5, p5) != -1 || bar (p5, m5) != 1)
> +    __builtin_abort ();
> +  if (qux (p0) != 0)
> +    __builtin_abort ();
> +  if (qux (m5) != -1 || qux (p5) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/g++.target/i386/pr103973-14.C.jj	2022-01-15 11:48:31.933388039 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-14.C	2022-01-15 12:00:42.393978161 +0100
> @@ -0,0 +1,7 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#include "pr103973-13.C"
> --- gcc/testsuite/g++.target/i386/pr103973-15.C.jj	2022-01-15 11:48:31.934388025 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-15.C	2022-01-15 11:49:07.262884325 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run }
> +// { dg-options "-O2 -ffast-math -save-temps -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tucomiss" { target { ! ia32 } } } }
> +// { dg-final { scan-assembler-times "\tcomiss" 2 { target { ! ia32 } } } }
> +
> +#define double_type float
> +#include "pr103973-13.C"
> --- gcc/testsuite/g++.target/i386/pr103973-16.C.jj	2022-01-15 11:48:31.935388011 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-16.C	2022-01-15 12:00:42.393978161 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do compile { target ia32 } }
> +// { dg-options "-O2 -ffast-math -march=i686 -mfpmath=387 -std=c++20" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type float
> +#include "pr103973-13.C"
> --- gcc/testsuite/g++.target/i386/pr103973-17.C.jj	2022-01-15 12:09:38.499343432 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-17.C	2022-01-15 12:08:54.276973207 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run { target large_long_double } }
> +// { dg-options "-O2 -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type long double
> +#include "pr103973-1.C"
> --- gcc/testsuite/g++.target/i386/pr103973-18.C.jj	2022-01-15 12:09:41.472301093 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-18.C	2022-01-15 12:09:15.681668382 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run { target large_long_double } }
> +// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type long double
> +#include "pr103973-5.C"
> --- gcc/testsuite/g++.target/i386/pr103973-19.C.jj	2022-01-15 12:09:43.544271589 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-19.C	2022-01-15 12:09:22.726568054 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run { target large_long_double } }
> +// { dg-options "-O2 -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type long double
> +#include "pr103973-9.C"
> --- gcc/testsuite/g++.target/i386/pr103973-20.C.jj	2022-01-15 12:09:46.301232323 +0100
> +++ gcc/testsuite/g++.target/i386/pr103973-20.C	2022-01-15 12:09:33.491414751 +0100
> @@ -0,0 +1,8 @@
> +// PR target/103973
> +// { dg-do run { target large_long_double } }
> +// { dg-options "-O2 -ffast-math -std=c++20 -save-temps" }
> +// { dg-final { scan-assembler-not "'\tfucom" } }
> +// { dg-final { scan-assembler-times "\tfcom" 2 } }
> +
> +#define double_type long double
> +#include "pr103973-13.C"
> 
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] widening_mul, i386, v2: Improve spaceship expansion on x86 [PR103973]
  2022-01-17 12:04         ` Richard Biener
@ 2022-01-17 12:36           ` Jakub Jelinek
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Jelinek @ 2022-01-17 12:36 UTC (permalink / raw)
  To: Richard Biener; +Cc: Uros Bizjak, gcc-patches

On Mon, Jan 17, 2022 at 01:04:40PM +0100, Richard Biener wrote:
> > I guess it depends, for code that can only be called during the expand pass
> > dropping it should be just fine, for code that can be called also (or only)
> > later I think adding JUMP_LABEL and correct LABEL_NUSES is needed because
> > nothing will fix it up afterwards.
> 
> I'm noting that
> 
> +  /* BB must have no executable statements.  */
> +  gimple_stmt_iterator gsi = gsi_after_labels (bb);
> +  if (phi_nodes (bb))
> +    return false;
> 
> disallows blocks with just a virtual PHI which wouldn't be
> "executable".  Not sure if anything will break when we fix that.

Note I'm only moving the existing function from phiopt to tree-cfg.c
so that I can use it from tree-ssa-math-opts.c.  But all the
cond_only_block_p uses in phiopt and now in tree-ssa-maht-opts.c too
only call it on single_pred_p (bb) basic blocks, so I don't see
what the virtual PHI on those would be good for.

> For code generation we rely on RTL opts to merge compare/scc
> and the subsequent branches on -1/0/1/[-2], correct?  I wonder
> whether that works on other targets as well or whether a
> asm-goto with "optab" UNSPEC text would be more forward looking?

Yes, we rely on some RTL opts, like we rely on it for e.g. the overflow
builtins or various other cases and they seem to be doing their job
well on my testing.  Initially I thought the optab would have 6 arguments,
2 comparison args and 4 labels and I'd emit a switch in the
tree-ssa-math-opts.c (I even wrote such code).  But it didn't work really
well, the switch in some cases wasn't really optimized, and optimization
passes after the widening_mul liked e.g. to propagate the .SPACESHIP
lhs into some but not all the PHI args if there were any etc.
Emitting a function that returns -1/0/1/2 worked better, especially if
the target attempts to emit it as a series of conditional jumps
with small bbs that just set those values.  RTL opts later on will
merge the jumps with further jumps that test the .SPACESHIP result,
or will turn some of the conditional jumps into scc etc.

> The restriction to scalar floats is probably because with
> scalar integers we're doing fine and with vectors we'd need some
> very much different tricks, right?

Sure, for vectors we couldn't use branches etc.  
I'm not really sure how would one write a vector version of the
spaceship actually.  The primary use case is C++ with <=>, but <=>
returns std::*_ordering which is an aggregate and one that isn't
very easy to turn into an integer even, switch doesn't work,
only if (... == std::partial_ordering::equality) ... else if (...
(unless I'm missing something).
But even in C, maybe:
typedef float V __attribute__((vector_size (16)));
typedef int W __attribute__((vector_size (16)));

W
foo (V x, V y)
{
  return (x != y) & (((x < y) & (W) { -1, -1, -1, -1 }) | ((x > y) & (W) { 1, 1, 1, 1 }) | ((W) { 2, 2, 2, 2 } & ~(x < y) & ~(x > y)));
}

but it isn't clear how I'd optimize it at the assembly, where
we currently emit:
        vbroadcastss    .LC2(%rip), %xmm2
        vcmpltps        %xmm0, %xmm1, %xmm3
        vbroadcastss    .LC3(%rip), %xmm4
        vpand   %xmm3, %xmm2, %xmm2
        vcmpltps        %xmm1, %xmm0, %xmm3
        vpor    %xmm3, %xmm2, %xmm2
        vcmpneq_oqps    %xmm1, %xmm0, %xmm3
        vcmpneqps       %xmm1, %xmm0, %xmm0
        vpandn  %xmm4, %xmm3, %xmm3
        vpor    %xmm3, %xmm2, %xmm2
        vpand   %xmm0, %xmm2, %xmm0
        ret
.LC2:
        .long   1
        .align 4
.LC3:
        .long   2

> The middle-end changes look OK, I don't see anything that
> couldn't be changed if other targets run into problems with
> getting similar optimized code.

Thanks.

	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-01-17 12:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-14 22:56 [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973] Jakub Jelinek
2022-01-15  8:29 ` Uros Bizjak
2022-01-15  9:56   ` Jakub Jelinek
2022-01-15 10:42     ` Uros Bizjak
2022-01-15 11:22       ` [PATCH] widening_mul, i386, v2: " Jakub Jelinek
2022-01-15 16:40         ` Uros Bizjak
2022-01-17 12:04         ` Richard Biener
2022-01-17 12:36           ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).