[PATCH v7] Match: Support more form for scalar unsigned SAT

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD
@ 2024-06-06 13:37 pan2.li
  2024-06-06 14:04 ` Richard Biener
  0 siblings, 1 reply; 9+ messages in thread
From: pan2.li @ 2024-06-06 13:37 UTC (permalink / raw)
  To: gcc-patches
  Cc: juzhe.zhong, kito.cheng, tamar.christina, richard.guenther, Pan Li

From: Pan Li <pan2.li@intel.com>

After we support one gassign form of the unsigned .SAT_ADD,  we
would like to support more forms including both the branch and
branchless.  There are 5 other forms of .SAT_ADD,  list as below:

Form 1:
  #define SAT_ADD_U_1(T) \
  T sat_add_u_1_##T(T x, T y) \
  { \
    return (T)(x + y) >= x ? (x + y) : -1; \
  }

Form 2:
  #define SAT_ADD_U_2(T) \
  T sat_add_u_2_##T(T x, T y) \
  { \
    T ret; \
    T overflow = __builtin_add_overflow (x, y, &ret); \
    return (T)(-overflow) | ret; \
  }

Form 3:
  #define SAT_ADD_U_3(T) \
  T sat_add_u_3_##T (T x, T y) \
  { \
    T ret; \
    return __builtin_add_overflow (x, y, &ret) ? -1 : ret; \
  }

Form 4:
  #define SAT_ADD_U_4(T) \
  T sat_add_u_4_##T (T x, T y) \
  { \
    T ret; \
    return __builtin_add_overflow (x, y, &ret) == 0 ? ret : -1; \
  }

Form 5:
  #define SAT_ADD_U_5(T) \
  T sat_add_u_5_##T(T x, T y) \
  { \
    return (T)(x + y) < x ? -1 : (x + y); \
  }

Take the forms 3 of above as example:

uint64_t
sat_add (uint64_t x, uint64_t y)
{
  uint64_t ret;
  return __builtin_add_overflow (x, y, &ret) ? -1 : ret;
}

Before this patch:
uint64_t sat_add (uint64_t x, uint64_t y)
{
  long unsigned int _1;
  long unsigned int _2;
  uint64_t _3;
  __complex__ long unsigned int _6;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _6 = .ADD_OVERFLOW (x_4(D), y_5(D));
  _2 = IMAGPART_EXPR <_6>;
  if (_2 != 0)
    goto <bb 4>; [35.00%]
  else
    goto <bb 3>; [65.00%]
;;    succ:       4
;;                3

;;   basic block 3, loop depth 0
;;    pred:       2
  _1 = REALPART_EXPR <_6>;
;;    succ:       4

;;   basic block 4, loop depth 0
;;    pred:       3
;;                2
  # _3 = PHI <_1(3), 18446744073709551615(2)>
  return _3;
;;    succ:       EXIT
}

After this patch:
uint64_t sat_add (uint64_t x, uint64_t y)
{
  long unsigned int _12;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _12 = .SAT_ADD (x_4(D), y_5(D)); [tail call]
  return _12;
;;    succ:       EXIT
}

The flag '^' acts on cond_expr will generate matching code similar as below:

else if (gphi *_a1 = dyn_cast <gphi *> (_d1))
  {
    basic_block _b1 = gimple_bb (_a1);
    if (gimple_phi_num_args (_a1) == 2)
      {
        basic_block _pb_0_1 = EDGE_PRED (_b1, 0)->src;
        basic_block _pb_1_1 = EDGE_PRED (_b1, 1)->src;
        basic_block _db_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_1))
                            ? _pb_0_1 : _pb_1_1;
        basic_block _other_db_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_1))
                                  ? _pb_1_1 : _pb_0_1;
        gcond *_ct_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_db_1));
        if (_ct_1 && EDGE_COUNT (_other_db_1->preds) == 1
          && EDGE_COUNT (_other_db_1->succs) == 1
          && EDGE_PRED (_other_db_1, 0)->src == _db_1)
          {
            tree _cond_lhs_1 = gimple_cond_lhs (_ct_1);
            tree _cond_rhs_1 = gimple_cond_rhs (_ct_1);
            tree _p0 = build2 (gimple_cond_code (_ct_1), boolean_type_node,
                               _cond_lhs_1, _cond_rhs_1);
            bool _arg_0_is_true_1 = gimple_phi_arg_edge (_a1, 0)->flags & EDGE_TRUE_VALUE;
            tree _p1 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 0 : 1);
            tree _p2 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 1 : 0);
            ....

The below test suites are passed for this patch.
* The x86 bootstrap test.
* The x86 fully regression test.
* The riscv fully regression test.

gcc/ChangeLog:

	* doc/match-and-simplify.texi: Add doc for the matching flag '^'.
	* genmatch.cc (cmp_operand): Add match_phi comparation.
	(dt_node::gen_kids_1): Add cond_expr bool flag for phi match.
	(dt_operand::gen_phi_on_cond): Add new func to gen phi matching
	on cond_expr.
	(parser::parse_expr): Add handling for the expr flag '^'.
	* match.pd: Add more form for unsigned .SAT_ADD.
	* tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add
	new func impl to build call for phi gimple.
	(match_unsigned_saturation_add): Add new func impl to match the
	.SAT_ADD for phi gimple.
	(math_opts_dom_walker::after_dom_children): Add phi matching
	try for all gimple phi stmt.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/doc/match-and-simplify.texi |  16 ++++
 gcc/genmatch.cc                 | 126 +++++++++++++++++++++++++++++++-
 gcc/match.pd                    |  43 ++++++++++-
 gcc/tree-ssa-math-opts.cc       |  56 +++++++++++++-
 4 files changed, 236 insertions(+), 5 deletions(-)

diff --git a/gcc/doc/match-and-simplify.texi b/gcc/doc/match-and-simplify.texi
index 01f19e2f62c..63d5af159f5 100644
--- a/gcc/doc/match-and-simplify.texi
+++ b/gcc/doc/match-and-simplify.texi
@@ -361,6 +361,22 @@ Usually the types of the generated result expressions are
 determined from the context, but sometimes like in the above case
 it is required that you specify them explicitly.
 
+Another modifier for generated expressions is @code{^} which
+tells the machinery to try more matches for some special cases.
+For example, normally the @code{cond} only allows the gimple
+assign when matching.  It will also try to match the gimple @code{PHI}
+besides gimple assign if appending the @code{^} to the @code{cond}.
+Aka @code{cond^}.  Consider below example
+
+@smallexample
+(match (unsigned_sat_add @@0 @@1)
+ (cond^ (ge (plus:c@@2 @@0 @@1) @@0) @@2 integer_minus_onep))
+@end smallexample
+
+The above matching will generate the predicate function named
+@code{gimple_unsigned_sat_add} that accepts both the gimple
+assign and gimple @code{PHI}.
+
 Another modifier for generated expressions is @code{!} which
 tells the machinery to only consider the simplification in case
 the marked expression simplified to a simple operand.  Consider
diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
index f1e0e7abe0c..a56bd90cb2c 100644
--- a/gcc/genmatch.cc
+++ b/gcc/genmatch.cc
@@ -848,12 +848,13 @@ public:
     : operand (OP_EXPR, loc), operation (operation_),
       ops (vNULL), expr_type (NULL), is_commutative (is_commutative_),
       is_generic (false), force_single_use (false), force_leaf (false),
-      opt_grp (0) {}
+      match_phi (false), opt_grp (0) {}
   expr (expr *e)
     : operand (OP_EXPR, e->location), operation (e->operation),
       ops (vNULL), expr_type (e->expr_type), is_commutative (e->is_commutative),
       is_generic (e->is_generic), force_single_use (e->force_single_use),
-      force_leaf (e->force_leaf), opt_grp (e->opt_grp) {}
+      force_leaf (e->force_leaf), match_phi (e->match_phi),
+      opt_grp (e->opt_grp) {}
   void append_op (operand *op) { ops.safe_push (op); }
   /* The operator and its operands.  */
   id_base *operation;
@@ -871,6 +872,8 @@ public:
   /* Whether in the result expression this should be a leaf node
      with any children simplified down to simple operands.  */
   bool force_leaf;
+  /* Whether the COND_EXPR is matching PHI gimple,  default false.  */
+  bool match_phi;
   /* If non-zero, the group for optional handling.  */
   unsigned char opt_grp;
   void gen_transform (FILE *f, int, const char *, bool, int,
@@ -1819,6 +1822,7 @@ public:
 
   char *get_name (char *);
   void gen_opname (char *, unsigned);
+  void gen_phi_on_cond (FILE *, int, int);
 };
 
 /* Leaf node of the decision tree, used for DT_SIMPLIFY.  */
@@ -1898,7 +1902,8 @@ cmp_operand (operand *o1, operand *o2)
       expr *e1 = static_cast<expr *>(o1);
       expr *e2 = static_cast<expr *>(o2);
       if (e1->operation != e2->operation
-	  || e1->is_generic != e2->is_generic)
+	  || e1->is_generic != e2->is_generic
+	  || e1->match_phi != e2->match_phi)
 	return false;
       if (e1->operation->kind == id_base::FN
 	  /* For function calls also compare number of arguments.  */
@@ -3251,6 +3256,7 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,
 			  depth);
 	  indent += 4;
 	  fprintf_indent (f, indent, "{\n");
+	  bool cond_expr_p = false;
 	  for (unsigned i = 0; i < exprs_len; ++i)
 	    {
 	      expr *e = as_a <expr *> (gimple_exprs[i]->op);
@@ -3262,6 +3268,7 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,
 	      else
 		{
 		  id_base *op = e->operation;
+		  cond_expr_p |= (*op == COND_EXPR && e->match_phi);
 		  if (*op == CONVERT_EXPR || *op == NOP_EXPR)
 		    fprintf_indent (f, indent, "CASE_CONVERT:\n");
 		  else
@@ -3275,6 +3282,27 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,
 	  fprintf_indent (f, indent, "default:;\n");
 	  fprintf_indent (f, indent, "}\n");
 	  indent -= 4;
+
+	  if (cond_expr_p)
+	    {
+	      fprintf_indent (f, indent,
+		"else if (gphi *_a%d = dyn_cast <gphi *> (_d%d))\n",
+		depth, depth);
+	      indent += 2;
+	      fprintf_indent (f, indent, "{\n");
+	      indent += 2;
+
+	      for (unsigned i = 0; i < exprs_len; i++)
+		{
+		  expr *e = as_a <expr *> (gimple_exprs[i]->op);
+		  if (*e->operation == COND_EXPR && e->match_phi)
+		    gimple_exprs[i]->gen_phi_on_cond (f, indent, depth);
+		}
+
+	      indent -= 2;
+	      fprintf_indent (f, indent, "}\n");
+	      indent -= 2;
+	    }
 	}
 
       if (fns_len)
@@ -3483,6 +3511,86 @@ dt_operand::gen (FILE *f, int indent, bool gimple, int depth)
     }
 }
 
+/* Generate matching code for the phi when meet COND_EXPR.  */
+
+void
+dt_operand::gen_phi_on_cond (FILE *f, int indent, int depth)
+{
+  fprintf_indent (f, indent,
+    "basic_block _b%d = gimple_bb (_a%d);\n", depth, depth);
+
+  fprintf_indent (f, indent, "if (gimple_phi_num_args (_a%d) == 2)\n", depth);
+
+  indent += 2;
+  fprintf_indent (f, indent, "{\n");
+  indent += 2;
+
+  fprintf_indent (f, indent,
+    "basic_block _pb_0_%d = EDGE_PRED (_b%d, 0)->src;\n", depth, depth);
+  fprintf_indent (f, indent,
+    "basic_block _pb_1_%d = EDGE_PRED (_b%d, 1)->src;\n", depth, depth);
+  fprintf_indent (f, indent,
+    "basic_block _db_%d = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_%d)) ? "
+    "_pb_0_%d : _pb_1_%d;\n", depth, depth, depth, depth);
+  fprintf_indent (f, indent,
+    "basic_block _other_db_%d = safe_dyn_cast <gcond *> "
+    "(*gsi_last_bb (_pb_0_%d)) ? _pb_1_%d : _pb_0_%d;\n",
+    depth, depth, depth, depth);
+
+  fprintf_indent (f, indent,
+    "gcond *_ct_%d = safe_dyn_cast <gcond *> (*gsi_last_bb (_db_%d));\n",
+    depth, depth);
+  fprintf_indent (f, indent, "if (_ct_%d"
+    " && EDGE_COUNT (_other_db_%d->preds) == 1\n", depth, depth);
+  fprintf_indent (f, indent,
+    "  && EDGE_COUNT (_other_db_%d->succs) == 1\n", depth);
+  fprintf_indent (f, indent,
+    "  && EDGE_PRED (_other_db_%d, 0)->src == _db_%d)\n", depth, depth);
+
+  indent += 2;
+  fprintf_indent (f, indent, "{\n");
+  indent += 2;
+
+  fprintf_indent (f, indent,
+    "tree _cond_lhs_%d = gimple_cond_lhs (_ct_%d);\n", depth, depth);
+  fprintf_indent (f, indent,
+    "tree _cond_rhs_%d = gimple_cond_rhs (_ct_%d);\n", depth, depth);
+
+  char opname_0[20];
+  char opname_1[20];
+  char opname_2[20];
+  gen_opname (opname_0, 0);
+
+  fprintf_indent (f, indent,
+    "tree %s = build2 (gimple_cond_code (_ct_%d), "
+    "boolean_type_node, _cond_lhs_%d, _cond_rhs_%d);\n",
+    opname_0, depth, depth, depth);
+
+  fprintf_indent (f, indent,
+    "bool _arg_0_is_true_%d = gimple_phi_arg_edge (_a%d, 0)->flags"
+    " & EDGE_TRUE_VALUE;\n", depth, depth);
+
+  gen_opname (opname_1, 1);
+  fprintf_indent (f, indent,
+    "tree %s = gimple_phi_arg_def (_a%d, _arg_0_is_true_%d ? 0 : 1);\n",
+    opname_1, depth, depth);
+
+  gen_opname (opname_2, 2);
+  fprintf_indent (f, indent,
+    "tree %s = gimple_phi_arg_def (_a%d, _arg_0_is_true_%d ? 1 : 0);\n",
+    opname_2, depth, depth);
+
+  gen_kids (f, indent, true, depth);
+
+  indent -= 2;
+  fprintf_indent (f, indent, "}\n");
+  indent -= 2;
+
+  indent -= 2;
+  fprintf_indent (f, indent, "}\n");
+  indent -= 2;
+}
+
 /* Emit a logging call to the debug file to the file F, with the INDENT from
    either the RESULT location or the S's match location if RESULT is null. */
 static void
@@ -4600,6 +4708,18 @@ parser::parse_expr ()
       e->force_leaf = true;
     }
 
+  if (token->type == CPP_XOR && !(token->flags & PREV_WHITE))
+    {
+      if (!parsing_match_operand)
+	fatal_at (token, "modifier '^' is only valid in a match expression");
+
+      if (!(*e->operation == COND_EXPR))
+	fatal_at (token, "modifier '^' can only act on operation COND_EXPR");
+
+      eat_token (CPP_XOR);
+      e->match_phi = true;
+    }
+
   if (token->type == CPP_COLON
       && !(token->flags & PREV_WHITE))
     {
diff --git a/gcc/match.pd b/gcc/match.pd
index 7c1ad428a3c..5f36c1fac82 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3056,31 +3056,48 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
       && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype))))))
 
 /* Unsigned Saturation Add */
+/* SAT_ADD = usadd_left_part_1 | usadd_right_part_1, aka:
+   SAT_ADD = (X + Y) | -((X + Y) < X)  */
 (match (usadd_left_part_1 @0 @1)
  (plus:c @0 @1)
  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
       && types_match (type, @0, @1))))
 
+/* SAT_ADD = usadd_left_part_2 | usadd_right_part_2, aka:
+   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | (IMAGPART_EXPR <.ADD_OVERFLOW> != 0) */
 (match (usadd_left_part_2 @0 @1)
  (realpart (IFN_ADD_OVERFLOW:c @0 @1))
  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
       && types_match (type, @0, @1))))
 
+/* SAT_ADD = usadd_left_part_1 | usadd_right_part_1, aka:
+   SAT_ADD = (X + Y) | -((type)(X + Y) < X)  */
 (match (usadd_right_part_1 @0 @1)
  (negate (convert (lt (plus:c @0 @1) @0)))
  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
       && types_match (type, @0, @1))))
 
+/* SAT_ADD = usadd_left_part_1 | usadd_right_part_1, aka:
+   SAT_ADD = (X + Y) | -(X > (X + Y))  */
 (match (usadd_right_part_1 @0 @1)
  (negate (convert (gt @0 (plus:c @0 @1))))
  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
       && types_match (type, @0, @1))))
 
+/* SAT_ADD = usadd_left_part_2 | usadd_right_part_2, aka:
+   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | (IMAGPART_EXPR <.ADD_OVERFLOW> != 0) */
 (match (usadd_right_part_2 @0 @1)
  (negate (convert (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)))
  (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
       && types_match (type, @0, @1))))
 
+/* SAT_ADD = usadd_left_part_2 | usadd_right_part_2, aka:
+   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | -IMAGPART_EXPR <.ADD_OVERFLOW> */
+(match (usadd_right_part_2 @0 @1)
+ (negate (imagpart (IFN_ADD_OVERFLOW:c @0 @1)))
+ (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
+      && types_match (type, @0, @1))))
+
 /* We cannot merge or overload usadd_left_part_1 and usadd_left_part_2
    because the sub part of left_part_2 cannot work with right_part_1.
    For example, left_part_2 pattern focus one .ADD_OVERFLOW but the
@@ -3092,10 +3109,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (match (unsigned_integer_sat_add @0 @1)
  (bit_ior:c (usadd_left_part_1 @0 @1) (usadd_right_part_1 @0 @1)))
 
-/* Unsigned saturation add, case 2 (branchless with .ADD_OVERFLOW).  */
+/* Unsigned saturation add, case 2 (branchless with .ADD_OVERFLOW):
+   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | -IMAGPART_EXPR <.ADD_OVERFLOW> or
+   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | (IMAGPART_EXPR <.ADD_OVERFLOW> != 0) */
 (match (unsigned_integer_sat_add @0 @1)
  (bit_ior:c (usadd_left_part_2 @0 @1) (usadd_right_part_2 @0 @1)))
 
+/* Unsigned saturation add, case 3 (branch with ge):
+   SAT_U_ADD = (X + Y) >= x ? (X + Y) : -1.  */
+(match (unsigned_integer_sat_add @0 @1)
+ (cond^ (ge (usadd_left_part_1@2 @0 @1) @0) @2 integer_minus_onep))
+
+/* Unsigned saturation add, case 4 (branch with lt):
+   SAT_U_ADD = (X + Y) < x ? -1 : (X + Y).  */
+(match (unsigned_integer_sat_add @0 @1)
+ (cond^ (lt (usadd_left_part_1@2 @0 @1) @0) integer_minus_onep @2))
+
+/* Unsigned saturation add, case 5 (branch with eq .ADD_OVERFLOW):
+   SAT_U_ADD = REALPART_EXPR <.ADD_OVERFLOW> == 0 ? .ADD_OVERFLOW : -1.  */
+(match (unsigned_integer_sat_add @0 @1)
+ (cond^ (eq (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
+  (usadd_left_part_2 @0 @1) integer_minus_onep))
+
+/* Unsigned saturation add, case 6 (branch with ne .ADD_OVERFLOW):
+   SAT_U_ADD = REALPART_EXPR <.ADD_OVERFLOW> != 0 ? -1 : .ADD_OVERFLOW.  */
+(match (unsigned_integer_sat_add @0 @1)
+ (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
+  integer_minus_onep (usadd_left_part_2 @0 @1)))
+
 /* Unsigned saturation sub, case 1 (branch with gt):
    SAT_U_SUB = X > Y ? X - Y : 0  */
 (match (unsigned_integer_sat_sub @0 @1)
diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
index 469e34d828d..173b0366f5e 100644
--- a/gcc/tree-ssa-math-opts.cc
+++ b/gcc/tree-ssa-math-opts.cc
@@ -4101,8 +4101,24 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, internal_fn fn,
     }
 }
 
+static void
+build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi,
+				    internal_fn fn, tree lhs, tree op_0,
+				    tree op_1)
+{
+  if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH))
+    {
+      gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
+      gimple_call_set_lhs (call, lhs);
+      gsi_insert_before (gsi, call, GSI_SAME_STMT);
+
+      gimple_stmt_iterator psi = gsi_for_stmt (phi);
+      remove_phi_node (&psi, /* release_lhs_p */ false);
+    }
+}
+
 /*
- * Try to match saturation unsigned add.
+ * Try to match saturation unsigned add with assign.
  *   _7 = _4 + _6;
  *   _8 = _4 > _7;
  *   _9 = (long unsigned int) _8;
@@ -4121,6 +4137,37 @@ match_unsigned_saturation_add (gimple_stmt_iterator *gsi, gassign *stmt)
     build_saturation_binary_arith_call (gsi, IFN_SAT_ADD, lhs, ops[0], ops[1]);
 }
 
+/*
+ * Try to match saturation unsigned add with PHI.
+ *   <bb 2> :
+ *   _1 = x_3(D) + y_4(D);
+ *   if (_1 >= x_3(D))
+ *     goto <bb 3>; [INV]
+ *   else
+ *     goto <bb 4>; [INV]
+ *
+ *   <bb 3> :
+ *
+ *   <bb 4> :
+ *   # _2 = PHI <255(2), _1(3)>
+ *   =>
+ *   <bb 4> [local count: 1073741824]:
+ *   _2 = .SAT_ADD (x_4(D), y_5(D));  */
+
+static void
+match_unsigned_saturation_add (gimple_stmt_iterator *gsi, gphi *phi)
+{
+  if (gimple_phi_num_args (phi) != 2)
+    return;
+
+  tree ops[2];
+  tree phi_result = gimple_phi_result (phi);
+
+  if (gimple_unsigned_integer_sat_add (phi_result, ops, NULL))
+    build_saturation_binary_arith_call (gsi, phi, IFN_SAT_ADD, phi_result,
+					ops[0], ops[1]);
+}
+
 /*
  * Try to match saturation unsigned sub.
  *   _1 = _4 >= _5;
@@ -6052,6 +6099,13 @@ math_opts_dom_walker::after_dom_children (basic_block bb)
 
   fma_deferring_state fma_state (param_avoid_fma_max_bits > 0);
 
+  for (gphi_iterator psi = gsi_start_phis (bb); !gsi_end_p (psi);
+    gsi_next (&psi))
+    {
+      gimple_stmt_iterator gsi = gsi_last_bb (bb);
+      match_unsigned_saturation_add (&gsi, psi.phi ());
+    }
+
   for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi);)
     {
       gimple *stmt = gsi_stmt (gsi);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD
  2024-06-06 13:37 [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD pan2.li
@ 2024-06-06 14:04 ` Richard Biener
  2024-06-06 22:59   ` Li, Pan2
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Biener @ 2024-06-06 14:04 UTC (permalink / raw)
  To: pan2.li; +Cc: gcc-patches, juzhe.zhong, kito.cheng, tamar.christina

On Thu, Jun 6, 2024 at 3:37 PM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> After we support one gassign form of the unsigned .SAT_ADD,  we
> would like to support more forms including both the branch and
> branchless.  There are 5 other forms of .SAT_ADD,  list as below:
>
> Form 1:
>   #define SAT_ADD_U_1(T) \
>   T sat_add_u_1_##T(T x, T y) \
>   { \
>     return (T)(x + y) >= x ? (x + y) : -1; \
>   }
>
> Form 2:
>   #define SAT_ADD_U_2(T) \
>   T sat_add_u_2_##T(T x, T y) \
>   { \
>     T ret; \
>     T overflow = __builtin_add_overflow (x, y, &ret); \
>     return (T)(-overflow) | ret; \
>   }
>
> Form 3:
>   #define SAT_ADD_U_3(T) \
>   T sat_add_u_3_##T (T x, T y) \
>   { \
>     T ret; \
>     return __builtin_add_overflow (x, y, &ret) ? -1 : ret; \
>   }
>
> Form 4:
>   #define SAT_ADD_U_4(T) \
>   T sat_add_u_4_##T (T x, T y) \
>   { \
>     T ret; \
>     return __builtin_add_overflow (x, y, &ret) == 0 ? ret : -1; \
>   }
>
> Form 5:
>   #define SAT_ADD_U_5(T) \
>   T sat_add_u_5_##T(T x, T y) \
>   { \
>     return (T)(x + y) < x ? -1 : (x + y); \
>   }
>
> Take the forms 3 of above as example:
>
> uint64_t
> sat_add (uint64_t x, uint64_t y)
> {
>   uint64_t ret;
>   return __builtin_add_overflow (x, y, &ret) ? -1 : ret;
> }
>
> Before this patch:
> uint64_t sat_add (uint64_t x, uint64_t y)
> {
>   long unsigned int _1;
>   long unsigned int _2;
>   uint64_t _3;
>   __complex__ long unsigned int _6;
>
> ;;   basic block 2, loop depth 0
> ;;    pred:       ENTRY
>   _6 = .ADD_OVERFLOW (x_4(D), y_5(D));
>   _2 = IMAGPART_EXPR <_6>;
>   if (_2 != 0)
>     goto <bb 4>; [35.00%]
>   else
>     goto <bb 3>; [65.00%]
> ;;    succ:       4
> ;;                3
>
> ;;   basic block 3, loop depth 0
> ;;    pred:       2
>   _1 = REALPART_EXPR <_6>;
> ;;    succ:       4
>
> ;;   basic block 4, loop depth 0
> ;;    pred:       3
> ;;                2
>   # _3 = PHI <_1(3), 18446744073709551615(2)>
>   return _3;
> ;;    succ:       EXIT
> }
>
> After this patch:
> uint64_t sat_add (uint64_t x, uint64_t y)
> {
>   long unsigned int _12;
>
> ;;   basic block 2, loop depth 0
> ;;    pred:       ENTRY
>   _12 = .SAT_ADD (x_4(D), y_5(D)); [tail call]
>   return _12;
> ;;    succ:       EXIT
> }
>
> The flag '^' acts on cond_expr will generate matching code similar as below:
>
> else if (gphi *_a1 = dyn_cast <gphi *> (_d1))
>   {
>     basic_block _b1 = gimple_bb (_a1);
>     if (gimple_phi_num_args (_a1) == 2)
>       {
>         basic_block _pb_0_1 = EDGE_PRED (_b1, 0)->src;
>         basic_block _pb_1_1 = EDGE_PRED (_b1, 1)->src;
>         basic_block _db_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_1))
>                             ? _pb_0_1 : _pb_1_1;
>         basic_block _other_db_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_1))
>                                   ? _pb_1_1 : _pb_0_1;
>         gcond *_ct_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_db_1));
>         if (_ct_1 && EDGE_COUNT (_other_db_1->preds) == 1
>           && EDGE_COUNT (_other_db_1->succs) == 1
>           && EDGE_PRED (_other_db_1, 0)->src == _db_1)
>           {
>             tree _cond_lhs_1 = gimple_cond_lhs (_ct_1);
>             tree _cond_rhs_1 = gimple_cond_rhs (_ct_1);
>             tree _p0 = build2 (gimple_cond_code (_ct_1), boolean_type_node,
>                                _cond_lhs_1, _cond_rhs_1);
>             bool _arg_0_is_true_1 = gimple_phi_arg_edge (_a1, 0)->flags & EDGE_TRUE_VALUE;
>             tree _p1 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 0 : 1);
>             tree _p2 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 1 : 0);
>             ....
>
> The below test suites are passed for this patch.
> * The x86 bootstrap test.
> * The x86 fully regression test.
> * The riscv fully regression test.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
>         * doc/match-and-simplify.texi: Add doc for the matching flag '^'.
>         * genmatch.cc (cmp_operand): Add match_phi comparation.
>         (dt_node::gen_kids_1): Add cond_expr bool flag for phi match.
>         (dt_operand::gen_phi_on_cond): Add new func to gen phi matching
>         on cond_expr.
>         (parser::parse_expr): Add handling for the expr flag '^'.
>         * match.pd: Add more form for unsigned .SAT_ADD.
>         * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add
>         new func impl to build call for phi gimple.
>         (match_unsigned_saturation_add): Add new func impl to match the
>         .SAT_ADD for phi gimple.
>         (math_opts_dom_walker::after_dom_children): Add phi matching
>         try for all gimple phi stmt.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/doc/match-and-simplify.texi |  16 ++++
>  gcc/genmatch.cc                 | 126 +++++++++++++++++++++++++++++++-
>  gcc/match.pd                    |  43 ++++++++++-
>  gcc/tree-ssa-math-opts.cc       |  56 +++++++++++++-
>  4 files changed, 236 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/doc/match-and-simplify.texi b/gcc/doc/match-and-simplify.texi
> index 01f19e2f62c..63d5af159f5 100644
> --- a/gcc/doc/match-and-simplify.texi
> +++ b/gcc/doc/match-and-simplify.texi
> @@ -361,6 +361,22 @@ Usually the types of the generated result expressions are
>  determined from the context, but sometimes like in the above case
>  it is required that you specify them explicitly.
>
> +Another modifier for generated expressions is @code{^} which
> +tells the machinery to try more matches for some special cases.
> +For example, normally the @code{cond} only allows the gimple
> +assign when matching.  It will also try to match the gimple @code{PHI}
> +besides gimple assign if appending the @code{^} to the @code{cond}.
> +Aka @code{cond^}.  Consider below example
> +
> +@smallexample
> +(match (unsigned_sat_add @@0 @@1)
> + (cond^ (ge (plus:c@@2 @@0 @@1) @@0) @@2 integer_minus_onep))
> +@end smallexample
> +
> +The above matching will generate the predicate function named
> +@code{gimple_unsigned_sat_add} that accepts both the gimple
> +assign and gimple @code{PHI}.
> +
>  Another modifier for generated expressions is @code{!} which
>  tells the machinery to only consider the simplification in case
>  the marked expression simplified to a simple operand.  Consider
> diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
> index f1e0e7abe0c..a56bd90cb2c 100644
> --- a/gcc/genmatch.cc
> +++ b/gcc/genmatch.cc
> @@ -848,12 +848,13 @@ public:
>      : operand (OP_EXPR, loc), operation (operation_),
>        ops (vNULL), expr_type (NULL), is_commutative (is_commutative_),
>        is_generic (false), force_single_use (false), force_leaf (false),
> -      opt_grp (0) {}
> +      match_phi (false), opt_grp (0) {}
>    expr (expr *e)
>      : operand (OP_EXPR, e->location), operation (e->operation),
>        ops (vNULL), expr_type (e->expr_type), is_commutative (e->is_commutative),
>        is_generic (e->is_generic), force_single_use (e->force_single_use),
> -      force_leaf (e->force_leaf), opt_grp (e->opt_grp) {}
> +      force_leaf (e->force_leaf), match_phi (e->match_phi),
> +      opt_grp (e->opt_grp) {}
>    void append_op (operand *op) { ops.safe_push (op); }
>    /* The operator and its operands.  */
>    id_base *operation;
> @@ -871,6 +872,8 @@ public:
>    /* Whether in the result expression this should be a leaf node
>       with any children simplified down to simple operands.  */
>    bool force_leaf;
> +  /* Whether the COND_EXPR is matching PHI gimple,  default false.  */
> +  bool match_phi;
>    /* If non-zero, the group for optional handling.  */
>    unsigned char opt_grp;
>    void gen_transform (FILE *f, int, const char *, bool, int,
> @@ -1819,6 +1822,7 @@ public:
>
>    char *get_name (char *);
>    void gen_opname (char *, unsigned);
> +  void gen_phi_on_cond (FILE *, int, int);
>  };
>
>  /* Leaf node of the decision tree, used for DT_SIMPLIFY.  */
> @@ -1898,7 +1902,8 @@ cmp_operand (operand *o1, operand *o2)
>        expr *e1 = static_cast<expr *>(o1);
>        expr *e2 = static_cast<expr *>(o2);
>        if (e1->operation != e2->operation
> -         || e1->is_generic != e2->is_generic)
> +         || e1->is_generic != e2->is_generic
> +         || e1->match_phi != e2->match_phi)
>         return false;
>        if (e1->operation->kind == id_base::FN
>           /* For function calls also compare number of arguments.  */
> @@ -3251,6 +3256,7 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,
>                           depth);
>           indent += 4;
>           fprintf_indent (f, indent, "{\n");
> +         bool cond_expr_p = false;
>           for (unsigned i = 0; i < exprs_len; ++i)
>             {
>               expr *e = as_a <expr *> (gimple_exprs[i]->op);
> @@ -3262,6 +3268,7 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,
>               else
>                 {
>                   id_base *op = e->operation;
> +                 cond_expr_p |= (*op == COND_EXPR && e->match_phi);
>                   if (*op == CONVERT_EXPR || *op == NOP_EXPR)
>                     fprintf_indent (f, indent, "CASE_CONVERT:\n");
>                   else
> @@ -3275,6 +3282,27 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,
>           fprintf_indent (f, indent, "default:;\n");
>           fprintf_indent (f, indent, "}\n");
>           indent -= 4;
> +
> +         if (cond_expr_p)
> +           {
> +             fprintf_indent (f, indent,
> +               "else if (gphi *_a%d = dyn_cast <gphi *> (_d%d))\n",
> +               depth, depth);
> +             indent += 2;
> +             fprintf_indent (f, indent, "{\n");
> +             indent += 2;
> +
> +             for (unsigned i = 0; i < exprs_len; i++)
> +               {
> +                 expr *e = as_a <expr *> (gimple_exprs[i]->op);
> +                 if (*e->operation == COND_EXPR && e->match_phi)
> +                   gimple_exprs[i]->gen_phi_on_cond (f, indent, depth);
> +               }
> +
> +             indent -= 2;
> +             fprintf_indent (f, indent, "}\n");
> +             indent -= 2;
> +           }
>         }
>
>        if (fns_len)
> @@ -3483,6 +3511,86 @@ dt_operand::gen (FILE *f, int indent, bool gimple, int depth)
>      }
>  }
>
> +/* Generate matching code for the phi when meet COND_EXPR.  */
> +
> +void
> +dt_operand::gen_phi_on_cond (FILE *f, int indent, int depth)
> +{
> +  fprintf_indent (f, indent,
> +    "basic_block _b%d = gimple_bb (_a%d);\n", depth, depth);
> +
> +  fprintf_indent (f, indent, "if (gimple_phi_num_args (_a%d) == 2)\n", depth);
> +
> +  indent += 2;
> +  fprintf_indent (f, indent, "{\n");
> +  indent += 2;
> +
> +  fprintf_indent (f, indent,
> +    "basic_block _pb_0_%d = EDGE_PRED (_b%d, 0)->src;\n", depth, depth);
> +  fprintf_indent (f, indent,
> +    "basic_block _pb_1_%d = EDGE_PRED (_b%d, 1)->src;\n", depth, depth);
> +  fprintf_indent (f, indent,
> +    "basic_block _db_%d = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_%d)) ? "
> +    "_pb_0_%d : _pb_1_%d;\n", depth, depth, depth, depth);
> +  fprintf_indent (f, indent,
> +    "basic_block _other_db_%d = safe_dyn_cast <gcond *> "
> +    "(*gsi_last_bb (_pb_0_%d)) ? _pb_1_%d : _pb_0_%d;\n",
> +    depth, depth, depth, depth);
> +
> +  fprintf_indent (f, indent,
> +    "gcond *_ct_%d = safe_dyn_cast <gcond *> (*gsi_last_bb (_db_%d));\n",
> +    depth, depth);
> +  fprintf_indent (f, indent, "if (_ct_%d"
> +    " && EDGE_COUNT (_other_db_%d->preds) == 1\n", depth, depth);
> +  fprintf_indent (f, indent,
> +    "  && EDGE_COUNT (_other_db_%d->succs) == 1\n", depth);
> +  fprintf_indent (f, indent,
> +    "  && EDGE_PRED (_other_db_%d, 0)->src == _db_%d)\n", depth, depth);
> +
> +  indent += 2;
> +  fprintf_indent (f, indent, "{\n");
> +  indent += 2;
> +
> +  fprintf_indent (f, indent,
> +    "tree _cond_lhs_%d = gimple_cond_lhs (_ct_%d);\n", depth, depth);
> +  fprintf_indent (f, indent,
> +    "tree _cond_rhs_%d = gimple_cond_rhs (_ct_%d);\n", depth, depth);
> +
> +  char opname_0[20];
> +  char opname_1[20];
> +  char opname_2[20];
> +  gen_opname (opname_0, 0);
> +
> +  fprintf_indent (f, indent,
> +    "tree %s = build2 (gimple_cond_code (_ct_%d), "
> +    "boolean_type_node, _cond_lhs_%d, _cond_rhs_%d);\n",
> +    opname_0, depth, depth, depth);
> +
> +  fprintf_indent (f, indent,
> +    "bool _arg_0_is_true_%d = gimple_phi_arg_edge (_a%d, 0)->flags"
> +    " & EDGE_TRUE_VALUE;\n", depth, depth);
> +
> +  gen_opname (opname_1, 1);
> +  fprintf_indent (f, indent,
> +    "tree %s = gimple_phi_arg_def (_a%d, _arg_0_is_true_%d ? 0 : 1);\n",
> +    opname_1, depth, depth);
> +
> +  gen_opname (opname_2, 2);
> +  fprintf_indent (f, indent,
> +    "tree %s = gimple_phi_arg_def (_a%d, _arg_0_is_true_%d ? 1 : 0);\n",
> +    opname_2, depth, depth);
> +
> +  gen_kids (f, indent, true, depth);
> +
> +  indent -= 2;
> +  fprintf_indent (f, indent, "}\n");
> +  indent -= 2;
> +
> +  indent -= 2;
> +  fprintf_indent (f, indent, "}\n");
> +  indent -= 2;
> +}
> +
>  /* Emit a logging call to the debug file to the file F, with the INDENT from
>     either the RESULT location or the S's match location if RESULT is null. */
>  static void
> @@ -4600,6 +4708,18 @@ parser::parse_expr ()
>        e->force_leaf = true;
>      }
>
> +  if (token->type == CPP_XOR && !(token->flags & PREV_WHITE))
> +    {
> +      if (!parsing_match_operand)
> +       fatal_at (token, "modifier '^' is only valid in a match expression");
> +
> +      if (!(*e->operation == COND_EXPR))
> +       fatal_at (token, "modifier '^' can only act on operation COND_EXPR");
> +
> +      eat_token (CPP_XOR);
> +      e->match_phi = true;
> +    }
> +
>    if (token->type == CPP_COLON
>        && !(token->flags & PREV_WHITE))
>      {
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 7c1ad428a3c..5f36c1fac82 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3056,31 +3056,48 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>        && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype))))))
>
>  /* Unsigned Saturation Add */
> +/* SAT_ADD = usadd_left_part_1 | usadd_right_part_1, aka:
> +   SAT_ADD = (X + Y) | -((X + Y) < X)  */
>  (match (usadd_left_part_1 @0 @1)
>   (plus:c @0 @1)
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>        && types_match (type, @0, @1))))
>
> +/* SAT_ADD = usadd_left_part_2 | usadd_right_part_2, aka:
> +   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | (IMAGPART_EXPR <.ADD_OVERFLOW> != 0) */
>  (match (usadd_left_part_2 @0 @1)
>   (realpart (IFN_ADD_OVERFLOW:c @0 @1))
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>        && types_match (type, @0, @1))))
>
> +/* SAT_ADD = usadd_left_part_1 | usadd_right_part_1, aka:
> +   SAT_ADD = (X + Y) | -((type)(X + Y) < X)  */
>  (match (usadd_right_part_1 @0 @1)
>   (negate (convert (lt (plus:c @0 @1) @0)))
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>        && types_match (type, @0, @1))))
>
> +/* SAT_ADD = usadd_left_part_1 | usadd_right_part_1, aka:
> +   SAT_ADD = (X + Y) | -(X > (X + Y))  */
>  (match (usadd_right_part_1 @0 @1)
>   (negate (convert (gt @0 (plus:c @0 @1))))
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>        && types_match (type, @0, @1))))
>
> +/* SAT_ADD = usadd_left_part_2 | usadd_right_part_2, aka:
> +   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | (IMAGPART_EXPR <.ADD_OVERFLOW> != 0) */
>  (match (usadd_right_part_2 @0 @1)
>   (negate (convert (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)))
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>        && types_match (type, @0, @1))))
>
> +/* SAT_ADD = usadd_left_part_2 | usadd_right_part_2, aka:
> +   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | -IMAGPART_EXPR <.ADD_OVERFLOW> */
> +(match (usadd_right_part_2 @0 @1)
> + (negate (imagpart (IFN_ADD_OVERFLOW:c @0 @1)))
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +      && types_match (type, @0, @1))))
> +
>  /* We cannot merge or overload usadd_left_part_1 and usadd_left_part_2
>     because the sub part of left_part_2 cannot work with right_part_1.
>     For example, left_part_2 pattern focus one .ADD_OVERFLOW but the
> @@ -3092,10 +3109,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (match (unsigned_integer_sat_add @0 @1)
>   (bit_ior:c (usadd_left_part_1 @0 @1) (usadd_right_part_1 @0 @1)))
>
> -/* Unsigned saturation add, case 2 (branchless with .ADD_OVERFLOW).  */
> +/* Unsigned saturation add, case 2 (branchless with .ADD_OVERFLOW):
> +   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | -IMAGPART_EXPR <.ADD_OVERFLOW> or
> +   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | (IMAGPART_EXPR <.ADD_OVERFLOW> != 0) */
>  (match (unsigned_integer_sat_add @0 @1)
>   (bit_ior:c (usadd_left_part_2 @0 @1) (usadd_right_part_2 @0 @1)))
>
> +/* Unsigned saturation add, case 3 (branch with ge):
> +   SAT_U_ADD = (X + Y) >= x ? (X + Y) : -1.  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (ge (usadd_left_part_1@2 @0 @1) @0) @2 integer_minus_onep))
> +
> +/* Unsigned saturation add, case 4 (branch with lt):
> +   SAT_U_ADD = (X + Y) < x ? -1 : (X + Y).  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (lt (usadd_left_part_1@2 @0 @1) @0) integer_minus_onep @2))
> +
> +/* Unsigned saturation add, case 5 (branch with eq .ADD_OVERFLOW):
> +   SAT_U_ADD = REALPART_EXPR <.ADD_OVERFLOW> == 0 ? .ADD_OVERFLOW : -1.  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (eq (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
> +  (usadd_left_part_2 @0 @1) integer_minus_onep))
> +
> +/* Unsigned saturation add, case 6 (branch with ne .ADD_OVERFLOW):
> +   SAT_U_ADD = REALPART_EXPR <.ADD_OVERFLOW> != 0 ? -1 : .ADD_OVERFLOW.  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
> +  integer_minus_onep (usadd_left_part_2 @0 @1)))
> +
>  /* Unsigned saturation sub, case 1 (branch with gt):
>     SAT_U_SUB = X > Y ? X - Y : 0  */
>  (match (unsigned_integer_sat_sub @0 @1)
> diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> index 469e34d828d..173b0366f5e 100644
> --- a/gcc/tree-ssa-math-opts.cc
> +++ b/gcc/tree-ssa-math-opts.cc
> @@ -4101,8 +4101,24 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, internal_fn fn,
>      }
>  }
>
> +static void
> +build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi,
> +                                   internal_fn fn, tree lhs, tree op_0,
> +                                   tree op_1)
> +{
> +  if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH))
> +    {
> +      gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> +      gimple_call_set_lhs (call, lhs);
> +      gsi_insert_before (gsi, call, GSI_SAME_STMT);
> +
> +      gimple_stmt_iterator psi = gsi_for_stmt (phi);
> +      remove_phi_node (&psi, /* release_lhs_p */ false);
> +    }
> +}
> +
>  /*
> - * Try to match saturation unsigned add.
> + * Try to match saturation unsigned add with assign.
>   *   _7 = _4 + _6;
>   *   _8 = _4 > _7;
>   *   _9 = (long unsigned int) _8;
> @@ -4121,6 +4137,37 @@ match_unsigned_saturation_add (gimple_stmt_iterator *gsi, gassign *stmt)
>      build_saturation_binary_arith_call (gsi, IFN_SAT_ADD, lhs, ops[0], ops[1]);
>  }
>
> +/*
> + * Try to match saturation unsigned add with PHI.
> + *   <bb 2> :
> + *   _1 = x_3(D) + y_4(D);
> + *   if (_1 >= x_3(D))
> + *     goto <bb 3>; [INV]
> + *   else
> + *     goto <bb 4>; [INV]
> + *
> + *   <bb 3> :
> + *
> + *   <bb 4> :
> + *   # _2 = PHI <255(2), _1(3)>
> + *   =>
> + *   <bb 4> [local count: 1073741824]:
> + *   _2 = .SAT_ADD (x_4(D), y_5(D));  */
> +
> +static void
> +match_unsigned_saturation_add (gimple_stmt_iterator *gsi, gphi *phi)
> +{
> +  if (gimple_phi_num_args (phi) != 2)
> +    return;
> +
> +  tree ops[2];
> +  tree phi_result = gimple_phi_result (phi);
> +
> +  if (gimple_unsigned_integer_sat_add (phi_result, ops, NULL))
> +    build_saturation_binary_arith_call (gsi, phi, IFN_SAT_ADD, phi_result,
> +                                       ops[0], ops[1]);
> +}
> +
>  /*
>   * Try to match saturation unsigned sub.
>   *   _1 = _4 >= _5;
> @@ -6052,6 +6099,13 @@ math_opts_dom_walker::after_dom_children (basic_block bb)
>
>    fma_deferring_state fma_state (param_avoid_fma_max_bits > 0);
>
> +  for (gphi_iterator psi = gsi_start_phis (bb); !gsi_end_p (psi);
> +    gsi_next (&psi))
> +    {
> +      gimple_stmt_iterator gsi = gsi_last_bb (bb);
> +      match_unsigned_saturation_add (&gsi, psi.phi ());
> +    }
> +
>    for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi);)
>      {
>        gimple *stmt = gsi_stmt (gsi);
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD
  2024-06-06 14:04 ` Richard Biener
@ 2024-06-06 22:59   ` Li, Pan2
  2024-06-13 12:00     ` Maciej W. Rozycki
  0 siblings, 1 reply; 9+ messages in thread
From: Li, Pan2 @ 2024-06-06 22:59 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, juzhe.zhong, kito.cheng, tamar.christina

Committed, thanks Richard.

Pan

-----Original Message-----
From: Richard Biener <richard.guenther@gmail.com> 
Sent: Thursday, June 6, 2024 10:04 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; tamar.christina@arm.com
Subject: Re: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD

On Thu, Jun 6, 2024 at 3:37 PM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> After we support one gassign form of the unsigned .SAT_ADD,  we
> would like to support more forms including both the branch and
> branchless.  There are 5 other forms of .SAT_ADD,  list as below:
>
> Form 1:
>   #define SAT_ADD_U_1(T) \
>   T sat_add_u_1_##T(T x, T y) \
>   { \
>     return (T)(x + y) >= x ? (x + y) : -1; \
>   }
>
> Form 2:
>   #define SAT_ADD_U_2(T) \
>   T sat_add_u_2_##T(T x, T y) \
>   { \
>     T ret; \
>     T overflow = __builtin_add_overflow (x, y, &ret); \
>     return (T)(-overflow) | ret; \
>   }
>
> Form 3:
>   #define SAT_ADD_U_3(T) \
>   T sat_add_u_3_##T (T x, T y) \
>   { \
>     T ret; \
>     return __builtin_add_overflow (x, y, &ret) ? -1 : ret; \
>   }
>
> Form 4:
>   #define SAT_ADD_U_4(T) \
>   T sat_add_u_4_##T (T x, T y) \
>   { \
>     T ret; \
>     return __builtin_add_overflow (x, y, &ret) == 0 ? ret : -1; \
>   }
>
> Form 5:
>   #define SAT_ADD_U_5(T) \
>   T sat_add_u_5_##T(T x, T y) \
>   { \
>     return (T)(x + y) < x ? -1 : (x + y); \
>   }
>
> Take the forms 3 of above as example:
>
> uint64_t
> sat_add (uint64_t x, uint64_t y)
> {
>   uint64_t ret;
>   return __builtin_add_overflow (x, y, &ret) ? -1 : ret;
> }
>
> Before this patch:
> uint64_t sat_add (uint64_t x, uint64_t y)
> {
>   long unsigned int _1;
>   long unsigned int _2;
>   uint64_t _3;
>   __complex__ long unsigned int _6;
>
> ;;   basic block 2, loop depth 0
> ;;    pred:       ENTRY
>   _6 = .ADD_OVERFLOW (x_4(D), y_5(D));
>   _2 = IMAGPART_EXPR <_6>;
>   if (_2 != 0)
>     goto <bb 4>; [35.00%]
>   else
>     goto <bb 3>; [65.00%]
> ;;    succ:       4
> ;;                3
>
> ;;   basic block 3, loop depth 0
> ;;    pred:       2
>   _1 = REALPART_EXPR <_6>;
> ;;    succ:       4
>
> ;;   basic block 4, loop depth 0
> ;;    pred:       3
> ;;                2
>   # _3 = PHI <_1(3), 18446744073709551615(2)>
>   return _3;
> ;;    succ:       EXIT
> }
>
> After this patch:
> uint64_t sat_add (uint64_t x, uint64_t y)
> {
>   long unsigned int _12;
>
> ;;   basic block 2, loop depth 0
> ;;    pred:       ENTRY
>   _12 = .SAT_ADD (x_4(D), y_5(D)); [tail call]
>   return _12;
> ;;    succ:       EXIT
> }
>
> The flag '^' acts on cond_expr will generate matching code similar as below:
>
> else if (gphi *_a1 = dyn_cast <gphi *> (_d1))
>   {
>     basic_block _b1 = gimple_bb (_a1);
>     if (gimple_phi_num_args (_a1) == 2)
>       {
>         basic_block _pb_0_1 = EDGE_PRED (_b1, 0)->src;
>         basic_block _pb_1_1 = EDGE_PRED (_b1, 1)->src;
>         basic_block _db_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_1))
>                             ? _pb_0_1 : _pb_1_1;
>         basic_block _other_db_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_1))
>                                   ? _pb_1_1 : _pb_0_1;
>         gcond *_ct_1 = safe_dyn_cast <gcond *> (*gsi_last_bb (_db_1));
>         if (_ct_1 && EDGE_COUNT (_other_db_1->preds) == 1
>           && EDGE_COUNT (_other_db_1->succs) == 1
>           && EDGE_PRED (_other_db_1, 0)->src == _db_1)
>           {
>             tree _cond_lhs_1 = gimple_cond_lhs (_ct_1);
>             tree _cond_rhs_1 = gimple_cond_rhs (_ct_1);
>             tree _p0 = build2 (gimple_cond_code (_ct_1), boolean_type_node,
>                                _cond_lhs_1, _cond_rhs_1);
>             bool _arg_0_is_true_1 = gimple_phi_arg_edge (_a1, 0)->flags & EDGE_TRUE_VALUE;
>             tree _p1 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 0 : 1);
>             tree _p2 = gimple_phi_arg_def (_a1, _arg_0_is_true_1 ? 1 : 0);
>             ....
>
> The below test suites are passed for this patch.
> * The x86 bootstrap test.
> * The x86 fully regression test.
> * The riscv fully regression test.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
>         * doc/match-and-simplify.texi: Add doc for the matching flag '^'.
>         * genmatch.cc (cmp_operand): Add match_phi comparation.
>         (dt_node::gen_kids_1): Add cond_expr bool flag for phi match.
>         (dt_operand::gen_phi_on_cond): Add new func to gen phi matching
>         on cond_expr.
>         (parser::parse_expr): Add handling for the expr flag '^'.
>         * match.pd: Add more form for unsigned .SAT_ADD.
>         * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add
>         new func impl to build call for phi gimple.
>         (match_unsigned_saturation_add): Add new func impl to match the
>         .SAT_ADD for phi gimple.
>         (math_opts_dom_walker::after_dom_children): Add phi matching
>         try for all gimple phi stmt.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/doc/match-and-simplify.texi |  16 ++++
>  gcc/genmatch.cc                 | 126 +++++++++++++++++++++++++++++++-
>  gcc/match.pd                    |  43 ++++++++++-
>  gcc/tree-ssa-math-opts.cc       |  56 +++++++++++++-
>  4 files changed, 236 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/doc/match-and-simplify.texi b/gcc/doc/match-and-simplify.texi
> index 01f19e2f62c..63d5af159f5 100644
> --- a/gcc/doc/match-and-simplify.texi
> +++ b/gcc/doc/match-and-simplify.texi
> @@ -361,6 +361,22 @@ Usually the types of the generated result expressions are
>  determined from the context, but sometimes like in the above case
>  it is required that you specify them explicitly.
>
> +Another modifier for generated expressions is @code{^} which
> +tells the machinery to try more matches for some special cases.
> +For example, normally the @code{cond} only allows the gimple
> +assign when matching.  It will also try to match the gimple @code{PHI}
> +besides gimple assign if appending the @code{^} to the @code{cond}.
> +Aka @code{cond^}.  Consider below example
> +
> +@smallexample
> +(match (unsigned_sat_add @@0 @@1)
> + (cond^ (ge (plus:c@@2 @@0 @@1) @@0) @@2 integer_minus_onep))
> +@end smallexample
> +
> +The above matching will generate the predicate function named
> +@code{gimple_unsigned_sat_add} that accepts both the gimple
> +assign and gimple @code{PHI}.
> +
>  Another modifier for generated expressions is @code{!} which
>  tells the machinery to only consider the simplification in case
>  the marked expression simplified to a simple operand.  Consider
> diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
> index f1e0e7abe0c..a56bd90cb2c 100644
> --- a/gcc/genmatch.cc
> +++ b/gcc/genmatch.cc
> @@ -848,12 +848,13 @@ public:
>      : operand (OP_EXPR, loc), operation (operation_),
>        ops (vNULL), expr_type (NULL), is_commutative (is_commutative_),
>        is_generic (false), force_single_use (false), force_leaf (false),
> -      opt_grp (0) {}
> +      match_phi (false), opt_grp (0) {}
>    expr (expr *e)
>      : operand (OP_EXPR, e->location), operation (e->operation),
>        ops (vNULL), expr_type (e->expr_type), is_commutative (e->is_commutative),
>        is_generic (e->is_generic), force_single_use (e->force_single_use),
> -      force_leaf (e->force_leaf), opt_grp (e->opt_grp) {}
> +      force_leaf (e->force_leaf), match_phi (e->match_phi),
> +      opt_grp (e->opt_grp) {}
>    void append_op (operand *op) { ops.safe_push (op); }
>    /* The operator and its operands.  */
>    id_base *operation;
> @@ -871,6 +872,8 @@ public:
>    /* Whether in the result expression this should be a leaf node
>       with any children simplified down to simple operands.  */
>    bool force_leaf;
> +  /* Whether the COND_EXPR is matching PHI gimple,  default false.  */
> +  bool match_phi;
>    /* If non-zero, the group for optional handling.  */
>    unsigned char opt_grp;
>    void gen_transform (FILE *f, int, const char *, bool, int,
> @@ -1819,6 +1822,7 @@ public:
>
>    char *get_name (char *);
>    void gen_opname (char *, unsigned);
> +  void gen_phi_on_cond (FILE *, int, int);
>  };
>
>  /* Leaf node of the decision tree, used for DT_SIMPLIFY.  */
> @@ -1898,7 +1902,8 @@ cmp_operand (operand *o1, operand *o2)
>        expr *e1 = static_cast<expr *>(o1);
>        expr *e2 = static_cast<expr *>(o2);
>        if (e1->operation != e2->operation
> -         || e1->is_generic != e2->is_generic)
> +         || e1->is_generic != e2->is_generic
> +         || e1->match_phi != e2->match_phi)
>         return false;
>        if (e1->operation->kind == id_base::FN
>           /* For function calls also compare number of arguments.  */
> @@ -3251,6 +3256,7 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,
>                           depth);
>           indent += 4;
>           fprintf_indent (f, indent, "{\n");
> +         bool cond_expr_p = false;
>           for (unsigned i = 0; i < exprs_len; ++i)
>             {
>               expr *e = as_a <expr *> (gimple_exprs[i]->op);
> @@ -3262,6 +3268,7 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,
>               else
>                 {
>                   id_base *op = e->operation;
> +                 cond_expr_p |= (*op == COND_EXPR && e->match_phi);
>                   if (*op == CONVERT_EXPR || *op == NOP_EXPR)
>                     fprintf_indent (f, indent, "CASE_CONVERT:\n");
>                   else
> @@ -3275,6 +3282,27 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,
>           fprintf_indent (f, indent, "default:;\n");
>           fprintf_indent (f, indent, "}\n");
>           indent -= 4;
> +
> +         if (cond_expr_p)
> +           {
> +             fprintf_indent (f, indent,
> +               "else if (gphi *_a%d = dyn_cast <gphi *> (_d%d))\n",
> +               depth, depth);
> +             indent += 2;
> +             fprintf_indent (f, indent, "{\n");
> +             indent += 2;
> +
> +             for (unsigned i = 0; i < exprs_len; i++)
> +               {
> +                 expr *e = as_a <expr *> (gimple_exprs[i]->op);
> +                 if (*e->operation == COND_EXPR && e->match_phi)
> +                   gimple_exprs[i]->gen_phi_on_cond (f, indent, depth);
> +               }
> +
> +             indent -= 2;
> +             fprintf_indent (f, indent, "}\n");
> +             indent -= 2;
> +           }
>         }
>
>        if (fns_len)
> @@ -3483,6 +3511,86 @@ dt_operand::gen (FILE *f, int indent, bool gimple, int depth)
>      }
>  }
>
> +/* Generate matching code for the phi when meet COND_EXPR.  */
> +
> +void
> +dt_operand::gen_phi_on_cond (FILE *f, int indent, int depth)
> +{
> +  fprintf_indent (f, indent,
> +    "basic_block _b%d = gimple_bb (_a%d);\n", depth, depth);
> +
> +  fprintf_indent (f, indent, "if (gimple_phi_num_args (_a%d) == 2)\n", depth);
> +
> +  indent += 2;
> +  fprintf_indent (f, indent, "{\n");
> +  indent += 2;
> +
> +  fprintf_indent (f, indent,
> +    "basic_block _pb_0_%d = EDGE_PRED (_b%d, 0)->src;\n", depth, depth);
> +  fprintf_indent (f, indent,
> +    "basic_block _pb_1_%d = EDGE_PRED (_b%d, 1)->src;\n", depth, depth);
> +  fprintf_indent (f, indent,
> +    "basic_block _db_%d = safe_dyn_cast <gcond *> (*gsi_last_bb (_pb_0_%d)) ? "
> +    "_pb_0_%d : _pb_1_%d;\n", depth, depth, depth, depth);
> +  fprintf_indent (f, indent,
> +    "basic_block _other_db_%d = safe_dyn_cast <gcond *> "
> +    "(*gsi_last_bb (_pb_0_%d)) ? _pb_1_%d : _pb_0_%d;\n",
> +    depth, depth, depth, depth);
> +
> +  fprintf_indent (f, indent,
> +    "gcond *_ct_%d = safe_dyn_cast <gcond *> (*gsi_last_bb (_db_%d));\n",
> +    depth, depth);
> +  fprintf_indent (f, indent, "if (_ct_%d"
> +    " && EDGE_COUNT (_other_db_%d->preds) == 1\n", depth, depth);
> +  fprintf_indent (f, indent,
> +    "  && EDGE_COUNT (_other_db_%d->succs) == 1\n", depth);
> +  fprintf_indent (f, indent,
> +    "  && EDGE_PRED (_other_db_%d, 0)->src == _db_%d)\n", depth, depth);
> +
> +  indent += 2;
> +  fprintf_indent (f, indent, "{\n");
> +  indent += 2;
> +
> +  fprintf_indent (f, indent,
> +    "tree _cond_lhs_%d = gimple_cond_lhs (_ct_%d);\n", depth, depth);
> +  fprintf_indent (f, indent,
> +    "tree _cond_rhs_%d = gimple_cond_rhs (_ct_%d);\n", depth, depth);
> +
> +  char opname_0[20];
> +  char opname_1[20];
> +  char opname_2[20];
> +  gen_opname (opname_0, 0);
> +
> +  fprintf_indent (f, indent,
> +    "tree %s = build2 (gimple_cond_code (_ct_%d), "
> +    "boolean_type_node, _cond_lhs_%d, _cond_rhs_%d);\n",
> +    opname_0, depth, depth, depth);
> +
> +  fprintf_indent (f, indent,
> +    "bool _arg_0_is_true_%d = gimple_phi_arg_edge (_a%d, 0)->flags"
> +    " & EDGE_TRUE_VALUE;\n", depth, depth);
> +
> +  gen_opname (opname_1, 1);
> +  fprintf_indent (f, indent,
> +    "tree %s = gimple_phi_arg_def (_a%d, _arg_0_is_true_%d ? 0 : 1);\n",
> +    opname_1, depth, depth);
> +
> +  gen_opname (opname_2, 2);
> +  fprintf_indent (f, indent,
> +    "tree %s = gimple_phi_arg_def (_a%d, _arg_0_is_true_%d ? 1 : 0);\n",
> +    opname_2, depth, depth);
> +
> +  gen_kids (f, indent, true, depth);
> +
> +  indent -= 2;
> +  fprintf_indent (f, indent, "}\n");
> +  indent -= 2;
> +
> +  indent -= 2;
> +  fprintf_indent (f, indent, "}\n");
> +  indent -= 2;
> +}
> +
>  /* Emit a logging call to the debug file to the file F, with the INDENT from
>     either the RESULT location or the S's match location if RESULT is null. */
>  static void
> @@ -4600,6 +4708,18 @@ parser::parse_expr ()
>        e->force_leaf = true;
>      }
>
> +  if (token->type == CPP_XOR && !(token->flags & PREV_WHITE))
> +    {
> +      if (!parsing_match_operand)
> +       fatal_at (token, "modifier '^' is only valid in a match expression");
> +
> +      if (!(*e->operation == COND_EXPR))
> +       fatal_at (token, "modifier '^' can only act on operation COND_EXPR");
> +
> +      eat_token (CPP_XOR);
> +      e->match_phi = true;
> +    }
> +
>    if (token->type == CPP_COLON
>        && !(token->flags & PREV_WHITE))
>      {
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 7c1ad428a3c..5f36c1fac82 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3056,31 +3056,48 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>        && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype))))))
>
>  /* Unsigned Saturation Add */
> +/* SAT_ADD = usadd_left_part_1 | usadd_right_part_1, aka:
> +   SAT_ADD = (X + Y) | -((X + Y) < X)  */
>  (match (usadd_left_part_1 @0 @1)
>   (plus:c @0 @1)
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>        && types_match (type, @0, @1))))
>
> +/* SAT_ADD = usadd_left_part_2 | usadd_right_part_2, aka:
> +   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | (IMAGPART_EXPR <.ADD_OVERFLOW> != 0) */
>  (match (usadd_left_part_2 @0 @1)
>   (realpart (IFN_ADD_OVERFLOW:c @0 @1))
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>        && types_match (type, @0, @1))))
>
> +/* SAT_ADD = usadd_left_part_1 | usadd_right_part_1, aka:
> +   SAT_ADD = (X + Y) | -((type)(X + Y) < X)  */
>  (match (usadd_right_part_1 @0 @1)
>   (negate (convert (lt (plus:c @0 @1) @0)))
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>        && types_match (type, @0, @1))))
>
> +/* SAT_ADD = usadd_left_part_1 | usadd_right_part_1, aka:
> +   SAT_ADD = (X + Y) | -(X > (X + Y))  */
>  (match (usadd_right_part_1 @0 @1)
>   (negate (convert (gt @0 (plus:c @0 @1))))
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>        && types_match (type, @0, @1))))
>
> +/* SAT_ADD = usadd_left_part_2 | usadd_right_part_2, aka:
> +   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | (IMAGPART_EXPR <.ADD_OVERFLOW> != 0) */
>  (match (usadd_right_part_2 @0 @1)
>   (negate (convert (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)))
>   (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
>        && types_match (type, @0, @1))))
>
> +/* SAT_ADD = usadd_left_part_2 | usadd_right_part_2, aka:
> +   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | -IMAGPART_EXPR <.ADD_OVERFLOW> */
> +(match (usadd_right_part_2 @0 @1)
> + (negate (imagpart (IFN_ADD_OVERFLOW:c @0 @1)))
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +      && types_match (type, @0, @1))))
> +
>  /* We cannot merge or overload usadd_left_part_1 and usadd_left_part_2
>     because the sub part of left_part_2 cannot work with right_part_1.
>     For example, left_part_2 pattern focus one .ADD_OVERFLOW but the
> @@ -3092,10 +3109,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (match (unsigned_integer_sat_add @0 @1)
>   (bit_ior:c (usadd_left_part_1 @0 @1) (usadd_right_part_1 @0 @1)))
>
> -/* Unsigned saturation add, case 2 (branchless with .ADD_OVERFLOW).  */
> +/* Unsigned saturation add, case 2 (branchless with .ADD_OVERFLOW):
> +   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | -IMAGPART_EXPR <.ADD_OVERFLOW> or
> +   SAT_ADD = REALPART_EXPR <.ADD_OVERFLOW> | (IMAGPART_EXPR <.ADD_OVERFLOW> != 0) */
>  (match (unsigned_integer_sat_add @0 @1)
>   (bit_ior:c (usadd_left_part_2 @0 @1) (usadd_right_part_2 @0 @1)))
>
> +/* Unsigned saturation add, case 3 (branch with ge):
> +   SAT_U_ADD = (X + Y) >= x ? (X + Y) : -1.  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (ge (usadd_left_part_1@2 @0 @1) @0) @2 integer_minus_onep))
> +
> +/* Unsigned saturation add, case 4 (branch with lt):
> +   SAT_U_ADD = (X + Y) < x ? -1 : (X + Y).  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (lt (usadd_left_part_1@2 @0 @1) @0) integer_minus_onep @2))
> +
> +/* Unsigned saturation add, case 5 (branch with eq .ADD_OVERFLOW):
> +   SAT_U_ADD = REALPART_EXPR <.ADD_OVERFLOW> == 0 ? .ADD_OVERFLOW : -1.  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (eq (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
> +  (usadd_left_part_2 @0 @1) integer_minus_onep))
> +
> +/* Unsigned saturation add, case 6 (branch with ne .ADD_OVERFLOW):
> +   SAT_U_ADD = REALPART_EXPR <.ADD_OVERFLOW> != 0 ? -1 : .ADD_OVERFLOW.  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
> +  integer_minus_onep (usadd_left_part_2 @0 @1)))
> +
>  /* Unsigned saturation sub, case 1 (branch with gt):
>     SAT_U_SUB = X > Y ? X - Y : 0  */
>  (match (unsigned_integer_sat_sub @0 @1)
> diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> index 469e34d828d..173b0366f5e 100644
> --- a/gcc/tree-ssa-math-opts.cc
> +++ b/gcc/tree-ssa-math-opts.cc
> @@ -4101,8 +4101,24 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, internal_fn fn,
>      }
>  }
>
> +static void
> +build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi,
> +                                   internal_fn fn, tree lhs, tree op_0,
> +                                   tree op_1)
> +{
> +  if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH))
> +    {
> +      gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1);
> +      gimple_call_set_lhs (call, lhs);
> +      gsi_insert_before (gsi, call, GSI_SAME_STMT);
> +
> +      gimple_stmt_iterator psi = gsi_for_stmt (phi);
> +      remove_phi_node (&psi, /* release_lhs_p */ false);
> +    }
> +}
> +
>  /*
> - * Try to match saturation unsigned add.
> + * Try to match saturation unsigned add with assign.
>   *   _7 = _4 + _6;
>   *   _8 = _4 > _7;
>   *   _9 = (long unsigned int) _8;
> @@ -4121,6 +4137,37 @@ match_unsigned_saturation_add (gimple_stmt_iterator *gsi, gassign *stmt)
>      build_saturation_binary_arith_call (gsi, IFN_SAT_ADD, lhs, ops[0], ops[1]);
>  }
>
> +/*
> + * Try to match saturation unsigned add with PHI.
> + *   <bb 2> :
> + *   _1 = x_3(D) + y_4(D);
> + *   if (_1 >= x_3(D))
> + *     goto <bb 3>; [INV]
> + *   else
> + *     goto <bb 4>; [INV]
> + *
> + *   <bb 3> :
> + *
> + *   <bb 4> :
> + *   # _2 = PHI <255(2), _1(3)>
> + *   =>
> + *   <bb 4> [local count: 1073741824]:
> + *   _2 = .SAT_ADD (x_4(D), y_5(D));  */
> +
> +static void
> +match_unsigned_saturation_add (gimple_stmt_iterator *gsi, gphi *phi)
> +{
> +  if (gimple_phi_num_args (phi) != 2)
> +    return;
> +
> +  tree ops[2];
> +  tree phi_result = gimple_phi_result (phi);
> +
> +  if (gimple_unsigned_integer_sat_add (phi_result, ops, NULL))
> +    build_saturation_binary_arith_call (gsi, phi, IFN_SAT_ADD, phi_result,
> +                                       ops[0], ops[1]);
> +}
> +
>  /*
>   * Try to match saturation unsigned sub.
>   *   _1 = _4 >= _5;
> @@ -6052,6 +6099,13 @@ math_opts_dom_walker::after_dom_children (basic_block bb)
>
>    fma_deferring_state fma_state (param_avoid_fma_max_bits > 0);
>
> +  for (gphi_iterator psi = gsi_start_phis (bb); !gsi_end_p (psi);
> +    gsi_next (&psi))
> +    {
> +      gimple_stmt_iterator gsi = gsi_last_bb (bb);
> +      match_unsigned_saturation_add (&gsi, psi.phi ());
> +    }
> +
>    for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi);)
>      {
>        gimple *stmt = gsi_stmt (gsi);
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD
  2024-06-06 22:59   ` Li, Pan2
@ 2024-06-13 12:00     ` Maciej W. Rozycki
  2024-06-13 13:00       ` Li, Pan2
  0 siblings, 1 reply; 9+ messages in thread
From: Maciej W. Rozycki @ 2024-06-13 12:00 UTC (permalink / raw)
  To: Li, Pan2
  Cc: Richard Biener, gcc-patches, juzhe.zhong, kito.cheng, tamar.christina

[-- Attachment #1: Type: text/plain, Size: 1134 bytes --]

On Thu, 6 Jun 2024, Li, Pan2 wrote:

> Committed, thanks Richard.

 This has broken glibc building for the `riscv64-linux-gnu' target:

iovsprintf.c: In function '__vsprintf_internal':
iovsprintf.c:34:1: error: definition in block 5 follows the use
   34 | __vsprintf_internal (char *string, size_t maxlen,
      | ^~~~~~~~~~~~~~~~~~~
for SSA_NAME: _9 in statement:
prephitmp_36 = (char *) _9;
during GIMPLE pass: widening_mul
iovsprintf.c:34:1: internal compiler error: verify_ssa failed
0x7fffb1474bf7 generic_start_main
        ../csu/libc-start.c:308
0x7fffb1474de3 __libc_start_main
        ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

Use just `riscv64-linux-gnu -O2' on the attached preprocessed source to 
reproduce.  This builds just fine up to e14afbe2d1c6^ and then breaks with 
the message quoted.

 Shall I investigate it further or will you be able to figure it out with 
the information supplied what's wrong?

  Maciej

[-- Attachment #2: Type: application/octet-stream, Size: 43444 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD
  2024-06-13 12:00     ` Maciej W. Rozycki
@ 2024-06-13 13:00       ` Li, Pan2
  2024-06-13 16:14         ` Maciej W. Rozycki
  0 siblings, 1 reply; 9+ messages in thread
From: Li, Pan2 @ 2024-06-13 13:00 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Richard Biener, gcc-patches, juzhe.zhong, kito.cheng, tamar.christina

Could you please help to update the upstream for another try ?

Should be fixed by this commit https://github.com/gcc-mirror/gcc/commit/d03ff3fd3e2da1352a404e3c53fe61314569345c.

Feel free to ping me if any questions or concerns.

Pan

-----Original Message-----
From: Maciej W. Rozycki <macro@orcam.me.uk> 
Sent: Thursday, June 13, 2024 8:01 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: Richard Biener <richard.guenther@gmail.com>; gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; tamar.christina@arm.com
Subject: RE: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD

On Thu, 6 Jun 2024, Li, Pan2 wrote:

> Committed, thanks Richard.

 This has broken glibc building for the `riscv64-linux-gnu' target:

iovsprintf.c: In function '__vsprintf_internal':
iovsprintf.c:34:1: error: definition in block 5 follows the use
   34 | __vsprintf_internal (char *string, size_t maxlen,
      | ^~~~~~~~~~~~~~~~~~~
for SSA_NAME: _9 in statement:
prephitmp_36 = (char *) _9;
during GIMPLE pass: widening_mul
iovsprintf.c:34:1: internal compiler error: verify_ssa failed
0x7fffb1474bf7 generic_start_main
        ../csu/libc-start.c:308
0x7fffb1474de3 __libc_start_main
        ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:116
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

Use just `riscv64-linux-gnu -O2' on the attached preprocessed source to 
reproduce.  This builds just fine up to e14afbe2d1c6^ and then breaks with 
the message quoted.

 Shall I investigate it further or will you be able to figure it out with 
the information supplied what's wrong?

  Maciej

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD
  2024-06-13 13:00       ` Li, Pan2
@ 2024-06-13 16:14         ` Maciej W. Rozycki
  2024-06-14  1:21           ` Li, Pan2
  0 siblings, 1 reply; 9+ messages in thread
From: Maciej W. Rozycki @ 2024-06-13 16:14 UTC (permalink / raw)
  To: Li, Pan2
  Cc: Richard Biener, gcc-patches, juzhe.zhong, kito.cheng, tamar.christina

On Thu, 13 Jun 2024, Li, Pan2 wrote:

> Could you please help to update the upstream for another try ?
> 
> Should be fixed by this commit https://github.com/gcc-mirror/gcc/commit/d03ff3fd3e2da1352a404e3c53fe61314569345c.
> 
> Feel free to ping me if any questions or concerns.

 Upstream master (as at 609764a42f0c) doesn't build:

In file included from .../gcc/gcc/coretypes.h:487,
                 from .../gcc/gcc/tree-vect-stmts.cc:24:
In member function 'bool poly_int<N, T>::is_constant() const [with unsigned int N = 2; C = long unsigned int]',
    inlined from 'C poly_int<N, T>::to_constant() const [with unsigned int N = 2; C = long unsigned int]' at .../gcc/gcc/poly-int.h:588:3,
    inlined from 'bool get_group_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2155:39,
    inlined from 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2387:38:
.../gcc/gcc/poly-int.h:557:7: error: 'remain.poly_int<2, long unsigned int>::coeffs[1]' may be used uninitialized [-Werror=maybe-uninitialized]
  557 |       if (this->coeffs[i] != 0)
      |       ^~
.../gcc/gcc/tree-vect-stmts.cc: In function 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)':
.../gcc/gcc/tree-vect-stmts.cc:2115:23: note: 'remain.poly_int<2, long unsigned int>::coeffs[1]' was declared here
 2115 |           poly_uint64 remain;
      |                       ^~~~~~
In file included from .../gcc/gcc/system.h:1250,
                 from .../gcc/gcc/tree-vect-stmts.cc:23:
In function 'int ceil_log2(long unsigned int)',
    inlined from 'bool get_group_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2156:43,
    inlined from 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2387:38:
.../gcc/gcc/hwint.h:266:17: error: 'remain.poly_int<2, long unsigned int>::coeffs[0]' may be used uninitialized [-Werror=maybe-uninitialized]
  266 |   return x == 0 ? 0 : floor_log2 (x - 1) + 1;
      |          ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
.../gcc/gcc/tree-vect-stmts.cc: In function 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)':
.../gcc/gcc/tree-vect-stmts.cc:2115:23: note: 'remain.poly_int<2, long unsigned int>::coeffs[0]' was declared here
 2115 |           poly_uint64 remain;
      |                       ^~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [Makefile:1198: tree-vect-stmts.o] Error 1

and actually e14afbe2d1c6^ doesn't build either (I guess it's just slipped 
through bisection as the file didn't have to be rebuild or something):

In file included from .../gcc/gcc/rtl.h:3973,
                 from .../gcc/gcc/config/riscv/riscv.cc:31:
In function 'rtx_def* init_rtx_fmt_ee(rtx, machine_mode, rtx, rtx)',
    inlined from 'rtx_def* gen_rtx_fmt_ee_stat(rtx_code, machine_mode, rtx, rtx)' at ./genrtl.h:50:26,
    inlined from 'void riscv_move_integer(rtx, rtx, long int, machine_mode)' at .../gcc/gcc/config/riscv/riscv.cc:2786:10:
./genrtl.h:37:16: error: 'x' may be used uninitialized [-Werror=maybe-uninitialized]
   37 |   XEXP (rt, 0) = arg0;
.../gcc/gcc/config/riscv/riscv.cc: In function 'void riscv_move_integer(rtx, rtx, long int, machine_mode)':
.../gcc/gcc/config/riscv/riscv.cc:2723:7: note: 'x' was declared here
 2723 |   rtx x;
      |       ^
cc1plus: all warnings being treated as errors
make[2]: *** [Makefile:2563: riscv.o] Error 1

I hope you'll find this all useful.  As it happens I don't need to verify 
my needs with a RISC-V target anymore, so I'm leaving it all up to you now 
as I need to switch back to Alpha, which has been my actual objective, and 
these rebuilds have taken a lot of my attention already.

 Thank you for your input.

  Maciej

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD
  2024-06-13 16:14         ` Maciej W. Rozycki
@ 2024-06-14  1:21           ` Li, Pan2
  2024-06-14  6:04             ` Richard Biener
  0 siblings, 1 reply; 9+ messages in thread
From: Li, Pan2 @ 2024-06-14  1:21 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Richard Biener, gcc-patches, juzhe.zhong, kito.cheng, tamar.christina

Thanks for another try.

Looks the build failure list below has nothing to do with this patch. I think the correlated owner will take care of this Werror build issue soon.

Pan

-----Original Message-----
From: Maciej W. Rozycki <macro@orcam.me.uk> 
Sent: Friday, June 14, 2024 12:15 AM
To: Li, Pan2 <pan2.li@intel.com>
Cc: Richard Biener <richard.guenther@gmail.com>; gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; tamar.christina@arm.com
Subject: RE: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD

On Thu, 13 Jun 2024, Li, Pan2 wrote:

> Could you please help to update the upstream for another try ?
> 
> Should be fixed by this commit https://github.com/gcc-mirror/gcc/commit/d03ff3fd3e2da1352a404e3c53fe61314569345c.
> 
> Feel free to ping me if any questions or concerns.

 Upstream master (as at 609764a42f0c) doesn't build:

In file included from .../gcc/gcc/coretypes.h:487,
                 from .../gcc/gcc/tree-vect-stmts.cc:24:
In member function 'bool poly_int<N, T>::is_constant() const [with unsigned int N = 2; C = long unsigned int]',
    inlined from 'C poly_int<N, T>::to_constant() const [with unsigned int N = 2; C = long unsigned int]' at .../gcc/gcc/poly-int.h:588:3,
    inlined from 'bool get_group_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2155:39,
    inlined from 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2387:38:
.../gcc/gcc/poly-int.h:557:7: error: 'remain.poly_int<2, long unsigned int>::coeffs[1]' may be used uninitialized [-Werror=maybe-uninitialized]
  557 |       if (this->coeffs[i] != 0)
      |       ^~
.../gcc/gcc/tree-vect-stmts.cc: In function 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)':
.../gcc/gcc/tree-vect-stmts.cc:2115:23: note: 'remain.poly_int<2, long unsigned int>::coeffs[1]' was declared here
 2115 |           poly_uint64 remain;
      |                       ^~~~~~
In file included from .../gcc/gcc/system.h:1250,
                 from .../gcc/gcc/tree-vect-stmts.cc:23:
In function 'int ceil_log2(long unsigned int)',
    inlined from 'bool get_group_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2156:43,
    inlined from 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2387:38:
.../gcc/gcc/hwint.h:266:17: error: 'remain.poly_int<2, long unsigned int>::coeffs[0]' may be used uninitialized [-Werror=maybe-uninitialized]
  266 |   return x == 0 ? 0 : floor_log2 (x - 1) + 1;
      |          ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
.../gcc/gcc/tree-vect-stmts.cc: In function 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)':
.../gcc/gcc/tree-vect-stmts.cc:2115:23: note: 'remain.poly_int<2, long unsigned int>::coeffs[0]' was declared here
 2115 |           poly_uint64 remain;
      |                       ^~~~~~
cc1plus: all warnings being treated as errors
make[2]: *** [Makefile:1198: tree-vect-stmts.o] Error 1

and actually e14afbe2d1c6^ doesn't build either (I guess it's just slipped 
through bisection as the file didn't have to be rebuild or something):

In file included from .../gcc/gcc/rtl.h:3973,
                 from .../gcc/gcc/config/riscv/riscv.cc:31:
In function 'rtx_def* init_rtx_fmt_ee(rtx, machine_mode, rtx, rtx)',
    inlined from 'rtx_def* gen_rtx_fmt_ee_stat(rtx_code, machine_mode, rtx, rtx)' at ./genrtl.h:50:26,
    inlined from 'void riscv_move_integer(rtx, rtx, long int, machine_mode)' at .../gcc/gcc/config/riscv/riscv.cc:2786:10:
./genrtl.h:37:16: error: 'x' may be used uninitialized [-Werror=maybe-uninitialized]
   37 |   XEXP (rt, 0) = arg0;
.../gcc/gcc/config/riscv/riscv.cc: In function 'void riscv_move_integer(rtx, rtx, long int, machine_mode)':
.../gcc/gcc/config/riscv/riscv.cc:2723:7: note: 'x' was declared here
 2723 |   rtx x;
      |       ^
cc1plus: all warnings being treated as errors
make[2]: *** [Makefile:2563: riscv.o] Error 1

I hope you'll find this all useful.  As it happens I don't need to verify 
my needs with a RISC-V target anymore, so I'm leaving it all up to you now 
as I need to switch back to Alpha, which has been my actual objective, and 
these rebuilds have taken a lot of my attention already.

 Thank you for your input.

  Maciej

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD
  2024-06-14  1:21           ` Li, Pan2
@ 2024-06-14  6:04             ` Richard Biener
  2024-06-14 22:27               ` Maciej W. Rozycki
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Biener @ 2024-06-14  6:04 UTC (permalink / raw)
  To: Li, Pan2
  Cc: Maciej W. Rozycki, gcc-patches, juzhe.zhong, kito.cheng, tamar.christina

On Fri, Jun 14, 2024 at 3:21 AM Li, Pan2 <pan2.li@intel.com> wrote:
>
> Thanks for another try.
>
> Looks the build failure list below has nothing to do with this patch. I think the correlated owner will take care of this Werror build issue soon.
>
> Pan
>
> -----Original Message-----
> From: Maciej W. Rozycki <macro@orcam.me.uk>
> Sent: Friday, June 14, 2024 12:15 AM
> To: Li, Pan2 <pan2.li@intel.com>
> Cc: Richard Biener <richard.guenther@gmail.com>; gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; tamar.christina@arm.com
> Subject: RE: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD
>
> On Thu, 13 Jun 2024, Li, Pan2 wrote:
>
> > Could you please help to update the upstream for another try ?
> >
> > Should be fixed by this commit https://github.com/gcc-mirror/gcc/commit/d03ff3fd3e2da1352a404e3c53fe61314569345c.
> >
> > Feel free to ping me if any questions or concerns.
>
>  Upstream master (as at 609764a42f0c) doesn't build:
>
> In file included from .../gcc/gcc/coretypes.h:487,
>                  from .../gcc/gcc/tree-vect-stmts.cc:24:
> In member function 'bool poly_int<N, T>::is_constant() const [with unsigned int N = 2; C = long unsigned int]',
>     inlined from 'C poly_int<N, T>::to_constant() const [with unsigned int N = 2; C = long unsigned int]' at .../gcc/gcc/poly-int.h:588:3,
>     inlined from 'bool get_group_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2155:39,
>     inlined from 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2387:38:
> .../gcc/gcc/poly-int.h:557:7: error: 'remain.poly_int<2, long unsigned int>::coeffs[1]' may be used uninitialized [-Werror=maybe-uninitialized]
>   557 |       if (this->coeffs[i] != 0)
>       |       ^~
> .../gcc/gcc/tree-vect-stmts.cc: In function 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)':
> .../gcc/gcc/tree-vect-stmts.cc:2115:23: note: 'remain.poly_int<2, long unsigned int>::coeffs[1]' was declared here
>  2115 |           poly_uint64 remain;
>       |                       ^~~~~~
> In file included from .../gcc/gcc/system.h:1250,
>                  from .../gcc/gcc/tree-vect-stmts.cc:23:
> In function 'int ceil_log2(long unsigned int)',
>     inlined from 'bool get_group_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2156:43,
>     inlined from 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)' at .../gcc/gcc/tree-vect-stmts.cc:2387:38:
> .../gcc/gcc/hwint.h:266:17: error: 'remain.poly_int<2, long unsigned int>::coeffs[0]' may be used uninitialized [-Werror=maybe-uninitialized]
>   266 |   return x == 0 ? 0 : floor_log2 (x - 1) + 1;
>       |          ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> .../gcc/gcc/tree-vect-stmts.cc: In function 'bool get_load_store_type(vec_info*, stmt_vec_info, tree, slp_tree, bool, vec_load_store_type, unsigned int, vect_memory_access_type*, poly_int64*, dr_alignment_support*, int*, gather_scatter_info*, internal_fn*)':
> .../gcc/gcc/tree-vect-stmts.cc:2115:23: note: 'remain.poly_int<2, long unsigned int>::coeffs[0]' was declared here
>  2115 |           poly_uint64 remain;
>       |                       ^~~~~~
> cc1plus: all warnings being treated as errors
> make[2]: *** [Makefile:1198: tree-vect-stmts.o] Error 1
>
> and actually e14afbe2d1c6^ doesn't build either (I guess it's just slipped
> through bisection as the file didn't have to be rebuild or something):
>
> In file included from .../gcc/gcc/rtl.h:3973,
>                  from .../gcc/gcc/config/riscv/riscv.cc:31:
> In function 'rtx_def* init_rtx_fmt_ee(rtx, machine_mode, rtx, rtx)',
>     inlined from 'rtx_def* gen_rtx_fmt_ee_stat(rtx_code, machine_mode, rtx, rtx)' at ./genrtl.h:50:26,
>     inlined from 'void riscv_move_integer(rtx, rtx, long int, machine_mode)' at .../gcc/gcc/config/riscv/riscv.cc:2786:10:
> ./genrtl.h:37:16: error: 'x' may be used uninitialized [-Werror=maybe-uninitialized]
>    37 |   XEXP (rt, 0) = arg0;
> .../gcc/gcc/config/riscv/riscv.cc: In function 'void riscv_move_integer(rtx, rtx, long int, machine_mode)':
> .../gcc/gcc/config/riscv/riscv.cc:2723:7: note: 'x' was declared here
>  2723 |   rtx x;
>       |       ^
> cc1plus: all warnings being treated as errors
> make[2]: *** [Makefile:2563: riscv.o] Error 1
>
> I hope you'll find this all useful.  As it happens I don't need to verify
> my needs with a RISC-V target anymore, so I'm leaving it all up to you now
> as I need to switch back to Alpha, which has been my actual objective, and
> these rebuilds have taken a lot of my attention already.

I'm including

@@ -2148,15 +2148,17 @@ get_group_load_store_type (vec_info *vinfo,
stmt_vec_info stmt_info,
...
              if (!nunits.is_constant (&cnunits)
                  || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant (&cvf)
-                 || ((cremain = remain.to_constant (), true)
+                 || (((cremain = group_size * cvf - gap % cnunits), true)
                      && ((cpart_size = (1 << ceil_log2 (cremain))) != cnunits)

in a patch I'm testing that should eventually fix the above bogus diagnostic.

>  Thank you for your input.
>
>   Maciej

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD
  2024-06-14  6:04             ` Richard Biener
@ 2024-06-14 22:27               ` Maciej W. Rozycki
  0 siblings, 0 replies; 9+ messages in thread
From: Maciej W. Rozycki @ 2024-06-14 22:27 UTC (permalink / raw)
  To: Richard Biener
  Cc: Li, Pan2, gcc-patches, juzhe.zhong, kito.cheng, tamar.christina

On Fri, 14 Jun 2024, Richard Biener wrote:

> > I hope you'll find this all useful.  As it happens I don't need to verify
> > my needs with a RISC-V target anymore, so I'm leaving it all up to you now
> > as I need to switch back to Alpha, which has been my actual objective, and
> > these rebuilds have taken a lot of my attention already.
> 
> I'm including
> 
> @@ -2148,15 +2148,17 @@ get_group_load_store_type (vec_info *vinfo,
> stmt_vec_info stmt_info,
> ...
>               if (!nunits.is_constant (&cnunits)
>                   || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant (&cvf)
> -                 || ((cremain = remain.to_constant (), true)
> +                 || (((cremain = group_size * cvf - gap % cnunits), true)
>                       && ((cpart_size = (1 << ceil_log2 (cremain))) != cnunits)
> 
> in a patch I'm testing that should eventually fix the above bogus diagnostic.

 Thank you.  I can see your fix went in with e575b5c56137 and I can 
confirm that with:

<https://patchwork.sourceware.org/project/gcc/patch/20240612232026.41780-1-patrick@rivosinc.com/>

applied (and its patched-autoconf artefacts removed) `riscv64-linux-gnu' 
target now succesfully builds for me as at 1bb2535c7cb2.

  Maciej

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-06-14 22:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-06 13:37 [PATCH v7] Match: Support more form for scalar unsigned SAT_ADD pan2.li
2024-06-06 14:04 ` Richard Biener
2024-06-06 22:59   ` Li, Pan2
2024-06-13 12:00     ` Maciej W. Rozycki
2024-06-13 13:00       ` Li, Pan2
2024-06-13 16:14         ` Maciej W. Rozycki
2024-06-14  1:21           ` Li, Pan2
2024-06-14  6:04             ` Richard Biener
2024-06-14 22:27               ` Maciej W. Rozycki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).