[PATCH 0/7] Some more phiopt cleanups and double minmax to match

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH 0/7] Some more phiopt cleanups and double minmax to match
@ 2023-04-24 21:30 Andrew Pinski
  2023-04-24 21:30 ` [PATCH 1/7] PHIOPT: Split out store elimination from phiopt Andrew Pinski
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Andrew Pinski @ 2023-04-24 21:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski

The first 3 patches of this patch series is about some more phiopt cleanups
dealing with the worker functions being folded into now the ::execute functions.

The last 4 are allowing diamond based min/max optimization to be done
in match instead of manually in phiopt. Note I have not removed
minmax_replacement yet as there is a few missing patterns from
match still. Those will be implemented in the next couple of weeks.

Andrew Pinski (7):
  PHIOPT: Split out store elimination from phiopt
  PHIOPT: Rename tree_ssa_phiopt_worker to pass_phiopt::execute
  PHIOPT: Move store_elim_worker into pass_cselim::execute
  MIN/MAX should be treated similar as comparisons for trapping
  PHIOPT: Allow MIN/MAX to have up to 2 MIN/MAX expressions for early
    phiopt
  MATCH: Factor out code that for min max detection with constants
  MATCH: Add patterns from phiopt's minmax_replacement

 gcc/fold-const.cc                            |  43 ++
 gcc/fold-const.h                             |   3 +
 gcc/match.pd                                 |  45 +-
 gcc/rtlanal.cc                               |   3 +
 gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c    |  10 +-
 gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c |   3 +-
 gcc/tree-eh.cc                               |   3 +
 gcc/tree-ssa-phiopt.cc                       | 579 ++++++++++---------
 8 files changed, 394 insertions(+), 295 deletions(-)

-- 
2.39.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/7] PHIOPT: Split out store elimination from phiopt
  2023-04-24 21:30 [PATCH 0/7] Some more phiopt cleanups and double minmax to match Andrew Pinski
@ 2023-04-24 21:30 ` Andrew Pinski
  2023-04-27 10:50   ` Richard Biener
  2023-04-24 21:30 ` [PATCH 2/7] PHIOPT: Rename tree_ssa_phiopt_worker to pass_phiopt::execute Andrew Pinski
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Andrew Pinski @ 2023-04-24 21:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski

Since the last cleanups, it made easier to see
that we should split out the store elimination
worker from tree_ssa_phiopt_worker function.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

	* tree-ssa-phiopt.cc (tree_ssa_phiopt_worker): Remove
	do_store_elim argument and split that part out to ...
	(store_elim_worker): This new function.
	(pass_cselim::execute): Call store_elim_worker.
	(pass_phiopt::execute): Update call to tree_ssa_phiopt_worker.
---
 gcc/tree-ssa-phiopt.cc | 180 ++++++++++++++++++++++++++++-------------
 1 file changed, 126 insertions(+), 54 deletions(-)

diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 4a3ab8efb71..7f47b32576b 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -104,27 +104,19 @@ single_non_singleton_phi_for_edges (gimple_seq seq, edge e0, edge e1)
   return phi;
 }
 
-/* The core routine of conditional store replacement and normal
-   phi optimizations.  Both share much of the infrastructure in how
-   to match applicable basic block patterns.  DO_STORE_ELIM is true
-   when we want to do conditional store replacement, false otherwise.
+/* The core routine of phi optimizations.
    DO_HOIST_LOADS is true when we want to hoist adjacent loads out
    of diamond control flow patterns, false otherwise.  */
 static unsigned int
-tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
+tree_ssa_phiopt_worker (bool do_hoist_loads, bool early_p)
 {
   basic_block bb;
   basic_block *bb_order;
   unsigned n, i;
   bool cfgchanged = false;
-  hash_set<tree> *nontrap = 0;
 
   calculate_dominance_info (CDI_DOMINATORS);
 
-  if (do_store_elim)
-    /* Calculate the set of non-trapping memory accesses.  */
-    nontrap = get_non_trapping ();
-
   /* Search every basic block for COND_EXPR we may be able to optimize.
 
      We walk the blocks in order that guarantees that a block with
@@ -148,7 +140,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
       /* Check to see if the last statement is a GIMPLE_COND.  */
       gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
       if (!cond_stmt)
-        continue;
+	continue;
 
       e1 = EDGE_SUCC (bb, 0);
       bb1 = e1->dest;
@@ -158,12 +150,12 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
       /* We cannot do the optimization on abnormal edges.  */
       if ((e1->flags & EDGE_ABNORMAL) != 0
           || (e2->flags & EDGE_ABNORMAL) != 0)
-       continue;
+	continue;
 
       /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
       if (EDGE_COUNT (bb1->succs) == 0
 	  || EDGE_COUNT (bb2->succs) == 0)
-        continue;
+	continue;
 
       /* Find the bb which is the fall through to the other.  */
       if (EDGE_SUCC (bb1, 0)->dest == bb2)
@@ -192,39 +184,6 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
 	  || (e1->flags & EDGE_FALLTHRU) == 0)
 	continue;
 
-      if (do_store_elim)
-	{
-	  if (diamond_p)
-	    {
-	      basic_block bb3 = e1->dest;
-
-	      /* Only handle sinking of store from 2 bbs only,
-		 The middle bbs don't need to come from the
-		 if always since we are sinking rather than
-		 hoisting. */
-	      if (EDGE_COUNT (bb3->preds) != 2)
-		continue;
-	      if (cond_if_else_store_replacement (bb1, bb2, bb3))
-		cfgchanged = true;
-	      continue;
-	    }
-
-	  /* Also make sure that bb1 only have one predecessor and that it
-	     is bb.  */
-	  if (!single_pred_p (bb1)
-	      || single_pred (bb1) != bb)
-	    continue;
-
-	  /* bb1 is the middle block, bb2 the join block, bb the split block,
-	     e1 the fallthrough edge from bb1 to bb2.  We can't do the
-	     optimization if the join block has more than two predecessors.  */
-	  if (EDGE_COUNT (bb2->preds) > 2)
-	    continue;
-	  if (cond_store_replacement (bb1, bb2, e1, e2, nontrap))
-	    cfgchanged = true;
-	  continue;
-	}
-
       if (diamond_p)
 	{
 	  basic_block bb3 = e1->dest;
@@ -322,18 +281,132 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
 
   free (bb_order);
 
-  if (do_store_elim)
-    delete nontrap;
+  if (cfgchanged)
+    return TODO_cleanup_cfg;
+  return 0;
+}
+
+/* The core routine of conditional store replacement.  */
+static unsigned int
+store_elim_worker (void)
+{
+  basic_block bb;
+  basic_block *bb_order;
+  unsigned n, i;
+  bool cfgchanged = false;
+  hash_set<tree> *nontrap = 0;
+
+  calculate_dominance_info (CDI_DOMINATORS);
+
+  /* Calculate the set of non-trapping memory accesses.  */
+  nontrap = get_non_trapping ();
+
+  /* Search every basic block for COND_EXPR we may be able to optimize.
+
+     We walk the blocks in order that guarantees that a block with
+     a single predecessor is processed before the predecessor.
+     This ensures that we collapse inner ifs before visiting the
+     outer ones, and also that we do not try to visit a removed
+     block.  */
+  bb_order = single_pred_before_succ_order ();
+  n = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
+
+  for (i = 0; i < n; i++)
+    {
+      basic_block bb1, bb2;
+      edge e1, e2;
+      bool diamond_p = false;
+
+      bb = bb_order[i];
+
+      /* Check to see if the last statement is a GIMPLE_COND.  */
+      gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
+      if (!cond_stmt)
+	continue;
+
+      e1 = EDGE_SUCC (bb, 0);
+      bb1 = e1->dest;
+      e2 = EDGE_SUCC (bb, 1);
+      bb2 = e2->dest;
+
+      /* We cannot do the optimization on abnormal edges.  */
+      if ((e1->flags & EDGE_ABNORMAL) != 0
+	  || (e2->flags & EDGE_ABNORMAL) != 0)
+	continue;
+
+      /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
+      if (EDGE_COUNT (bb1->succs) == 0
+	  || EDGE_COUNT (bb2->succs) == 0)
+	continue;
+
+      /* Find the bb which is the fall through to the other.  */
+      if (EDGE_SUCC (bb1, 0)->dest == bb2)
+	;
+      else if (EDGE_SUCC (bb2, 0)->dest == bb1)
+	{
+	  std::swap (bb1, bb2);
+	  std::swap (e1, e2);
+	}
+      else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
+	       && single_succ_p (bb2))
+	{
+	  diamond_p = true;
+	  e2 = EDGE_SUCC (bb2, 0);
+	  /* Make sure bb2 is just a fall through. */
+	  if ((e2->flags & EDGE_FALLTHRU) == 0)
+	    continue;
+	}
+      else
+	continue;
+
+      e1 = EDGE_SUCC (bb1, 0);
+
+      /* Make sure that bb1 is just a fall through.  */
+      if (!single_succ_p (bb1)
+	  || (e1->flags & EDGE_FALLTHRU) == 0)
+	continue;
+
+      if (diamond_p)
+	{
+	  basic_block bb3 = e1->dest;
+
+	  /* Only handle sinking of store from 2 bbs only,
+	     The middle bbs don't need to come from the
+	     if always since we are sinking rather than
+	     hoisting. */
+	  if (EDGE_COUNT (bb3->preds) != 2)
+	    continue;
+	  if (cond_if_else_store_replacement (bb1, bb2, bb3))
+	    cfgchanged = true;
+	  continue;
+	}
+
+      /* Also make sure that bb1 only have one predecessor and that it
+	 is bb.  */
+      if (!single_pred_p (bb1)
+	  || single_pred (bb1) != bb)
+	continue;
+
+      /* bb1 is the middle block, bb2 the join block, bb the split block,
+	 e1 the fallthrough edge from bb1 to bb2.  We can't do the
+	 optimization if the join block has more than two predecessors.  */
+      if (EDGE_COUNT (bb2->preds) > 2)
+	continue;
+      if (cond_store_replacement (bb1, bb2, e1, e2, nontrap))
+	cfgchanged = true;
+    }
+
+  free (bb_order);
+
+  delete nontrap;
   /* If the CFG has changed, we should cleanup the CFG.  */
-  if (cfgchanged && do_store_elim)
+  if (cfgchanged)
     {
       /* In cond-store replacement we have added some loads on edges
          and new VOPS (as we moved the store, and created a load).  */
       gsi_commit_edge_inserts ();
       return TODO_cleanup_cfg | TODO_update_ssa_only_virtuals;
     }
-  else if (cfgchanged)
-    return TODO_cleanup_cfg;
   return 0;
 }
 
@@ -4257,8 +4330,7 @@ public:
   bool gate (function *) final override { return flag_ssa_phiopt; }
   unsigned int execute (function *) final override
     {
-      return tree_ssa_phiopt_worker (false,
-				     !early_p ? gate_hoist_loads () : false,
+      return tree_ssa_phiopt_worker (!early_p ? gate_hoist_loads () : false,
 				     early_p);
     }
 
@@ -4360,7 +4432,7 @@ pass_cselim::execute (function *)
      An interfacing issue of find_data_references_in_bb.  */
   loop_optimizer_init (LOOPS_NORMAL);
   scev_initialize ();
-  todo = tree_ssa_phiopt_worker (true, false, false);
+  todo = store_elim_worker ();
   scev_finalize ();
   loop_optimizer_finalize ();
   return todo;
-- 
2.39.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/7] PHIOPT: Rename tree_ssa_phiopt_worker to pass_phiopt::execute
  2023-04-24 21:30 [PATCH 0/7] Some more phiopt cleanups and double minmax to match Andrew Pinski
  2023-04-24 21:30 ` [PATCH 1/7] PHIOPT: Split out store elimination from phiopt Andrew Pinski
@ 2023-04-24 21:30 ` Andrew Pinski
  2023-04-27 10:58   ` Richard Biener
  2023-04-24 21:30 ` [PATCH 3/7] PHIOPT: Move store_elim_worker into pass_cselim::execute Andrew Pinski
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Andrew Pinski @ 2023-04-24 21:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski

Now that store elimination and phiopt does not
share outer code, we can move tree_ssa_phiopt_worker
directly into pass_phiopt::execute and remove
many declarations (prototypes) from the file.

gcc/ChangeLog:

	* tree-ssa-phiopt.cc (two_value_replacement): Remove
	prototype.
	(match_simplify_replacement): Likewise.
	(factor_out_conditional_conversion): Likewise.
	(value_replacement): Likewise.
	(minmax_replacement): Likewise.
	(spaceship_replacement): Likewise.
	(cond_removal_in_builtin_zero_pattern): Likewise.
	(hoist_adjacent_loads): Likewise.
	(tree_ssa_phiopt_worker): Move into ...
	(pass_phiopt::execute): this.
---
 gcc/tree-ssa-phiopt.cc | 385 +++++++++++++++++++----------------------
 1 file changed, 181 insertions(+), 204 deletions(-)

diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 7f47b32576b..d232fd9b551 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -55,27 +55,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-propagate.h"
 #include "tree-ssa-dce.h"
 
-static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
-				   tree, tree);
-static bool match_simplify_replacement (basic_block, basic_block, basic_block,
-					edge, edge, gphi *, tree, tree, bool, bool);
-static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, tree,
-						gimple *);
-static int value_replacement (basic_block, basic_block,
-			      edge, edge, gphi *, tree, tree);
-static bool minmax_replacement (basic_block, basic_block, basic_block,
-				edge, edge, gphi *, tree, tree, bool);
-static bool spaceship_replacement (basic_block, basic_block,
-				   edge, edge, gphi *, tree, tree);
-static bool cond_removal_in_builtin_zero_pattern (basic_block, basic_block,
-						  edge, edge, gphi *,
-						  tree, tree);
 static bool cond_store_replacement (basic_block, basic_block, edge, edge,
 				    hash_set<tree> *);
 static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
 static hash_set<tree> * get_non_trapping ();
-static void hoist_adjacent_loads (basic_block, basic_block,
-				  basic_block, basic_block);
 
 /* Return the singleton PHI in the SEQ of PHIs for edges E0 and E1. */
 
@@ -104,188 +87,6 @@ single_non_singleton_phi_for_edges (gimple_seq seq, edge e0, edge e1)
   return phi;
 }
 
-/* The core routine of phi optimizations.
-   DO_HOIST_LOADS is true when we want to hoist adjacent loads out
-   of diamond control flow patterns, false otherwise.  */
-static unsigned int
-tree_ssa_phiopt_worker (bool do_hoist_loads, bool early_p)
-{
-  basic_block bb;
-  basic_block *bb_order;
-  unsigned n, i;
-  bool cfgchanged = false;
-
-  calculate_dominance_info (CDI_DOMINATORS);
-
-  /* Search every basic block for COND_EXPR we may be able to optimize.
-
-     We walk the blocks in order that guarantees that a block with
-     a single predecessor is processed before the predecessor.
-     This ensures that we collapse inner ifs before visiting the
-     outer ones, and also that we do not try to visit a removed
-     block.  */
-  bb_order = single_pred_before_succ_order ();
-  n = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
-
-  for (i = 0; i < n; i++)
-    {
-      gphi *phi;
-      basic_block bb1, bb2;
-      edge e1, e2;
-      tree arg0, arg1;
-      bool diamond_p = false;
-
-      bb = bb_order[i];
-
-      /* Check to see if the last statement is a GIMPLE_COND.  */
-      gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
-      if (!cond_stmt)
-	continue;
-
-      e1 = EDGE_SUCC (bb, 0);
-      bb1 = e1->dest;
-      e2 = EDGE_SUCC (bb, 1);
-      bb2 = e2->dest;
-
-      /* We cannot do the optimization on abnormal edges.  */
-      if ((e1->flags & EDGE_ABNORMAL) != 0
-          || (e2->flags & EDGE_ABNORMAL) != 0)
-	continue;
-
-      /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
-      if (EDGE_COUNT (bb1->succs) == 0
-	  || EDGE_COUNT (bb2->succs) == 0)
-	continue;
-
-      /* Find the bb which is the fall through to the other.  */
-      if (EDGE_SUCC (bb1, 0)->dest == bb2)
-        ;
-      else if (EDGE_SUCC (bb2, 0)->dest == bb1)
-        {
-	  std::swap (bb1, bb2);
-	  std::swap (e1, e2);
-	}
-      else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
-	       && single_succ_p (bb2))
-	{
-	  diamond_p = true;
-	  e2 = EDGE_SUCC (bb2, 0);
-	  /* Make sure bb2 is just a fall through. */
-	  if ((e2->flags & EDGE_FALLTHRU) == 0)
-	    continue;
-	}
-      else
-	continue;
-
-      e1 = EDGE_SUCC (bb1, 0);
-
-      /* Make sure that bb1 is just a fall through.  */
-      if (!single_succ_p (bb1)
-	  || (e1->flags & EDGE_FALLTHRU) == 0)
-	continue;
-
-      if (diamond_p)
-	{
-	  basic_block bb3 = e1->dest;
-
-	  if (!single_pred_p (bb1)
-	      || !single_pred_p (bb2))
-	    continue;
-
-	  if (do_hoist_loads
-	      && !FLOAT_TYPE_P (TREE_TYPE (gimple_cond_lhs (cond_stmt)))
-	      && EDGE_COUNT (bb->succs) == 2
-	      && EDGE_COUNT (bb3->preds) == 2
-	      /* If one edge or the other is dominant, a conditional move
-		 is likely to perform worse than the well-predicted branch.  */
-	      && !predictable_edge_p (EDGE_SUCC (bb, 0))
-	      && !predictable_edge_p (EDGE_SUCC (bb, 1)))
-	    hoist_adjacent_loads (bb, bb1, bb2, bb3);
-	}
-
-      gimple_stmt_iterator gsi;
-      bool candorest = true;
-
-      /* Check that we're looking for nested phis.  */
-      basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
-      gimple_seq phis = phi_nodes (merge);
-
-      /* Value replacement can work with more than one PHI
-	 so try that first. */
-      if (!early_p && !diamond_p)
-	for (gsi = gsi_start (phis); !gsi_end_p (gsi); gsi_next (&gsi))
-	  {
-	    phi = as_a <gphi *> (gsi_stmt (gsi));
-	    arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
-	    arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
-	    if (value_replacement (bb, bb1, e1, e2, phi, arg0, arg1) == 2)
-	      {
-		candorest = false;
-		cfgchanged = true;
-		break;
-	      }
-	  }
-
-      if (!candorest)
-	continue;
-
-      phi = single_non_singleton_phi_for_edges (phis, e1, e2);
-      if (!phi)
-	continue;
-
-      arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
-      arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
-
-      /* Something is wrong if we cannot find the arguments in the PHI
-	  node.  */
-      gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
-
-      gphi *newphi;
-      if (single_pred_p (bb1)
-	  && !diamond_p
-	  && (newphi = factor_out_conditional_conversion (e1, e2, phi,
-							  arg0, arg1,
-							  cond_stmt)))
-	{
-	  phi = newphi;
-	  /* factor_out_conditional_conversion may create a new PHI in
-	     BB2 and eliminate an existing PHI in BB2.  Recompute values
-	     that may be affected by that change.  */
-	  arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
-	  arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
-	  gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
-	}
-
-      /* Do the replacement of conditional if it can be done.  */
-      if (!early_p
-	  && !diamond_p
-	  && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
-	cfgchanged = true;
-      else if (match_simplify_replacement (bb, bb1, bb2, e1, e2, phi,
-					   arg0, arg1, early_p, diamond_p))
-	cfgchanged = true;
-      else if (!early_p
-	       && !diamond_p
-	       && single_pred_p (bb1)
-	       && cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
-							phi, arg0, arg1))
-	cfgchanged = true;
-      else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
-				   diamond_p))
-	cfgchanged = true;
-      else if (single_pred_p (bb1)
-	       && !diamond_p
-	       && spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
-	cfgchanged = true;
-    }
-
-  free (bb_order);
-
-  if (cfgchanged)
-    return TODO_cleanup_cfg;
-  return 0;
-}
-
 /* The core routine of conditional store replacement.  */
 static unsigned int
 store_elim_worker (void)
@@ -4328,11 +4129,7 @@ public:
       early_p = param;
     }
   bool gate (function *) final override { return flag_ssa_phiopt; }
-  unsigned int execute (function *) final override
-    {
-      return tree_ssa_phiopt_worker (!early_p ? gate_hoist_loads () : false,
-				     early_p);
-    }
+  unsigned int execute (function *) final override;
 
 private:
   bool early_p;
@@ -4346,6 +4143,186 @@ make_pass_phiopt (gcc::context *ctxt)
   return new pass_phiopt (ctxt);
 }
 
+unsigned int
+pass_phiopt::execute (function *)
+{
+  bool do_hoist_loads = !early_p ? gate_hoist_loads () : false;
+  basic_block bb;
+  basic_block *bb_order;
+  unsigned n, i;
+  bool cfgchanged = false;
+
+  calculate_dominance_info (CDI_DOMINATORS);
+
+  /* Search every basic block for COND_EXPR we may be able to optimize.
+
+     We walk the blocks in order that guarantees that a block with
+     a single predecessor is processed before the predecessor.
+     This ensures that we collapse inner ifs before visiting the
+     outer ones, and also that we do not try to visit a removed
+     block.  */
+  bb_order = single_pred_before_succ_order ();
+  n = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
+
+  for (i = 0; i < n; i++)
+    {
+      gphi *phi;
+      basic_block bb1, bb2;
+      edge e1, e2;
+      tree arg0, arg1;
+      bool diamond_p = false;
+
+      bb = bb_order[i];
+
+      /* Check to see if the last statement is a GIMPLE_COND.  */
+      gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
+      if (!cond_stmt)
+	continue;
+
+      e1 = EDGE_SUCC (bb, 0);
+      bb1 = e1->dest;
+      e2 = EDGE_SUCC (bb, 1);
+      bb2 = e2->dest;
+
+      /* We cannot do the optimization on abnormal edges.  */
+      if ((e1->flags & EDGE_ABNORMAL) != 0
+	  || (e2->flags & EDGE_ABNORMAL) != 0)
+	continue;
+
+      /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
+      if (EDGE_COUNT (bb1->succs) == 0
+	  || EDGE_COUNT (bb2->succs) == 0)
+	continue;
+
+      /* Find the bb which is the fall through to the other.  */
+      if (EDGE_SUCC (bb1, 0)->dest == bb2)
+	;
+      else if (EDGE_SUCC (bb2, 0)->dest == bb1)
+	{
+	  std::swap (bb1, bb2);
+	  std::swap (e1, e2);
+	}
+      else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
+	       && single_succ_p (bb2))
+	{
+	  diamond_p = true;
+	  e2 = EDGE_SUCC (bb2, 0);
+	  /* Make sure bb2 is just a fall through. */
+	  if ((e2->flags & EDGE_FALLTHRU) == 0)
+	    continue;
+	}
+      else
+	continue;
+
+      e1 = EDGE_SUCC (bb1, 0);
+
+      /* Make sure that bb1 is just a fall through.  */
+      if (!single_succ_p (bb1)
+	  || (e1->flags & EDGE_FALLTHRU) == 0)
+	continue;
+
+      if (diamond_p)
+	{
+	  basic_block bb3 = e1->dest;
+
+	  if (!single_pred_p (bb1)
+	      || !single_pred_p (bb2))
+	    continue;
+
+	  if (do_hoist_loads
+	      && !FLOAT_TYPE_P (TREE_TYPE (gimple_cond_lhs (cond_stmt)))
+	      && EDGE_COUNT (bb->succs) == 2
+	      && EDGE_COUNT (bb3->preds) == 2
+	      /* If one edge or the other is dominant, a conditional move
+		 is likely to perform worse than the well-predicted branch.  */
+	      && !predictable_edge_p (EDGE_SUCC (bb, 0))
+	      && !predictable_edge_p (EDGE_SUCC (bb, 1)))
+	    hoist_adjacent_loads (bb, bb1, bb2, bb3);
+	}
+
+      gimple_stmt_iterator gsi;
+      bool candorest = true;
+
+      /* Check that we're looking for nested phis.  */
+      basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
+      gimple_seq phis = phi_nodes (merge);
+
+      /* Value replacement can work with more than one PHI
+	 so try that first. */
+      if (!early_p && !diamond_p)
+	for (gsi = gsi_start (phis); !gsi_end_p (gsi); gsi_next (&gsi))
+	  {
+	    phi = as_a <gphi *> (gsi_stmt (gsi));
+	    arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
+	    arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
+	    if (value_replacement (bb, bb1, e1, e2, phi, arg0, arg1) == 2)
+	      {
+		candorest = false;
+		cfgchanged = true;
+		break;
+	      }
+	  }
+
+      if (!candorest)
+	continue;
+
+      phi = single_non_singleton_phi_for_edges (phis, e1, e2);
+      if (!phi)
+	continue;
+
+      arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
+      arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
+
+      /* Something is wrong if we cannot find the arguments in the PHI
+	  node.  */
+      gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
+
+      gphi *newphi;
+      if (single_pred_p (bb1)
+	  && !diamond_p
+	  && (newphi = factor_out_conditional_conversion (e1, e2, phi,
+							  arg0, arg1,
+							  cond_stmt)))
+	{
+	  phi = newphi;
+	  /* factor_out_conditional_conversion may create a new PHI in
+	     BB2 and eliminate an existing PHI in BB2.  Recompute values
+	     that may be affected by that change.  */
+	  arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
+	  arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
+	  gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
+	}
+
+      /* Do the replacement of conditional if it can be done.  */
+      if (!early_p
+	  && !diamond_p
+	  && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
+	cfgchanged = true;
+      else if (match_simplify_replacement (bb, bb1, bb2, e1, e2, phi,
+					   arg0, arg1, early_p, diamond_p))
+	cfgchanged = true;
+      else if (!early_p
+	       && !diamond_p
+	       && single_pred_p (bb1)
+	       && cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
+							phi, arg0, arg1))
+	cfgchanged = true;
+      else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
+				   diamond_p))
+	cfgchanged = true;
+      else if (single_pred_p (bb1)
+	       && !diamond_p
+	       && spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
+	cfgchanged = true;
+    }
+
+  free (bb_order);
+
+  if (cfgchanged)
+    return TODO_cleanup_cfg;
+  return 0;
+}
+
 /* This pass tries to transform conditional stores into unconditional
    ones, enabling further simplifications with the simpler then and else
    blocks.  In particular it replaces this:
-- 
2.39.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/7] PHIOPT: Move store_elim_worker into pass_cselim::execute
  2023-04-24 21:30 [PATCH 0/7] Some more phiopt cleanups and double minmax to match Andrew Pinski
  2023-04-24 21:30 ` [PATCH 1/7] PHIOPT: Split out store elimination from phiopt Andrew Pinski
  2023-04-24 21:30 ` [PATCH 2/7] PHIOPT: Rename tree_ssa_phiopt_worker to pass_phiopt::execute Andrew Pinski
@ 2023-04-24 21:30 ` Andrew Pinski
  2023-04-27 10:50   ` Richard Biener
  2023-04-24 21:30 ` [PATCH 4/7] MIN/MAX should be treated similar as comparisons for trapping Andrew Pinski
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Andrew Pinski @ 2023-04-24 21:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski

This simple patch moves the body of store_elim_worker
direclty into pass_cselim::execute.

Also removes unneeded prototypes too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

	* tree-ssa-phiopt.cc (cond_store_replacement): Remove
	prototype.
	(cond_if_else_store_replacement): Likewise.
	(get_non_trapping): Likewise.
	(store_elim_worker): Move into ...
	(pass_cselim::execute): This.
---
 gcc/tree-ssa-phiopt.cc | 250 ++++++++++++++++++++---------------------
 1 file changed, 119 insertions(+), 131 deletions(-)

diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index d232fd9b551..fb2d2c9fc1a 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -55,11 +55,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-propagate.h"
 #include "tree-ssa-dce.h"
 
-static bool cond_store_replacement (basic_block, basic_block, edge, edge,
-				    hash_set<tree> *);
-static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
-static hash_set<tree> * get_non_trapping ();
-
 /* Return the singleton PHI in the SEQ of PHIs for edges E0 and E1. */
 
 static gphi *
@@ -87,130 +82,6 @@ single_non_singleton_phi_for_edges (gimple_seq seq, edge e0, edge e1)
   return phi;
 }
 
-/* The core routine of conditional store replacement.  */
-static unsigned int
-store_elim_worker (void)
-{
-  basic_block bb;
-  basic_block *bb_order;
-  unsigned n, i;
-  bool cfgchanged = false;
-  hash_set<tree> *nontrap = 0;
-
-  calculate_dominance_info (CDI_DOMINATORS);
-
-  /* Calculate the set of non-trapping memory accesses.  */
-  nontrap = get_non_trapping ();
-
-  /* Search every basic block for COND_EXPR we may be able to optimize.
-
-     We walk the blocks in order that guarantees that a block with
-     a single predecessor is processed before the predecessor.
-     This ensures that we collapse inner ifs before visiting the
-     outer ones, and also that we do not try to visit a removed
-     block.  */
-  bb_order = single_pred_before_succ_order ();
-  n = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
-
-  for (i = 0; i < n; i++)
-    {
-      basic_block bb1, bb2;
-      edge e1, e2;
-      bool diamond_p = false;
-
-      bb = bb_order[i];
-
-      /* Check to see if the last statement is a GIMPLE_COND.  */
-      gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
-      if (!cond_stmt)
-	continue;
-
-      e1 = EDGE_SUCC (bb, 0);
-      bb1 = e1->dest;
-      e2 = EDGE_SUCC (bb, 1);
-      bb2 = e2->dest;
-
-      /* We cannot do the optimization on abnormal edges.  */
-      if ((e1->flags & EDGE_ABNORMAL) != 0
-	  || (e2->flags & EDGE_ABNORMAL) != 0)
-	continue;
-
-      /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
-      if (EDGE_COUNT (bb1->succs) == 0
-	  || EDGE_COUNT (bb2->succs) == 0)
-	continue;
-
-      /* Find the bb which is the fall through to the other.  */
-      if (EDGE_SUCC (bb1, 0)->dest == bb2)
-	;
-      else if (EDGE_SUCC (bb2, 0)->dest == bb1)
-	{
-	  std::swap (bb1, bb2);
-	  std::swap (e1, e2);
-	}
-      else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
-	       && single_succ_p (bb2))
-	{
-	  diamond_p = true;
-	  e2 = EDGE_SUCC (bb2, 0);
-	  /* Make sure bb2 is just a fall through. */
-	  if ((e2->flags & EDGE_FALLTHRU) == 0)
-	    continue;
-	}
-      else
-	continue;
-
-      e1 = EDGE_SUCC (bb1, 0);
-
-      /* Make sure that bb1 is just a fall through.  */
-      if (!single_succ_p (bb1)
-	  || (e1->flags & EDGE_FALLTHRU) == 0)
-	continue;
-
-      if (diamond_p)
-	{
-	  basic_block bb3 = e1->dest;
-
-	  /* Only handle sinking of store from 2 bbs only,
-	     The middle bbs don't need to come from the
-	     if always since we are sinking rather than
-	     hoisting. */
-	  if (EDGE_COUNT (bb3->preds) != 2)
-	    continue;
-	  if (cond_if_else_store_replacement (bb1, bb2, bb3))
-	    cfgchanged = true;
-	  continue;
-	}
-
-      /* Also make sure that bb1 only have one predecessor and that it
-	 is bb.  */
-      if (!single_pred_p (bb1)
-	  || single_pred (bb1) != bb)
-	continue;
-
-      /* bb1 is the middle block, bb2 the join block, bb the split block,
-	 e1 the fallthrough edge from bb1 to bb2.  We can't do the
-	 optimization if the join block has more than two predecessors.  */
-      if (EDGE_COUNT (bb2->preds) > 2)
-	continue;
-      if (cond_store_replacement (bb1, bb2, e1, e2, nontrap))
-	cfgchanged = true;
-    }
-
-  free (bb_order);
-
-  delete nontrap;
-  /* If the CFG has changed, we should cleanup the CFG.  */
-  if (cfgchanged)
-    {
-      /* In cond-store replacement we have added some loads on edges
-         and new VOPS (as we moved the store, and created a load).  */
-      gsi_commit_edge_inserts ();
-      return TODO_cleanup_cfg | TODO_update_ssa_only_virtuals;
-    }
-  return 0;
-}
-
 /* Replace PHI node element whose edge is E in block BB with variable NEW.
    Remove the edge from COND_BLOCK which does not lead to BB (COND_BLOCK
    is known to have two edges, one of which must reach BB).  */
@@ -4403,13 +4274,130 @@ make_pass_cselim (gcc::context *ctxt)
 unsigned int
 pass_cselim::execute (function *)
 {
-  unsigned todo;
+  basic_block bb;
+  basic_block *bb_order;
+  unsigned n, i;
+  bool cfgchanged = false;
+  hash_set<tree> *nontrap = 0;
+  unsigned todo = 0;
+
   /* ???  We are not interested in loop related info, but the following
      will create it, ICEing as we didn't init loops with pre-headers.
      An interfacing issue of find_data_references_in_bb.  */
   loop_optimizer_init (LOOPS_NORMAL);
   scev_initialize ();
-  todo = store_elim_worker ();
+
+  calculate_dominance_info (CDI_DOMINATORS);
+
+  /* Calculate the set of non-trapping memory accesses.  */
+  nontrap = get_non_trapping ();
+
+  /* Search every basic block for COND_EXPR we may be able to optimize.
+
+     We walk the blocks in order that guarantees that a block with
+     a single predecessor is processed before the predecessor.
+     This ensures that we collapse inner ifs before visiting the
+     outer ones, and also that we do not try to visit a removed
+     block.  */
+  bb_order = single_pred_before_succ_order ();
+  n = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
+
+  for (i = 0; i < n; i++)
+    {
+      basic_block bb1, bb2;
+      edge e1, e2;
+      bool diamond_p = false;
+
+      bb = bb_order[i];
+
+      /* Check to see if the last statement is a GIMPLE_COND.  */
+      gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
+      if (!cond_stmt)
+	continue;
+
+      e1 = EDGE_SUCC (bb, 0);
+      bb1 = e1->dest;
+      e2 = EDGE_SUCC (bb, 1);
+      bb2 = e2->dest;
+
+      /* We cannot do the optimization on abnormal edges.  */
+      if ((e1->flags & EDGE_ABNORMAL) != 0
+	  || (e2->flags & EDGE_ABNORMAL) != 0)
+	continue;
+
+      /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
+      if (EDGE_COUNT (bb1->succs) == 0
+	  || EDGE_COUNT (bb2->succs) == 0)
+	continue;
+
+      /* Find the bb which is the fall through to the other.  */
+      if (EDGE_SUCC (bb1, 0)->dest == bb2)
+	;
+      else if (EDGE_SUCC (bb2, 0)->dest == bb1)
+	{
+	  std::swap (bb1, bb2);
+	  std::swap (e1, e2);
+	}
+      else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
+	       && single_succ_p (bb2))
+	{
+	  diamond_p = true;
+	  e2 = EDGE_SUCC (bb2, 0);
+	  /* Make sure bb2 is just a fall through. */
+	  if ((e2->flags & EDGE_FALLTHRU) == 0)
+	    continue;
+	}
+      else
+	continue;
+
+      e1 = EDGE_SUCC (bb1, 0);
+
+      /* Make sure that bb1 is just a fall through.  */
+      if (!single_succ_p (bb1)
+	  || (e1->flags & EDGE_FALLTHRU) == 0)
+	continue;
+
+      if (diamond_p)
+	{
+	  basic_block bb3 = e1->dest;
+
+	  /* Only handle sinking of store from 2 bbs only,
+	     The middle bbs don't need to come from the
+	     if always since we are sinking rather than
+	     hoisting. */
+	  if (EDGE_COUNT (bb3->preds) != 2)
+	    continue;
+	  if (cond_if_else_store_replacement (bb1, bb2, bb3))
+	    cfgchanged = true;
+	  continue;
+	}
+
+      /* Also make sure that bb1 only have one predecessor and that it
+	 is bb.  */
+      if (!single_pred_p (bb1)
+	  || single_pred (bb1) != bb)
+	continue;
+
+      /* bb1 is the middle block, bb2 the join block, bb the split block,
+	 e1 the fallthrough edge from bb1 to bb2.  We can't do the
+	 optimization if the join block has more than two predecessors.  */
+      if (EDGE_COUNT (bb2->preds) > 2)
+	continue;
+      if (cond_store_replacement (bb1, bb2, e1, e2, nontrap))
+	cfgchanged = true;
+    }
+
+  free (bb_order);
+
+  delete nontrap;
+  /* If the CFG has changed, we should cleanup the CFG.  */
+  if (cfgchanged)
+    {
+      /* In cond-store replacement we have added some loads on edges
+	  and new VOPS (as we moved the store, and created a load).  */
+      gsi_commit_edge_inserts ();
+      todo = TODO_cleanup_cfg | TODO_update_ssa_only_virtuals;
+    }
   scev_finalize ();
   loop_optimizer_finalize ();
   return todo;
-- 
2.39.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 4/7] MIN/MAX should be treated similar as comparisons for trapping
  2023-04-24 21:30 [PATCH 0/7] Some more phiopt cleanups and double minmax to match Andrew Pinski
                   ` (2 preceding siblings ...)
  2023-04-24 21:30 ` [PATCH 3/7] PHIOPT: Move store_elim_worker into pass_cselim::execute Andrew Pinski
@ 2023-04-24 21:30 ` Andrew Pinski
  2023-04-27 10:49   ` Richard Biener
  2023-04-24 21:30 ` [PATCH 5/7] PHIOPT: Allow MIN/MAX to have up to 2 MIN/MAX expressions for early phiopt Andrew Pinski
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Andrew Pinski @ 2023-04-24 21:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski

While looking into moving optimizations from minmax_replacement
in phiopt to match.pd, I Noticed that min/max were considered
trapping even if -ffinite-math-only was being used. This changes
those expressions to be similar as comparisons so that they are
not considered trapping if -ffinite-math-only is on.

OK? Bootstrapped and tested with no regressions on x86_64-linux-gnu.

gcc/ChangeLog:

	* rtlanal.cc (may_trap_p_1): Treat SMIN/SMAX similar as
	COMPARISON.
	* tree-eh.cc (operation_could_trap_helper_p): Treate
	MIN_EXPR/MAX_EXPR similar as other comparisons.
---
 gcc/rtlanal.cc | 3 +++
 gcc/tree-eh.cc | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index c96a88cebf1..b7948ecfad1 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -3204,6 +3204,9 @@ may_trap_p_1 (const_rtx x, unsigned flags)
     case LT:
     case LTGT:
     case COMPARE:
+    /* Treat min/max similar as comparisons.  */
+    case SMIN:
+    case SMAX:
       /* Some floating point comparisons may trap.  */
       if (!flag_trapping_math)
 	break;
diff --git a/gcc/tree-eh.cc b/gcc/tree-eh.cc
index 41cf57d2b30..dbaa27d95c5 100644
--- a/gcc/tree-eh.cc
+++ b/gcc/tree-eh.cc
@@ -2490,6 +2490,9 @@ operation_could_trap_helper_p (enum tree_code op,
     case GT_EXPR:
     case GE_EXPR:
     case LTGT_EXPR:
+    /* MIN/MAX similar as LT/LE/GT/GE. */
+    case MIN_EXPR:
+    case MAX_EXPR:
       /* Some floating point comparisons may trap.  */
       return honor_nans;
 
-- 
2.39.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 5/7] PHIOPT: Allow MIN/MAX to have up to 2 MIN/MAX expressions for early phiopt
  2023-04-24 21:30 [PATCH 0/7] Some more phiopt cleanups and double minmax to match Andrew Pinski
                   ` (3 preceding siblings ...)
  2023-04-24 21:30 ` [PATCH 4/7] MIN/MAX should be treated similar as comparisons for trapping Andrew Pinski
@ 2023-04-24 21:30 ` Andrew Pinski
  2023-04-27 10:51   ` Richard Biener
  2023-04-24 21:30 ` [PATCH 6/7] MATCH: Factor out code that for min max detection with constants Andrew Pinski
  2023-04-24 21:30 ` [PATCH 7/7] MATCH: Add patterns from phiopt's minmax_replacement Andrew Pinski
  6 siblings, 1 reply; 15+ messages in thread
From: Andrew Pinski @ 2023-04-24 21:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski

In the early PHIOPT mode, the original minmax_replacement, would
replace a PHI node with up to 2 min/max expressions in some cases,
this allows for that too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

	* tree-ssa-phiopt.cc (phiopt_early_allow): Allow for
	up to 2 min/max expressions in the sequence/match code.
---
 gcc/tree-ssa-phiopt.cc | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index fb2d2c9fc1a..de1896aa91a 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -542,9 +542,23 @@ phiopt_early_allow (gimple_seq &seq, gimple_match_op &op)
     return false;
   tree_code code = (tree_code)op.code;
 
-  /* For non-empty sequence, only allow one statement.  */
+  /* For non-empty sequence, only allow one statement
+     except for MIN/MAX, allow max 2 statements,
+     each with MIN/MAX.  */
   if (!gimple_seq_empty_p (seq))
     {
+      if (code == MIN_EXPR || code == MAX_EXPR)
+	{
+	  if (!gimple_seq_singleton_p (seq))
+	    return false;
+
+	  gimple *stmt = gimple_seq_first_stmt (seq);
+	  /* Only allow assignments.  */
+	  if (!is_gimple_assign (stmt))
+	    return false;
+	  code = gimple_assign_rhs_code (stmt);
+	  return code == MIN_EXPR || code == MAX_EXPR;
+	}
       /* Check to make sure op was already a SSA_NAME.  */
       if (code != SSA_NAME)
 	return false;
-- 
2.39.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 6/7] MATCH: Factor out code that for min max detection with constants
  2023-04-24 21:30 [PATCH 0/7] Some more phiopt cleanups and double minmax to match Andrew Pinski
                   ` (4 preceding siblings ...)
  2023-04-24 21:30 ` [PATCH 5/7] PHIOPT: Allow MIN/MAX to have up to 2 MIN/MAX expressions for early phiopt Andrew Pinski
@ 2023-04-24 21:30 ` Andrew Pinski
  2023-04-25 10:45   ` Mikael Morin
  2023-04-24 21:30 ` [PATCH 7/7] MATCH: Add patterns from phiopt's minmax_replacement Andrew Pinski
  6 siblings, 1 reply; 15+ messages in thread
From: Andrew Pinski @ 2023-04-24 21:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski

This factors out some of the code from the min/max detection
from match.pd into a function so it can be reused in other
places. This is mainly used to detect the conversions
of >= to > which causes the integer values to be changed by
one.

OK? Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

	* match.pd: Factor out the deciding the min/max from
	the "(cond (cmp (convert1? x) c1) (convert2? x) c2)"
	pattern to ...
	* fold-const.cc (minmax_from_comparison): this new function.
	* fold-const.h (minmax_from_comparison): New prototype.
---
 gcc/fold-const.cc | 43 +++++++++++++++++++++++++++++++++++++++++++
 gcc/fold-const.h  |  3 +++
 gcc/match.pd      | 29 +----------------------------
 3 files changed, 47 insertions(+), 28 deletions(-)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 3b397ae2941..b8add24f874 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -150,6 +150,49 @@ static tree fold_convert_const (enum tree_code, tree, tree);
 static tree fold_view_convert_expr (tree, tree);
 static tree fold_negate_expr (location_t, tree);
 
+/* This is a helper function to detect min/max for some operands of COND_EXPR.
+   The form is "(EXP0 CMP EXP1) ? EXP2 : EXP3". */
+tree_code
+minmax_from_comparison (tree_code cmp, tree exp0, tree exp1, tree exp2, tree exp3)
+{
+  enum tree_code code = ERROR_MARK;
+
+  if (HONOR_NANS (exp0) || HONOR_SIGNED_ZEROS (exp0))
+    return ERROR_MARK;
+
+  if (!operand_equal_p (exp0, exp2))
+    return ERROR_MARK;
+
+  if (TREE_CODE (exp1) == INTEGER_CST && TREE_CODE (exp1) == INTEGER_CST
+      && wi::to_widest (exp1) == (wi::to_widest (exp3) - 1))
+    {
+      /* X <= Y - 1 equals to X < Y.  */
+      if (cmp == LE_EXPR)
+	code = LT_EXPR;
+      /* X > Y - 1 equals to X >= Y.  */
+      if (cmp == GT_EXPR)
+	code = GE_EXPR;
+    }
+  if (TREE_CODE (exp3) == INTEGER_CST && TREE_CODE (exp1) == INTEGER_CST
+      && wi::to_widest (exp1) == (wi::to_widest (exp3) + 1))
+    {
+     /* X < Y + 1 equals to X <= Y.  */
+     if (cmp == LT_EXPR)
+	code = LE_EXPR;
+    /* X >= Y + 1 equals to X > Y.  */
+    if (cmp == GE_EXPR)
+	code = GT_EXPR;
+    }
+  if (code != ERROR_MARK
+      || operand_equal_p (exp1, exp3))
+    {
+      if (cmp == LT_EXPR || cmp == LE_EXPR)
+	code = MIN_EXPR;
+      if (cmp == GT_EXPR || cmp == GE_EXPR)
+	code = MAX_EXPR;
+    }
+  return code;
+}
 
 /* Return EXPR_LOCATION of T if it is not UNKNOWN_LOCATION.
    Otherwise, return LOC.  */
diff --git a/gcc/fold-const.h b/gcc/fold-const.h
index 56ecaa87fc6..b828badc42f 100644
--- a/gcc/fold-const.h
+++ b/gcc/fold-const.h
@@ -246,6 +246,9 @@ extern tree fold_build_pointer_plus_hwi_loc (location_t loc, tree ptr, HOST_WIDE
 #define fold_build_pointer_plus_hwi(p,o) \
 	fold_build_pointer_plus_hwi_loc (UNKNOWN_LOCATION, p, o)
 
+extern tree_code minmax_from_comparison (tree_code, tree, tree,
+					 tree, tree);
+
 /* In gimple-fold.cc.  */
 extern void clear_type_padding_in_mask (tree, unsigned char *);
 extern bool clear_padding_type_may_have_padding_p (tree);
diff --git a/gcc/match.pd b/gcc/match.pd
index e89ba57e30b..6d3aaf45a93 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4677,34 +4677,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 		     || TYPE_SIGN (c2_type) == TYPE_SIGN (from_type)))))
        {
 	 if (cmp != EQ_EXPR)
-	   {
-	     if (wi::to_widest (@3) == (wi::to_widest (@2) - 1))
-	       {
-		 /* X <= Y - 1 equals to X < Y.  */
-		 if (cmp == LE_EXPR)
-		   code = LT_EXPR;
-		 /* X > Y - 1 equals to X >= Y.  */
-		 if (cmp == GT_EXPR)
-		   code = GE_EXPR;
-	       }
-	     if (wi::to_widest (@3) == (wi::to_widest (@2) + 1))
-	       {
-		 /* X < Y + 1 equals to X <= Y.  */
-		 if (cmp == LT_EXPR)
-		   code = LE_EXPR;
-		 /* X >= Y + 1 equals to X > Y.  */
-		 if (cmp == GE_EXPR)
-		   code = GT_EXPR;
-	       }
-	     if (code != ERROR_MARK
-		 || wi::to_widest (@2) == wi::to_widest (@3))
-	       {
-		 if (cmp == LT_EXPR || cmp == LE_EXPR)
-		   code = MIN_EXPR;
-		 if (cmp == GT_EXPR || cmp == GE_EXPR)
-		   code = MAX_EXPR;
-	       }
-	   }
+	   code = minmax_from_comparison (cmp, @1, @3, @1, @2);
 	 /* Can do A == C1 ? A : C2  ->  A == C1 ? C1 : C2?  */
 	 else if (int_fits_type_p (@3, from_type))
 	   code = EQ_EXPR;
-- 
2.39.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 7/7] MATCH: Add patterns from phiopt's minmax_replacement
  2023-04-24 21:30 [PATCH 0/7] Some more phiopt cleanups and double minmax to match Andrew Pinski
                   ` (5 preceding siblings ...)
  2023-04-24 21:30 ` [PATCH 6/7] MATCH: Factor out code that for min max detection with constants Andrew Pinski
@ 2023-04-24 21:30 ` Andrew Pinski
  2023-04-27 10:58   ` Richard Biener
  6 siblings, 1 reply; 15+ messages in thread
From: Andrew Pinski @ 2023-04-24 21:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski

This adds a few patterns from phiopt's minmax_replacement
for (A CMP B) ? MIN/MAX<A, C> : MIN/MAX <B, C> .
It is progress to remove minmax_replacement from phiopt.
There are still some more cases dealing with constants on the
edges (0/INT_MAX) to handle in match.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

	* match.pd: Add patterns for
	"(A CMP B) ? MIN/MAX<A, C> : MIN/MAX <B, C>".

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/minmax-16.c: Update testcase slightly.
	* gcc.dg/tree-ssa/split-path-1.c: Also disable tree-loop-if-convert
	as that now does the combining.
---
 gcc/match.pd                                 | 16 ++++++++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c    | 10 ++++++++--
 gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c |  3 ++-
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index 6d3aaf45a93..5d5aae24509 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4843,6 +4843,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
        (convert @c0))))))))
 #endif
 
+/* These was part of minmax phiopt.  */
+/* Optimize (a CMP b) ? minmax<a, c> : minmax<b, c>
+   to minmax<min/max<a, b>, c> */
+(for minmax (min max)
+ (for cmp (lt le gt ge)
+  (simplify
+   (cond (cmp @1 @3) (minmax:c @1 @4) (minmax:c @2 @4))
+   (with
+    {
+      tree_code code = minmax_from_comparison (cmp, @1, @2, @1, @3);
+    }
+    (if (code == MIN_EXPR)
+     (minmax (min @1 @2) @4)
+     (if (code == MAX_EXPR)
+      (minmax (max @1 @2) @4)))))))
+
 /* X != C1 ? -X : C2 simplifies to -X when -C1 == C2.  */
 (simplify
  (cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
index 4febd092d83..623b12b3f74 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O -fdump-tree-phiopt -g" } */
+/* { dg-options "-O -fdump-tree-phiopt -fdump-tree-optimized -g" } */
 
 #include <stdint.h>
 
@@ -25,5 +25,11 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
+/* After phiopt1, there really should be only 3 MIN_EXPR in the IR (including debug statements).
+   But the way phiopt does not cleanup the CFG all the time, the PHI might still reference the
+   alternative bb's moved statement.
+   Note in the end, we do dce the statement and other debug statements to end up with only 2 MIN_EXPR.
+   So check that too. */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 4 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
index 902dde44a50..b670dee8d10 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
-/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
+/* Note both PHI-OPT and the loop if conversion pass converts the inner if to be branchless using min/max. */
+/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt -fno-tree-loop-if-convert" } */
 
 #include <stdio.h>
 #include <stdlib.h>
-- 
2.39.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/7] MATCH: Factor out code that for min max detection with constants
  2023-04-24 21:30 ` [PATCH 6/7] MATCH: Factor out code that for min max detection with constants Andrew Pinski
@ 2023-04-25 10:45   ` Mikael Morin
  0 siblings, 0 replies; 15+ messages in thread
From: Mikael Morin @ 2023-04-25 10:45 UTC (permalink / raw)
  To: Andrew Pinski, gcc-patches

Hello,

> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 3b397ae2941..b8add24f874 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -150,6 +150,49 @@ static tree fold_convert_const (enum tree_code, tree, tree);
>   static tree fold_view_convert_expr (tree, tree);
>   static tree fold_negate_expr (location_t, tree);
>   
> +/* This is a helper function to detect min/max for some operands of COND_EXPR.
> +   The form is "(EXP0 CMP EXP1) ? EXP2 : EXP3". */
> +tree_code
> +minmax_from_comparison (tree_code cmp, tree exp0, tree exp1, tree exp2, tree exp3)
> +{
> +  enum tree_code code = ERROR_MARK;
> +
> +  if (HONOR_NANS (exp0) || HONOR_SIGNED_ZEROS (exp0))
> +    return ERROR_MARK;
> +
> +  if (!operand_equal_p (exp0, exp2))
> +    return ERROR_MARK;
> +
> +  if (TREE_CODE (exp1) == INTEGER_CST && TREE_CODE (exp1) == INTEGER_CST

one of these conditions should probably be TREE_CODE (exp3) == INTEGER_CST?

> +      && wi::to_widest (exp1) == (wi::to_widest (exp3) - 1))
> +    {
> +      /* X <= Y - 1 equals to X < Y.  */
> +      if (cmp == LE_EXPR)
> +	code = LT_EXPR;
> +      /* X > Y - 1 equals to X >= Y.  */
> +      if (cmp == GT_EXPR)
> +	code = GE_EXPR;
> +    }
> +  if (TREE_CODE (exp3) == INTEGER_CST && TREE_CODE (exp1) == INTEGER_CST
> +      && wi::to_widest (exp1) == (wi::to_widest (exp3) + 1))
> +    {
> +     /* X < Y + 1 equals to X <= Y.  */
> +     if (cmp == LT_EXPR)
> +	code = LE_EXPR;
> +    /* X >= Y + 1 equals to X > Y.  */
> +    if (cmp == GE_EXPR)
> +	code = GT_EXPR;
> +    }


Mikael


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/7] MIN/MAX should be treated similar as comparisons for trapping
  2023-04-24 21:30 ` [PATCH 4/7] MIN/MAX should be treated similar as comparisons for trapping Andrew Pinski
@ 2023-04-27 10:49   ` Richard Biener
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Biener @ 2023-04-27 10:49 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches

On Mon, Apr 24, 2023 at 11:31 PM Andrew Pinski via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> While looking into moving optimizations from minmax_replacement
> in phiopt to match.pd, I Noticed that min/max were considered
> trapping even if -ffinite-math-only was being used. This changes
> those expressions to be similar as comparisons so that they are
> not considered trapping if -ffinite-math-only is on.
>
> OK? Bootstrapped and tested with no regressions on x86_64-linux-gnu.

OK.

> gcc/ChangeLog:
>
>         * rtlanal.cc (may_trap_p_1): Treat SMIN/SMAX similar as
>         COMPARISON.
>         * tree-eh.cc (operation_could_trap_helper_p): Treate
>         MIN_EXPR/MAX_EXPR similar as other comparisons.
> ---
>  gcc/rtlanal.cc | 3 +++
>  gcc/tree-eh.cc | 3 +++
>  2 files changed, 6 insertions(+)
>
> diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
> index c96a88cebf1..b7948ecfad1 100644
> --- a/gcc/rtlanal.cc
> +++ b/gcc/rtlanal.cc
> @@ -3204,6 +3204,9 @@ may_trap_p_1 (const_rtx x, unsigned flags)
>      case LT:
>      case LTGT:
>      case COMPARE:
> +    /* Treat min/max similar as comparisons.  */
> +    case SMIN:
> +    case SMAX:
>        /* Some floating point comparisons may trap.  */
>        if (!flag_trapping_math)
>         break;
> diff --git a/gcc/tree-eh.cc b/gcc/tree-eh.cc
> index 41cf57d2b30..dbaa27d95c5 100644
> --- a/gcc/tree-eh.cc
> +++ b/gcc/tree-eh.cc
> @@ -2490,6 +2490,9 @@ operation_could_trap_helper_p (enum tree_code op,
>      case GT_EXPR:
>      case GE_EXPR:
>      case LTGT_EXPR:
> +    /* MIN/MAX similar as LT/LE/GT/GE. */
> +    case MIN_EXPR:
> +    case MAX_EXPR:
>        /* Some floating point comparisons may trap.  */
>        return honor_nans;
>
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/7] PHIOPT: Move store_elim_worker into pass_cselim::execute
  2023-04-24 21:30 ` [PATCH 3/7] PHIOPT: Move store_elim_worker into pass_cselim::execute Andrew Pinski
@ 2023-04-27 10:50   ` Richard Biener
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Biener @ 2023-04-27 10:50 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches

On Mon, Apr 24, 2023 at 11:31 PM Andrew Pinski via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> This simple patch moves the body of store_elim_worker
> direclty into pass_cselim::execute.
>
> Also removes unneeded prototypes too.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

> gcc/ChangeLog:
>
>         * tree-ssa-phiopt.cc (cond_store_replacement): Remove
>         prototype.
>         (cond_if_else_store_replacement): Likewise.
>         (get_non_trapping): Likewise.
>         (store_elim_worker): Move into ...
>         (pass_cselim::execute): This.
> ---
>  gcc/tree-ssa-phiopt.cc | 250 ++++++++++++++++++++---------------------
>  1 file changed, 119 insertions(+), 131 deletions(-)
>
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index d232fd9b551..fb2d2c9fc1a 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -55,11 +55,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-ssa-propagate.h"
>  #include "tree-ssa-dce.h"
>
> -static bool cond_store_replacement (basic_block, basic_block, edge, edge,
> -                                   hash_set<tree> *);
> -static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
> -static hash_set<tree> * get_non_trapping ();
> -
>  /* Return the singleton PHI in the SEQ of PHIs for edges E0 and E1. */
>
>  static gphi *
> @@ -87,130 +82,6 @@ single_non_singleton_phi_for_edges (gimple_seq seq, edge e0, edge e1)
>    return phi;
>  }
>
> -/* The core routine of conditional store replacement.  */
> -static unsigned int
> -store_elim_worker (void)
> -{
> -  basic_block bb;
> -  basic_block *bb_order;
> -  unsigned n, i;
> -  bool cfgchanged = false;
> -  hash_set<tree> *nontrap = 0;
> -
> -  calculate_dominance_info (CDI_DOMINATORS);
> -
> -  /* Calculate the set of non-trapping memory accesses.  */
> -  nontrap = get_non_trapping ();
> -
> -  /* Search every basic block for COND_EXPR we may be able to optimize.
> -
> -     We walk the blocks in order that guarantees that a block with
> -     a single predecessor is processed before the predecessor.
> -     This ensures that we collapse inner ifs before visiting the
> -     outer ones, and also that we do not try to visit a removed
> -     block.  */
> -  bb_order = single_pred_before_succ_order ();
> -  n = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
> -
> -  for (i = 0; i < n; i++)
> -    {
> -      basic_block bb1, bb2;
> -      edge e1, e2;
> -      bool diamond_p = false;
> -
> -      bb = bb_order[i];
> -
> -      /* Check to see if the last statement is a GIMPLE_COND.  */
> -      gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
> -      if (!cond_stmt)
> -       continue;
> -
> -      e1 = EDGE_SUCC (bb, 0);
> -      bb1 = e1->dest;
> -      e2 = EDGE_SUCC (bb, 1);
> -      bb2 = e2->dest;
> -
> -      /* We cannot do the optimization on abnormal edges.  */
> -      if ((e1->flags & EDGE_ABNORMAL) != 0
> -         || (e2->flags & EDGE_ABNORMAL) != 0)
> -       continue;
> -
> -      /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
> -      if (EDGE_COUNT (bb1->succs) == 0
> -         || EDGE_COUNT (bb2->succs) == 0)
> -       continue;
> -
> -      /* Find the bb which is the fall through to the other.  */
> -      if (EDGE_SUCC (bb1, 0)->dest == bb2)
> -       ;
> -      else if (EDGE_SUCC (bb2, 0)->dest == bb1)
> -       {
> -         std::swap (bb1, bb2);
> -         std::swap (e1, e2);
> -       }
> -      else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> -              && single_succ_p (bb2))
> -       {
> -         diamond_p = true;
> -         e2 = EDGE_SUCC (bb2, 0);
> -         /* Make sure bb2 is just a fall through. */
> -         if ((e2->flags & EDGE_FALLTHRU) == 0)
> -           continue;
> -       }
> -      else
> -       continue;
> -
> -      e1 = EDGE_SUCC (bb1, 0);
> -
> -      /* Make sure that bb1 is just a fall through.  */
> -      if (!single_succ_p (bb1)
> -         || (e1->flags & EDGE_FALLTHRU) == 0)
> -       continue;
> -
> -      if (diamond_p)
> -       {
> -         basic_block bb3 = e1->dest;
> -
> -         /* Only handle sinking of store from 2 bbs only,
> -            The middle bbs don't need to come from the
> -            if always since we are sinking rather than
> -            hoisting. */
> -         if (EDGE_COUNT (bb3->preds) != 2)
> -           continue;
> -         if (cond_if_else_store_replacement (bb1, bb2, bb3))
> -           cfgchanged = true;
> -         continue;
> -       }
> -
> -      /* Also make sure that bb1 only have one predecessor and that it
> -        is bb.  */
> -      if (!single_pred_p (bb1)
> -         || single_pred (bb1) != bb)
> -       continue;
> -
> -      /* bb1 is the middle block, bb2 the join block, bb the split block,
> -        e1 the fallthrough edge from bb1 to bb2.  We can't do the
> -        optimization if the join block has more than two predecessors.  */
> -      if (EDGE_COUNT (bb2->preds) > 2)
> -       continue;
> -      if (cond_store_replacement (bb1, bb2, e1, e2, nontrap))
> -       cfgchanged = true;
> -    }
> -
> -  free (bb_order);
> -
> -  delete nontrap;
> -  /* If the CFG has changed, we should cleanup the CFG.  */
> -  if (cfgchanged)
> -    {
> -      /* In cond-store replacement we have added some loads on edges
> -         and new VOPS (as we moved the store, and created a load).  */
> -      gsi_commit_edge_inserts ();
> -      return TODO_cleanup_cfg | TODO_update_ssa_only_virtuals;
> -    }
> -  return 0;
> -}
> -
>  /* Replace PHI node element whose edge is E in block BB with variable NEW.
>     Remove the edge from COND_BLOCK which does not lead to BB (COND_BLOCK
>     is known to have two edges, one of which must reach BB).  */
> @@ -4403,13 +4274,130 @@ make_pass_cselim (gcc::context *ctxt)
>  unsigned int
>  pass_cselim::execute (function *)
>  {
> -  unsigned todo;
> +  basic_block bb;
> +  basic_block *bb_order;
> +  unsigned n, i;
> +  bool cfgchanged = false;
> +  hash_set<tree> *nontrap = 0;
> +  unsigned todo = 0;
> +
>    /* ???  We are not interested in loop related info, but the following
>       will create it, ICEing as we didn't init loops with pre-headers.
>       An interfacing issue of find_data_references_in_bb.  */
>    loop_optimizer_init (LOOPS_NORMAL);
>    scev_initialize ();
> -  todo = store_elim_worker ();
> +
> +  calculate_dominance_info (CDI_DOMINATORS);
> +
> +  /* Calculate the set of non-trapping memory accesses.  */
> +  nontrap = get_non_trapping ();
> +
> +  /* Search every basic block for COND_EXPR we may be able to optimize.
> +
> +     We walk the blocks in order that guarantees that a block with
> +     a single predecessor is processed before the predecessor.
> +     This ensures that we collapse inner ifs before visiting the
> +     outer ones, and also that we do not try to visit a removed
> +     block.  */
> +  bb_order = single_pred_before_succ_order ();
> +  n = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
> +
> +  for (i = 0; i < n; i++)
> +    {
> +      basic_block bb1, bb2;
> +      edge e1, e2;
> +      bool diamond_p = false;
> +
> +      bb = bb_order[i];
> +
> +      /* Check to see if the last statement is a GIMPLE_COND.  */
> +      gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
> +      if (!cond_stmt)
> +       continue;
> +
> +      e1 = EDGE_SUCC (bb, 0);
> +      bb1 = e1->dest;
> +      e2 = EDGE_SUCC (bb, 1);
> +      bb2 = e2->dest;
> +
> +      /* We cannot do the optimization on abnormal edges.  */
> +      if ((e1->flags & EDGE_ABNORMAL) != 0
> +         || (e2->flags & EDGE_ABNORMAL) != 0)
> +       continue;
> +
> +      /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
> +      if (EDGE_COUNT (bb1->succs) == 0
> +         || EDGE_COUNT (bb2->succs) == 0)
> +       continue;
> +
> +      /* Find the bb which is the fall through to the other.  */
> +      if (EDGE_SUCC (bb1, 0)->dest == bb2)
> +       ;
> +      else if (EDGE_SUCC (bb2, 0)->dest == bb1)
> +       {
> +         std::swap (bb1, bb2);
> +         std::swap (e1, e2);
> +       }
> +      else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> +              && single_succ_p (bb2))
> +       {
> +         diamond_p = true;
> +         e2 = EDGE_SUCC (bb2, 0);
> +         /* Make sure bb2 is just a fall through. */
> +         if ((e2->flags & EDGE_FALLTHRU) == 0)
> +           continue;
> +       }
> +      else
> +       continue;
> +
> +      e1 = EDGE_SUCC (bb1, 0);
> +
> +      /* Make sure that bb1 is just a fall through.  */
> +      if (!single_succ_p (bb1)
> +         || (e1->flags & EDGE_FALLTHRU) == 0)
> +       continue;
> +
> +      if (diamond_p)
> +       {
> +         basic_block bb3 = e1->dest;
> +
> +         /* Only handle sinking of store from 2 bbs only,
> +            The middle bbs don't need to come from the
> +            if always since we are sinking rather than
> +            hoisting. */
> +         if (EDGE_COUNT (bb3->preds) != 2)
> +           continue;
> +         if (cond_if_else_store_replacement (bb1, bb2, bb3))
> +           cfgchanged = true;
> +         continue;
> +       }
> +
> +      /* Also make sure that bb1 only have one predecessor and that it
> +        is bb.  */
> +      if (!single_pred_p (bb1)
> +         || single_pred (bb1) != bb)
> +       continue;
> +
> +      /* bb1 is the middle block, bb2 the join block, bb the split block,
> +        e1 the fallthrough edge from bb1 to bb2.  We can't do the
> +        optimization if the join block has more than two predecessors.  */
> +      if (EDGE_COUNT (bb2->preds) > 2)
> +       continue;
> +      if (cond_store_replacement (bb1, bb2, e1, e2, nontrap))
> +       cfgchanged = true;
> +    }
> +
> +  free (bb_order);
> +
> +  delete nontrap;
> +  /* If the CFG has changed, we should cleanup the CFG.  */
> +  if (cfgchanged)
> +    {
> +      /* In cond-store replacement we have added some loads on edges
> +         and new VOPS (as we moved the store, and created a load).  */
> +      gsi_commit_edge_inserts ();
> +      todo = TODO_cleanup_cfg | TODO_update_ssa_only_virtuals;
> +    }
>    scev_finalize ();
>    loop_optimizer_finalize ();
>    return todo;
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/7] PHIOPT: Split out store elimination from phiopt
  2023-04-24 21:30 ` [PATCH 1/7] PHIOPT: Split out store elimination from phiopt Andrew Pinski
@ 2023-04-27 10:50   ` Richard Biener
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Biener @ 2023-04-27 10:50 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches

On Mon, Apr 24, 2023 at 11:31 PM Andrew Pinski via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Since the last cleanups, it made easier to see
> that we should split out the store elimination
> worker from tree_ssa_phiopt_worker function.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

> gcc/ChangeLog:
>
>         * tree-ssa-phiopt.cc (tree_ssa_phiopt_worker): Remove
>         do_store_elim argument and split that part out to ...
>         (store_elim_worker): This new function.
>         (pass_cselim::execute): Call store_elim_worker.
>         (pass_phiopt::execute): Update call to tree_ssa_phiopt_worker.
> ---
>  gcc/tree-ssa-phiopt.cc | 180 ++++++++++++++++++++++++++++-------------
>  1 file changed, 126 insertions(+), 54 deletions(-)
>
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index 4a3ab8efb71..7f47b32576b 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -104,27 +104,19 @@ single_non_singleton_phi_for_edges (gimple_seq seq, edge e0, edge e1)
>    return phi;
>  }
>
> -/* The core routine of conditional store replacement and normal
> -   phi optimizations.  Both share much of the infrastructure in how
> -   to match applicable basic block patterns.  DO_STORE_ELIM is true
> -   when we want to do conditional store replacement, false otherwise.
> +/* The core routine of phi optimizations.
>     DO_HOIST_LOADS is true when we want to hoist adjacent loads out
>     of diamond control flow patterns, false otherwise.  */
>  static unsigned int
> -tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
> +tree_ssa_phiopt_worker (bool do_hoist_loads, bool early_p)
>  {
>    basic_block bb;
>    basic_block *bb_order;
>    unsigned n, i;
>    bool cfgchanged = false;
> -  hash_set<tree> *nontrap = 0;
>
>    calculate_dominance_info (CDI_DOMINATORS);
>
> -  if (do_store_elim)
> -    /* Calculate the set of non-trapping memory accesses.  */
> -    nontrap = get_non_trapping ();
> -
>    /* Search every basic block for COND_EXPR we may be able to optimize.
>
>       We walk the blocks in order that guarantees that a block with
> @@ -148,7 +140,7 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>        /* Check to see if the last statement is a GIMPLE_COND.  */
>        gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
>        if (!cond_stmt)
> -        continue;
> +       continue;
>
>        e1 = EDGE_SUCC (bb, 0);
>        bb1 = e1->dest;
> @@ -158,12 +150,12 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>        /* We cannot do the optimization on abnormal edges.  */
>        if ((e1->flags & EDGE_ABNORMAL) != 0
>            || (e2->flags & EDGE_ABNORMAL) != 0)
> -       continue;
> +       continue;
>
>        /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
>        if (EDGE_COUNT (bb1->succs) == 0
>           || EDGE_COUNT (bb2->succs) == 0)
> -        continue;
> +       continue;
>
>        /* Find the bb which is the fall through to the other.  */
>        if (EDGE_SUCC (bb1, 0)->dest == bb2)
> @@ -192,39 +184,6 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>           || (e1->flags & EDGE_FALLTHRU) == 0)
>         continue;
>
> -      if (do_store_elim)
> -       {
> -         if (diamond_p)
> -           {
> -             basic_block bb3 = e1->dest;
> -
> -             /* Only handle sinking of store from 2 bbs only,
> -                The middle bbs don't need to come from the
> -                if always since we are sinking rather than
> -                hoisting. */
> -             if (EDGE_COUNT (bb3->preds) != 2)
> -               continue;
> -             if (cond_if_else_store_replacement (bb1, bb2, bb3))
> -               cfgchanged = true;
> -             continue;
> -           }
> -
> -         /* Also make sure that bb1 only have one predecessor and that it
> -            is bb.  */
> -         if (!single_pred_p (bb1)
> -             || single_pred (bb1) != bb)
> -           continue;
> -
> -         /* bb1 is the middle block, bb2 the join block, bb the split block,
> -            e1 the fallthrough edge from bb1 to bb2.  We can't do the
> -            optimization if the join block has more than two predecessors.  */
> -         if (EDGE_COUNT (bb2->preds) > 2)
> -           continue;
> -         if (cond_store_replacement (bb1, bb2, e1, e2, nontrap))
> -           cfgchanged = true;
> -         continue;
> -       }
> -
>        if (diamond_p)
>         {
>           basic_block bb3 = e1->dest;
> @@ -322,18 +281,132 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool do_hoist_loads, bool early_p)
>
>    free (bb_order);
>
> -  if (do_store_elim)
> -    delete nontrap;
> +  if (cfgchanged)
> +    return TODO_cleanup_cfg;
> +  return 0;
> +}
> +
> +/* The core routine of conditional store replacement.  */
> +static unsigned int
> +store_elim_worker (void)
> +{
> +  basic_block bb;
> +  basic_block *bb_order;
> +  unsigned n, i;
> +  bool cfgchanged = false;
> +  hash_set<tree> *nontrap = 0;
> +
> +  calculate_dominance_info (CDI_DOMINATORS);
> +
> +  /* Calculate the set of non-trapping memory accesses.  */
> +  nontrap = get_non_trapping ();
> +
> +  /* Search every basic block for COND_EXPR we may be able to optimize.
> +
> +     We walk the blocks in order that guarantees that a block with
> +     a single predecessor is processed before the predecessor.
> +     This ensures that we collapse inner ifs before visiting the
> +     outer ones, and also that we do not try to visit a removed
> +     block.  */
> +  bb_order = single_pred_before_succ_order ();
> +  n = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
> +
> +  for (i = 0; i < n; i++)
> +    {
> +      basic_block bb1, bb2;
> +      edge e1, e2;
> +      bool diamond_p = false;
> +
> +      bb = bb_order[i];
> +
> +      /* Check to see if the last statement is a GIMPLE_COND.  */
> +      gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
> +      if (!cond_stmt)
> +       continue;
> +
> +      e1 = EDGE_SUCC (bb, 0);
> +      bb1 = e1->dest;
> +      e2 = EDGE_SUCC (bb, 1);
> +      bb2 = e2->dest;
> +
> +      /* We cannot do the optimization on abnormal edges.  */
> +      if ((e1->flags & EDGE_ABNORMAL) != 0
> +         || (e2->flags & EDGE_ABNORMAL) != 0)
> +       continue;
> +
> +      /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
> +      if (EDGE_COUNT (bb1->succs) == 0
> +         || EDGE_COUNT (bb2->succs) == 0)
> +       continue;
> +
> +      /* Find the bb which is the fall through to the other.  */
> +      if (EDGE_SUCC (bb1, 0)->dest == bb2)
> +       ;
> +      else if (EDGE_SUCC (bb2, 0)->dest == bb1)
> +       {
> +         std::swap (bb1, bb2);
> +         std::swap (e1, e2);
> +       }
> +      else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> +              && single_succ_p (bb2))
> +       {
> +         diamond_p = true;
> +         e2 = EDGE_SUCC (bb2, 0);
> +         /* Make sure bb2 is just a fall through. */
> +         if ((e2->flags & EDGE_FALLTHRU) == 0)
> +           continue;
> +       }
> +      else
> +       continue;
> +
> +      e1 = EDGE_SUCC (bb1, 0);
> +
> +      /* Make sure that bb1 is just a fall through.  */
> +      if (!single_succ_p (bb1)
> +         || (e1->flags & EDGE_FALLTHRU) == 0)
> +       continue;
> +
> +      if (diamond_p)
> +       {
> +         basic_block bb3 = e1->dest;
> +
> +         /* Only handle sinking of store from 2 bbs only,
> +            The middle bbs don't need to come from the
> +            if always since we are sinking rather than
> +            hoisting. */
> +         if (EDGE_COUNT (bb3->preds) != 2)
> +           continue;
> +         if (cond_if_else_store_replacement (bb1, bb2, bb3))
> +           cfgchanged = true;
> +         continue;
> +       }
> +
> +      /* Also make sure that bb1 only have one predecessor and that it
> +        is bb.  */
> +      if (!single_pred_p (bb1)
> +         || single_pred (bb1) != bb)
> +       continue;
> +
> +      /* bb1 is the middle block, bb2 the join block, bb the split block,
> +        e1 the fallthrough edge from bb1 to bb2.  We can't do the
> +        optimization if the join block has more than two predecessors.  */
> +      if (EDGE_COUNT (bb2->preds) > 2)
> +       continue;
> +      if (cond_store_replacement (bb1, bb2, e1, e2, nontrap))
> +       cfgchanged = true;
> +    }
> +
> +  free (bb_order);
> +
> +  delete nontrap;
>    /* If the CFG has changed, we should cleanup the CFG.  */
> -  if (cfgchanged && do_store_elim)
> +  if (cfgchanged)
>      {
>        /* In cond-store replacement we have added some loads on edges
>           and new VOPS (as we moved the store, and created a load).  */
>        gsi_commit_edge_inserts ();
>        return TODO_cleanup_cfg | TODO_update_ssa_only_virtuals;
>      }
> -  else if (cfgchanged)
> -    return TODO_cleanup_cfg;
>    return 0;
>  }
>
> @@ -4257,8 +4330,7 @@ public:
>    bool gate (function *) final override { return flag_ssa_phiopt; }
>    unsigned int execute (function *) final override
>      {
> -      return tree_ssa_phiopt_worker (false,
> -                                    !early_p ? gate_hoist_loads () : false,
> +      return tree_ssa_phiopt_worker (!early_p ? gate_hoist_loads () : false,
>                                      early_p);
>      }
>
> @@ -4360,7 +4432,7 @@ pass_cselim::execute (function *)
>       An interfacing issue of find_data_references_in_bb.  */
>    loop_optimizer_init (LOOPS_NORMAL);
>    scev_initialize ();
> -  todo = tree_ssa_phiopt_worker (true, false, false);
> +  todo = store_elim_worker ();
>    scev_finalize ();
>    loop_optimizer_finalize ();
>    return todo;
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/7] PHIOPT: Allow MIN/MAX to have up to 2 MIN/MAX expressions for early phiopt
  2023-04-24 21:30 ` [PATCH 5/7] PHIOPT: Allow MIN/MAX to have up to 2 MIN/MAX expressions for early phiopt Andrew Pinski
@ 2023-04-27 10:51   ` Richard Biener
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Biener @ 2023-04-27 10:51 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches

On Mon, Apr 24, 2023 at 11:33 PM Andrew Pinski via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> In the early PHIOPT mode, the original minmax_replacement, would
> replace a PHI node with up to 2 min/max expressions in some cases,
> this allows for that too.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

> gcc/ChangeLog:
>
>         * tree-ssa-phiopt.cc (phiopt_early_allow): Allow for
>         up to 2 min/max expressions in the sequence/match code.
> ---
>  gcc/tree-ssa-phiopt.cc | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index fb2d2c9fc1a..de1896aa91a 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -542,9 +542,23 @@ phiopt_early_allow (gimple_seq &seq, gimple_match_op &op)
>      return false;
>    tree_code code = (tree_code)op.code;
>
> -  /* For non-empty sequence, only allow one statement.  */
> +  /* For non-empty sequence, only allow one statement
> +     except for MIN/MAX, allow max 2 statements,
> +     each with MIN/MAX.  */
>    if (!gimple_seq_empty_p (seq))
>      {
> +      if (code == MIN_EXPR || code == MAX_EXPR)
> +       {
> +         if (!gimple_seq_singleton_p (seq))
> +           return false;
> +
> +         gimple *stmt = gimple_seq_first_stmt (seq);
> +         /* Only allow assignments.  */
> +         if (!is_gimple_assign (stmt))
> +           return false;
> +         code = gimple_assign_rhs_code (stmt);
> +         return code == MIN_EXPR || code == MAX_EXPR;
> +       }
>        /* Check to make sure op was already a SSA_NAME.  */
>        if (code != SSA_NAME)
>         return false;
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 7/7] MATCH: Add patterns from phiopt's minmax_replacement
  2023-04-24 21:30 ` [PATCH 7/7] MATCH: Add patterns from phiopt's minmax_replacement Andrew Pinski
@ 2023-04-27 10:58   ` Richard Biener
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Biener @ 2023-04-27 10:58 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches

On Mon, Apr 24, 2023 at 11:33 PM Andrew Pinski via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> This adds a few patterns from phiopt's minmax_replacement
> for (A CMP B) ? MIN/MAX<A, C> : MIN/MAX <B, C> .
> It is progress to remove minmax_replacement from phiopt.
> There are still some more cases dealing with constants on the
> edges (0/INT_MAX) to handle in match.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

> gcc/ChangeLog:
>
>         * match.pd: Add patterns for
>         "(A CMP B) ? MIN/MAX<A, C> : MIN/MAX <B, C>".
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/minmax-16.c: Update testcase slightly.
>         * gcc.dg/tree-ssa/split-path-1.c: Also disable tree-loop-if-convert
>         as that now does the combining.
> ---
>  gcc/match.pd                                 | 16 ++++++++++++++++
>  gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c    | 10 ++++++++--
>  gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c |  3 ++-
>  3 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 6d3aaf45a93..5d5aae24509 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -4843,6 +4843,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>         (convert @c0))))))))
>  #endif
>
> +/* These was part of minmax phiopt.  */
> +/* Optimize (a CMP b) ? minmax<a, c> : minmax<b, c>
> +   to minmax<min/max<a, b>, c> */
> +(for minmax (min max)
> + (for cmp (lt le gt ge)
> +  (simplify
> +   (cond (cmp @1 @3) (minmax:c @1 @4) (minmax:c @2 @4))
> +   (with
> +    {
> +      tree_code code = minmax_from_comparison (cmp, @1, @2, @1, @3);
> +    }
> +    (if (code == MIN_EXPR)
> +     (minmax (min @1 @2) @4)
> +     (if (code == MAX_EXPR)
> +      (minmax (max @1 @2) @4)))))))
> +
>  /* X != C1 ? -X : C2 simplifies to -X when -C1 == C2.  */
>  (simplify
>   (cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2)
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> index 4febd092d83..623b12b3f74 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-16.c
> @@ -1,5 +1,5 @@
>  /* { dg-do run } */
> -/* { dg-options "-O -fdump-tree-phiopt -g" } */
> +/* { dg-options "-O -fdump-tree-phiopt -fdump-tree-optimized -g" } */
>
>  #include <stdint.h>
>
> @@ -25,5 +25,11 @@ main (void)
>    return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "MIN_EXPR" 3 "phiopt1" } } */
> +/* After phiopt1, there really should be only 3 MIN_EXPR in the IR (including debug statements).
> +   But the way phiopt does not cleanup the CFG all the time, the PHI might still reference the
> +   alternative bb's moved statement.
> +   Note in the end, we do dce the statement and other debug statements to end up with only 2 MIN_EXPR.
> +   So check that too. */
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 4 "phiopt1" } } */
> +/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "optimized" } } */
>  /* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> index 902dde44a50..b670dee8d10 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/split-path-1.c
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
> -/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt" } */
> +/* Note both PHI-OPT and the loop if conversion pass converts the inner if to be branchless using min/max. */
> +/* { dg-options "-O2 -fsplit-paths -fdump-tree-split-paths-details --param max-jump-thread-duplication-stmts=20 -fno-ssa-phiopt -fno-tree-loop-if-convert" } */
>
>  #include <stdio.h>
>  #include <stdlib.h>
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/7] PHIOPT: Rename tree_ssa_phiopt_worker to pass_phiopt::execute
  2023-04-24 21:30 ` [PATCH 2/7] PHIOPT: Rename tree_ssa_phiopt_worker to pass_phiopt::execute Andrew Pinski
@ 2023-04-27 10:58   ` Richard Biener
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Biener @ 2023-04-27 10:58 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches

On Mon, Apr 24, 2023 at 11:34 PM Andrew Pinski via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Now that store elimination and phiopt does not
> share outer code, we can move tree_ssa_phiopt_worker
> directly into pass_phiopt::execute and remove
> many declarations (prototypes) from the file.

OK.

> gcc/ChangeLog:
>
>         * tree-ssa-phiopt.cc (two_value_replacement): Remove
>         prototype.
>         (match_simplify_replacement): Likewise.
>         (factor_out_conditional_conversion): Likewise.
>         (value_replacement): Likewise.
>         (minmax_replacement): Likewise.
>         (spaceship_replacement): Likewise.
>         (cond_removal_in_builtin_zero_pattern): Likewise.
>         (hoist_adjacent_loads): Likewise.
>         (tree_ssa_phiopt_worker): Move into ...
>         (pass_phiopt::execute): this.
> ---
>  gcc/tree-ssa-phiopt.cc | 385 +++++++++++++++++++----------------------
>  1 file changed, 181 insertions(+), 204 deletions(-)
>
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index 7f47b32576b..d232fd9b551 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -55,27 +55,10 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-ssa-propagate.h"
>  #include "tree-ssa-dce.h"
>
> -static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
> -                                  tree, tree);
> -static bool match_simplify_replacement (basic_block, basic_block, basic_block,
> -                                       edge, edge, gphi *, tree, tree, bool, bool);
> -static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, tree,
> -                                               gimple *);
> -static int value_replacement (basic_block, basic_block,
> -                             edge, edge, gphi *, tree, tree);
> -static bool minmax_replacement (basic_block, basic_block, basic_block,
> -                               edge, edge, gphi *, tree, tree, bool);
> -static bool spaceship_replacement (basic_block, basic_block,
> -                                  edge, edge, gphi *, tree, tree);
> -static bool cond_removal_in_builtin_zero_pattern (basic_block, basic_block,
> -                                                 edge, edge, gphi *,
> -                                                 tree, tree);
>  static bool cond_store_replacement (basic_block, basic_block, edge, edge,
>                                     hash_set<tree> *);
>  static bool cond_if_else_store_replacement (basic_block, basic_block, basic_block);
>  static hash_set<tree> * get_non_trapping ();
> -static void hoist_adjacent_loads (basic_block, basic_block,
> -                                 basic_block, basic_block);
>
>  /* Return the singleton PHI in the SEQ of PHIs for edges E0 and E1. */
>
> @@ -104,188 +87,6 @@ single_non_singleton_phi_for_edges (gimple_seq seq, edge e0, edge e1)
>    return phi;
>  }
>
> -/* The core routine of phi optimizations.
> -   DO_HOIST_LOADS is true when we want to hoist adjacent loads out
> -   of diamond control flow patterns, false otherwise.  */
> -static unsigned int
> -tree_ssa_phiopt_worker (bool do_hoist_loads, bool early_p)
> -{
> -  basic_block bb;
> -  basic_block *bb_order;
> -  unsigned n, i;
> -  bool cfgchanged = false;
> -
> -  calculate_dominance_info (CDI_DOMINATORS);
> -
> -  /* Search every basic block for COND_EXPR we may be able to optimize.
> -
> -     We walk the blocks in order that guarantees that a block with
> -     a single predecessor is processed before the predecessor.
> -     This ensures that we collapse inner ifs before visiting the
> -     outer ones, and also that we do not try to visit a removed
> -     block.  */
> -  bb_order = single_pred_before_succ_order ();
> -  n = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
> -
> -  for (i = 0; i < n; i++)
> -    {
> -      gphi *phi;
> -      basic_block bb1, bb2;
> -      edge e1, e2;
> -      tree arg0, arg1;
> -      bool diamond_p = false;
> -
> -      bb = bb_order[i];
> -
> -      /* Check to see if the last statement is a GIMPLE_COND.  */
> -      gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
> -      if (!cond_stmt)
> -       continue;
> -
> -      e1 = EDGE_SUCC (bb, 0);
> -      bb1 = e1->dest;
> -      e2 = EDGE_SUCC (bb, 1);
> -      bb2 = e2->dest;
> -
> -      /* We cannot do the optimization on abnormal edges.  */
> -      if ((e1->flags & EDGE_ABNORMAL) != 0
> -          || (e2->flags & EDGE_ABNORMAL) != 0)
> -       continue;
> -
> -      /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
> -      if (EDGE_COUNT (bb1->succs) == 0
> -         || EDGE_COUNT (bb2->succs) == 0)
> -       continue;
> -
> -      /* Find the bb which is the fall through to the other.  */
> -      if (EDGE_SUCC (bb1, 0)->dest == bb2)
> -        ;
> -      else if (EDGE_SUCC (bb2, 0)->dest == bb1)
> -        {
> -         std::swap (bb1, bb2);
> -         std::swap (e1, e2);
> -       }
> -      else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> -              && single_succ_p (bb2))
> -       {
> -         diamond_p = true;
> -         e2 = EDGE_SUCC (bb2, 0);
> -         /* Make sure bb2 is just a fall through. */
> -         if ((e2->flags & EDGE_FALLTHRU) == 0)
> -           continue;
> -       }
> -      else
> -       continue;
> -
> -      e1 = EDGE_SUCC (bb1, 0);
> -
> -      /* Make sure that bb1 is just a fall through.  */
> -      if (!single_succ_p (bb1)
> -         || (e1->flags & EDGE_FALLTHRU) == 0)
> -       continue;
> -
> -      if (diamond_p)
> -       {
> -         basic_block bb3 = e1->dest;
> -
> -         if (!single_pred_p (bb1)
> -             || !single_pred_p (bb2))
> -           continue;
> -
> -         if (do_hoist_loads
> -             && !FLOAT_TYPE_P (TREE_TYPE (gimple_cond_lhs (cond_stmt)))
> -             && EDGE_COUNT (bb->succs) == 2
> -             && EDGE_COUNT (bb3->preds) == 2
> -             /* If one edge or the other is dominant, a conditional move
> -                is likely to perform worse than the well-predicted branch.  */
> -             && !predictable_edge_p (EDGE_SUCC (bb, 0))
> -             && !predictable_edge_p (EDGE_SUCC (bb, 1)))
> -           hoist_adjacent_loads (bb, bb1, bb2, bb3);
> -       }
> -
> -      gimple_stmt_iterator gsi;
> -      bool candorest = true;
> -
> -      /* Check that we're looking for nested phis.  */
> -      basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> -      gimple_seq phis = phi_nodes (merge);
> -
> -      /* Value replacement can work with more than one PHI
> -        so try that first. */
> -      if (!early_p && !diamond_p)
> -       for (gsi = gsi_start (phis); !gsi_end_p (gsi); gsi_next (&gsi))
> -         {
> -           phi = as_a <gphi *> (gsi_stmt (gsi));
> -           arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
> -           arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
> -           if (value_replacement (bb, bb1, e1, e2, phi, arg0, arg1) == 2)
> -             {
> -               candorest = false;
> -               cfgchanged = true;
> -               break;
> -             }
> -         }
> -
> -      if (!candorest)
> -       continue;
> -
> -      phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> -      if (!phi)
> -       continue;
> -
> -      arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
> -      arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
> -
> -      /* Something is wrong if we cannot find the arguments in the PHI
> -         node.  */
> -      gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
> -
> -      gphi *newphi;
> -      if (single_pred_p (bb1)
> -         && !diamond_p
> -         && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> -                                                         arg0, arg1,
> -                                                         cond_stmt)))
> -       {
> -         phi = newphi;
> -         /* factor_out_conditional_conversion may create a new PHI in
> -            BB2 and eliminate an existing PHI in BB2.  Recompute values
> -            that may be affected by that change.  */
> -         arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
> -         arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
> -         gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
> -       }
> -
> -      /* Do the replacement of conditional if it can be done.  */
> -      if (!early_p
> -         && !diamond_p
> -         && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> -       cfgchanged = true;
> -      else if (match_simplify_replacement (bb, bb1, bb2, e1, e2, phi,
> -                                          arg0, arg1, early_p, diamond_p))
> -       cfgchanged = true;
> -      else if (!early_p
> -              && !diamond_p
> -              && single_pred_p (bb1)
> -              && cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
> -                                                       phi, arg0, arg1))
> -       cfgchanged = true;
> -      else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> -                                  diamond_p))
> -       cfgchanged = true;
> -      else if (single_pred_p (bb1)
> -              && !diamond_p
> -              && spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> -       cfgchanged = true;
> -    }
> -
> -  free (bb_order);
> -
> -  if (cfgchanged)
> -    return TODO_cleanup_cfg;
> -  return 0;
> -}
> -
>  /* The core routine of conditional store replacement.  */
>  static unsigned int
>  store_elim_worker (void)
> @@ -4328,11 +4129,7 @@ public:
>        early_p = param;
>      }
>    bool gate (function *) final override { return flag_ssa_phiopt; }
> -  unsigned int execute (function *) final override
> -    {
> -      return tree_ssa_phiopt_worker (!early_p ? gate_hoist_loads () : false,
> -                                    early_p);
> -    }
> +  unsigned int execute (function *) final override;
>
>  private:
>    bool early_p;
> @@ -4346,6 +4143,186 @@ make_pass_phiopt (gcc::context *ctxt)
>    return new pass_phiopt (ctxt);
>  }
>
> +unsigned int
> +pass_phiopt::execute (function *)
> +{
> +  bool do_hoist_loads = !early_p ? gate_hoist_loads () : false;
> +  basic_block bb;
> +  basic_block *bb_order;
> +  unsigned n, i;
> +  bool cfgchanged = false;
> +
> +  calculate_dominance_info (CDI_DOMINATORS);
> +
> +  /* Search every basic block for COND_EXPR we may be able to optimize.
> +
> +     We walk the blocks in order that guarantees that a block with
> +     a single predecessor is processed before the predecessor.
> +     This ensures that we collapse inner ifs before visiting the
> +     outer ones, and also that we do not try to visit a removed
> +     block.  */
> +  bb_order = single_pred_before_succ_order ();
> +  n = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS;
> +
> +  for (i = 0; i < n; i++)
> +    {
> +      gphi *phi;
> +      basic_block bb1, bb2;
> +      edge e1, e2;
> +      tree arg0, arg1;
> +      bool diamond_p = false;
> +
> +      bb = bb_order[i];
> +
> +      /* Check to see if the last statement is a GIMPLE_COND.  */
> +      gcond *cond_stmt = safe_dyn_cast <gcond *> (*gsi_last_bb (bb));
> +      if (!cond_stmt)
> +       continue;
> +
> +      e1 = EDGE_SUCC (bb, 0);
> +      bb1 = e1->dest;
> +      e2 = EDGE_SUCC (bb, 1);
> +      bb2 = e2->dest;
> +
> +      /* We cannot do the optimization on abnormal edges.  */
> +      if ((e1->flags & EDGE_ABNORMAL) != 0
> +         || (e2->flags & EDGE_ABNORMAL) != 0)
> +       continue;
> +
> +      /* If either bb1's succ or bb2 or bb2's succ is non NULL.  */
> +      if (EDGE_COUNT (bb1->succs) == 0
> +         || EDGE_COUNT (bb2->succs) == 0)
> +       continue;
> +
> +      /* Find the bb which is the fall through to the other.  */
> +      if (EDGE_SUCC (bb1, 0)->dest == bb2)
> +       ;
> +      else if (EDGE_SUCC (bb2, 0)->dest == bb1)
> +       {
> +         std::swap (bb1, bb2);
> +         std::swap (e1, e2);
> +       }
> +      else if (EDGE_SUCC (bb1, 0)->dest == EDGE_SUCC (bb2, 0)->dest
> +              && single_succ_p (bb2))
> +       {
> +         diamond_p = true;
> +         e2 = EDGE_SUCC (bb2, 0);
> +         /* Make sure bb2 is just a fall through. */
> +         if ((e2->flags & EDGE_FALLTHRU) == 0)
> +           continue;
> +       }
> +      else
> +       continue;
> +
> +      e1 = EDGE_SUCC (bb1, 0);
> +
> +      /* Make sure that bb1 is just a fall through.  */
> +      if (!single_succ_p (bb1)
> +         || (e1->flags & EDGE_FALLTHRU) == 0)
> +       continue;
> +
> +      if (diamond_p)
> +       {
> +         basic_block bb3 = e1->dest;
> +
> +         if (!single_pred_p (bb1)
> +             || !single_pred_p (bb2))
> +           continue;
> +
> +         if (do_hoist_loads
> +             && !FLOAT_TYPE_P (TREE_TYPE (gimple_cond_lhs (cond_stmt)))
> +             && EDGE_COUNT (bb->succs) == 2
> +             && EDGE_COUNT (bb3->preds) == 2
> +             /* If one edge or the other is dominant, a conditional move
> +                is likely to perform worse than the well-predicted branch.  */
> +             && !predictable_edge_p (EDGE_SUCC (bb, 0))
> +             && !predictable_edge_p (EDGE_SUCC (bb, 1)))
> +           hoist_adjacent_loads (bb, bb1, bb2, bb3);
> +       }
> +
> +      gimple_stmt_iterator gsi;
> +      bool candorest = true;
> +
> +      /* Check that we're looking for nested phis.  */
> +      basic_block merge = diamond_p ? EDGE_SUCC (bb2, 0)->dest : bb2;
> +      gimple_seq phis = phi_nodes (merge);
> +
> +      /* Value replacement can work with more than one PHI
> +        so try that first. */
> +      if (!early_p && !diamond_p)
> +       for (gsi = gsi_start (phis); !gsi_end_p (gsi); gsi_next (&gsi))
> +         {
> +           phi = as_a <gphi *> (gsi_stmt (gsi));
> +           arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
> +           arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
> +           if (value_replacement (bb, bb1, e1, e2, phi, arg0, arg1) == 2)
> +             {
> +               candorest = false;
> +               cfgchanged = true;
> +               break;
> +             }
> +         }
> +
> +      if (!candorest)
> +       continue;
> +
> +      phi = single_non_singleton_phi_for_edges (phis, e1, e2);
> +      if (!phi)
> +       continue;
> +
> +      arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
> +      arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
> +
> +      /* Something is wrong if we cannot find the arguments in the PHI
> +         node.  */
> +      gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
> +
> +      gphi *newphi;
> +      if (single_pred_p (bb1)
> +         && !diamond_p
> +         && (newphi = factor_out_conditional_conversion (e1, e2, phi,
> +                                                         arg0, arg1,
> +                                                         cond_stmt)))
> +       {
> +         phi = newphi;
> +         /* factor_out_conditional_conversion may create a new PHI in
> +            BB2 and eliminate an existing PHI in BB2.  Recompute values
> +            that may be affected by that change.  */
> +         arg0 = gimple_phi_arg_def (phi, e1->dest_idx);
> +         arg1 = gimple_phi_arg_def (phi, e2->dest_idx);
> +         gcc_assert (arg0 != NULL_TREE && arg1 != NULL_TREE);
> +       }
> +
> +      /* Do the replacement of conditional if it can be done.  */
> +      if (!early_p
> +         && !diamond_p
> +         && two_value_replacement (bb, bb1, e2, phi, arg0, arg1))
> +       cfgchanged = true;
> +      else if (match_simplify_replacement (bb, bb1, bb2, e1, e2, phi,
> +                                          arg0, arg1, early_p, diamond_p))
> +       cfgchanged = true;
> +      else if (!early_p
> +              && !diamond_p
> +              && single_pred_p (bb1)
> +              && cond_removal_in_builtin_zero_pattern (bb, bb1, e1, e2,
> +                                                       phi, arg0, arg1))
> +       cfgchanged = true;
> +      else if (minmax_replacement (bb, bb1, bb2, e1, e2, phi, arg0, arg1,
> +                                  diamond_p))
> +       cfgchanged = true;
> +      else if (single_pred_p (bb1)
> +              && !diamond_p
> +              && spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> +       cfgchanged = true;
> +    }
> +
> +  free (bb_order);
> +
> +  if (cfgchanged)
> +    return TODO_cleanup_cfg;
> +  return 0;
> +}
> +
>  /* This pass tries to transform conditional stores into unconditional
>     ones, enabling further simplifications with the simpler then and else
>     blocks.  In particular it replaces this:
> --
> 2.39.1
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-04-27 11:00 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-24 21:30 [PATCH 0/7] Some more phiopt cleanups and double minmax to match Andrew Pinski
2023-04-24 21:30 ` [PATCH 1/7] PHIOPT: Split out store elimination from phiopt Andrew Pinski
2023-04-27 10:50   ` Richard Biener
2023-04-24 21:30 ` [PATCH 2/7] PHIOPT: Rename tree_ssa_phiopt_worker to pass_phiopt::execute Andrew Pinski
2023-04-27 10:58   ` Richard Biener
2023-04-24 21:30 ` [PATCH 3/7] PHIOPT: Move store_elim_worker into pass_cselim::execute Andrew Pinski
2023-04-27 10:50   ` Richard Biener
2023-04-24 21:30 ` [PATCH 4/7] MIN/MAX should be treated similar as comparisons for trapping Andrew Pinski
2023-04-27 10:49   ` Richard Biener
2023-04-24 21:30 ` [PATCH 5/7] PHIOPT: Allow MIN/MAX to have up to 2 MIN/MAX expressions for early phiopt Andrew Pinski
2023-04-27 10:51   ` Richard Biener
2023-04-24 21:30 ` [PATCH 6/7] MATCH: Factor out code that for min max detection with constants Andrew Pinski
2023-04-25 10:45   ` Mikael Morin
2023-04-24 21:30 ` [PATCH 7/7] MATCH: Add patterns from phiopt's minmax_replacement Andrew Pinski
2023-04-27 10:58   ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).