public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Compile-time improvement for if conversion.
@ 2016-10-05 13:22 Yuri Rumyantsev
  2016-10-10  9:52 ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Yuri Rumyantsev @ 2016-10-05 13:22 UTC (permalink / raw)
  To: Richard Biener, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2239 bytes --]

Hi All,

Here is implementation of Richard proposal:

< For general infrastructure it would be nice to expose a (post-)dominator
< compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
< what makes if-conversion expensive is the post-dom compute which happens
< for each loop for the whole function.  It shouldn't be very difficult
< to write this,
< sharing as much as possible code with the current DOM code might need
< quite some refactoring though.

I implemented this proposal by adding calculation of dominance info
for SESE regions and incorporate this change to if conversion pass.
SESE region is built by adding loop pre-header and possibly fake
post-header blocks to loop body. Fake post-header is deleted after
predication completion.

Bootstrapping and regression testing did not show any new failures.

Is it OK for trunk?

ChangeLog:
2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>

* dominance.c : Include cfgloop.h for loop recognition.
(dom_info): Add new functions and add boolean argument to recognize
computation for loop region.
(dom_info::dom_info): New function.
(dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
handle unvisited blocks.
(dom_info::calc_idoms): Likewise.
(compute_dom_fast_query_in_region): New function.
(calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
false argument.
(calculate_dominance_info_for_region): New function.
(free_dominance_info_for_region): Likewise.
(verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
argument.
* dominance.h: Add prototype for introduced functions
calculate_dominance_info_for_region and
free_dominance_info_for_region.
tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
(build_sese_region): New function.
(if_convertible_loop_p_1): Invoke local version of post-dominators
calculation, free it after basic block predication and delete created
fake post-header block if any.
(tree_if_conversion): Delete call of free_dominance_info for
post-dominators, free ifc_sese_bbs which represents SESE region.
(pass_if_conversion::execute): Delete detection of infinite loops
and fake edges to exit block since post-dominator calculation is
performed per if-converted loop only.

[-- Attachment #2: patch --]
[-- Type: application/octet-stream, Size: 13830 bytes --]

diff --git a/gcc/dominance.c b/gcc/dominance.c
index e3308cc..1caed7c 100644
--- a/gcc/dominance.c
+++ b/gcc/dominance.c
@@ -41,6 +41,7 @@
 #include "cfganal.h"
 #include "et-forest.h"
 #include "graphds.h"
+#include "cfgloop.h"
 
 /* We name our nodes with integers, beginning with 1.  Zero is reserved for
    'undefined' or 'end of list'.  The name of each node is given by the dfs
@@ -60,9 +61,10 @@ class dom_info
 {
 public:
   dom_info (function *, cdi_direction);
+  dom_info (struct loop *, cdi_direction);
   ~dom_info ();
-  void calc_dfs_tree ();
-  void calc_idoms ();
+  void calc_dfs_tree (bool);
+  void calc_idoms (bool);
 
   inline basic_block get_idom (basic_block);
 private:
@@ -204,6 +206,61 @@ dom_info::dom_info (function *fn, cdi_direction dir)
     }
 }
 
+/* Constructor for innermost loop which is SESE region.  */
+
+dom_info::dom_info (struct loop *loop, cdi_direction dir)
+{
+  /* We need memory for n_basic_blocks nodes.  */
+  size_t num = m_n_basic_blocks = loop->num_nodes + 2;
+  m_dfs_parent = new_zero_array <TBB> (num);
+  m_dom = new_zero_array <TBB> (num);
+
+  m_path_min = new TBB[num];
+  m_key = new TBB[num];
+  m_set_size = new unsigned int[num];
+  for (size_t i = 0; i < num; i++)
+    {
+      m_path_min[i] = m_key[i] = i;
+      m_set_size[i] = 1;
+    }
+
+  m_bucket = new_zero_array <TBB> (num);
+  m_next_bucket = new_zero_array <TBB> (num);
+
+  m_set_chain = new_zero_array <TBB> (num);
+  m_set_child = new_zero_array <TBB> (num);
+
+  /* Determine max basic block index in region.  */
+  int max_index = ((basic_block*)loop->aux)[0]->index;
+  for (size_t i = 1; i < num; i++)
+    if (((basic_block*)loop->aux)[i]->index > max_index)
+      max_index = ((basic_block*)loop->aux)[i]->index;
+  max_index += 1;
+  m_dfs_order = new_zero_array <TBB> (max_index + 1);
+  m_dfs_last = &m_dfs_order[max_index];
+  m_dfs_to_bb = new_zero_array <basic_block> (num);
+
+  m_dfsnum = 1;
+  m_nodes = 0;
+  m_fake_exit_edge = NULL; /* Assume that region is SCC.  */
+
+  switch (dir)
+    {
+      case CDI_DOMINATORS:
+	m_reverse = false;
+	m_start_block = ((basic_block*)loop->aux)[0];
+	m_end_block = ((basic_block*)loop->aux)[num - 1];
+	break;
+      case CDI_POST_DOMINATORS:
+	m_reverse = true;
+	m_start_block = ((basic_block*)loop->aux)[num - 1];
+	m_end_block = ((basic_block*)loop->aux)[0];
+	break;
+      default:
+	gcc_unreachable ();
+    }
+}
+
 inline basic_block
 dom_info::get_idom (basic_block bb)
 {
@@ -336,10 +393,11 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
 /* The main entry for calculating the DFS tree or forest.  m_reverse is true,
    if we are interested in the reverse flow graph.  In that case the result is
    not necessarily a tree but a forest, because there may be nodes from which
-   the EXIT_BLOCK is unreachable.  */
+   the EXIT_BLOCK is unreachable.  IN_REGION is true if dfs is performed
+   for innermost loop considering as SESE region.  */
 
 void
-dom_info::calc_dfs_tree ()
+dom_info::calc_dfs_tree (bool in_region)
 {
   *m_dfs_last = m_dfsnum;
   m_dfs_to_bb[m_dfsnum] = m_start_block;
@@ -347,7 +405,7 @@ dom_info::calc_dfs_tree ()
 
   calc_dfs_tree_nonrec (m_start_block);
 
-  if (m_reverse)
+  if (m_reverse && !in_region)
     {
       /* In the post-dom case we may have nodes without a path to EXIT_BLOCK.
          They are reverse-unreachable.  In the dom-case we disallow such
@@ -493,10 +551,11 @@ dom_info::link_roots (TBB v, TBB w)
 
 /* This calculates the immediate dominators (or post-dominators). THIS is our
    working structure and should hold the DFS forest.
-   On return the immediate dominator to node V is in m_dom[V].  */
+   On return the immediate dominator to node V is in m_dom[V].
+   IN_REGIOS is true when dominator tree is built innermost loop.  */
 
 void
-dom_info::calc_idoms ()
+dom_info::calc_idoms (bool in_region)
 {
   /* Go backwards in DFS order, to first look at the leafs.  */
   for (TBB v = m_nodes; v > 1; v--)
@@ -511,7 +570,7 @@ dom_info::calc_idoms ()
 				   : ei_start (bb->preds);
       edge_iterator einext;
 
-      if (m_reverse)
+      if (m_reverse && !in_region)
 	{
 	  /* If this block has a fake edge to exit, process that first.  */
 	  if (bitmap_bit_p (m_fake_exit_edge, bb->index))
@@ -622,6 +681,31 @@ compute_dom_fast_query (enum cdi_direction dir)
   dom_computed[dir_index] = DOM_OK;
 }
 
+/* Analogous to the previous function but compute the data for loop body.  */
+
+static void
+compute_dom_fast_query_in_region (struct loop *loop, enum cdi_direction dir)
+{
+  int num = 0;
+  basic_block bb;
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+
+  gcc_checking_assert (dom_info_available_p (dir));
+
+  if (dom_computed[dir_index] == DOM_OK)
+    return;
+
+  /* Assign dfs numbers for loop nodes only.  */
+  for (unsigned int i = 1; i < loop->num_nodes + 1; i++)
+    {
+      bb = ((basic_block*)loop->aux)[i];
+      if (!bb->dom[dir_index]->father)
+	assign_dfs_numbers (bb->dom[dir_index], &num);
+    }
+
+  dom_computed[dir_index] = DOM_OK;
+}
+
 /* The main entry point into this module.  DIR is set depending on whether
    we want to compute dominators or postdominators.  */
 
@@ -649,8 +733,8 @@ calculate_dominance_info (cdi_direction dir)
       n_bbs_in_dom_tree[dir_index] = n_basic_blocks_for_fn (cfun);
 
       dom_info di (cfun, dir);
-      di.calc_dfs_tree ();
-      di.calc_idoms ();
+      di.calc_dfs_tree (false);
+      di.calc_idoms (false);
 
       FOR_EACH_BB_FN (b, cfun)
 	{
@@ -668,6 +752,45 @@ calculate_dominance_info (cdi_direction dir)
   timevar_pop (TV_DOMINANCE);
 }
 
+/* Analogous to the previous function but compute dominators or postdominators
+   for innermost loops with pre-header and post-header to form SESE region.  */
+
+void
+calculate_dominance_info_for_region (struct loop *loop, cdi_direction dir)
+{
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+  unsigned int n_nodes = loop->num_nodes + 2;
+
+  if (dom_computed[dir_index] == DOM_OK)
+    return;
+
+  timevar_push (TV_DOMINANCE);
+  if (!dom_info_available_p (dir))
+    {
+      basic_block bb;
+      gcc_assert (loop->aux);
+
+      for (unsigned int i = 0; i < n_nodes; i++)
+	{
+	  bb = ((basic_block*)loop->aux)[i];
+	  bb->dom[dir_index] = et_new_tree (bb);
+	}
+      dom_info di (loop, dir);
+      di.calc_dfs_tree (true);
+      di.calc_idoms (true);
+      for (unsigned int i = 0; i < n_nodes; i++)
+	{
+	  bb = ((basic_block*)loop->aux)[i];
+	  if (basic_block d = di.get_idom (bb))
+	    et_set_father (bb->dom[dir_index], d->dom[dir_index]);
+	}
+      dom_computed[dir_index] = DOM_NO_FAST_QUERY;
+    }
+  compute_dom_fast_query_in_region (loop, dir);
+
+  timevar_pop (TV_DOMINANCE);
+}
+
 /* Free dominance information for direction DIR.  */
 void
 free_dominance_info (function *fn, enum cdi_direction dir)
@@ -696,6 +819,26 @@ free_dominance_info (enum cdi_direction dir)
   free_dominance_info (cfun, dir);
 }
 
+void
+free_dominance_info_for_region (struct loop *loop, enum cdi_direction dir)
+{
+  basic_block bb;
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+
+  if (!dom_info_available_p (dir))
+    return;
+
+  for (unsigned int i = 0; i < loop->num_nodes + 2; i++)
+    {
+      bb = ((basic_block*)loop->aux)[i];
+      et_free_tree_force (bb->dom[dir_index]);
+      bb->dom[dir_index] = NULL;
+    }
+  et_free_pools ();
+  cfun->cfg->x_dom_computed[dir_index] = DOM_NONE;
+  cfun->cfg->x_n_bbs_in_dom_tree[dir_index] = 0;
+}
+
 /* Return the immediate dominator of basic block BB.  */
 basic_block
 get_immediate_dominator (enum cdi_direction dir, basic_block bb)
@@ -1012,8 +1155,8 @@ verify_dominators (cdi_direction dir)
   gcc_assert (dom_info_available_p (dir));
 
   dom_info di (cfun, dir);
-  di.calc_dfs_tree ();
-  di.calc_idoms ();
+  di.calc_dfs_tree (false);
+  di.calc_idoms (false);
 
   bool err = false;
   basic_block bb;
diff --git a/gcc/dominance.h b/gcc/dominance.h
index 961c4e2..3dbe526 100644
--- a/gcc/dominance.h
+++ b/gcc/dominance.h
@@ -36,8 +36,11 @@ enum dom_state
 };
 
 extern void calculate_dominance_info (enum cdi_direction);
+extern void calculate_dominance_info_for_region (struct loop *,
+						 enum cdi_direction);
 extern void free_dominance_info (function *, enum cdi_direction);
 extern void free_dominance_info (enum cdi_direction);
+extern void free_dominance_info_for_region (struct loop *, enum cdi_direction);
 extern basic_block get_immediate_dominator (enum cdi_direction, basic_block);
 extern void set_immediate_dominator (enum cdi_direction, basic_block,
 				     basic_block);
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index eec431e..46e793b 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -186,6 +186,12 @@ innermost_loop_behavior_hash::equal (const value_type &e1,
 /* List of basic blocks in if-conversion-suitable order.  */
 static basic_block *ifc_bbs;
 
+/* List of basic blocks representing SESE region.  */
+static basic_block *ifc_sese_bbs = NULL;
+
+/* Fake postheader basic block to build SESE region.  */
+static basic_block fake_postheader;
+
 /* Hash table to store <DR's innermost loop behavior, DR> pairs.  */
 static hash_map<innermost_loop_behavior_hash,
 		data_reference_p> *innermost_DR_map;
@@ -1309,6 +1315,53 @@ predicate_bbs (loop_p loop)
 	      && bb_predicate_gimplified_stmts (loop->latch) == NULL);
 }
 
+/* Build SESE region by adding loop pre-header and post-header blocks.  */
+
+static basic_block *
+build_sese_region (struct loop *loop)
+{
+  basic_block *blocks;
+
+  gcc_assert (ifc_bbs);
+  fake_postheader = NULL;
+  /* Add loop preheader and postheader blocks to form SESE region.  */
+  blocks = XCNEWVEC (basic_block, loop->num_nodes + 2);
+  blocks[0] = loop_preheader_edge (loop)->src;
+  gcc_assert (EDGE_COUNT (blocks[0]->succs) == 1);
+  for (unsigned int i = 0; i < loop->num_nodes; i++)
+    {
+      basic_block bb = ifc_bbs[i];
+      blocks[i + 1] = bb;
+      /* Find loop postheader.  */
+      edge e;
+      edge_iterator ei;
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (loop_exit_edge_p (loop, e))
+	  {
+	    if (EDGE_COUNT (e->dest->preds) == 1)
+	      blocks[loop->num_nodes + 1] = e->dest;
+	    else
+	      {
+		/* Post-header must have the only predecessor,
+		   so create fake basic block the only predecessor
+		   of which is exit block to form SESE region.
+		   After completion basic block predication
+		   this block will be deleted and branch/edge of exit
+		   block will be restored.  */
+		basic_block bb = create_empty_bb (e->src);
+		unchecked_make_edge (e->src, bb, 0);
+		fake_postheader = bb;
+		blocks[loop->num_nodes + 1] = bb;
+		if (dump_file && (dump_flags & TDF_DETAILS))
+		  fprintf (dump_file, "Create fake post-header#%d!\n",
+			   bb->index);
+	      }
+	    break;
+	  }
+    }
+  return blocks;
+}
+
 /* Return true when LOOP is if-convertible.  This is a helper function
    for if_convertible_loop_p.  REFS and DDRS are initialized and freed
    in if_convertible_loop_p.  */
@@ -1322,7 +1375,8 @@ if_convertible_loop_p_1 (struct loop *loop, vec<data_reference_p> *refs)
   if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
     return false;
 
-  calculate_dominance_info (CDI_DOMINATORS);
+  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
+    calculate_dominance_info (CDI_DOMINATORS);
 
   /* Allow statements that can be handled during if-conversion.  */
   ifc_bbs = get_loop_body_in_if_conv_order (loop);
@@ -1370,9 +1424,33 @@ if_convertible_loop_p_1 (struct loop *loop, vec<data_reference_p> *refs)
 	  = new hash_map<innermost_loop_behavior_hash, data_reference_p>;
   baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
 
-  calculate_dominance_info (CDI_POST_DOMINATORS);
+  /* Compute post-dominator tree locally.  */
+  gcc_assert (ifc_sese_bbs == NULL);
+  ifc_sese_bbs = build_sese_region (loop);
+  loop->aux = ifc_sese_bbs;
+  calculate_dominance_info_for_region (loop, CDI_POST_DOMINATORS);
+
   predicate_bbs (loop);
 
+  /* Free post-dominator tree since it is not used after predication.  */
+  free_dominance_info_for_region (loop, CDI_POST_DOMINATORS);
+  if (fake_postheader != NULL)
+    {
+      edge_iterator ei;
+      edge e;
+      basic_block exit_bb = single_pred (fake_postheader);
+      gcc_assert (exit_bb);
+      FOR_EACH_EDGE (e, ei, exit_bb->succs)
+      if (e->dest == fake_postheader)
+	{
+	  /* Delete this fake edge.  */
+	  remove_edge_raw (e);
+	  break;
+	}
+      expunge_block (fake_postheader);
+      fake_postheader = NULL;
+    }
+
   for (i = 0; refs->iterate (i, &dr); i++)
     {
       tree ref = DR_REF (dr);
@@ -2752,7 +2830,13 @@ tree_if_conversion (struct loop *loop)
       free (ifc_bbs);
       ifc_bbs = NULL;
     }
-  free_dominance_info (CDI_POST_DOMINATORS);
+
+  if (ifc_sese_bbs)
+    {
+      free (ifc_sese_bbs);
+      ifc_sese_bbs = NULL;
+    }
+  loop->aux = NULL;
 
   return todo;
 }
@@ -2805,14 +2889,6 @@ pass_if_conversion::execute (function *fun)
   if (number_of_loops (fun) <= 1)
     return 0;
 
-  /* If there are infinite loops, during CDI_POST_DOMINATORS computation
-     we can pick pretty much random bb inside of the infinite loop that
-     has the fake edge.  If we are unlucky enough, this can confuse the
-     add_to_predicate_list post-dominator check to optimize as if that
-     bb or some other one is a join block when it actually is not.
-     See PR70916.  */
-  connect_infinite_loops_to_exit ();
-
   FOR_EACH_LOOP (loop, 0)
     if (flag_tree_loop_if_convert == 1
 	|| flag_tree_loop_if_convert_stores == 1
@@ -2820,8 +2896,6 @@ pass_if_conversion::execute (function *fun)
 	    && !loop->dont_vectorize))
       todo |= tree_if_conversion (loop);
 
-  remove_fake_exit_edges ();
-
   if (flag_checking)
     {
       basic_block bb;

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-10-17 11:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-05 13:22 Compile-time improvement for if conversion Yuri Rumyantsev
2016-10-10  9:52 ` Richard Biener
2016-10-10 11:42   ` Yuri Rumyantsev
2016-10-10 12:00     ` Richard Biener
2016-10-10 14:17       ` Yuri Rumyantsev
2016-10-11 10:33         ` Richard Biener
2016-10-11 13:23           ` Yuri Rumyantsev
2016-10-11 13:48             ` Richard Biener
2016-10-12 13:37               ` Yuri Rumyantsev
2016-10-13 10:15                 ` Richard Biener
2016-10-14 12:07                   ` Yuri Rumyantsev
2016-10-17 11:53                     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).