public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Compile-time improvement for if conversion.
@ 2016-10-05 13:22 Yuri Rumyantsev
  2016-10-10  9:52 ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Yuri Rumyantsev @ 2016-10-05 13:22 UTC (permalink / raw)
  To: Richard Biener, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2239 bytes --]

Hi All,

Here is implementation of Richard proposal:

< For general infrastructure it would be nice to expose a (post-)dominator
< compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
< what makes if-conversion expensive is the post-dom compute which happens
< for each loop for the whole function.  It shouldn't be very difficult
< to write this,
< sharing as much as possible code with the current DOM code might need
< quite some refactoring though.

I implemented this proposal by adding calculation of dominance info
for SESE regions and incorporate this change to if conversion pass.
SESE region is built by adding loop pre-header and possibly fake
post-header blocks to loop body. Fake post-header is deleted after
predication completion.

Bootstrapping and regression testing did not show any new failures.

Is it OK for trunk?

ChangeLog:
2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>

* dominance.c : Include cfgloop.h for loop recognition.
(dom_info): Add new functions and add boolean argument to recognize
computation for loop region.
(dom_info::dom_info): New function.
(dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
handle unvisited blocks.
(dom_info::calc_idoms): Likewise.
(compute_dom_fast_query_in_region): New function.
(calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
false argument.
(calculate_dominance_info_for_region): New function.
(free_dominance_info_for_region): Likewise.
(verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
argument.
* dominance.h: Add prototype for introduced functions
calculate_dominance_info_for_region and
free_dominance_info_for_region.
tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
(build_sese_region): New function.
(if_convertible_loop_p_1): Invoke local version of post-dominators
calculation, free it after basic block predication and delete created
fake post-header block if any.
(tree_if_conversion): Delete call of free_dominance_info for
post-dominators, free ifc_sese_bbs which represents SESE region.
(pass_if_conversion::execute): Delete detection of infinite loops
and fake edges to exit block since post-dominator calculation is
performed per if-converted loop only.

[-- Attachment #2: patch --]
[-- Type: application/octet-stream, Size: 13830 bytes --]

diff --git a/gcc/dominance.c b/gcc/dominance.c
index e3308cc..1caed7c 100644
--- a/gcc/dominance.c
+++ b/gcc/dominance.c
@@ -41,6 +41,7 @@
 #include "cfganal.h"
 #include "et-forest.h"
 #include "graphds.h"
+#include "cfgloop.h"
 
 /* We name our nodes with integers, beginning with 1.  Zero is reserved for
    'undefined' or 'end of list'.  The name of each node is given by the dfs
@@ -60,9 +61,10 @@ class dom_info
 {
 public:
   dom_info (function *, cdi_direction);
+  dom_info (struct loop *, cdi_direction);
   ~dom_info ();
-  void calc_dfs_tree ();
-  void calc_idoms ();
+  void calc_dfs_tree (bool);
+  void calc_idoms (bool);
 
   inline basic_block get_idom (basic_block);
 private:
@@ -204,6 +206,61 @@ dom_info::dom_info (function *fn, cdi_direction dir)
     }
 }
 
+/* Constructor for innermost loop which is SESE region.  */
+
+dom_info::dom_info (struct loop *loop, cdi_direction dir)
+{
+  /* We need memory for n_basic_blocks nodes.  */
+  size_t num = m_n_basic_blocks = loop->num_nodes + 2;
+  m_dfs_parent = new_zero_array <TBB> (num);
+  m_dom = new_zero_array <TBB> (num);
+
+  m_path_min = new TBB[num];
+  m_key = new TBB[num];
+  m_set_size = new unsigned int[num];
+  for (size_t i = 0; i < num; i++)
+    {
+      m_path_min[i] = m_key[i] = i;
+      m_set_size[i] = 1;
+    }
+
+  m_bucket = new_zero_array <TBB> (num);
+  m_next_bucket = new_zero_array <TBB> (num);
+
+  m_set_chain = new_zero_array <TBB> (num);
+  m_set_child = new_zero_array <TBB> (num);
+
+  /* Determine max basic block index in region.  */
+  int max_index = ((basic_block*)loop->aux)[0]->index;
+  for (size_t i = 1; i < num; i++)
+    if (((basic_block*)loop->aux)[i]->index > max_index)
+      max_index = ((basic_block*)loop->aux)[i]->index;
+  max_index += 1;
+  m_dfs_order = new_zero_array <TBB> (max_index + 1);
+  m_dfs_last = &m_dfs_order[max_index];
+  m_dfs_to_bb = new_zero_array <basic_block> (num);
+
+  m_dfsnum = 1;
+  m_nodes = 0;
+  m_fake_exit_edge = NULL; /* Assume that region is SCC.  */
+
+  switch (dir)
+    {
+      case CDI_DOMINATORS:
+	m_reverse = false;
+	m_start_block = ((basic_block*)loop->aux)[0];
+	m_end_block = ((basic_block*)loop->aux)[num - 1];
+	break;
+      case CDI_POST_DOMINATORS:
+	m_reverse = true;
+	m_start_block = ((basic_block*)loop->aux)[num - 1];
+	m_end_block = ((basic_block*)loop->aux)[0];
+	break;
+      default:
+	gcc_unreachable ();
+    }
+}
+
 inline basic_block
 dom_info::get_idom (basic_block bb)
 {
@@ -336,10 +393,11 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
 /* The main entry for calculating the DFS tree or forest.  m_reverse is true,
    if we are interested in the reverse flow graph.  In that case the result is
    not necessarily a tree but a forest, because there may be nodes from which
-   the EXIT_BLOCK is unreachable.  */
+   the EXIT_BLOCK is unreachable.  IN_REGION is true if dfs is performed
+   for innermost loop considering as SESE region.  */
 
 void
-dom_info::calc_dfs_tree ()
+dom_info::calc_dfs_tree (bool in_region)
 {
   *m_dfs_last = m_dfsnum;
   m_dfs_to_bb[m_dfsnum] = m_start_block;
@@ -347,7 +405,7 @@ dom_info::calc_dfs_tree ()
 
   calc_dfs_tree_nonrec (m_start_block);
 
-  if (m_reverse)
+  if (m_reverse && !in_region)
     {
       /* In the post-dom case we may have nodes without a path to EXIT_BLOCK.
          They are reverse-unreachable.  In the dom-case we disallow such
@@ -493,10 +551,11 @@ dom_info::link_roots (TBB v, TBB w)
 
 /* This calculates the immediate dominators (or post-dominators). THIS is our
    working structure and should hold the DFS forest.
-   On return the immediate dominator to node V is in m_dom[V].  */
+   On return the immediate dominator to node V is in m_dom[V].
+   IN_REGIOS is true when dominator tree is built innermost loop.  */
 
 void
-dom_info::calc_idoms ()
+dom_info::calc_idoms (bool in_region)
 {
   /* Go backwards in DFS order, to first look at the leafs.  */
   for (TBB v = m_nodes; v > 1; v--)
@@ -511,7 +570,7 @@ dom_info::calc_idoms ()
 				   : ei_start (bb->preds);
       edge_iterator einext;
 
-      if (m_reverse)
+      if (m_reverse && !in_region)
 	{
 	  /* If this block has a fake edge to exit, process that first.  */
 	  if (bitmap_bit_p (m_fake_exit_edge, bb->index))
@@ -622,6 +681,31 @@ compute_dom_fast_query (enum cdi_direction dir)
   dom_computed[dir_index] = DOM_OK;
 }
 
+/* Analogous to the previous function but compute the data for loop body.  */
+
+static void
+compute_dom_fast_query_in_region (struct loop *loop, enum cdi_direction dir)
+{
+  int num = 0;
+  basic_block bb;
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+
+  gcc_checking_assert (dom_info_available_p (dir));
+
+  if (dom_computed[dir_index] == DOM_OK)
+    return;
+
+  /* Assign dfs numbers for loop nodes only.  */
+  for (unsigned int i = 1; i < loop->num_nodes + 1; i++)
+    {
+      bb = ((basic_block*)loop->aux)[i];
+      if (!bb->dom[dir_index]->father)
+	assign_dfs_numbers (bb->dom[dir_index], &num);
+    }
+
+  dom_computed[dir_index] = DOM_OK;
+}
+
 /* The main entry point into this module.  DIR is set depending on whether
    we want to compute dominators or postdominators.  */
 
@@ -649,8 +733,8 @@ calculate_dominance_info (cdi_direction dir)
       n_bbs_in_dom_tree[dir_index] = n_basic_blocks_for_fn (cfun);
 
       dom_info di (cfun, dir);
-      di.calc_dfs_tree ();
-      di.calc_idoms ();
+      di.calc_dfs_tree (false);
+      di.calc_idoms (false);
 
       FOR_EACH_BB_FN (b, cfun)
 	{
@@ -668,6 +752,45 @@ calculate_dominance_info (cdi_direction dir)
   timevar_pop (TV_DOMINANCE);
 }
 
+/* Analogous to the previous function but compute dominators or postdominators
+   for innermost loops with pre-header and post-header to form SESE region.  */
+
+void
+calculate_dominance_info_for_region (struct loop *loop, cdi_direction dir)
+{
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+  unsigned int n_nodes = loop->num_nodes + 2;
+
+  if (dom_computed[dir_index] == DOM_OK)
+    return;
+
+  timevar_push (TV_DOMINANCE);
+  if (!dom_info_available_p (dir))
+    {
+      basic_block bb;
+      gcc_assert (loop->aux);
+
+      for (unsigned int i = 0; i < n_nodes; i++)
+	{
+	  bb = ((basic_block*)loop->aux)[i];
+	  bb->dom[dir_index] = et_new_tree (bb);
+	}
+      dom_info di (loop, dir);
+      di.calc_dfs_tree (true);
+      di.calc_idoms (true);
+      for (unsigned int i = 0; i < n_nodes; i++)
+	{
+	  bb = ((basic_block*)loop->aux)[i];
+	  if (basic_block d = di.get_idom (bb))
+	    et_set_father (bb->dom[dir_index], d->dom[dir_index]);
+	}
+      dom_computed[dir_index] = DOM_NO_FAST_QUERY;
+    }
+  compute_dom_fast_query_in_region (loop, dir);
+
+  timevar_pop (TV_DOMINANCE);
+}
+
 /* Free dominance information for direction DIR.  */
 void
 free_dominance_info (function *fn, enum cdi_direction dir)
@@ -696,6 +819,26 @@ free_dominance_info (enum cdi_direction dir)
   free_dominance_info (cfun, dir);
 }
 
+void
+free_dominance_info_for_region (struct loop *loop, enum cdi_direction dir)
+{
+  basic_block bb;
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+
+  if (!dom_info_available_p (dir))
+    return;
+
+  for (unsigned int i = 0; i < loop->num_nodes + 2; i++)
+    {
+      bb = ((basic_block*)loop->aux)[i];
+      et_free_tree_force (bb->dom[dir_index]);
+      bb->dom[dir_index] = NULL;
+    }
+  et_free_pools ();
+  cfun->cfg->x_dom_computed[dir_index] = DOM_NONE;
+  cfun->cfg->x_n_bbs_in_dom_tree[dir_index] = 0;
+}
+
 /* Return the immediate dominator of basic block BB.  */
 basic_block
 get_immediate_dominator (enum cdi_direction dir, basic_block bb)
@@ -1012,8 +1155,8 @@ verify_dominators (cdi_direction dir)
   gcc_assert (dom_info_available_p (dir));
 
   dom_info di (cfun, dir);
-  di.calc_dfs_tree ();
-  di.calc_idoms ();
+  di.calc_dfs_tree (false);
+  di.calc_idoms (false);
 
   bool err = false;
   basic_block bb;
diff --git a/gcc/dominance.h b/gcc/dominance.h
index 961c4e2..3dbe526 100644
--- a/gcc/dominance.h
+++ b/gcc/dominance.h
@@ -36,8 +36,11 @@ enum dom_state
 };
 
 extern void calculate_dominance_info (enum cdi_direction);
+extern void calculate_dominance_info_for_region (struct loop *,
+						 enum cdi_direction);
 extern void free_dominance_info (function *, enum cdi_direction);
 extern void free_dominance_info (enum cdi_direction);
+extern void free_dominance_info_for_region (struct loop *, enum cdi_direction);
 extern basic_block get_immediate_dominator (enum cdi_direction, basic_block);
 extern void set_immediate_dominator (enum cdi_direction, basic_block,
 				     basic_block);
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index eec431e..46e793b 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -186,6 +186,12 @@ innermost_loop_behavior_hash::equal (const value_type &e1,
 /* List of basic blocks in if-conversion-suitable order.  */
 static basic_block *ifc_bbs;
 
+/* List of basic blocks representing SESE region.  */
+static basic_block *ifc_sese_bbs = NULL;
+
+/* Fake postheader basic block to build SESE region.  */
+static basic_block fake_postheader;
+
 /* Hash table to store <DR's innermost loop behavior, DR> pairs.  */
 static hash_map<innermost_loop_behavior_hash,
 		data_reference_p> *innermost_DR_map;
@@ -1309,6 +1315,53 @@ predicate_bbs (loop_p loop)
 	      && bb_predicate_gimplified_stmts (loop->latch) == NULL);
 }
 
+/* Build SESE region by adding loop pre-header and post-header blocks.  */
+
+static basic_block *
+build_sese_region (struct loop *loop)
+{
+  basic_block *blocks;
+
+  gcc_assert (ifc_bbs);
+  fake_postheader = NULL;
+  /* Add loop preheader and postheader blocks to form SESE region.  */
+  blocks = XCNEWVEC (basic_block, loop->num_nodes + 2);
+  blocks[0] = loop_preheader_edge (loop)->src;
+  gcc_assert (EDGE_COUNT (blocks[0]->succs) == 1);
+  for (unsigned int i = 0; i < loop->num_nodes; i++)
+    {
+      basic_block bb = ifc_bbs[i];
+      blocks[i + 1] = bb;
+      /* Find loop postheader.  */
+      edge e;
+      edge_iterator ei;
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (loop_exit_edge_p (loop, e))
+	  {
+	    if (EDGE_COUNT (e->dest->preds) == 1)
+	      blocks[loop->num_nodes + 1] = e->dest;
+	    else
+	      {
+		/* Post-header must have the only predecessor,
+		   so create fake basic block the only predecessor
+		   of which is exit block to form SESE region.
+		   After completion basic block predication
+		   this block will be deleted and branch/edge of exit
+		   block will be restored.  */
+		basic_block bb = create_empty_bb (e->src);
+		unchecked_make_edge (e->src, bb, 0);
+		fake_postheader = bb;
+		blocks[loop->num_nodes + 1] = bb;
+		if (dump_file && (dump_flags & TDF_DETAILS))
+		  fprintf (dump_file, "Create fake post-header#%d!\n",
+			   bb->index);
+	      }
+	    break;
+	  }
+    }
+  return blocks;
+}
+
 /* Return true when LOOP is if-convertible.  This is a helper function
    for if_convertible_loop_p.  REFS and DDRS are initialized and freed
    in if_convertible_loop_p.  */
@@ -1322,7 +1375,8 @@ if_convertible_loop_p_1 (struct loop *loop, vec<data_reference_p> *refs)
   if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
     return false;
 
-  calculate_dominance_info (CDI_DOMINATORS);
+  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
+    calculate_dominance_info (CDI_DOMINATORS);
 
   /* Allow statements that can be handled during if-conversion.  */
   ifc_bbs = get_loop_body_in_if_conv_order (loop);
@@ -1370,9 +1424,33 @@ if_convertible_loop_p_1 (struct loop *loop, vec<data_reference_p> *refs)
 	  = new hash_map<innermost_loop_behavior_hash, data_reference_p>;
   baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
 
-  calculate_dominance_info (CDI_POST_DOMINATORS);
+  /* Compute post-dominator tree locally.  */
+  gcc_assert (ifc_sese_bbs == NULL);
+  ifc_sese_bbs = build_sese_region (loop);
+  loop->aux = ifc_sese_bbs;
+  calculate_dominance_info_for_region (loop, CDI_POST_DOMINATORS);
+
   predicate_bbs (loop);
 
+  /* Free post-dominator tree since it is not used after predication.  */
+  free_dominance_info_for_region (loop, CDI_POST_DOMINATORS);
+  if (fake_postheader != NULL)
+    {
+      edge_iterator ei;
+      edge e;
+      basic_block exit_bb = single_pred (fake_postheader);
+      gcc_assert (exit_bb);
+      FOR_EACH_EDGE (e, ei, exit_bb->succs)
+      if (e->dest == fake_postheader)
+	{
+	  /* Delete this fake edge.  */
+	  remove_edge_raw (e);
+	  break;
+	}
+      expunge_block (fake_postheader);
+      fake_postheader = NULL;
+    }
+
   for (i = 0; refs->iterate (i, &dr); i++)
     {
       tree ref = DR_REF (dr);
@@ -2752,7 +2830,13 @@ tree_if_conversion (struct loop *loop)
       free (ifc_bbs);
       ifc_bbs = NULL;
     }
-  free_dominance_info (CDI_POST_DOMINATORS);
+
+  if (ifc_sese_bbs)
+    {
+      free (ifc_sese_bbs);
+      ifc_sese_bbs = NULL;
+    }
+  loop->aux = NULL;
 
   return todo;
 }
@@ -2805,14 +2889,6 @@ pass_if_conversion::execute (function *fun)
   if (number_of_loops (fun) <= 1)
     return 0;
 
-  /* If there are infinite loops, during CDI_POST_DOMINATORS computation
-     we can pick pretty much random bb inside of the infinite loop that
-     has the fake edge.  If we are unlucky enough, this can confuse the
-     add_to_predicate_list post-dominator check to optimize as if that
-     bb or some other one is a join block when it actually is not.
-     See PR70916.  */
-  connect_infinite_loops_to_exit ();
-
   FOR_EACH_LOOP (loop, 0)
     if (flag_tree_loop_if_convert == 1
 	|| flag_tree_loop_if_convert_stores == 1
@@ -2820,8 +2896,6 @@ pass_if_conversion::execute (function *fun)
 	    && !loop->dont_vectorize))
       todo |= tree_if_conversion (loop);
 
-  remove_fake_exit_edges ();
-
   if (flag_checking)
     {
       basic_block bb;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-05 13:22 Compile-time improvement for if conversion Yuri Rumyantsev
@ 2016-10-10  9:52 ` Richard Biener
  2016-10-10 11:42   ` Yuri Rumyantsev
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2016-10-10  9:52 UTC (permalink / raw)
  To: Yuri Rumyantsev; +Cc: gcc-patches

On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Hi All,
>
> Here is implementation of Richard proposal:
>
> < For general infrastructure it would be nice to expose a (post-)dominator
> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
> < what makes if-conversion expensive is the post-dom compute which happens
> < for each loop for the whole function.  It shouldn't be very difficult
> < to write this,
> < sharing as much as possible code with the current DOM code might need
> < quite some refactoring though.
>
> I implemented this proposal by adding calculation of dominance info
> for SESE regions and incorporate this change to if conversion pass.
> SESE region is built by adding loop pre-header and possibly fake
> post-header blocks to loop body. Fake post-header is deleted after
> predication completion.
>
> Bootstrapping and regression testing did not show any new failures.
>
> Is it OK for trunk?

It's mostly reasonable but I have a few comments.  First, re-using
bb->dom[] for the dominator info is somewhat fragile but indeed
a requirement to make the patch reasonably small.  Please,
in calculate_dominance_info_for_region, make sure that
!dom_info_available_p (dir).

You pass loop * everywhere but require ->aux to be set up as
an array of BBs forming the region with special BBs at array ends.

Please instead pass in a vec<basic_block> which avoids using ->aux
and also allows other non-loop-based SESE regions to be used
(I couldn't spot anything that relies on this being a loop).

Adding a convenience wrapper for loop  * would be of course nice,
to cover the special pre/post-header code in tree-if-conv.c.

In theory a SESE region is fully specified by its entry end exit _edge_,
so you might want to see if it's possible to use such a pair of edges
to guard the dfs/idom walks to avoid the need to create fake blocks.

Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
please use split_edge() of the entry/exit edges.

Richard.

> ChangeLog:
> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>
> * dominance.c : Include cfgloop.h for loop recognition.
> (dom_info): Add new functions and add boolean argument to recognize
> computation for loop region.
> (dom_info::dom_info): New function.
> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
> handle unvisited blocks.
> (dom_info::calc_idoms): Likewise.
> (compute_dom_fast_query_in_region): New function.
> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
> false argument.
> (calculate_dominance_info_for_region): New function.
> (free_dominance_info_for_region): Likewise.
> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
> argument.
> * dominance.h: Add prototype for introduced functions
> calculate_dominance_info_for_region and
> free_dominance_info_for_region.
> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
> (build_sese_region): New function.
> (if_convertible_loop_p_1): Invoke local version of post-dominators
> calculation, free it after basic block predication and delete created
> fake post-header block if any.
> (tree_if_conversion): Delete call of free_dominance_info for
> post-dominators, free ifc_sese_bbs which represents SESE region.
> (pass_if_conversion::execute): Delete detection of infinite loops
> and fake edges to exit block since post-dominator calculation is
> performed per if-converted loop only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-10  9:52 ` Richard Biener
@ 2016-10-10 11:42   ` Yuri Rumyantsev
  2016-10-10 12:00     ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Yuri Rumyantsev @ 2016-10-10 11:42 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Thanks Richard for your comments.
I'd like to answer on your last comment regarding use split_edge()
instead of creating fake post-header. I started with this splitting
but it requires to fix-up closed ssa form by creating additional phi
nodes, so I decided to use only cfg change without updating ssa form.
Other changes look reasonable and will fix them.

2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>> Hi All,
>>
>> Here is implementation of Richard proposal:
>>
>> < For general infrastructure it would be nice to expose a (post-)dominator
>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>> < what makes if-conversion expensive is the post-dom compute which happens
>> < for each loop for the whole function.  It shouldn't be very difficult
>> < to write this,
>> < sharing as much as possible code with the current DOM code might need
>> < quite some refactoring though.
>>
>> I implemented this proposal by adding calculation of dominance info
>> for SESE regions and incorporate this change to if conversion pass.
>> SESE region is built by adding loop pre-header and possibly fake
>> post-header blocks to loop body. Fake post-header is deleted after
>> predication completion.
>>
>> Bootstrapping and regression testing did not show any new failures.
>>
>> Is it OK for trunk?
>
> It's mostly reasonable but I have a few comments.  First, re-using
> bb->dom[] for the dominator info is somewhat fragile but indeed
> a requirement to make the patch reasonably small.  Please,
> in calculate_dominance_info_for_region, make sure that
> !dom_info_available_p (dir).
>
> You pass loop * everywhere but require ->aux to be set up as
> an array of BBs forming the region with special BBs at array ends.
>
> Please instead pass in a vec<basic_block> which avoids using ->aux
> and also allows other non-loop-based SESE regions to be used
> (I couldn't spot anything that relies on this being a loop).
>
> Adding a convenience wrapper for loop  * would be of course nice,
> to cover the special pre/post-header code in tree-if-conv.c.
>
> In theory a SESE region is fully specified by its entry end exit _edge_,
> so you might want to see if it's possible to use such a pair of edges
> to guard the dfs/idom walks to avoid the need to create fake blocks.
>
> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
> please use split_edge() of the entry/exit edges.
>
> Richard.
>
>> ChangeLog:
>> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>
>> * dominance.c : Include cfgloop.h for loop recognition.
>> (dom_info): Add new functions and add boolean argument to recognize
>> computation for loop region.
>> (dom_info::dom_info): New function.
>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>> handle unvisited blocks.
>> (dom_info::calc_idoms): Likewise.
>> (compute_dom_fast_query_in_region): New function.
>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>> false argument.
>> (calculate_dominance_info_for_region): New function.
>> (free_dominance_info_for_region): Likewise.
>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>> argument.
>> * dominance.h: Add prototype for introduced functions
>> calculate_dominance_info_for_region and
>> free_dominance_info_for_region.
>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>> (build_sese_region): New function.
>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>> calculation, free it after basic block predication and delete created
>> fake post-header block if any.
>> (tree_if_conversion): Delete call of free_dominance_info for
>> post-dominators, free ifc_sese_bbs which represents SESE region.
>> (pass_if_conversion::execute): Delete detection of infinite loops
>> and fake edges to exit block since post-dominator calculation is
>> performed per if-converted loop only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-10 11:42   ` Yuri Rumyantsev
@ 2016-10-10 12:00     ` Richard Biener
  2016-10-10 14:17       ` Yuri Rumyantsev
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2016-10-10 12:00 UTC (permalink / raw)
  To: Yuri Rumyantsev; +Cc: gcc-patches

On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Thanks Richard for your comments.
> I'd like to answer on your last comment regarding use split_edge()
> instead of creating fake post-header. I started with this splitting
> but it requires to fix-up closed ssa form by creating additional phi
> nodes, so I decided to use only cfg change without updating ssa form.
> Other changes look reasonable and will fix them.

Ah.  In this case can you investigate what it takes to make the entry/exit
edges rather than BBs?  That is, introduce those "fakes" only internally
in dominance.c?

> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Hi All,
>>>
>>> Here is implementation of Richard proposal:
>>>
>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>> < what makes if-conversion expensive is the post-dom compute which happens
>>> < for each loop for the whole function.  It shouldn't be very difficult
>>> < to write this,
>>> < sharing as much as possible code with the current DOM code might need
>>> < quite some refactoring though.
>>>
>>> I implemented this proposal by adding calculation of dominance info
>>> for SESE regions and incorporate this change to if conversion pass.
>>> SESE region is built by adding loop pre-header and possibly fake
>>> post-header blocks to loop body. Fake post-header is deleted after
>>> predication completion.
>>>
>>> Bootstrapping and regression testing did not show any new failures.
>>>
>>> Is it OK for trunk?
>>
>> It's mostly reasonable but I have a few comments.  First, re-using
>> bb->dom[] for the dominator info is somewhat fragile but indeed
>> a requirement to make the patch reasonably small.  Please,
>> in calculate_dominance_info_for_region, make sure that
>> !dom_info_available_p (dir).
>>
>> You pass loop * everywhere but require ->aux to be set up as
>> an array of BBs forming the region with special BBs at array ends.
>>
>> Please instead pass in a vec<basic_block> which avoids using ->aux
>> and also allows other non-loop-based SESE regions to be used
>> (I couldn't spot anything that relies on this being a loop).
>>
>> Adding a convenience wrapper for loop  * would be of course nice,
>> to cover the special pre/post-header code in tree-if-conv.c.
>>
>> In theory a SESE region is fully specified by its entry end exit _edge_,
>> so you might want to see if it's possible to use such a pair of edges
>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>
>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>> please use split_edge() of the entry/exit edges.
>>
>> Richard.
>>
>>> ChangeLog:
>>> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>
>>> * dominance.c : Include cfgloop.h for loop recognition.
>>> (dom_info): Add new functions and add boolean argument to recognize
>>> computation for loop region.
>>> (dom_info::dom_info): New function.
>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>> handle unvisited blocks.
>>> (dom_info::calc_idoms): Likewise.
>>> (compute_dom_fast_query_in_region): New function.
>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>> false argument.
>>> (calculate_dominance_info_for_region): New function.
>>> (free_dominance_info_for_region): Likewise.
>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>> argument.
>>> * dominance.h: Add prototype for introduced functions
>>> calculate_dominance_info_for_region and
>>> free_dominance_info_for_region.
>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>> (build_sese_region): New function.
>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>> calculation, free it after basic block predication and delete created
>>> fake post-header block if any.
>>> (tree_if_conversion): Delete call of free_dominance_info for
>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>> and fake edges to exit block since post-dominator calculation is
>>> performed per if-converted loop only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-10 12:00     ` Richard Biener
@ 2016-10-10 14:17       ` Yuri Rumyantsev
  2016-10-11 10:33         ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Yuri Rumyantsev @ 2016-10-10 14:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Richard,

If "fake" exit or entry block is created in dominance how we can
determine what is its the only  predecessor or successor without using
a notion of loop?

2016-10-10 15:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>> Thanks Richard for your comments.
>> I'd like to answer on your last comment regarding use split_edge()
>> instead of creating fake post-header. I started with this splitting
>> but it requires to fix-up closed ssa form by creating additional phi
>> nodes, so I decided to use only cfg change without updating ssa form.
>> Other changes look reasonable and will fix them.
>
> Ah.  In this case can you investigate what it takes to make the entry/exit
> edges rather than BBs?  That is, introduce those "fakes" only internally
> in dominance.c?
>
>> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>> Hi All,
>>>>
>>>> Here is implementation of Richard proposal:
>>>>
>>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>>> < what makes if-conversion expensive is the post-dom compute which happens
>>>> < for each loop for the whole function.  It shouldn't be very difficult
>>>> < to write this,
>>>> < sharing as much as possible code with the current DOM code might need
>>>> < quite some refactoring though.
>>>>
>>>> I implemented this proposal by adding calculation of dominance info
>>>> for SESE regions and incorporate this change to if conversion pass.
>>>> SESE region is built by adding loop pre-header and possibly fake
>>>> post-header blocks to loop body. Fake post-header is deleted after
>>>> predication completion.
>>>>
>>>> Bootstrapping and regression testing did not show any new failures.
>>>>
>>>> Is it OK for trunk?
>>>
>>> It's mostly reasonable but I have a few comments.  First, re-using
>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>> a requirement to make the patch reasonably small.  Please,
>>> in calculate_dominance_info_for_region, make sure that
>>> !dom_info_available_p (dir).
>>>
>>> You pass loop * everywhere but require ->aux to be set up as
>>> an array of BBs forming the region with special BBs at array ends.
>>>
>>> Please instead pass in a vec<basic_block> which avoids using ->aux
>>> and also allows other non-loop-based SESE regions to be used
>>> (I couldn't spot anything that relies on this being a loop).
>>>
>>> Adding a convenience wrapper for loop  * would be of course nice,
>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>
>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>> so you might want to see if it's possible to use such a pair of edges
>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>
>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>> please use split_edge() of the entry/exit edges.
>>>
>>> Richard.
>>>
>>>> ChangeLog:
>>>> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>
>>>> * dominance.c : Include cfgloop.h for loop recognition.
>>>> (dom_info): Add new functions and add boolean argument to recognize
>>>> computation for loop region.
>>>> (dom_info::dom_info): New function.
>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>>> handle unvisited blocks.
>>>> (dom_info::calc_idoms): Likewise.
>>>> (compute_dom_fast_query_in_region): New function.
>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>>> false argument.
>>>> (calculate_dominance_info_for_region): New function.
>>>> (free_dominance_info_for_region): Likewise.
>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>>> argument.
>>>> * dominance.h: Add prototype for introduced functions
>>>> calculate_dominance_info_for_region and
>>>> free_dominance_info_for_region.
>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>>> (build_sese_region): New function.
>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>>> calculation, free it after basic block predication and delete created
>>>> fake post-header block if any.
>>>> (tree_if_conversion): Delete call of free_dominance_info for
>>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>>> and fake edges to exit block since post-dominator calculation is
>>>> performed per if-converted loop only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-10 14:17       ` Yuri Rumyantsev
@ 2016-10-11 10:33         ` Richard Biener
  2016-10-11 13:23           ` Yuri Rumyantsev
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2016-10-11 10:33 UTC (permalink / raw)
  To: Yuri Rumyantsev; +Cc: gcc-patches

On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> If "fake" exit or entry block is created in dominance how we can
> determine what is its the only  predecessor or successor without using
> a notion of loop?

The caller passes in an entry and exit edge instead of a block or loop.

Richard.

> 2016-10-10 15:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Thanks Richard for your comments.
>>> I'd like to answer on your last comment regarding use split_edge()
>>> instead of creating fake post-header. I started with this splitting
>>> but it requires to fix-up closed ssa form by creating additional phi
>>> nodes, so I decided to use only cfg change without updating ssa form.
>>> Other changes look reasonable and will fix them.
>>
>> Ah.  In this case can you investigate what it takes to make the entry/exit
>> edges rather than BBs?  That is, introduce those "fakes" only internally
>> in dominance.c?
>>
>>> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Hi All,
>>>>>
>>>>> Here is implementation of Richard proposal:
>>>>>
>>>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>>>> < what makes if-conversion expensive is the post-dom compute which happens
>>>>> < for each loop for the whole function.  It shouldn't be very difficult
>>>>> < to write this,
>>>>> < sharing as much as possible code with the current DOM code might need
>>>>> < quite some refactoring though.
>>>>>
>>>>> I implemented this proposal by adding calculation of dominance info
>>>>> for SESE regions and incorporate this change to if conversion pass.
>>>>> SESE region is built by adding loop pre-header and possibly fake
>>>>> post-header blocks to loop body. Fake post-header is deleted after
>>>>> predication completion.
>>>>>
>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>
>>>>> Is it OK for trunk?
>>>>
>>>> It's mostly reasonable but I have a few comments.  First, re-using
>>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>>> a requirement to make the patch reasonably small.  Please,
>>>> in calculate_dominance_info_for_region, make sure that
>>>> !dom_info_available_p (dir).
>>>>
>>>> You pass loop * everywhere but require ->aux to be set up as
>>>> an array of BBs forming the region with special BBs at array ends.
>>>>
>>>> Please instead pass in a vec<basic_block> which avoids using ->aux
>>>> and also allows other non-loop-based SESE regions to be used
>>>> (I couldn't spot anything that relies on this being a loop).
>>>>
>>>> Adding a convenience wrapper for loop  * would be of course nice,
>>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>>
>>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>>> so you might want to see if it's possible to use such a pair of edges
>>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>>
>>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>>> please use split_edge() of the entry/exit edges.
>>>>
>>>> Richard.
>>>>
>>>>> ChangeLog:
>>>>> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>
>>>>> * dominance.c : Include cfgloop.h for loop recognition.
>>>>> (dom_info): Add new functions and add boolean argument to recognize
>>>>> computation for loop region.
>>>>> (dom_info::dom_info): New function.
>>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>>>> handle unvisited blocks.
>>>>> (dom_info::calc_idoms): Likewise.
>>>>> (compute_dom_fast_query_in_region): New function.
>>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>>>> false argument.
>>>>> (calculate_dominance_info_for_region): New function.
>>>>> (free_dominance_info_for_region): Likewise.
>>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>>>> argument.
>>>>> * dominance.h: Add prototype for introduced functions
>>>>> calculate_dominance_info_for_region and
>>>>> free_dominance_info_for_region.
>>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>>>> (build_sese_region): New function.
>>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>>>> calculation, free it after basic block predication and delete created
>>>>> fake post-header block if any.
>>>>> (tree_if_conversion): Delete call of free_dominance_info for
>>>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>>>> and fake edges to exit block since post-dominator calculation is
>>>>> performed per if-converted loop only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-11 10:33         ` Richard Biener
@ 2016-10-11 13:23           ` Yuri Rumyantsev
  2016-10-11 13:48             ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Yuri Rumyantsev @ 2016-10-11 13:23 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5297 bytes --]

Richard,

I implemented this by passing callback function in_region which
returns true if block belongs to region.
I am testing it now

I attach modified patch for your quick review.

Thanks.

2016-10-11 13:33 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>> Richard,
>>
>> If "fake" exit or entry block is created in dominance how we can
>> determine what is its the only  predecessor or successor without using
>> a notion of loop?
>
> The caller passes in an entry and exit edge instead of a block or loop.
>
> Richard.
>
>> 2016-10-10 15:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>> Thanks Richard for your comments.
>>>> I'd like to answer on your last comment regarding use split_edge()
>>>> instead of creating fake post-header. I started with this splitting
>>>> but it requires to fix-up closed ssa form by creating additional phi
>>>> nodes, so I decided to use only cfg change without updating ssa form.
>>>> Other changes look reasonable and will fix them.
>>>
>>> Ah.  In this case can you investigate what it takes to make the entry/exit
>>> edges rather than BBs?  That is, introduce those "fakes" only internally
>>> in dominance.c?
>>>
>>>> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> Here is implementation of Richard proposal:
>>>>>>
>>>>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>>>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>>>>> < what makes if-conversion expensive is the post-dom compute which happens
>>>>>> < for each loop for the whole function.  It shouldn't be very difficult
>>>>>> < to write this,
>>>>>> < sharing as much as possible code with the current DOM code might need
>>>>>> < quite some refactoring though.
>>>>>>
>>>>>> I implemented this proposal by adding calculation of dominance info
>>>>>> for SESE regions and incorporate this change to if conversion pass.
>>>>>> SESE region is built by adding loop pre-header and possibly fake
>>>>>> post-header blocks to loop body. Fake post-header is deleted after
>>>>>> predication completion.
>>>>>>
>>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>>
>>>>>> Is it OK for trunk?
>>>>>
>>>>> It's mostly reasonable but I have a few comments.  First, re-using
>>>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>>>> a requirement to make the patch reasonably small.  Please,
>>>>> in calculate_dominance_info_for_region, make sure that
>>>>> !dom_info_available_p (dir).
>>>>>
>>>>> You pass loop * everywhere but require ->aux to be set up as
>>>>> an array of BBs forming the region with special BBs at array ends.
>>>>>
>>>>> Please instead pass in a vec<basic_block> which avoids using ->aux
>>>>> and also allows other non-loop-based SESE regions to be used
>>>>> (I couldn't spot anything that relies on this being a loop).
>>>>>
>>>>> Adding a convenience wrapper for loop  * would be of course nice,
>>>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>>>
>>>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>>>> so you might want to see if it's possible to use such a pair of edges
>>>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>>>
>>>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>>>> please use split_edge() of the entry/exit edges.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> ChangeLog:
>>>>>> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>
>>>>>> * dominance.c : Include cfgloop.h for loop recognition.
>>>>>> (dom_info): Add new functions and add boolean argument to recognize
>>>>>> computation for loop region.
>>>>>> (dom_info::dom_info): New function.
>>>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>>>>> handle unvisited blocks.
>>>>>> (dom_info::calc_idoms): Likewise.
>>>>>> (compute_dom_fast_query_in_region): New function.
>>>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>>>>> false argument.
>>>>>> (calculate_dominance_info_for_region): New function.
>>>>>> (free_dominance_info_for_region): Likewise.
>>>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>>>>> argument.
>>>>>> * dominance.h: Add prototype for introduced functions
>>>>>> calculate_dominance_info_for_region and
>>>>>> free_dominance_info_for_region.
>>>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>>>>> (build_sese_region): New function.
>>>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>>>>> calculation, free it after basic block predication and delete created
>>>>>> fake post-header block if any.
>>>>>> (tree_if_conversion): Delete call of free_dominance_info for
>>>>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>>>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>>>>> and fake edges to exit block since post-dominator calculation is
>>>>>> performed per if-converted loop only.

[-- Attachment #2: patch.2 --]
[-- Type: application/octet-stream, Size: 14622 bytes --]

diff --git a/gcc/dominance.c b/gcc/dominance.c
index e3308cc..00410e9 100644
--- a/gcc/dominance.c
+++ b/gcc/dominance.c
@@ -41,6 +41,7 @@
 #include "cfganal.h"
 #include "et-forest.h"
 #include "graphds.h"
+#include "cfghooks.h"
 
 /* We name our nodes with integers, beginning with 1.  Zero is reserved for
    'undefined' or 'end of list'.  The name of each node is given by the dfs
@@ -60,9 +61,10 @@ class dom_info
 {
 public:
   dom_info (function *, cdi_direction);
+  dom_info (vec <basic_block>, cdi_direction);
   ~dom_info ();
-  void calc_dfs_tree ();
-  void calc_idoms ();
+  void calc_dfs_tree (bool);
+  void calc_idoms (bool);
 
   inline basic_block get_idom (basic_block);
 private:
@@ -204,6 +206,61 @@ dom_info::dom_info (function *fn, cdi_direction dir)
     }
 }
 
+/* Constructor for SESE region.  */
+
+dom_info::dom_info (vec<basic_block> region, cdi_direction dir)
+{
+  /* We need memory for n_basic_blocks nodes.  */
+  size_t num = m_n_basic_blocks = region.length ();
+  m_dfs_parent = new_zero_array <TBB> (num);
+  m_dom = new_zero_array <TBB> (num);
+
+  m_path_min = new TBB[num];
+  m_key = new TBB[num];
+  m_set_size = new unsigned int[num];
+  for (size_t i = 0; i < num; i++)
+    {
+      m_path_min[i] = m_key[i] = i;
+      m_set_size[i] = 1;
+    }
+
+  m_bucket = new_zero_array <TBB> (num);
+  m_next_bucket = new_zero_array <TBB> (num);
+
+  m_set_chain = new_zero_array <TBB> (num);
+  m_set_child = new_zero_array <TBB> (num);
+
+  /* Determine max basic block index in region.  */
+  int max_index = region[0]->index;
+  for (size_t i = 1; i < num; i++)
+    if (region[i]->index > max_index)
+      max_index = region[i]->index;
+  max_index += 1;
+  m_dfs_order = new_zero_array <TBB> (max_index + 1);
+  m_dfs_last = &m_dfs_order[max_index];
+  m_dfs_to_bb = new_zero_array <basic_block> (num);
+
+  m_dfsnum = 1;
+  m_nodes = 0;
+  m_fake_exit_edge = NULL; /* Assume that region is SCC.  */
+
+  switch (dir)
+    {
+      case CDI_DOMINATORS:
+	m_reverse = false;
+	m_start_block = region[0];
+	m_end_block = region[num - 1];
+	break;
+      case CDI_POST_DOMINATORS:
+	m_reverse = true;
+	m_start_block = region[num - 1];
+	m_end_block = region[0];
+	break;
+      default:
+	gcc_unreachable ();
+    }
+}
+
 inline basic_block
 dom_info::get_idom (basic_block bb)
 {
@@ -336,10 +393,11 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
 /* The main entry for calculating the DFS tree or forest.  m_reverse is true,
    if we are interested in the reverse flow graph.  In that case the result is
    not necessarily a tree but a forest, because there may be nodes from which
-   the EXIT_BLOCK is unreachable.  */
+   the EXIT_BLOCK is unreachable.  IN_REGION is true if dfs is performed
+   for SESE region.  */
 
 void
-dom_info::calc_dfs_tree ()
+dom_info::calc_dfs_tree (bool in_region)
 {
   *m_dfs_last = m_dfsnum;
   m_dfs_to_bb[m_dfsnum] = m_start_block;
@@ -347,7 +405,7 @@ dom_info::calc_dfs_tree ()
 
   calc_dfs_tree_nonrec (m_start_block);
 
-  if (m_reverse)
+  if (m_reverse && !in_region)
     {
       /* In the post-dom case we may have nodes without a path to EXIT_BLOCK.
          They are reverse-unreachable.  In the dom-case we disallow such
@@ -493,10 +551,11 @@ dom_info::link_roots (TBB v, TBB w)
 
 /* This calculates the immediate dominators (or post-dominators). THIS is our
    working structure and should hold the DFS forest.
-   On return the immediate dominator to node V is in m_dom[V].  */
+   On return the immediate dominator to node V is in m_dom[V].
+   IN_REGION is true when dominator tree is built fro SESE region.  */
 
 void
-dom_info::calc_idoms ()
+dom_info::calc_idoms (bool in_region)
 {
   /* Go backwards in DFS order, to first look at the leafs.  */
   for (TBB v = m_nodes; v > 1; v--)
@@ -511,7 +570,7 @@ dom_info::calc_idoms ()
 				   : ei_start (bb->preds);
       edge_iterator einext;
 
-      if (m_reverse)
+      if (m_reverse && !in_region)
 	{
 	  /* If this block has a fake edge to exit, process that first.  */
 	  if (bitmap_bit_p (m_fake_exit_edge, bb->index))
@@ -622,6 +681,32 @@ compute_dom_fast_query (enum cdi_direction dir)
   dom_computed[dir_index] = DOM_OK;
 }
 
+/* Analogous to the previous function but compute the data for SESE region.  */
+
+static void
+compute_dom_fast_query_in_region (enum cdi_direction dir,
+				  vec<basic_block> region)
+{
+  int num = 0;
+  basic_block bb;
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+
+  gcc_checking_assert (dom_info_available_p (dir));
+
+  if (dom_computed[dir_index] == DOM_OK)
+    return;
+
+  /* Assign dfs numbers for region nodes except for entry and exit nodes.  */ 
+  for (unsigned int i = 1; i < region.length () - 1; i++)
+    {
+      bb = region[i];
+      if (!bb->dom[dir_index]->father)
+	assign_dfs_numbers (bb->dom[dir_index], &num);
+    }
+
+  dom_computed[dir_index] = DOM_OK;
+}
+
 /* The main entry point into this module.  DIR is set depending on whether
    we want to compute dominators or postdominators.  */
 
@@ -649,8 +734,8 @@ calculate_dominance_info (cdi_direction dir)
       n_bbs_in_dom_tree[dir_index] = n_basic_blocks_for_fn (cfun);
 
       dom_info di (cfun, dir);
-      di.calc_dfs_tree ();
-      di.calc_idoms ();
+      di.calc_dfs_tree (false);
+      di.calc_idoms (false);
 
       FOR_EACH_BB_FN (b, cfun)
 	{
@@ -668,6 +753,123 @@ calculate_dominance_info (cdi_direction dir)
   timevar_pop (TV_DOMINANCE);
 }
 
+/* Analogous to the previous function but compute dominators or postdominators
+   for innermost loops with pre-header and post-header to form SESE region.
+   New entry or exit node can be created if region is not SESE.  REGION is
+   released after dominance tree was built.  Callback function IN_REGION is
+   supplied together with and it returns true if basic block belongs to
+   REGION.  */
+
+void
+calculate_dominance_info_for_region (cdi_direction dir,
+				     vec<basic_block> region,
+				     bool (*in_region) (vec<basic_block>,
+							basic_block))
+{
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+  basic_block bb;
+  unsigned int i;
+  basic_block entry, new_entry = NULL;
+  basic_block exit, new_exit = NULL;
+  edge e;
+  edge_iterator ei;
+
+  if (dom_computed[dir_index] == DOM_OK)
+    return;
+
+  timevar_push (TV_DOMINANCE);
+  /* Assume that dom info is not partially computed.  */
+  gcc_assert (!dom_info_available_p (dir));
+
+  FOR_EACH_VEC_ELT (region, i, bb)
+    {
+      bb->dom[dir_index] = et_new_tree (bb);
+    }
+  /* Check that region is SESE region.  */
+  entry = region[0];
+  if ( EDGE_COUNT (entry->succs) > 1)
+    {
+      /* Create new entry block with the only successor.  */
+      basic_block succ = NULL;
+      bool found = false;
+      FOR_EACH_EDGE (e, ei, entry->succs)
+	if (in_region (region, e->dest))
+	  {
+	    gcc_assert (!found);
+	    found = true;
+	    succ = e->dest;
+	  }
+    gcc_assert (found);
+    new_entry = create_empty_bb (entry);
+    unchecked_make_edge (new_entry, succ, 0);
+    /* Put it to region as entry node.  */
+    region[0] = new_entry;
+    }
+  /* Check that exit has the only predecessor.  */
+  exit = region.pop ();
+  if (EDGE_COUNT (exit->preds) == 1)
+    region.safe_push (exit);
+  else
+    {
+      /* Create new exit block with the only predecessor.  */
+      basic_block pred = NULL;
+      bool found = false;
+      FOR_EACH_EDGE (e, ei, exit->preds)
+	if (in_region (region, e->src))
+	  {
+	    gcc_assert (!found);
+	    found = true;
+	    pred = e->src;
+	  }
+      gcc_assert (found);
+      new_exit = create_empty_bb (exit);
+      unchecked_make_edge (pred, new_exit, 0);
+      region.safe_push (new_exit);
+    }
+  dom_info di (region, dir);
+  di.calc_dfs_tree (true);
+  di.calc_idoms (true);
+
+  FOR_EACH_VEC_ELT (region, i, bb)
+    if (basic_block d = di.get_idom (bb))
+      et_set_father (bb->dom[dir_index], d->dom[dir_index]);
+
+  dom_computed[dir_index] = DOM_NO_FAST_QUERY;
+  compute_dom_fast_query_in_region (dir, region);
+
+  /* Delete created entry/exit nodes.  */
+  if (new_entry)
+    {
+      region[0] = entry;  /* restore entry node.  */
+      bb = single_succ (new_entry);
+      FOR_EACH_EDGE (e, ei, bb->preds)
+	if (e->src == new_entry)
+	  {
+	    remove_edge_raw (e);
+	    break;
+	  }
+      et_free_tree_force (new_entry->dom[dir_index]);
+      expunge_block (new_entry);
+    }
+  if (new_exit)
+    {
+      /* Restore exit node.  */
+      region.pop ();
+      region.safe_push (exit);
+      bb = single_pred (new_exit);
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (e->dest == new_exit)
+	  {
+	    remove_edge_raw (e);
+	    break;
+	  }
+      et_free_tree_force (new_exit->dom[dir_index]);
+      expunge_block (new_exit);
+    }
+
+  timevar_pop (TV_DOMINANCE);
+}
+
 /* Free dominance information for direction DIR.  */
 void
 free_dominance_info (function *fn, enum cdi_direction dir)
@@ -696,6 +898,27 @@ free_dominance_info (enum cdi_direction dir)
   free_dominance_info (cfun, dir);
 }
 
+void
+free_dominance_info_for_region (enum cdi_direction dir,
+				vec<basic_block> region)
+{
+  basic_block bb;
+  unsigned int i;
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+
+  if (!dom_info_available_p (dir))
+    return;
+
+  FOR_EACH_VEC_ELT (region, i, bb)
+    {
+      et_free_tree_force (bb->dom[dir_index]);
+      bb->dom[dir_index] = NULL;
+    }
+  et_free_pools ();
+  cfun->cfg->x_dom_computed[dir_index] = DOM_NONE;
+  cfun->cfg->x_n_bbs_in_dom_tree[dir_index] = 0;
+}
+
 /* Return the immediate dominator of basic block BB.  */
 basic_block
 get_immediate_dominator (enum cdi_direction dir, basic_block bb)
@@ -1012,8 +1235,8 @@ verify_dominators (cdi_direction dir)
   gcc_assert (dom_info_available_p (dir));
 
   dom_info di (cfun, dir);
-  di.calc_dfs_tree ();
-  di.calc_idoms ();
+  di.calc_dfs_tree (false);
+  di.calc_idoms (false);
 
   bool err = false;
   basic_block bb;
diff --git a/gcc/dominance.h b/gcc/dominance.h
index 961c4e2..8483cb5 100644
--- a/gcc/dominance.h
+++ b/gcc/dominance.h
@@ -36,8 +36,13 @@ enum dom_state
 };
 
 extern void calculate_dominance_info (enum cdi_direction);
+void calculate_dominance_info_for_region (enum cdi_direction,
+					  vec<basic_block>,
+				          bool (*in_region) (vec<basic_block>,
+							     basic_block));
 extern void free_dominance_info (function *, enum cdi_direction);
 extern void free_dominance_info (enum cdi_direction);
+void free_dominance_info_for_region (enum cdi_direction, vec<basic_block>);
 extern basic_block get_immediate_dominator (enum cdi_direction, basic_block);
 extern void set_immediate_dominator (enum cdi_direction, basic_block,
 				     basic_block);
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index eec431e..5237f9e 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1309,6 +1309,47 @@ predicate_bbs (loop_p loop)
 	      && bb_predicate_gimplified_stmts (loop->latch) == NULL);
 }
 
+/* Build SESE region by adding loop pre-header and post-header blocks.  */
+
+static vec<basic_block>
+build_sese_region (struct loop *loop)
+{
+  vec<basic_block> region = vNULL;
+  basic_block exit_bb = NULL;
+
+  gcc_assert (ifc_bbs);
+  /* The first element is loop pre-header.  */
+  region.safe_push (loop_preheader_edge (loop)->src);
+
+  for (unsigned int i = 0; i < loop->num_nodes; i++)
+    {
+      basic_block bb = ifc_bbs[i];
+      region.safe_push (bb);
+      /* Find loop postheader.  */
+      edge e;
+      edge_iterator ei;
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (loop_exit_edge_p (loop, e))
+	  {
+	      exit_bb = e->dest;
+	      break;
+	  }
+    }
+  /* The last element is loop post-header.  */
+  gcc_assert (exit_bb);
+  region.safe_push (exit_bb);
+  return region;
+}
+
+/* Callback function for post-dominator tree calculattion for loop region.
+   Returns true if basic_block BB belongs to region REGION.  */
+static bool
+in_region (vec<basic_block> region, basic_block bb)
+{
+  struct loop *loop = region[1]->loop_father; /* Use element inside region.  */
+  return bb->loop_father == loop;
+}
+
 /* Return true when LOOP is if-convertible.  This is a helper function
    for if_convertible_loop_p.  REFS and DDRS are initialized and freed
    in if_convertible_loop_p.  */
@@ -1318,11 +1359,13 @@ if_convertible_loop_p_1 (struct loop *loop, vec<data_reference_p> *refs)
 {
   unsigned int i;
   basic_block exit_bb = NULL;
+  vec<basic_block> region;
 
   if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
     return false;
 
-  calculate_dominance_info (CDI_DOMINATORS);
+  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
+    calculate_dominance_info (CDI_DOMINATORS);
 
   /* Allow statements that can be handled during if-conversion.  */
   ifc_bbs = get_loop_body_in_if_conv_order (loop);
@@ -1370,9 +1413,16 @@ if_convertible_loop_p_1 (struct loop *loop, vec<data_reference_p> *refs)
 	  = new hash_map<innermost_loop_behavior_hash, data_reference_p>;
   baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
 
-  calculate_dominance_info (CDI_POST_DOMINATORS);
+  /* Compute post-dominator tree locally.  */
+  region = build_sese_region (loop);
+  calculate_dominance_info_for_region (CDI_POST_DOMINATORS, region, in_region);
+
   predicate_bbs (loop);
 
+  /* Free post-dominator tree since it is not used after predication.  */
+  free_dominance_info_for_region (CDI_POST_DOMINATORS, region);
+  region.release ();
+
   for (i = 0; refs->iterate (i, &dr); i++)
     {
       tree ref = DR_REF (dr);
@@ -2752,7 +2802,6 @@ tree_if_conversion (struct loop *loop)
       free (ifc_bbs);
       ifc_bbs = NULL;
     }
-  free_dominance_info (CDI_POST_DOMINATORS);
 
   return todo;
 }
@@ -2805,14 +2854,6 @@ pass_if_conversion::execute (function *fun)
   if (number_of_loops (fun) <= 1)
     return 0;
 
-  /* If there are infinite loops, during CDI_POST_DOMINATORS computation
-     we can pick pretty much random bb inside of the infinite loop that
-     has the fake edge.  If we are unlucky enough, this can confuse the
-     add_to_predicate_list post-dominator check to optimize as if that
-     bb or some other one is a join block when it actually is not.
-     See PR70916.  */
-  connect_infinite_loops_to_exit ();
-
   FOR_EACH_LOOP (loop, 0)
     if (flag_tree_loop_if_convert == 1
 	|| flag_tree_loop_if_convert_stores == 1
@@ -2820,8 +2861,6 @@ pass_if_conversion::execute (function *fun)
 	    && !loop->dont_vectorize))
       todo |= tree_if_conversion (loop);
 
-  remove_fake_exit_edges ();
-
   if (flag_checking)
     {
       basic_block bb;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-11 13:23           ` Yuri Rumyantsev
@ 2016-10-11 13:48             ` Richard Biener
  2016-10-12 13:37               ` Yuri Rumyantsev
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2016-10-11 13:48 UTC (permalink / raw)
  To: Yuri Rumyantsev; +Cc: gcc-patches

On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> I implemented this by passing callback function in_region which
> returns true if block belongs to region.
> I am testing it now
>
> I attach modified patch for your quick review.

+  FOR_EACH_VEC_ELT (region, i, bb)
+    {
+      bb->dom[dir_index] = et_new_tree (bb);
+    }
+  /* Check that region is SESE region.  */
+  entry = region[0];
+  if ( EDGE_COUNT (entry->succs) > 1)
+    {
+      /* Create new entry block with the only successor.  */
+      basic_block succ = NULL;
+      bool found = false;
+      FOR_EACH_EDGE (e, ei, entry->succs)
+       if (in_region (region, e->dest))
+         {
+           gcc_assert (!found);

is that check equal to e->dest->dom[dir_index] != NULL?  I think we
shouldn't need the callback.

+    new_entry = create_empty_bb (entry);
+    unchecked_make_edge (new_entry, succ, 0);

I still think this is somewhat gross and we should try to avoid it
somehow - at least it's well-hidden now ;)

+    /* Put it to region as entry node.  */
+    region[0] = new_entry;

so region[0] is overwritten now?

As far as I understand calc_dfs_tree it should be possible to compute
the region on-the-fly
from calling calc_dfs_tree_nonrec on the entry/exit (also maybe
avoiding some of the still
"large" complexity of zeroing arrays in the constructor).

And if we use that DFS walk directly we should be able to avoid
creating those fake entry/exit
blocks by using entry/exit edges instead... (somehow).

Richard.



> Thanks.
>
> 2016-10-11 13:33 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Richard,
>>>
>>> If "fake" exit or entry block is created in dominance how we can
>>> determine what is its the only  predecessor or successor without using
>>> a notion of loop?
>>
>> The caller passes in an entry and exit edge instead of a block or loop.
>>
>> Richard.
>>
>>> 2016-10-10 15:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Thanks Richard for your comments.
>>>>> I'd like to answer on your last comment regarding use split_edge()
>>>>> instead of creating fake post-header. I started with this splitting
>>>>> but it requires to fix-up closed ssa form by creating additional phi
>>>>> nodes, so I decided to use only cfg change without updating ssa form.
>>>>> Other changes look reasonable and will fix them.
>>>>
>>>> Ah.  In this case can you investigate what it takes to make the entry/exit
>>>> edges rather than BBs?  That is, introduce those "fakes" only internally
>>>> in dominance.c?
>>>>
>>>>> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> Here is implementation of Richard proposal:
>>>>>>>
>>>>>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>>>>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>>>>>> < what makes if-conversion expensive is the post-dom compute which happens
>>>>>>> < for each loop for the whole function.  It shouldn't be very difficult
>>>>>>> < to write this,
>>>>>>> < sharing as much as possible code with the current DOM code might need
>>>>>>> < quite some refactoring though.
>>>>>>>
>>>>>>> I implemented this proposal by adding calculation of dominance info
>>>>>>> for SESE regions and incorporate this change to if conversion pass.
>>>>>>> SESE region is built by adding loop pre-header and possibly fake
>>>>>>> post-header blocks to loop body. Fake post-header is deleted after
>>>>>>> predication completion.
>>>>>>>
>>>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>>>
>>>>>>> Is it OK for trunk?
>>>>>>
>>>>>> It's mostly reasonable but I have a few comments.  First, re-using
>>>>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>>>>> a requirement to make the patch reasonably small.  Please,
>>>>>> in calculate_dominance_info_for_region, make sure that
>>>>>> !dom_info_available_p (dir).
>>>>>>
>>>>>> You pass loop * everywhere but require ->aux to be set up as
>>>>>> an array of BBs forming the region with special BBs at array ends.
>>>>>>
>>>>>> Please instead pass in a vec<basic_block> which avoids using ->aux
>>>>>> and also allows other non-loop-based SESE regions to be used
>>>>>> (I couldn't spot anything that relies on this being a loop).
>>>>>>
>>>>>> Adding a convenience wrapper for loop  * would be of course nice,
>>>>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>>>>
>>>>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>>>>> so you might want to see if it's possible to use such a pair of edges
>>>>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>>>>
>>>>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>>>>> please use split_edge() of the entry/exit edges.
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>> ChangeLog:
>>>>>>> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>
>>>>>>> * dominance.c : Include cfgloop.h for loop recognition.
>>>>>>> (dom_info): Add new functions and add boolean argument to recognize
>>>>>>> computation for loop region.
>>>>>>> (dom_info::dom_info): New function.
>>>>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>>>>>> handle unvisited blocks.
>>>>>>> (dom_info::calc_idoms): Likewise.
>>>>>>> (compute_dom_fast_query_in_region): New function.
>>>>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>>>>>> false argument.
>>>>>>> (calculate_dominance_info_for_region): New function.
>>>>>>> (free_dominance_info_for_region): Likewise.
>>>>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>>>>>> argument.
>>>>>>> * dominance.h: Add prototype for introduced functions
>>>>>>> calculate_dominance_info_for_region and
>>>>>>> free_dominance_info_for_region.
>>>>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>>>>>> (build_sese_region): New function.
>>>>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>>>>>> calculation, free it after basic block predication and delete created
>>>>>>> fake post-header block if any.
>>>>>>> (tree_if_conversion): Delete call of free_dominance_info for
>>>>>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>>>>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>>>>>> and fake edges to exit block since post-dominator calculation is
>>>>>>> performed per if-converted loop only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-11 13:48             ` Richard Biener
@ 2016-10-12 13:37               ` Yuri Rumyantsev
  2016-10-13 10:15                 ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Yuri Rumyantsev @ 2016-10-12 13:37 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 7236 bytes --]

Richard,

Here is updated patch . I avoided creation of new entry/exit blocks
but instead add check to border cases - do not consider also blocks
which are out of region.

Any comments will be appreciated.

2016-10-11 16:48 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
> On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>> Richard,
>>
>> I implemented this by passing callback function in_region which
>> returns true if block belongs to region.
>> I am testing it now
>>
>> I attach modified patch for your quick review.
>
> +  FOR_EACH_VEC_ELT (region, i, bb)
> +    {
> +      bb->dom[dir_index] = et_new_tree (bb);
> +    }
> +  /* Check that region is SESE region.  */
> +  entry = region[0];
> +  if ( EDGE_COUNT (entry->succs) > 1)
> +    {
> +      /* Create new entry block with the only successor.  */
> +      basic_block succ = NULL;
> +      bool found = false;
> +      FOR_EACH_EDGE (e, ei, entry->succs)
> +       if (in_region (region, e->dest))
> +         {
> +           gcc_assert (!found);
>
> is that check equal to e->dest->dom[dir_index] != NULL?  I think we
> shouldn't need the callback.
>
> +    new_entry = create_empty_bb (entry);
> +    unchecked_make_edge (new_entry, succ, 0);
>
> I still think this is somewhat gross and we should try to avoid it
> somehow - at least it's well-hidden now ;)
>
> +    /* Put it to region as entry node.  */
> +    region[0] = new_entry;
>
> so region[0] is overwritten now?
>
> As far as I understand calc_dfs_tree it should be possible to compute
> the region on-the-fly
> from calling calc_dfs_tree_nonrec on the entry/exit (also maybe
> avoiding some of the still
> "large" complexity of zeroing arrays in the constructor).
>
> And if we use that DFS walk directly we should be able to avoid
> creating those fake entry/exit
> blocks by using entry/exit edges instead... (somehow).
>
> Richard.
>
>
>
>> Thanks.
>>
>> 2016-10-11 13:33 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>> Richard,
>>>>
>>>> If "fake" exit or entry block is created in dominance how we can
>>>> determine what is its the only  predecessor or successor without using
>>>> a notion of loop?
>>>
>>> The caller passes in an entry and exit edge instead of a block or loop.
>>>
>>> Richard.
>>>
>>>> 2016-10-10 15:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>> Thanks Richard for your comments.
>>>>>> I'd like to answer on your last comment regarding use split_edge()
>>>>>> instead of creating fake post-header. I started with this splitting
>>>>>> but it requires to fix-up closed ssa form by creating additional phi
>>>>>> nodes, so I decided to use only cfg change without updating ssa form.
>>>>>> Other changes look reasonable and will fix them.
>>>>>
>>>>> Ah.  In this case can you investigate what it takes to make the entry/exit
>>>>> edges rather than BBs?  That is, introduce those "fakes" only internally
>>>>> in dominance.c?
>>>>>
>>>>>> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> Here is implementation of Richard proposal:
>>>>>>>>
>>>>>>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>>>>>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>>>>>>> < what makes if-conversion expensive is the post-dom compute which happens
>>>>>>>> < for each loop for the whole function.  It shouldn't be very difficult
>>>>>>>> < to write this,
>>>>>>>> < sharing as much as possible code with the current DOM code might need
>>>>>>>> < quite some refactoring though.
>>>>>>>>
>>>>>>>> I implemented this proposal by adding calculation of dominance info
>>>>>>>> for SESE regions and incorporate this change to if conversion pass.
>>>>>>>> SESE region is built by adding loop pre-header and possibly fake
>>>>>>>> post-header blocks to loop body. Fake post-header is deleted after
>>>>>>>> predication completion.
>>>>>>>>
>>>>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>>>>
>>>>>>>> Is it OK for trunk?
>>>>>>>
>>>>>>> It's mostly reasonable but I have a few comments.  First, re-using
>>>>>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>>>>>> a requirement to make the patch reasonably small.  Please,
>>>>>>> in calculate_dominance_info_for_region, make sure that
>>>>>>> !dom_info_available_p (dir).
>>>>>>>
>>>>>>> You pass loop * everywhere but require ->aux to be set up as
>>>>>>> an array of BBs forming the region with special BBs at array ends.
>>>>>>>
>>>>>>> Please instead pass in a vec<basic_block> which avoids using ->aux
>>>>>>> and also allows other non-loop-based SESE regions to be used
>>>>>>> (I couldn't spot anything that relies on this being a loop).
>>>>>>>
>>>>>>> Adding a convenience wrapper for loop  * would be of course nice,
>>>>>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>>>>>
>>>>>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>>>>>> so you might want to see if it's possible to use such a pair of edges
>>>>>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>>>>>
>>>>>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>>>>>> please use split_edge() of the entry/exit edges.
>>>>>>>
>>>>>>> Richard.
>>>>>>>
>>>>>>>> ChangeLog:
>>>>>>>> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>
>>>>>>>> * dominance.c : Include cfgloop.h for loop recognition.
>>>>>>>> (dom_info): Add new functions and add boolean argument to recognize
>>>>>>>> computation for loop region.
>>>>>>>> (dom_info::dom_info): New function.
>>>>>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>>>>>>> handle unvisited blocks.
>>>>>>>> (dom_info::calc_idoms): Likewise.
>>>>>>>> (compute_dom_fast_query_in_region): New function.
>>>>>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>>>>>>> false argument.
>>>>>>>> (calculate_dominance_info_for_region): New function.
>>>>>>>> (free_dominance_info_for_region): Likewise.
>>>>>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>>>>>>> argument.
>>>>>>>> * dominance.h: Add prototype for introduced functions
>>>>>>>> calculate_dominance_info_for_region and
>>>>>>>> free_dominance_info_for_region.
>>>>>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>>>>>>> (build_sese_region): New function.
>>>>>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>>>>>>> calculation, free it after basic block predication and delete created
>>>>>>>> fake post-header block if any.
>>>>>>>> (tree_if_conversion): Delete call of free_dominance_info for
>>>>>>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>>>>>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>>>>>>> and fake edges to exit block since post-dominator calculation is
>>>>>>>> performed per if-converted loop only.

[-- Attachment #2: patch.3 --]
[-- Type: application/octet-stream, Size: 13109 bytes --]

diff --git a/gcc/dominance.c b/gcc/dominance.c
index e3308cc..4d823bf 100644
--- a/gcc/dominance.c
+++ b/gcc/dominance.c
@@ -60,9 +60,10 @@ class dom_info
 {
 public:
   dom_info (function *, cdi_direction);
+  dom_info (vec <basic_block>, cdi_direction);
   ~dom_info ();
-  void calc_dfs_tree ();
-  void calc_idoms ();
+  void calc_dfs_tree (bool);
+  void calc_idoms (bool);
 
   inline basic_block get_idom (basic_block);
 private:
@@ -204,6 +205,61 @@ dom_info::dom_info (function *fn, cdi_direction dir)
     }
 }
 
+/* Constructor for SESE region.  */
+
+dom_info::dom_info (vec<basic_block> region, cdi_direction dir)
+{
+  /* We need memory for n_basic_blocks nodes.  */
+  size_t num = m_n_basic_blocks = region.length ();
+  m_dfs_parent = new_zero_array <TBB> (num);
+  m_dom = new_zero_array <TBB> (num);
+
+  m_path_min = new TBB[num];
+  m_key = new TBB[num];
+  m_set_size = new unsigned int[num];
+  for (size_t i = 0; i < num; i++)
+    {
+      m_path_min[i] = m_key[i] = i;
+      m_set_size[i] = 1;
+    }
+
+  m_bucket = new_zero_array <TBB> (num);
+  m_next_bucket = new_zero_array <TBB> (num);
+
+  m_set_chain = new_zero_array <TBB> (num);
+  m_set_child = new_zero_array <TBB> (num);
+
+  /* Determine max basic block index in region.  */
+  int max_index = region[0]->index;
+  for (size_t i = 1; i < num; i++)
+    if (region[i]->index > max_index)
+      max_index = region[i]->index;
+  max_index += 1;
+  m_dfs_order = new_zero_array <TBB> (max_index + 1);
+  m_dfs_last = &m_dfs_order[max_index];
+  m_dfs_to_bb = new_zero_array <basic_block> (num);
+
+  m_dfsnum = 1;
+  m_nodes = 0;
+  m_fake_exit_edge = NULL; /* Assume that region is SCC.  */
+
+  switch (dir)
+    {
+      case CDI_DOMINATORS:
+	m_reverse = false;
+	m_start_block = region[0];
+	m_end_block = region[num - 1];
+	break;
+      case CDI_POST_DOMINATORS:
+	m_reverse = true;
+	m_start_block = region[num - 1];
+	m_end_block = region[0];
+	break;
+      default:
+	gcc_unreachable ();
+    }
+}
+
 inline basic_block
 dom_info::get_idom (basic_block bb)
 {
@@ -252,6 +308,8 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
 {
   edge_iterator *stack = new edge_iterator[m_n_basic_blocks + 1];
   int sp = 0;
+  unsigned d_i = dom_convert_dir_to_idx (m_reverse ? CDI_POST_DOMINATORS
+					 : CDI_DOMINATORS);
 
   /* Initialize the first edge.  */
   edge_iterator ei = m_reverse ? ei_start (bb->preds)
@@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
 	      bn = e->src;
 
 	      /* If the next node BN is either already visited or a border
-	         block the current edge is useless, and simply overwritten
-	         with the next edge out of the current node.  */
-	      if (bn == m_end_block || m_dfs_order[bn->index])
+	         block or out of region the current edge is useless, and simply
+		 overwritten with the next edge out of the current node.  */
+	      if (bn == m_end_block || bn->dom[d_i] == NULL 
+		  || m_dfs_order[bn->index])
 		{
 		  ei_next (&ei);
 		  continue;
@@ -289,7 +348,8 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
 	  else
 	    {
 	      bn = e->dest;
-	      if (bn == m_end_block || m_dfs_order[bn->index])
+	      if (bn == m_end_block || bn->dom[d_i] == NULL
+		  || m_dfs_order[bn->index])
 		{
 		  ei_next (&ei);
 		  continue;
@@ -336,10 +396,11 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
 /* The main entry for calculating the DFS tree or forest.  m_reverse is true,
    if we are interested in the reverse flow graph.  In that case the result is
    not necessarily a tree but a forest, because there may be nodes from which
-   the EXIT_BLOCK is unreachable.  */
+   the EXIT_BLOCK is unreachable.  IN_REGION is true if dfs is performed
+   for SESE region.  */
 
 void
-dom_info::calc_dfs_tree ()
+dom_info::calc_dfs_tree (bool in_region)
 {
   *m_dfs_last = m_dfsnum;
   m_dfs_to_bb[m_dfsnum] = m_start_block;
@@ -347,7 +408,7 @@ dom_info::calc_dfs_tree ()
 
   calc_dfs_tree_nonrec (m_start_block);
 
-  if (m_reverse)
+  if (m_reverse && !in_region)
     {
       /* In the post-dom case we may have nodes without a path to EXIT_BLOCK.
          They are reverse-unreachable.  In the dom-case we disallow such
@@ -493,10 +554,11 @@ dom_info::link_roots (TBB v, TBB w)
 
 /* This calculates the immediate dominators (or post-dominators). THIS is our
    working structure and should hold the DFS forest.
-   On return the immediate dominator to node V is in m_dom[V].  */
+   On return the immediate dominator to node V is in m_dom[V].
+   IN_REGION is true when dominator tree is built fro SESE region.  */
 
 void
-dom_info::calc_idoms ()
+dom_info::calc_idoms (bool in_region)
 {
   /* Go backwards in DFS order, to first look at the leafs.  */
   for (TBB v = m_nodes; v > 1; v--)
@@ -511,7 +573,7 @@ dom_info::calc_idoms ()
 				   : ei_start (bb->preds);
       edge_iterator einext;
 
-      if (m_reverse)
+      if (m_reverse && !in_region)
 	{
 	  /* If this block has a fake edge to exit, process that first.  */
 	  if (bitmap_bit_p (m_fake_exit_edge, bb->index))
@@ -622,6 +684,32 @@ compute_dom_fast_query (enum cdi_direction dir)
   dom_computed[dir_index] = DOM_OK;
 }
 
+/* Analogous to the previous function but compute the data for SESE region.  */
+
+static void
+compute_dom_fast_query_in_region (enum cdi_direction dir,
+				  vec<basic_block> region)
+{
+  int num = 0;
+  basic_block bb;
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+
+  gcc_checking_assert (dom_info_available_p (dir));
+
+  if (dom_computed[dir_index] == DOM_OK)
+    return;
+
+  /* Assign dfs numbers for region nodes except for entry and exit nodes.  */ 
+  for (unsigned int i = 1; i < region.length () - 1; i++)
+    {
+      bb = region[i];
+      if (!bb->dom[dir_index]->father)
+	assign_dfs_numbers (bb->dom[dir_index], &num);
+    }
+
+  dom_computed[dir_index] = DOM_OK;
+}
+
 /* The main entry point into this module.  DIR is set depending on whether
    we want to compute dominators or postdominators.  */
 
@@ -649,8 +737,8 @@ calculate_dominance_info (cdi_direction dir)
       n_bbs_in_dom_tree[dir_index] = n_basic_blocks_for_fn (cfun);
 
       dom_info di (cfun, dir);
-      di.calc_dfs_tree ();
-      di.calc_idoms ();
+      di.calc_dfs_tree (false);
+      di.calc_idoms (false);
 
       FOR_EACH_BB_FN (b, cfun)
 	{
@@ -668,6 +756,46 @@ calculate_dominance_info (cdi_direction dir)
   timevar_pop (TV_DOMINANCE);
 }
 
+/* Analogous to the previous function but compute dominators or postdominators
+   for innermost loops with pre-header and post-header to form SESE region.
+   New entry or exit node can be created if region is not SESE.  REGION is
+   released after dominance tree was built.  Callback function IN_REGION is
+   supplied together with and it returns true if basic block belongs to
+   REGION.  */
+
+void
+calculate_dominance_info_for_region (cdi_direction dir,
+				     vec<basic_block> region)
+{
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+  basic_block bb;
+  unsigned int i;
+
+  if (dom_computed[dir_index] == DOM_OK)
+    return;
+
+  timevar_push (TV_DOMINANCE);
+  /* Assume that dom info is not partially computed.  */
+  gcc_assert (!dom_info_available_p (dir));
+
+  FOR_EACH_VEC_ELT (region, i, bb)
+    {
+      bb->dom[dir_index] = et_new_tree (bb);
+    }
+  dom_info di (region, dir);
+  di.calc_dfs_tree (true);
+  di.calc_idoms (true);
+
+  FOR_EACH_VEC_ELT (region, i, bb)
+    if (basic_block d = di.get_idom (bb))
+      et_set_father (bb->dom[dir_index], d->dom[dir_index]);
+
+  dom_computed[dir_index] = DOM_NO_FAST_QUERY;
+  compute_dom_fast_query_in_region (dir, region);
+
+  timevar_pop (TV_DOMINANCE);
+}
+
 /* Free dominance information for direction DIR.  */
 void
 free_dominance_info (function *fn, enum cdi_direction dir)
@@ -696,6 +824,27 @@ free_dominance_info (enum cdi_direction dir)
   free_dominance_info (cfun, dir);
 }
 
+void
+free_dominance_info_for_region (enum cdi_direction dir,
+				vec<basic_block> region)
+{
+  basic_block bb;
+  unsigned int i;
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+
+  if (!dom_info_available_p (dir))
+    return;
+
+  FOR_EACH_VEC_ELT (region, i, bb)
+    {
+      et_free_tree_force (bb->dom[dir_index]);
+      bb->dom[dir_index] = NULL;
+    }
+  et_free_pools ();
+  cfun->cfg->x_dom_computed[dir_index] = DOM_NONE;
+  cfun->cfg->x_n_bbs_in_dom_tree[dir_index] = 0;
+}
+
 /* Return the immediate dominator of basic block BB.  */
 basic_block
 get_immediate_dominator (enum cdi_direction dir, basic_block bb)
@@ -1012,8 +1161,8 @@ verify_dominators (cdi_direction dir)
   gcc_assert (dom_info_available_p (dir));
 
   dom_info di (cfun, dir);
-  di.calc_dfs_tree ();
-  di.calc_idoms ();
+  di.calc_dfs_tree (false);
+  di.calc_idoms (false);
 
   bool err = false;
   basic_block bb;
diff --git a/gcc/dominance.h b/gcc/dominance.h
index 961c4e2..1c5898b 100644
--- a/gcc/dominance.h
+++ b/gcc/dominance.h
@@ -36,8 +36,12 @@ enum dom_state
 };
 
 extern void calculate_dominance_info (enum cdi_direction);
+extern void calculate_dominance_info_for_region (enum cdi_direction,
+						 vec<basic_block>);
 extern void free_dominance_info (function *, enum cdi_direction);
 extern void free_dominance_info (enum cdi_direction);
+extern void free_dominance_info_for_region (enum cdi_direction,
+					    vec<basic_block>);
 extern basic_block get_immediate_dominator (enum cdi_direction, basic_block);
 extern void set_immediate_dominator (enum cdi_direction, basic_block,
 				     basic_block);
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index eec431e..2000bf0 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1309,6 +1309,38 @@ predicate_bbs (loop_p loop)
 	      && bb_predicate_gimplified_stmts (loop->latch) == NULL);
 }
 
+/* Build SESE region by adding loop pre-header and post-header blocks.  */
+
+static vec<basic_block>
+build_sese_region (struct loop *loop)
+{
+  vec<basic_block> region = vNULL;
+  basic_block exit_bb = NULL;
+
+  gcc_assert (ifc_bbs);
+  /* The first element is loop pre-header.  */
+  region.safe_push (loop_preheader_edge (loop)->src);
+
+  for (unsigned int i = 0; i < loop->num_nodes; i++)
+    {
+      basic_block bb = ifc_bbs[i];
+      region.safe_push (bb);
+      /* Find loop postheader.  */
+      edge e;
+      edge_iterator ei;
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (loop_exit_edge_p (loop, e))
+	  {
+	      exit_bb = e->dest;
+	      break;
+	  }
+    }
+  /* The last element is loop post-header.  */
+  gcc_assert (exit_bb);
+  region.safe_push (exit_bb);
+  return region;
+}
+
 /* Return true when LOOP is if-convertible.  This is a helper function
    for if_convertible_loop_p.  REFS and DDRS are initialized and freed
    in if_convertible_loop_p.  */
@@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop, vec<data_reference_p> *refs)
 {
   unsigned int i;
   basic_block exit_bb = NULL;
+  vec<basic_block> region;
 
   if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
     return false;
 
-  calculate_dominance_info (CDI_DOMINATORS);
+  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
+    calculate_dominance_info (CDI_DOMINATORS);
 
   /* Allow statements that can be handled during if-conversion.  */
   ifc_bbs = get_loop_body_in_if_conv_order (loop);
@@ -1370,9 +1404,16 @@ if_convertible_loop_p_1 (struct loop *loop, vec<data_reference_p> *refs)
 	  = new hash_map<innermost_loop_behavior_hash, data_reference_p>;
   baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
 
-  calculate_dominance_info (CDI_POST_DOMINATORS);
+  /* Compute post-dominator tree locally.  */
+  region = build_sese_region (loop);
+  calculate_dominance_info_for_region (CDI_POST_DOMINATORS, region);
+
   predicate_bbs (loop);
 
+  /* Free post-dominator tree since it is not used after predication.  */
+  free_dominance_info_for_region (CDI_POST_DOMINATORS, region);
+  region.release ();
+
   for (i = 0; refs->iterate (i, &dr); i++)
     {
       tree ref = DR_REF (dr);
@@ -2752,7 +2793,6 @@ tree_if_conversion (struct loop *loop)
       free (ifc_bbs);
       ifc_bbs = NULL;
     }
-  free_dominance_info (CDI_POST_DOMINATORS);
 
   return todo;
 }
@@ -2805,14 +2845,6 @@ pass_if_conversion::execute (function *fun)
   if (number_of_loops (fun) <= 1)
     return 0;
 
-  /* If there are infinite loops, during CDI_POST_DOMINATORS computation
-     we can pick pretty much random bb inside of the infinite loop that
-     has the fake edge.  If we are unlucky enough, this can confuse the
-     add_to_predicate_list post-dominator check to optimize as if that
-     bb or some other one is a join block when it actually is not.
-     See PR70916.  */
-  connect_infinite_loops_to_exit ();
-
   FOR_EACH_LOOP (loop, 0)
     if (flag_tree_loop_if_convert == 1
 	|| flag_tree_loop_if_convert_stores == 1
@@ -2820,8 +2852,6 @@ pass_if_conversion::execute (function *fun)
 	    && !loop->dont_vectorize))
       todo |= tree_if_conversion (loop);
 
-  remove_fake_exit_edges ();
-
   if (flag_checking)
     {
       basic_block bb;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-12 13:37               ` Yuri Rumyantsev
@ 2016-10-13 10:15                 ` Richard Biener
  2016-10-14 12:07                   ` Yuri Rumyantsev
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2016-10-13 10:15 UTC (permalink / raw)
  To: Yuri Rumyantsev; +Cc: gcc-patches

On Wed, Oct 12, 2016 at 3:37 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> Here is updated patch . I avoided creation of new entry/exit blocks
> but instead add check to border cases - do not consider also blocks
> which are out of region.
>
> Any comments will be appreciated.

I mostly like it now.  Can you split out all the common parts from the
dom_info constructor
to a helper method please and just compute m_n_basic_blocks and a max_index and
do all the memory allocation in that common function?

@@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
              bn = e->src;

              /* If the next node BN is either already visited or a border
-                block the current edge is useless, and simply overwritten
-                with the next edge out of the current node.  */
-             if (bn == m_end_block || m_dfs_order[bn->index])
+                block or out of region the current edge is useless, and simply
+                overwritten with the next edge out of the current node.  */
+             if (bn == m_end_block || bn->dom[d_i] == NULL

clever idea ;)  Just thinking if this means we support single entry,
multiple exit
regions for CDI_DOMINATORs and multiple entry, single exit regions for
CDI_POST_DOMINATORs.  I think so.  Please update the function comments
accordingly.


+  m_dfsnum = 1;
+  m_nodes = 0;
+  m_fake_exit_edge = NULL; /* Assume that region is SCC.  */

you mean SESE rather than SCC?

@@ -511,7 +573,7 @@ dom_info::calc_idoms ()
                                   : ei_start (bb->preds);
       edge_iterator einext;

-      if (m_reverse)
+      if (m_reverse && !in_region)
        {
          /* If this block has a fake edge to exit, process that first.  */
          if (bitmap_bit_p (m_fake_exit_edge, bb->index))

I think it's better to simply change the if (m_reverse) test to a if
(m_fake_exit_edge) test.

I noticed you do not initialize n_bbs_in_dom_tree (just set it to
zero), it's not really used
anywhere (but in an assert).

free_dominance_info_for_region needs a function level comment (per
coding guidelines).

Please make free_dominance_info_for_region take a struct function * pointer.

 @@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop,
vec<data_reference_p> *refs)
 {
   unsigned int i;
   basic_block exit_bb = NULL;
+  vec<basic_block> region;

   if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
     return false;

-  calculate_dominance_info (CDI_DOMINATORS);
+  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
+    calculate_dominance_info (CDI_DOMINATORS);

This is a premature (unwanted) change.

Other than that the only O(function-size) part left is the zeroing in

+  /* Determine max basic block index in region.  */
+  int max_index = region[0]->index;
+  for (size_t i = 1; i < num; i++)
+    if (region[i]->index > max_index)
+      max_index = region[i]->index;
+  max_index += 1;
+  m_dfs_order = new_zero_array <TBB> (max_index + 1);
+  m_dfs_last = &m_dfs_order[max_index];

I suppose we can try addressing this as followup.

Thanks a lot for doing this.
Richard.

> 2016-10-11 16:48 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Richard,
>>>
>>> I implemented this by passing callback function in_region which
>>> returns true if block belongs to region.
>>> I am testing it now
>>>
>>> I attach modified patch for your quick review.
>>
>> +  FOR_EACH_VEC_ELT (region, i, bb)
>> +    {
>> +      bb->dom[dir_index] = et_new_tree (bb);
>> +    }
>> +  /* Check that region is SESE region.  */
>> +  entry = region[0];
>> +  if ( EDGE_COUNT (entry->succs) > 1)
>> +    {
>> +      /* Create new entry block with the only successor.  */
>> +      basic_block succ = NULL;
>> +      bool found = false;
>> +      FOR_EACH_EDGE (e, ei, entry->succs)
>> +       if (in_region (region, e->dest))
>> +         {
>> +           gcc_assert (!found);
>>
>> is that check equal to e->dest->dom[dir_index] != NULL?  I think we
>> shouldn't need the callback.
>>
>> +    new_entry = create_empty_bb (entry);
>> +    unchecked_make_edge (new_entry, succ, 0);
>>
>> I still think this is somewhat gross and we should try to avoid it
>> somehow - at least it's well-hidden now ;)
>>
>> +    /* Put it to region as entry node.  */
>> +    region[0] = new_entry;
>>
>> so region[0] is overwritten now?
>>
>> As far as I understand calc_dfs_tree it should be possible to compute
>> the region on-the-fly
>> from calling calc_dfs_tree_nonrec on the entry/exit (also maybe
>> avoiding some of the still
>> "large" complexity of zeroing arrays in the constructor).
>>
>> And if we use that DFS walk directly we should be able to avoid
>> creating those fake entry/exit
>> blocks by using entry/exit edges instead... (somehow).
>>
>> Richard.
>>
>>
>>
>>> Thanks.
>>>
>>> 2016-10-11 13:33 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Richard,
>>>>>
>>>>> If "fake" exit or entry block is created in dominance how we can
>>>>> determine what is its the only  predecessor or successor without using
>>>>> a notion of loop?
>>>>
>>>> The caller passes in an entry and exit edge instead of a block or loop.
>>>>
>>>> Richard.
>>>>
>>>>> 2016-10-10 15:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>> Thanks Richard for your comments.
>>>>>>> I'd like to answer on your last comment regarding use split_edge()
>>>>>>> instead of creating fake post-header. I started with this splitting
>>>>>>> but it requires to fix-up closed ssa form by creating additional phi
>>>>>>> nodes, so I decided to use only cfg change without updating ssa form.
>>>>>>> Other changes look reasonable and will fix them.
>>>>>>
>>>>>> Ah.  In this case can you investigate what it takes to make the entry/exit
>>>>>> edges rather than BBs?  That is, introduce those "fakes" only internally
>>>>>> in dominance.c?
>>>>>>
>>>>>>> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> Here is implementation of Richard proposal:
>>>>>>>>>
>>>>>>>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>>>>>>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>>>>>>>> < what makes if-conversion expensive is the post-dom compute which happens
>>>>>>>>> < for each loop for the whole function.  It shouldn't be very difficult
>>>>>>>>> < to write this,
>>>>>>>>> < sharing as much as possible code with the current DOM code might need
>>>>>>>>> < quite some refactoring though.
>>>>>>>>>
>>>>>>>>> I implemented this proposal by adding calculation of dominance info
>>>>>>>>> for SESE regions and incorporate this change to if conversion pass.
>>>>>>>>> SESE region is built by adding loop pre-header and possibly fake
>>>>>>>>> post-header blocks to loop body. Fake post-header is deleted after
>>>>>>>>> predication completion.
>>>>>>>>>
>>>>>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>>>>>
>>>>>>>>> Is it OK for trunk?
>>>>>>>>
>>>>>>>> It's mostly reasonable but I have a few comments.  First, re-using
>>>>>>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>>>>>>> a requirement to make the patch reasonably small.  Please,
>>>>>>>> in calculate_dominance_info_for_region, make sure that
>>>>>>>> !dom_info_available_p (dir).
>>>>>>>>
>>>>>>>> You pass loop * everywhere but require ->aux to be set up as
>>>>>>>> an array of BBs forming the region with special BBs at array ends.
>>>>>>>>
>>>>>>>> Please instead pass in a vec<basic_block> which avoids using ->aux
>>>>>>>> and also allows other non-loop-based SESE regions to be used
>>>>>>>> (I couldn't spot anything that relies on this being a loop).
>>>>>>>>
>>>>>>>> Adding a convenience wrapper for loop  * would be of course nice,
>>>>>>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>>>>>>
>>>>>>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>>>>>>> so you might want to see if it's possible to use such a pair of edges
>>>>>>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>>>>>>
>>>>>>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>>>>>>> please use split_edge() of the entry/exit edges.
>>>>>>>>
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> ChangeLog:
>>>>>>>>> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>>
>>>>>>>>> * dominance.c : Include cfgloop.h for loop recognition.
>>>>>>>>> (dom_info): Add new functions and add boolean argument to recognize
>>>>>>>>> computation for loop region.
>>>>>>>>> (dom_info::dom_info): New function.
>>>>>>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>>>>>>>> handle unvisited blocks.
>>>>>>>>> (dom_info::calc_idoms): Likewise.
>>>>>>>>> (compute_dom_fast_query_in_region): New function.
>>>>>>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>>>>>>>> false argument.
>>>>>>>>> (calculate_dominance_info_for_region): New function.
>>>>>>>>> (free_dominance_info_for_region): Likewise.
>>>>>>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>>>>>>>> argument.
>>>>>>>>> * dominance.h: Add prototype for introduced functions
>>>>>>>>> calculate_dominance_info_for_region and
>>>>>>>>> free_dominance_info_for_region.
>>>>>>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>>>>>>>> (build_sese_region): New function.
>>>>>>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>>>>>>>> calculation, free it after basic block predication and delete created
>>>>>>>>> fake post-header block if any.
>>>>>>>>> (tree_if_conversion): Delete call of free_dominance_info for
>>>>>>>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>>>>>>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>>>>>>>> and fake edges to exit block since post-dominator calculation is
>>>>>>>>> performed per if-converted loop only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-13 10:15                 ` Richard Biener
@ 2016-10-14 12:07                   ` Yuri Rumyantsev
  2016-10-17 11:53                     ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Yuri Rumyantsev @ 2016-10-14 12:07 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 12084 bytes --]

Richard,

Here is updated patch with the changes proposed by you.

Bootstrapping and regression testing did not show any new failures.
Is it OK for trunk?

ChangeLog:
2016-10-14  Yuri Rumyantsev  <ysrumyan@gmail.com>

* dominance.c (dom_info::dom_info): Add new constructor for region
presented by vector of basic blocks.
(dom_init): New method to initialize members common for both
constructors.
(dom_info::dom_info): Invoke dom_init for partial initialization.
(dom_info::get_idom): Add check to corner cases on basic blocks which
are not in region.
(dom_info::calc_dfs_tree): Check M_FAKE_EXIT_EDGE instead of M_REVERSE
to detect unreachable bbs.
(dom_info::calc_idoms): Likewise.
(compute_dom_fast_query_in_region): New function.
(calculate_dominance_info_for_region): Likewise.
(free_dominance_info_for_region): Add couple functions to free
dominance info for region.
* dominance.h: Add prototypes for introduced region-based functions
tree-if-conv.c: (build_region): New function.
(if_convertible_loop_p_1): Invoke local version of post-dominators
calculation before basic block predication with subsequent freeing
post-dominator info.
(tree_if_conversion): Remove free of post-dominator info
(pass_if_conversion::execute): Delete detection of infinite loops
and fake edges to exit block since post-dominator calculation is
performed per if-converted loop only.

2016-10-13 13:15 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
> On Wed, Oct 12, 2016 at 3:37 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>> Richard,
>>
>> Here is updated patch . I avoided creation of new entry/exit blocks
>> but instead add check to border cases - do not consider also blocks
>> which are out of region.
>>
>> Any comments will be appreciated.
>
> I mostly like it now.  Can you split out all the common parts from the
> dom_info constructor
> to a helper method please and just compute m_n_basic_blocks and a max_index and
> do all the memory allocation in that common function?
>
> @@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
>               bn = e->src;
>
>               /* If the next node BN is either already visited or a border
> -                block the current edge is useless, and simply overwritten
> -                with the next edge out of the current node.  */
> -             if (bn == m_end_block || m_dfs_order[bn->index])
> +                block or out of region the current edge is useless, and simply
> +                overwritten with the next edge out of the current node.  */
> +             if (bn == m_end_block || bn->dom[d_i] == NULL
>
> clever idea ;)  Just thinking if this means we support single entry,
> multiple exit
> regions for CDI_DOMINATORs and multiple entry, single exit regions for
> CDI_POST_DOMINATORs.  I think so.  Please update the function comments
> accordingly.
>
>
> +  m_dfsnum = 1;
> +  m_nodes = 0;
> +  m_fake_exit_edge = NULL; /* Assume that region is SCC.  */
>
> you mean SESE rather than SCC?
>
> @@ -511,7 +573,7 @@ dom_info::calc_idoms ()
>                                    : ei_start (bb->preds);
>        edge_iterator einext;
>
> -      if (m_reverse)
> +      if (m_reverse && !in_region)
>         {
>           /* If this block has a fake edge to exit, process that first.  */
>           if (bitmap_bit_p (m_fake_exit_edge, bb->index))
>
> I think it's better to simply change the if (m_reverse) test to a if
> (m_fake_exit_edge) test.
>
> I noticed you do not initialize n_bbs_in_dom_tree (just set it to
> zero), it's not really used
> anywhere (but in an assert).
>
> free_dominance_info_for_region needs a function level comment (per
> coding guidelines).
>
> Please make free_dominance_info_for_region take a struct function * pointer.
>
>  @@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop,
> vec<data_reference_p> *refs)
>  {
>    unsigned int i;
>    basic_block exit_bb = NULL;
> +  vec<basic_block> region;
>
>    if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
>      return false;
>
> -  calculate_dominance_info (CDI_DOMINATORS);
> +  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
> +    calculate_dominance_info (CDI_DOMINATORS);
>
> This is a premature (unwanted) change.
>
> Other than that the only O(function-size) part left is the zeroing in
>
> +  /* Determine max basic block index in region.  */
> +  int max_index = region[0]->index;
> +  for (size_t i = 1; i < num; i++)
> +    if (region[i]->index > max_index)
> +      max_index = region[i]->index;
> +  max_index += 1;
> +  m_dfs_order = new_zero_array <TBB> (max_index + 1);
> +  m_dfs_last = &m_dfs_order[max_index];
>
> I suppose we can try addressing this as followup.
>
> Thanks a lot for doing this.
> Richard.
>
>> 2016-10-11 16:48 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>> On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>> Richard,
>>>>
>>>> I implemented this by passing callback function in_region which
>>>> returns true if block belongs to region.
>>>> I am testing it now
>>>>
>>>> I attach modified patch for your quick review.
>>>
>>> +  FOR_EACH_VEC_ELT (region, i, bb)
>>> +    {
>>> +      bb->dom[dir_index] = et_new_tree (bb);
>>> +    }
>>> +  /* Check that region is SESE region.  */
>>> +  entry = region[0];
>>> +  if ( EDGE_COUNT (entry->succs) > 1)
>>> +    {
>>> +      /* Create new entry block with the only successor.  */
>>> +      basic_block succ = NULL;
>>> +      bool found = false;
>>> +      FOR_EACH_EDGE (e, ei, entry->succs)
>>> +       if (in_region (region, e->dest))
>>> +         {
>>> +           gcc_assert (!found);
>>>
>>> is that check equal to e->dest->dom[dir_index] != NULL?  I think we
>>> shouldn't need the callback.
>>>
>>> +    new_entry = create_empty_bb (entry);
>>> +    unchecked_make_edge (new_entry, succ, 0);
>>>
>>> I still think this is somewhat gross and we should try to avoid it
>>> somehow - at least it's well-hidden now ;)
>>>
>>> +    /* Put it to region as entry node.  */
>>> +    region[0] = new_entry;
>>>
>>> so region[0] is overwritten now?
>>>
>>> As far as I understand calc_dfs_tree it should be possible to compute
>>> the region on-the-fly
>>> from calling calc_dfs_tree_nonrec on the entry/exit (also maybe
>>> avoiding some of the still
>>> "large" complexity of zeroing arrays in the constructor).
>>>
>>> And if we use that DFS walk directly we should be able to avoid
>>> creating those fake entry/exit
>>> blocks by using entry/exit edges instead... (somehow).
>>>
>>> Richard.
>>>
>>>
>>>
>>>> Thanks.
>>>>
>>>> 2016-10-11 13:33 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>> Richard,
>>>>>>
>>>>>> If "fake" exit or entry block is created in dominance how we can
>>>>>> determine what is its the only  predecessor or successor without using
>>>>>> a notion of loop?
>>>>>
>>>>> The caller passes in an entry and exit edge instead of a block or loop.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> 2016-10-10 15:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>> Thanks Richard for your comments.
>>>>>>>> I'd like to answer on your last comment regarding use split_edge()
>>>>>>>> instead of creating fake post-header. I started with this splitting
>>>>>>>> but it requires to fix-up closed ssa form by creating additional phi
>>>>>>>> nodes, so I decided to use only cfg change without updating ssa form.
>>>>>>>> Other changes look reasonable and will fix them.
>>>>>>>
>>>>>>> Ah.  In this case can you investigate what it takes to make the entry/exit
>>>>>>> edges rather than BBs?  That is, introduce those "fakes" only internally
>>>>>>> in dominance.c?
>>>>>>>
>>>>>>>> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>> Hi All,
>>>>>>>>>>
>>>>>>>>>> Here is implementation of Richard proposal:
>>>>>>>>>>
>>>>>>>>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>>>>>>>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>>>>>>>>> < what makes if-conversion expensive is the post-dom compute which happens
>>>>>>>>>> < for each loop for the whole function.  It shouldn't be very difficult
>>>>>>>>>> < to write this,
>>>>>>>>>> < sharing as much as possible code with the current DOM code might need
>>>>>>>>>> < quite some refactoring though.
>>>>>>>>>>
>>>>>>>>>> I implemented this proposal by adding calculation of dominance info
>>>>>>>>>> for SESE regions and incorporate this change to if conversion pass.
>>>>>>>>>> SESE region is built by adding loop pre-header and possibly fake
>>>>>>>>>> post-header blocks to loop body. Fake post-header is deleted after
>>>>>>>>>> predication completion.
>>>>>>>>>>
>>>>>>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>>>>>>
>>>>>>>>>> Is it OK for trunk?
>>>>>>>>>
>>>>>>>>> It's mostly reasonable but I have a few comments.  First, re-using
>>>>>>>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>>>>>>>> a requirement to make the patch reasonably small.  Please,
>>>>>>>>> in calculate_dominance_info_for_region, make sure that
>>>>>>>>> !dom_info_available_p (dir).
>>>>>>>>>
>>>>>>>>> You pass loop * everywhere but require ->aux to be set up as
>>>>>>>>> an array of BBs forming the region with special BBs at array ends.
>>>>>>>>>
>>>>>>>>> Please instead pass in a vec<basic_block> which avoids using ->aux
>>>>>>>>> and also allows other non-loop-based SESE regions to be used
>>>>>>>>> (I couldn't spot anything that relies on this being a loop).
>>>>>>>>>
>>>>>>>>> Adding a convenience wrapper for loop  * would be of course nice,
>>>>>>>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>>>>>>>
>>>>>>>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>>>>>>>> so you might want to see if it's possible to use such a pair of edges
>>>>>>>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>>>>>>>
>>>>>>>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>>>>>>>> please use split_edge() of the entry/exit edges.
>>>>>>>>>
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>>> ChangeLog:
>>>>>>>>>> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>>>
>>>>>>>>>> * dominance.c : Include cfgloop.h for loop recognition.
>>>>>>>>>> (dom_info): Add new functions and add boolean argument to recognize
>>>>>>>>>> computation for loop region.
>>>>>>>>>> (dom_info::dom_info): New function.
>>>>>>>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>>>>>>>>> handle unvisited blocks.
>>>>>>>>>> (dom_info::calc_idoms): Likewise.
>>>>>>>>>> (compute_dom_fast_query_in_region): New function.
>>>>>>>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>>>>>>>>> false argument.
>>>>>>>>>> (calculate_dominance_info_for_region): New function.
>>>>>>>>>> (free_dominance_info_for_region): Likewise.
>>>>>>>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>>>>>>>>> argument.
>>>>>>>>>> * dominance.h: Add prototype for introduced functions
>>>>>>>>>> calculate_dominance_info_for_region and
>>>>>>>>>> free_dominance_info_for_region.
>>>>>>>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>>>>>>>>> (build_sese_region): New function.
>>>>>>>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>>>>>>>>> calculation, free it after basic block predication and delete created
>>>>>>>>>> fake post-header block if any.
>>>>>>>>>> (tree_if_conversion): Delete call of free_dominance_info for
>>>>>>>>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>>>>>>>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>>>>>>>>> and fake edges to exit block since post-dominator calculation is
>>>>>>>>>> performed per if-converted loop only.

[-- Attachment #2: patch.4 --]
[-- Type: application/octet-stream, Size: 12172 bytes --]

diff --git a/gcc/dominance.c b/gcc/dominance.c
index e3308cc..9d7d9e9 100644
--- a/gcc/dominance.c
+++ b/gcc/dominance.c
@@ -60,6 +60,7 @@ class dom_info
 {
 public:
   dom_info (function *, cdi_direction);
+  dom_info (vec <basic_block>, cdi_direction);
   ~dom_info ();
   void calc_dfs_tree ();
   void calc_idoms ();
@@ -68,6 +69,7 @@ public:
 private:
   void calc_dfs_tree_nonrec (basic_block);
   void compress (TBB);
+  void dom_init (void);
   TBB eval (TBB);
   void link_roots (TBB, TBB);
 
@@ -153,12 +155,12 @@ inline T *new_zero_array (size_t num)
   return result;
 }
 
-/* Allocate all needed memory in a pessimistic fashion (so we round up).  */
+/* Helper function for constructors to initialize a part of class members.  */
 
-dom_info::dom_info (function *fn, cdi_direction dir)
+void
+dom_info::dom_init (void)
 {
-  /* We need memory for n_basic_blocks nodes.  */
-  size_t num = m_n_basic_blocks = n_basic_blocks_for_fn (fn);
+  size_t num = m_n_basic_blocks;
   m_dfs_parent = new_zero_array <TBB> (num);
   m_dom = new_zero_array <TBB> (num);
 
@@ -177,13 +179,23 @@ dom_info::dom_info (function *fn, cdi_direction dir)
   m_set_chain = new_zero_array <TBB> (num);
   m_set_child = new_zero_array <TBB> (num);
 
-  unsigned last_bb_index = last_basic_block_for_fn (fn);
-  m_dfs_order = new_zero_array <TBB> (last_bb_index + 1);
-  m_dfs_last = &m_dfs_order[last_bb_index];
   m_dfs_to_bb = new_zero_array <basic_block> (num);
 
   m_dfsnum = 1;
   m_nodes = 0;
+}
+
+/* Allocate all needed memory in a pessimistic fashion (so we round up).  */
+
+dom_info::dom_info (function *fn, cdi_direction dir)
+{
+  m_n_basic_blocks = n_basic_blocks_for_fn (fn);
+
+  dom_init ();
+
+  unsigned last_bb_index = last_basic_block_for_fn (fn);
+  m_dfs_order = new_zero_array <TBB> (last_bb_index + 1);
+  m_dfs_last = &m_dfs_order[last_bb_index];
 
   switch (dir)
     {
@@ -204,6 +216,44 @@ dom_info::dom_info (function *fn, cdi_direction dir)
     }
 }
 
+/* Constructor for reducible region REGION.  */
+
+dom_info::dom_info (vec<basic_block> region, cdi_direction dir)
+{
+  m_n_basic_blocks = region.length ();
+  unsigned int nm1 = m_n_basic_blocks - 1;
+
+  dom_init ();
+
+  /* Determine max basic block index in region.  */
+  int max_index = region[0]->index;
+  for (size_t i = 1; i <= nm1; i++)
+    if (region[i]->index > max_index)
+      max_index = region[i]->index;
+  max_index += 1;  /* set index on the first bb out of region.  */
+
+  m_dfs_order = new_zero_array <TBB> (max_index + 1);
+  m_dfs_last = &m_dfs_order[max_index];
+
+  m_fake_exit_edge = NULL; /* Assume that region is reducible.  */
+
+  switch (dir)
+    {
+      case CDI_DOMINATORS:
+	m_reverse = false;
+	m_start_block = region[0];
+	m_end_block = region[nm1];
+	break;
+      case CDI_POST_DOMINATORS:
+	m_reverse = true;
+	m_start_block = region[nm1];
+	m_end_block = region[0];
+	break;
+      default:
+	gcc_unreachable ();
+    }
+}
+
 inline basic_block
 dom_info::get_idom (basic_block bb)
 {
@@ -252,6 +302,8 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
 {
   edge_iterator *stack = new edge_iterator[m_n_basic_blocks + 1];
   int sp = 0;
+  unsigned d_i = dom_convert_dir_to_idx (m_reverse ? CDI_POST_DOMINATORS
+					 : CDI_DOMINATORS);
 
   /* Initialize the first edge.  */
   edge_iterator ei = m_reverse ? ei_start (bb->preds)
@@ -276,9 +328,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
 	      bn = e->src;
 
 	      /* If the next node BN is either already visited or a border
-	         block the current edge is useless, and simply overwritten
-	         with the next edge out of the current node.  */
-	      if (bn == m_end_block || m_dfs_order[bn->index])
+		 block or out of region the current edge is useless, and simply
+		 overwritten with the next edge out of the current node.  */
+	      if (bn == m_end_block || bn->dom[d_i] == NULL
+		  || m_dfs_order[bn->index])
 		{
 		  ei_next (&ei);
 		  continue;
@@ -289,7 +342,8 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
 	  else
 	    {
 	      bn = e->dest;
-	      if (bn == m_end_block || m_dfs_order[bn->index])
+	      if (bn == m_end_block || bn->dom[d_i] == NULL
+		  || m_dfs_order[bn->index])
 		{
 		  ei_next (&ei);
 		  continue;
@@ -347,7 +401,7 @@ dom_info::calc_dfs_tree ()
 
   calc_dfs_tree_nonrec (m_start_block);
 
-  if (m_reverse)
+  if (m_fake_exit_edge)
     {
       /* In the post-dom case we may have nodes without a path to EXIT_BLOCK.
          They are reverse-unreachable.  In the dom-case we disallow such
@@ -511,7 +565,7 @@ dom_info::calc_idoms ()
 				   : ei_start (bb->preds);
       edge_iterator einext;
 
-      if (m_reverse)
+      if (m_fake_exit_edge)
 	{
 	  /* If this block has a fake edge to exit, process that first.  */
 	  if (bitmap_bit_p (m_fake_exit_edge, bb->index))
@@ -622,6 +676,33 @@ compute_dom_fast_query (enum cdi_direction dir)
   dom_computed[dir_index] = DOM_OK;
 }
 
+/* Analogous to the previous function but compute the data for reducible
+   region REGION.  */
+
+static void
+compute_dom_fast_query_in_region (enum cdi_direction dir,
+				  vec<basic_block> region)
+{
+  int num = 0;
+  basic_block bb;
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+
+  gcc_checking_assert (dom_info_available_p (dir));
+
+  if (dom_computed[dir_index] == DOM_OK)
+    return;
+
+  /* Assign dfs numbers for region nodes except for entry and exit nodes.  */
+  for (unsigned int i = 1; i < region.length () - 1; i++)
+    {
+      bb = region[i];
+      if (!bb->dom[dir_index]->father)
+	assign_dfs_numbers (bb->dom[dir_index], &num);
+    }
+
+  dom_computed[dir_index] = DOM_OK;
+}
+
 /* The main entry point into this module.  DIR is set depending on whether
    we want to compute dominators or postdominators.  */
 
@@ -668,6 +749,43 @@ calculate_dominance_info (cdi_direction dir)
   timevar_pop (TV_DOMINANCE);
 }
 
+/* Analogous to the previous function but compute dominance info for regions
+   which are single entry, multiple exit regions for CDI_DOMINATORs and
+   multiple entry, single exit regions for CDI_POST_DOMINATORs.  */
+
+void
+calculate_dominance_info_for_region (cdi_direction dir,
+				     vec<basic_block> region)
+{
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+  basic_block bb;
+  unsigned int i;
+
+  if (dom_computed[dir_index] == DOM_OK)
+    return;
+
+  timevar_push (TV_DOMINANCE);
+  /* Assume that dom info is not partially computed.  */
+  gcc_assert (!dom_info_available_p (dir));
+
+  FOR_EACH_VEC_ELT (region, i, bb)
+    {
+      bb->dom[dir_index] = et_new_tree (bb);
+    }
+  dom_info di (region, dir);
+  di.calc_dfs_tree ();
+  di.calc_idoms ();
+
+  FOR_EACH_VEC_ELT (region, i, bb)
+    if (basic_block d = di.get_idom (bb))
+      et_set_father (bb->dom[dir_index], d->dom[dir_index]);
+
+  dom_computed[dir_index] = DOM_NO_FAST_QUERY;
+  compute_dom_fast_query_in_region (dir, region);
+
+  timevar_pop (TV_DOMINANCE);
+}
+
 /* Free dominance information for direction DIR.  */
 void
 free_dominance_info (function *fn, enum cdi_direction dir)
@@ -696,6 +814,39 @@ free_dominance_info (enum cdi_direction dir)
   free_dominance_info (cfun, dir);
 }
 
+/* Free dominance information for direction DIR in region REGION.  */
+
+void
+free_dominance_info_for_region (enum cdi_direction dir,
+				vec<basic_block> region,
+				function *fn)
+{
+  basic_block bb;
+  unsigned int i;
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+
+  if (!dom_info_available_p (dir))
+    return;
+
+  FOR_EACH_VEC_ELT (region, i, bb)
+    {
+      et_free_tree_force (bb->dom[dir_index]);
+      bb->dom[dir_index] = NULL;
+    }
+  et_free_pools ();
+
+  fn->cfg->x_dom_computed[dir_index] = DOM_NONE;
+
+  fn->cfg->x_n_bbs_in_dom_tree[dir_index] = 0;
+}
+
+void
+free_dominance_info_for_region (enum cdi_direction dir,
+				vec<basic_block> region)
+{
+  free_dominance_info_for_region (dir, region, cfun);
+}
+
 /* Return the immediate dominator of basic block BB.  */
 basic_block
 get_immediate_dominator (enum cdi_direction dir, basic_block bb)
diff --git a/gcc/dominance.h b/gcc/dominance.h
index 961c4e2..8ed7f5c 100644
--- a/gcc/dominance.h
+++ b/gcc/dominance.h
@@ -36,8 +36,15 @@ enum dom_state
 };
 
 extern void calculate_dominance_info (enum cdi_direction);
+extern void calculate_dominance_info_for_region (enum cdi_direction,
+						 vec<basic_block>);
 extern void free_dominance_info (function *, enum cdi_direction);
 extern void free_dominance_info (enum cdi_direction);
+extern void free_dominance_info_for_region (enum cdi_direction,
+					    vec<basic_block>,
+					    function *);
+extern void free_dominance_info_for_region (enum cdi_direction,
+					    vec<basic_block>);
 extern basic_block get_immediate_dominator (enum cdi_direction, basic_block);
 extern void set_immediate_dominator (enum cdi_direction, basic_block,
 				     basic_block);
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index eec431e..aa26c59 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -1309,6 +1309,38 @@ predicate_bbs (loop_p loop)
 	      && bb_predicate_gimplified_stmts (loop->latch) == NULL);
 }
 
+/* Build region by adding loop pre-header and post-header blocks.  */
+
+static vec<basic_block>
+build_region (struct loop *loop)
+{
+  vec<basic_block> region = vNULL;
+  basic_block exit_bb = NULL;
+
+  gcc_assert (ifc_bbs);
+  /* The first element is loop pre-header.  */
+  region.safe_push (loop_preheader_edge (loop)->src);
+
+  for (unsigned int i = 0; i < loop->num_nodes; i++)
+    {
+      basic_block bb = ifc_bbs[i];
+      region.safe_push (bb);
+      /* Find loop postheader.  */
+      edge e;
+      edge_iterator ei;
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (loop_exit_edge_p (loop, e))
+	  {
+	      exit_bb = e->dest;
+	      break;
+	  }
+    }
+  /* The last element is loop post-header.  */
+  gcc_assert (exit_bb);
+  region.safe_push (exit_bb);
+  return region;
+}
+
 /* Return true when LOOP is if-convertible.  This is a helper function
    for if_convertible_loop_p.  REFS and DDRS are initialized and freed
    in if_convertible_loop_p.  */
@@ -1318,6 +1350,7 @@ if_convertible_loop_p_1 (struct loop *loop, vec<data_reference_p> *refs)
 {
   unsigned int i;
   basic_block exit_bb = NULL;
+  vec<basic_block> region;
 
   if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
     return false;
@@ -1370,9 +1403,16 @@ if_convertible_loop_p_1 (struct loop *loop, vec<data_reference_p> *refs)
 	  = new hash_map<innermost_loop_behavior_hash, data_reference_p>;
   baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
 
-  calculate_dominance_info (CDI_POST_DOMINATORS);
+  /* Compute post-dominator tree locally.  */
+  region = build_region (loop);
+  calculate_dominance_info_for_region (CDI_POST_DOMINATORS, region);
+
   predicate_bbs (loop);
 
+  /* Free post-dominator tree since it is not used after predication.  */
+  free_dominance_info_for_region (CDI_POST_DOMINATORS, region);
+  region.release ();
+
   for (i = 0; refs->iterate (i, &dr); i++)
     {
       tree ref = DR_REF (dr);
@@ -2752,7 +2792,6 @@ tree_if_conversion (struct loop *loop)
       free (ifc_bbs);
       ifc_bbs = NULL;
     }
-  free_dominance_info (CDI_POST_DOMINATORS);
 
   return todo;
 }
@@ -2805,14 +2844,6 @@ pass_if_conversion::execute (function *fun)
   if (number_of_loops (fun) <= 1)
     return 0;
 
-  /* If there are infinite loops, during CDI_POST_DOMINATORS computation
-     we can pick pretty much random bb inside of the infinite loop that
-     has the fake edge.  If we are unlucky enough, this can confuse the
-     add_to_predicate_list post-dominator check to optimize as if that
-     bb or some other one is a join block when it actually is not.
-     See PR70916.  */
-  connect_infinite_loops_to_exit ();
-
   FOR_EACH_LOOP (loop, 0)
     if (flag_tree_loop_if_convert == 1
 	|| flag_tree_loop_if_convert_stores == 1
@@ -2820,8 +2851,6 @@ pass_if_conversion::execute (function *fun)
 	    && !loop->dont_vectorize))
       todo |= tree_if_conversion (loop);
 
-  remove_fake_exit_edges ();
-
   if (flag_checking)
     {
       basic_block bb;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Compile-time improvement for if conversion.
  2016-10-14 12:07                   ` Yuri Rumyantsev
@ 2016-10-17 11:53                     ` Richard Biener
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Biener @ 2016-10-17 11:53 UTC (permalink / raw)
  To: Yuri Rumyantsev; +Cc: gcc-patches

On Fri, Oct 14, 2016 at 2:07 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> Here is updated patch with the changes proposed by you.
>
> Bootstrapping and regression testing did not show any new failures.
> Is it OK for trunk?

Ok with dropping the free_dominance_info_for_region variant without
struct function argument and making the one with take struct function *
as _first_ argument.

Thanks,
Richard.

> ChangeLog:
> 2016-10-14  Yuri Rumyantsev  <ysrumyan@gmail.com>
>
> * dominance.c (dom_info::dom_info): Add new constructor for region
> presented by vector of basic blocks.
> (dom_init): New method to initialize members common for both
> constructors.
> (dom_info::dom_info): Invoke dom_init for partial initialization.
> (dom_info::get_idom): Add check to corner cases on basic blocks which
> are not in region.
> (dom_info::calc_dfs_tree): Check M_FAKE_EXIT_EDGE instead of M_REVERSE
> to detect unreachable bbs.
> (dom_info::calc_idoms): Likewise.
> (compute_dom_fast_query_in_region): New function.
> (calculate_dominance_info_for_region): Likewise.
> (free_dominance_info_for_region): Add couple functions to free
> dominance info for region.
> * dominance.h: Add prototypes for introduced region-based functions
> tree-if-conv.c: (build_region): New function.
> (if_convertible_loop_p_1): Invoke local version of post-dominators
> calculation before basic block predication with subsequent freeing
> post-dominator info.
> (tree_if_conversion): Remove free of post-dominator info
> (pass_if_conversion::execute): Delete detection of infinite loops
> and fake edges to exit block since post-dominator calculation is
> performed per if-converted loop only.
>
> 2016-10-13 13:15 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Wed, Oct 12, 2016 at 3:37 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Richard,
>>>
>>> Here is updated patch . I avoided creation of new entry/exit blocks
>>> but instead add check to border cases - do not consider also blocks
>>> which are out of region.
>>>
>>> Any comments will be appreciated.
>>
>> I mostly like it now.  Can you split out all the common parts from the
>> dom_info constructor
>> to a helper method please and just compute m_n_basic_blocks and a max_index and
>> do all the memory allocation in that common function?
>>
>> @@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb)
>>               bn = e->src;
>>
>>               /* If the next node BN is either already visited or a border
>> -                block the current edge is useless, and simply overwritten
>> -                with the next edge out of the current node.  */
>> -             if (bn == m_end_block || m_dfs_order[bn->index])
>> +                block or out of region the current edge is useless, and simply
>> +                overwritten with the next edge out of the current node.  */
>> +             if (bn == m_end_block || bn->dom[d_i] == NULL
>>
>> clever idea ;)  Just thinking if this means we support single entry,
>> multiple exit
>> regions for CDI_DOMINATORs and multiple entry, single exit regions for
>> CDI_POST_DOMINATORs.  I think so.  Please update the function comments
>> accordingly.
>>
>>
>> +  m_dfsnum = 1;
>> +  m_nodes = 0;
>> +  m_fake_exit_edge = NULL; /* Assume that region is SCC.  */
>>
>> you mean SESE rather than SCC?
>>
>> @@ -511,7 +573,7 @@ dom_info::calc_idoms ()
>>                                    : ei_start (bb->preds);
>>        edge_iterator einext;
>>
>> -      if (m_reverse)
>> +      if (m_reverse && !in_region)
>>         {
>>           /* If this block has a fake edge to exit, process that first.  */
>>           if (bitmap_bit_p (m_fake_exit_edge, bb->index))
>>
>> I think it's better to simply change the if (m_reverse) test to a if
>> (m_fake_exit_edge) test.
>>
>> I noticed you do not initialize n_bbs_in_dom_tree (just set it to
>> zero), it's not really used
>> anywhere (but in an assert).
>>
>> free_dominance_info_for_region needs a function level comment (per
>> coding guidelines).
>>
>> Please make free_dominance_info_for_region take a struct function * pointer.
>>
>>  @@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop,
>> vec<data_reference_p> *refs)
>>  {
>>    unsigned int i;
>>    basic_block exit_bb = NULL;
>> +  vec<basic_block> region;
>>
>>    if (find_data_references_in_loop (loop, refs) == chrec_dont_know)
>>      return false;
>>
>> -  calculate_dominance_info (CDI_DOMINATORS);
>> +  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
>> +    calculate_dominance_info (CDI_DOMINATORS);
>>
>> This is a premature (unwanted) change.
>>
>> Other than that the only O(function-size) part left is the zeroing in
>>
>> +  /* Determine max basic block index in region.  */
>> +  int max_index = region[0]->index;
>> +  for (size_t i = 1; i < num; i++)
>> +    if (region[i]->index > max_index)
>> +      max_index = region[i]->index;
>> +  max_index += 1;
>> +  m_dfs_order = new_zero_array <TBB> (max_index + 1);
>> +  m_dfs_last = &m_dfs_order[max_index];
>>
>> I suppose we can try addressing this as followup.
>>
>> Thanks a lot for doing this.
>> Richard.
>>
>>> 2016-10-11 16:48 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Richard,
>>>>>
>>>>> I implemented this by passing callback function in_region which
>>>>> returns true if block belongs to region.
>>>>> I am testing it now
>>>>>
>>>>> I attach modified patch for your quick review.
>>>>
>>>> +  FOR_EACH_VEC_ELT (region, i, bb)
>>>> +    {
>>>> +      bb->dom[dir_index] = et_new_tree (bb);
>>>> +    }
>>>> +  /* Check that region is SESE region.  */
>>>> +  entry = region[0];
>>>> +  if ( EDGE_COUNT (entry->succs) > 1)
>>>> +    {
>>>> +      /* Create new entry block with the only successor.  */
>>>> +      basic_block succ = NULL;
>>>> +      bool found = false;
>>>> +      FOR_EACH_EDGE (e, ei, entry->succs)
>>>> +       if (in_region (region, e->dest))
>>>> +         {
>>>> +           gcc_assert (!found);
>>>>
>>>> is that check equal to e->dest->dom[dir_index] != NULL?  I think we
>>>> shouldn't need the callback.
>>>>
>>>> +    new_entry = create_empty_bb (entry);
>>>> +    unchecked_make_edge (new_entry, succ, 0);
>>>>
>>>> I still think this is somewhat gross and we should try to avoid it
>>>> somehow - at least it's well-hidden now ;)
>>>>
>>>> +    /* Put it to region as entry node.  */
>>>> +    region[0] = new_entry;
>>>>
>>>> so region[0] is overwritten now?
>>>>
>>>> As far as I understand calc_dfs_tree it should be possible to compute
>>>> the region on-the-fly
>>>> from calling calc_dfs_tree_nonrec on the entry/exit (also maybe
>>>> avoiding some of the still
>>>> "large" complexity of zeroing arrays in the constructor).
>>>>
>>>> And if we use that DFS walk directly we should be able to avoid
>>>> creating those fake entry/exit
>>>> blocks by using entry/exit edges instead... (somehow).
>>>>
>>>> Richard.
>>>>
>>>>
>>>>
>>>>> Thanks.
>>>>>
>>>>> 2016-10-11 13:33 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>> Richard,
>>>>>>>
>>>>>>> If "fake" exit or entry block is created in dominance how we can
>>>>>>> determine what is its the only  predecessor or successor without using
>>>>>>> a notion of loop?
>>>>>>
>>>>>> The caller passes in an entry and exit edge instead of a block or loop.
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>> 2016-10-10 15:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>> Thanks Richard for your comments.
>>>>>>>>> I'd like to answer on your last comment regarding use split_edge()
>>>>>>>>> instead of creating fake post-header. I started with this splitting
>>>>>>>>> but it requires to fix-up closed ssa form by creating additional phi
>>>>>>>>> nodes, so I decided to use only cfg change without updating ssa form.
>>>>>>>>> Other changes look reasonable and will fix them.
>>>>>>>>
>>>>>>>> Ah.  In this case can you investigate what it takes to make the entry/exit
>>>>>>>> edges rather than BBs?  That is, introduce those "fakes" only internally
>>>>>>>> in dominance.c?
>>>>>>>>
>>>>>>>>> 2016-10-10 12:52 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> Here is implementation of Richard proposal:
>>>>>>>>>>>
>>>>>>>>>>> < For general infrastructure it would be nice to expose a (post-)dominator
>>>>>>>>>>> < compute for MESE (post-dominators) / SEME (dominators) regions.  I believe
>>>>>>>>>>> < what makes if-conversion expensive is the post-dom compute which happens
>>>>>>>>>>> < for each loop for the whole function.  It shouldn't be very difficult
>>>>>>>>>>> < to write this,
>>>>>>>>>>> < sharing as much as possible code with the current DOM code might need
>>>>>>>>>>> < quite some refactoring though.
>>>>>>>>>>>
>>>>>>>>>>> I implemented this proposal by adding calculation of dominance info
>>>>>>>>>>> for SESE regions and incorporate this change to if conversion pass.
>>>>>>>>>>> SESE region is built by adding loop pre-header and possibly fake
>>>>>>>>>>> post-header blocks to loop body. Fake post-header is deleted after
>>>>>>>>>>> predication completion.
>>>>>>>>>>>
>>>>>>>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>>>>>>>
>>>>>>>>>>> Is it OK for trunk?
>>>>>>>>>>
>>>>>>>>>> It's mostly reasonable but I have a few comments.  First, re-using
>>>>>>>>>> bb->dom[] for the dominator info is somewhat fragile but indeed
>>>>>>>>>> a requirement to make the patch reasonably small.  Please,
>>>>>>>>>> in calculate_dominance_info_for_region, make sure that
>>>>>>>>>> !dom_info_available_p (dir).
>>>>>>>>>>
>>>>>>>>>> You pass loop * everywhere but require ->aux to be set up as
>>>>>>>>>> an array of BBs forming the region with special BBs at array ends.
>>>>>>>>>>
>>>>>>>>>> Please instead pass in a vec<basic_block> which avoids using ->aux
>>>>>>>>>> and also allows other non-loop-based SESE regions to be used
>>>>>>>>>> (I couldn't spot anything that relies on this being a loop).
>>>>>>>>>>
>>>>>>>>>> Adding a convenience wrapper for loop  * would be of course nice,
>>>>>>>>>> to cover the special pre/post-header code in tree-if-conv.c.
>>>>>>>>>>
>>>>>>>>>> In theory a SESE region is fully specified by its entry end exit _edge_,
>>>>>>>>>> so you might want to see if it's possible to use such a pair of edges
>>>>>>>>>> to guard the dfs/idom walks to avoid the need to create fake blocks.
>>>>>>>>>>
>>>>>>>>>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc.
>>>>>>>>>> please use split_edge() of the entry/exit edges.
>>>>>>>>>>
>>>>>>>>>> Richard.
>>>>>>>>>>
>>>>>>>>>>> ChangeLog:
>>>>>>>>>>> 2016-10-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>>>>
>>>>>>>>>>> * dominance.c : Include cfgloop.h for loop recognition.
>>>>>>>>>>> (dom_info): Add new functions and add boolean argument to recognize
>>>>>>>>>>> computation for loop region.
>>>>>>>>>>> (dom_info::dom_info): New function.
>>>>>>>>>>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not
>>>>>>>>>>> handle unvisited blocks.
>>>>>>>>>>> (dom_info::calc_idoms): Likewise.
>>>>>>>>>>> (compute_dom_fast_query_in_region): New function.
>>>>>>>>>>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with
>>>>>>>>>>> false argument.
>>>>>>>>>>> (calculate_dominance_info_for_region): New function.
>>>>>>>>>>> (free_dominance_info_for_region): Likewise.
>>>>>>>>>>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false
>>>>>>>>>>> argument.
>>>>>>>>>>> * dominance.h: Add prototype for introduced functions
>>>>>>>>>>> calculate_dominance_info_for_region and
>>>>>>>>>>> free_dominance_info_for_region.
>>>>>>>>>>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader.
>>>>>>>>>>> (build_sese_region): New function.
>>>>>>>>>>> (if_convertible_loop_p_1): Invoke local version of post-dominators
>>>>>>>>>>> calculation, free it after basic block predication and delete created
>>>>>>>>>>> fake post-header block if any.
>>>>>>>>>>> (tree_if_conversion): Delete call of free_dominance_info for
>>>>>>>>>>> post-dominators, free ifc_sese_bbs which represents SESE region.
>>>>>>>>>>> (pass_if_conversion::execute): Delete detection of infinite loops
>>>>>>>>>>> and fake edges to exit block since post-dominator calculation is
>>>>>>>>>>> performed per if-converted loop only.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-10-17 11:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-05 13:22 Compile-time improvement for if conversion Yuri Rumyantsev
2016-10-10  9:52 ` Richard Biener
2016-10-10 11:42   ` Yuri Rumyantsev
2016-10-10 12:00     ` Richard Biener
2016-10-10 14:17       ` Yuri Rumyantsev
2016-10-11 10:33         ` Richard Biener
2016-10-11 13:23           ` Yuri Rumyantsev
2016-10-11 13:48             ` Richard Biener
2016-10-12 13:37               ` Yuri Rumyantsev
2016-10-13 10:15                 ` Richard Biener
2016-10-14 12:07                   ` Yuri Rumyantsev
2016-10-17 11:53                     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).