public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] tree-optimization/102943 - avoid (re-)computing dominance bitmap
@ 2022-03-10 11:50 Richard Biener
  2022-03-11  3:31 ` Jeff Law
  0 siblings, 1 reply; 2+ messages in thread
From: Richard Biener @ 2022-03-10 11:50 UTC (permalink / raw)
  To: gcc-patches

Currently back_propagate_equivalences tries to optimize dominance
queries in a smart way but it fails to notice that when fast indexes
are available the dominance query is fast (when called from DOM).
It also re-computes the dominance bitmap for each equivalence recorded
on an edge, which for FP are usually several.  Finally it fails to
use the tree bitmap view for efficiency.  Overall this cuts 7
seconds of compile-time from originally 77 in the slowest LTRANS
unit when building 521.wrf_r.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2022-03-10  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/102943
	* tree-ssa-dom.cc (back_propagate_equivalences): Only
	populate the dominance bitmap if fast queries are not
	available.  Use a tree view bitmap.
	(record_temporary_equivalences): Cache the dominance bitmap
	across all equivalences on the edge.
---
 gcc/tree-ssa-dom.cc | 58 +++++++++++++++++++++++++++------------------
 1 file changed, 35 insertions(+), 23 deletions(-)

diff --git a/gcc/tree-ssa-dom.cc b/gcc/tree-ssa-dom.cc
index fc90c207b52..21745bf31d3 100644
--- a/gcc/tree-ssa-dom.cc
+++ b/gcc/tree-ssa-dom.cc
@@ -1025,12 +1025,13 @@ dom_valueize (tree t)
    additional equivalences that are valid on edge E.  */
 static void
 back_propagate_equivalences (tree lhs, edge e,
-			     class const_and_copies *const_and_copies)
+			     class const_and_copies *const_and_copies,
+			     bitmap *domby)
 {
   use_operand_p use_p;
   imm_use_iterator iter;
-  bitmap domby = NULL;
   basic_block dest = e->dest;
+  bool domok = (dom_info_state (CDI_DOMINATORS) == DOM_OK);
 
   /* Iterate over the uses of LHS to see if any dominate E->dest.
      If so, they may create useful equivalences too.
@@ -1053,27 +1054,38 @@ back_propagate_equivalences (tree lhs, edge e,
       if (!lhs2 || TREE_CODE (lhs2) != SSA_NAME)
 	continue;
 
-      /* Profiling has shown the domination tests here can be fairly
-	 expensive.  We get significant improvements by building the
-	 set of blocks that dominate BB.  We can then just test
-	 for set membership below.
-
-	 We also initialize the set lazily since often the only uses
-	 are going to be in the same block as DEST.  */
-      if (!domby)
+      if (domok)
 	{
-	  domby = BITMAP_ALLOC (NULL);
-	  basic_block bb = get_immediate_dominator (CDI_DOMINATORS, dest);
-	  while (bb)
+	  if (!dominated_by_p (CDI_DOMINATORS, dest, gimple_bb (use_stmt)))
+	    continue;
+	}
+      else
+	{
+	  /* Profiling has shown the domination tests here can be fairly
+	     expensive when the fast indexes are not computed.
+	     We get significant improvements by building the
+	     set of blocks that dominate BB.  We can then just test
+	     for set membership below.
+
+	     We also initialize the set lazily since often the only uses
+	     are going to be in the same block as DEST.  */
+
+	  if (!*domby)
 	    {
-	      bitmap_set_bit (domby, bb->index);
-	      bb = get_immediate_dominator (CDI_DOMINATORS, bb);
+	      *domby = BITMAP_ALLOC (NULL);
+	      bitmap_tree_view (*domby);
+	      basic_block bb = get_immediate_dominator (CDI_DOMINATORS, dest);
+	      while (bb)
+		{
+		  bitmap_set_bit (*domby, bb->index);
+		  bb = get_immediate_dominator (CDI_DOMINATORS, bb);
+		}
 	    }
-	}
 
-      /* This tests if USE_STMT does not dominate DEST.  */
-      if (!bitmap_bit_p (domby, gimple_bb (use_stmt)->index))
-	continue;
+	  /* This tests if USE_STMT does not dominate DEST.  */
+	  if (!bitmap_bit_p (*domby, gimple_bb (use_stmt)->index))
+	    continue;
+	}
 
       /* At this point USE_STMT dominates DEST and may result in a
 	 useful equivalence.  Try to simplify its RHS to a constant
@@ -1083,9 +1095,6 @@ back_propagate_equivalences (tree lhs, edge e,
       if (res && (TREE_CODE (res) == SSA_NAME || is_gimple_min_invariant (res)))
 	record_equality (lhs2, res, const_and_copies);
     }
-
-  if (domby)
-    BITMAP_FREE (domby);
 }
 
 /* Record into CONST_AND_COPIES and AVAIL_EXPRS_STACK any equivalences implied
@@ -1110,6 +1119,7 @@ record_temporary_equivalences (edge e,
       for (i = 0; edge_info->cond_equivalences.iterate (i, &eq); ++i)
 	avail_exprs_stack->record_cond (eq);
 
+      bitmap domby = NULL;
       edge_info::equiv_pair *seq;
       for (i = 0; edge_info->simple_equivalences.iterate (i, &seq); ++i)
 	{
@@ -1146,8 +1156,10 @@ record_temporary_equivalences (edge e,
 	  /* Any equivalence found for LHS may result in additional
 	     equivalences for other uses of LHS that we have already
 	     processed.  */
-	  back_propagate_equivalences (lhs, e, const_and_copies);
+	  back_propagate_equivalences (lhs, e, const_and_copies, &domby);
 	}
+      if (domby)
+	BITMAP_FREE (domby);
     }
 }
 
-- 
2.34.1

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] tree-optimization/102943 - avoid (re-)computing dominance bitmap
  2022-03-10 11:50 [PATCH] tree-optimization/102943 - avoid (re-)computing dominance bitmap Richard Biener
@ 2022-03-11  3:31 ` Jeff Law
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Law @ 2022-03-11  3:31 UTC (permalink / raw)
  To: Richard Biener, gcc-patches



On 3/10/2022 4:50 AM, Richard Biener via Gcc-patches wrote:
> Currently back_propagate_equivalences tries to optimize dominance
> queries in a smart way but it fails to notice that when fast indexes
> are available the dominance query is fast (when called from DOM).
> It also re-computes the dominance bitmap for each equivalence recorded
> on an edge, which for FP are usually several.  Finally it fails to
> use the tree bitmap view for efficiency.  Overall this cuts 7
> seconds of compile-time from originally 77 in the slowest LTRANS
> unit when building 521.wrf_r.
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>
> 2022-03-10  Richard Biener  <rguenther@suse.de>
>
> 	PR tree-optimization/102943
> 	* tree-ssa-dom.cc (back_propagate_equivalences): Only
> 	populate the dominance bitmap if fast queries are not
> 	available.  Use a tree view bitmap.
> 	(record_temporary_equivalences): Cache the dominance bitmap
> 	across all equivalences on the edge.
Anything that helps WRF build times is a win in my book.    IIRC the 
worst module in there takes something like 27 minutes to cross compile 
for our target.  Annoying as hell when you have to do it multiple times 
a day.

jeff


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-03-11  3:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-10 11:50 [PATCH] tree-optimization/102943 - avoid (re-)computing dominance bitmap Richard Biener
2022-03-11  3:31 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).