public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH] middle-end/114579 - speed up add_scope_conflicts
@ 2024-04-04 13:32 Michael Matz
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Matz @ 2024-04-04 13:32 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jakub Jelinek

Hello,

On Thu, 4 Apr 2024, Richard Biener wrote:

> I have reworded the comment before the all-to-all conflict recording
> since it seemed to be confusing and missing the point - but maybe I
> am also missing something here.

The point of the comment was to explain how this differs from building a 
conflict graph for named objects without indirect accesses, i.e. how e.g. 
a conflict graph for register allocation is done.  There conflicts are 
added at the points where objects are defined (between that def and all 
currently live objects), and _only_ there.  I.e. not at CFG merge points 
and not at uses.  So something like so:

  foreach INSN in block backwards:
    foreach DEF in INSN:
      foreach O in active:
        add_conflict (DEF, O);
      remove (DEF from active);

That's for the more usual backwards direction, but it would be similar 
for forward one.  The point is that this is the only place where conflicts 
are added, not at CFG splits/merges.

The above relies on being able to see all places where the objects are 
written.  With indirect accesses that's not the case anymore:

   ptr = cond : &foo : &bar;   // MENTION foo, bar
   *ptr = 42;                  // no DEF of a named object
   escape (ptr);
   CLOBBER (foo)
   CLOBBER (bar)

We somewhere need to record the conflict between foo and bar.  
Conservatively we did so at the first real instruction of a block.  As 
minor optimization we also start a block in a mode where we just record 
mentions, and switch to also recording conflicts at the first real insn. 
(An empty block, or just PHIs do not cause conflicts in themself)

This need for an N-to-N conflict generation at some point is the major 
difference between building the conflict graph for named vs. 
potentionally indirect objects, and the comment tried to explain that 
from the perspective of someone familiar with conflict graph building 
in regalloc :-)

Where exactly these N-to-N conflicts are generated doesn't matter so much, 
as long as everything is there.  The current way was the most obvious one.

The observation that these N-to-N conflicts only need to be done on the 
commonly disjoint sets of the inputs (which is trivially empty for a 
single predecessor) is correct, i.e. only doing it with more than a single 
predecessor makes sense.  This requires adding the N-to-Ns always even in 
mostly emtpy blocks, for the danger of that being missed in the successor 
block, as you are doing.

So, fine with me, also with the changed comment if you think it explains 
stuff better :)


Ciao,
Michael.

> 
> Nevertheless for the testcase in the PR the compile-time spend in
> add_scope_conflicts at -O1 drops from previously 67s (39%) to 10s (9%).
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?
> 
> Thanks,
> Richard.
> 
> 	PR middle-end/114579
> 	* cfgexpand.cc (add_scope_conflicts_1): Record all-to-all
> 	conflicts only when there's a CFG merge but for all CFG merges.
> ---
>  gcc/cfgexpand.cc | 46 +++++++++++++++++++++++++++++++++-------------
>  1 file changed, 33 insertions(+), 13 deletions(-)
> 
> diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
> index eef565eddb5..fa48a4c633f 100644
> --- a/gcc/cfgexpand.cc
> +++ b/gcc/cfgexpand.cc
> @@ -640,21 +640,26 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict)
>  	{
>  	  if (for_conflict && visit == visit_op)
>  	    {
> -	      /* If this is the first real instruction in this BB we need
> -	         to add conflicts for everything live at this point now.
> -		 Unlike classical liveness for named objects we can't
> -		 rely on seeing a def/use of the names we're interested in.
> -		 There might merely be indirect loads/stores.  We'd not add any
> -		 conflicts for such partitions.  */
> +	      /* When we are inheriting live variables from our predecessors
> +		 through a CFG merge we might not see an actual mention of
> +		 the variables to record the approprate conflict as defs/uses
> +		 might be through indirect stores/loads.  For this reason
> +		 we have to make sure each live variable conflicts with
> +		 each other.  When there's just a single predecessor the
> +		 set of conflicts is already up-to-date.
> +		 We perform this delayed at the first real instruction to
> +		 allow clobbers starting this block to remove variables from
> +		 the set of live variables.  */
>  	      bitmap_iterator bi;
>  	      unsigned i;
> -	      EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
> -		{
> -		  class stack_var *a = &stack_vars[i];
> -		  if (!a->conflicts)
> -		    a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack);
> -		  bitmap_ior_into (a->conflicts, work);
> -		}
> +	      if (EDGE_COUNT (bb->preds) > 1)
> +		EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
> +		  {
> +		    class stack_var *a = &stack_vars[i];
> +		    if (!a->conflicts)
> +		      a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack);
> +		    bitmap_ior_into (a->conflicts, work);
> +		  }
>  	      visit = visit_conflict;
>  	    }
>  	  walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit);
> @@ -662,6 +667,21 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict)
>  	    add_scope_conflicts_2 (USE_FROM_PTR (use_p), work, visit);
>  	}
>      }
> +
> +  /* When there was no real instruction but there's a CFG merge we need
> +     to add the conflicts now.  */
> +  if (for_conflict && visit == visit_op && EDGE_COUNT (bb->preds) > 1)
> +    {
> +      bitmap_iterator bi;
> +      unsigned i;
> +      EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
> +	{
> +	  class stack_var *a = &stack_vars[i];
> +	  if (!a->conflicts)
> +	    a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack);
> +	  bitmap_ior_into (a->conflicts, work);
> +	}
> +    }
>  }
>  
>  /* Generate stack partition conflicts between all partitions that are
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] middle-end/114579 - speed up add_scope_conflicts
       [not found] <20240404124128.752263858C78@sourceware.org>
@ 2024-04-05 15:09 ` Jeff Law
  0 siblings, 0 replies; 3+ messages in thread
From: Jeff Law @ 2024-04-05 15:09 UTC (permalink / raw)
  To: Richard Biener, gcc-patches; +Cc: Jakub Jelinek, Michael Matz



On 4/4/24 6:41 AM, Richard Biener wrote:
> The following speeds up stack variable conflict detection by recognizing
> that the all-to-all conflict recording is only necessary for CFG merges
> as it's the unioning of the live variable sets that doesn't come with
> explicit mentions we record conflicts for.
> 
> If we employ this optimization we have to make sure to perform the
> all-to-all conflict recording for all CFG merges even those into
> empty blocks where we might previously have skipped this.
> 
> I have reworded the comment before the all-to-all conflict recording
> since it seemed to be confusing and missing the point - but maybe I
> am also missing something here.
> 
> Nevertheless for the testcase in the PR the compile-time spend in
> add_scope_conflicts at -O1 drops from previously 67s (39%) to 10s (9%).
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?
> 
> Thanks,
> Richard.
> 
> 	PR middle-end/114579
> 	* cfgexpand.cc (add_scope_conflicts_1): Record all-to-all
> 	conflicts only when there's a CFG merge but for all CFG merges.
OK.  Your call on whether or not to include in gcc-14 or wait for gcc-15.

jeff


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH] middle-end/114579 - speed up add_scope_conflicts
@ 2024-04-04 12:41 Richard Biener
  0 siblings, 0 replies; 3+ messages in thread
From: Richard Biener @ 2024-04-04 12:41 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Michael Matz

The following speeds up stack variable conflict detection by recognizing
that the all-to-all conflict recording is only necessary for CFG merges
as it's the unioning of the live variable sets that doesn't come with
explicit mentions we record conflicts for.

If we employ this optimization we have to make sure to perform the
all-to-all conflict recording for all CFG merges even those into
empty blocks where we might previously have skipped this.

I have reworded the comment before the all-to-all conflict recording
since it seemed to be confusing and missing the point - but maybe I
am also missing something here.

Nevertheless for the testcase in the PR the compile-time spend in
add_scope_conflicts at -O1 drops from previously 67s (39%) to 10s (9%).

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?

Thanks,
Richard.

	PR middle-end/114579
	* cfgexpand.cc (add_scope_conflicts_1): Record all-to-all
	conflicts only when there's a CFG merge but for all CFG merges.
---
 gcc/cfgexpand.cc | 46 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 33 insertions(+), 13 deletions(-)

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index eef565eddb5..fa48a4c633f 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -640,21 +640,26 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict)
 	{
 	  if (for_conflict && visit == visit_op)
 	    {
-	      /* If this is the first real instruction in this BB we need
-	         to add conflicts for everything live at this point now.
-		 Unlike classical liveness for named objects we can't
-		 rely on seeing a def/use of the names we're interested in.
-		 There might merely be indirect loads/stores.  We'd not add any
-		 conflicts for such partitions.  */
+	      /* When we are inheriting live variables from our predecessors
+		 through a CFG merge we might not see an actual mention of
+		 the variables to record the approprate conflict as defs/uses
+		 might be through indirect stores/loads.  For this reason
+		 we have to make sure each live variable conflicts with
+		 each other.  When there's just a single predecessor the
+		 set of conflicts is already up-to-date.
+		 We perform this delayed at the first real instruction to
+		 allow clobbers starting this block to remove variables from
+		 the set of live variables.  */
 	      bitmap_iterator bi;
 	      unsigned i;
-	      EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
-		{
-		  class stack_var *a = &stack_vars[i];
-		  if (!a->conflicts)
-		    a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack);
-		  bitmap_ior_into (a->conflicts, work);
-		}
+	      if (EDGE_COUNT (bb->preds) > 1)
+		EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
+		  {
+		    class stack_var *a = &stack_vars[i];
+		    if (!a->conflicts)
+		      a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack);
+		    bitmap_ior_into (a->conflicts, work);
+		  }
 	      visit = visit_conflict;
 	    }
 	  walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit);
@@ -662,6 +667,21 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict)
 	    add_scope_conflicts_2 (USE_FROM_PTR (use_p), work, visit);
 	}
     }
+
+  /* When there was no real instruction but there's a CFG merge we need
+     to add the conflicts now.  */
+  if (for_conflict && visit == visit_op && EDGE_COUNT (bb->preds) > 1)
+    {
+      bitmap_iterator bi;
+      unsigned i;
+      EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi)
+	{
+	  class stack_var *a = &stack_vars[i];
+	  if (!a->conflicts)
+	    a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack);
+	  bitmap_ior_into (a->conflicts, work);
+	}
+    }
 }
 
 /* Generate stack partition conflicts between all partitions that are
-- 
2.35.3

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-04-05 15:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-04 13:32 [PATCH] middle-end/114579 - speed up add_scope_conflicts Michael Matz
     [not found] <20240404124128.752263858C78@sourceware.org>
2024-04-05 15:09 ` Jeff Law
  -- strict thread matches above, loose matches on Subject: below --
2024-04-04 12:41 Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).