public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: Fix PR rtl-optimization/34035
@ 2007-11-12 22:50 Steven Bosscher
  2007-11-12 23:00 ` Eric Botcazou
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Bosscher @ 2007-11-12 22:50 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

Eric Botcazou wrote in http://gcc.gnu.org/ml/gcc-patches/2007-11/msg00666.html:
> Fixed by running cleanup_cfg (0) in this case.

This is how compile time always creeps up in gcc.

Please just call delete_unreachable_blocks() instead of cleanup_cfg().

Gr.
Steven

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fix PR rtl-optimization/34035
  2007-11-12 22:50 Fix PR rtl-optimization/34035 Steven Bosscher
@ 2007-11-12 23:00 ` Eric Botcazou
  2007-11-12 23:10   ` David Edelsohn
  2007-11-13  0:29   ` Steven Bosscher
  0 siblings, 2 replies; 10+ messages in thread
From: Eric Botcazou @ 2007-11-12 23:00 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches

> This is how compile time always creeps up in gcc.

Figures?

> Please just call delete_unreachable_blocks() instead of cleanup_cfg().

I think that cleanup_cfg (0) is a good compromise in this case.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fix PR rtl-optimization/34035
  2007-11-12 23:00 ` Eric Botcazou
@ 2007-11-12 23:10   ` David Edelsohn
  2007-11-13  0:27     ` Eric Botcazou
  2007-11-13  0:29   ` Steven Bosscher
  1 sibling, 1 reply; 10+ messages in thread
From: David Edelsohn @ 2007-11-12 23:10 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Steven Bosscher

Steven> Please just call delete_unreachable_blocks() instead of cleanup_cfg().

Eric> I think that cleanup_cfg (0) is a good compromise in this case.

	Why?  How about some justification for your assertion.

Thanks, David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fix PR rtl-optimization/34035
  2007-11-12 23:10   ` David Edelsohn
@ 2007-11-13  0:27     ` Eric Botcazou
  0 siblings, 0 replies; 10+ messages in thread
From: Eric Botcazou @ 2007-11-13  0:27 UTC (permalink / raw)
  To: David Edelsohn; +Cc: gcc-patches, Steven Bosscher

> 	Why?  How about some justification for your assertion.

It's the fastest mode of cleanup_cfg and purging dead edges can lead to 
opportunities of CFG simplification.  And it's the canonical cleanup routine 
at the end of RTL passes.  Note that it is only invoked on demand here.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fix PR rtl-optimization/34035
  2007-11-12 23:00 ` Eric Botcazou
  2007-11-12 23:10   ` David Edelsohn
@ 2007-11-13  0:29   ` Steven Bosscher
  2007-11-13  1:51     ` Eric Botcazou
  1 sibling, 1 reply; 10+ messages in thread
From: Steven Bosscher @ 2007-11-13  0:29 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On Nov 12, 2007 10:49 PM, Eric Botcazou <ebotcazou@libertysurf.fr> wrote:
> > This is how compile time always creeps up in gcc.
>
> Figures?
>
> > Please just call delete_unreachable_blocks() instead of cleanup_cfg().
>
> I think that cleanup_cfg (0) is a good compromise in this case.

I respectfully disagree.

I appreciate you fixed this bug, first of all.  But if deleting just
the unreachable blocks is sufficient, then there is no reason that I
can think of to do a cfg cleanup.

With cleanup_cfg(0) you add another delete_trivially_dead_insns()
pass, and three iterations over all basic blocks.  This does seem, at
first, to be quite cheap.  The problem is that we do it all over the
place when it actually doesn't buy you anything: The "good" rtl passes
look through the dead code that d_t_d_i can remove, and even the
semi-good passes don't care about an extra forwarder block or a
single_succ block here and there.

IMO much of gcc's abysmal compile time performance can be blamed on
doing redundant work all over the place, trying hard for little gain.
Sorry to jump on your patch like this, but it just looks like
"obvious" redundant cfg cleanup work  (bikeshed, bikeshed! ;-)

But, many thanks for fixing the bug, at least! :-)

Gr.
Steven

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fix PR rtl-optimization/34035
  2007-11-13  0:29   ` Steven Bosscher
@ 2007-11-13  1:51     ` Eric Botcazou
  2007-11-13  2:16       ` Steven Bosscher
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Botcazou @ 2007-11-13  1:51 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches

> With cleanup_cfg(0) you add another delete_trivially_dead_insns()
> pass, and three iterations over all basic blocks.  This does seem, at
> first, to be quite cheap.  The problem is that we do it all over the
> place when it actually doesn't buy you anything: The "good" rtl passes
> look through the dead code that d_t_d_i can remove, and even the
> semi-good passes don't care about an extra forwarder block or a
> single_succ block here and there.

cleanup_cfg (0) is the canonical invocation to minimally clean up the CFG at 
the end of the RTL passes.  It is already run at the end of every CSE and 
GCSE pass if a single jump is modified.  We now also invoke it on demand in, 
I believe, relatively rare cases.

> IMO much of gcc's abysmal compile time performance can be blamed on
> doing redundant work all over the place, trying hard for little gain.
> Sorry to jump on your patch like this, but it just looks like
> "obvious" redundant cfg cleanup work  (bikeshed, bikeshed! ;-)

I don't think cleanup_cfg (0) is "trying hard"; otherwise tree optimization 
passes shouldn't be allowed to set TODO_cleanup_cfg.  That being said, if you 
can come up with a testcase, I'll revise my position.  Of course don't try 
with C, C++ or Fortran, the change only affects those languages for which 
-fnon-call-exceptions is set, e.g. Ada that I specially care about.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fix PR rtl-optimization/34035
  2007-11-13  1:51     ` Eric Botcazou
@ 2007-11-13  2:16       ` Steven Bosscher
  2007-11-13 15:45         ` Eric Botcazou
  2007-11-14 12:46         ` Eric Botcazou
  0 siblings, 2 replies; 10+ messages in thread
From: Steven Bosscher @ 2007-11-13  2:16 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On Nov 13, 2007 12:01 AM, Eric Botcazou <ebotcazou@libertysurf.fr> wrote:
> > With cleanup_cfg(0) you add another delete_trivially_dead_insns()
> > pass, and three iterations over all basic blocks.  This does seem, at
> > first, to be quite cheap.  The problem is that we do it all over the
> > place when it actually doesn't buy you anything: The "good" rtl passes
> > look through the dead code that d_t_d_i can remove, and even the
> > semi-good passes don't care about an extra forwarder block or a
> > single_succ block here and there.
>
> cleanup_cfg (0) is the canonical invocation to minimally clean up the CFG at
> the end of the RTL passes.  It is already run at the end of every CSE and
> GCSE pass if a single jump is modified.

True.  And whether that is a good thing?  I seriously doubt it.


> > IMO much of gcc's abysmal compile time performance can be blamed on
> > doing redundant work all over the place, trying hard for little gain.
> > Sorry to jump on your patch like this, but it just looks like
> > "obvious" redundant cfg cleanup work  (bikeshed, bikeshed! ;-)
>
> I don't think cleanup_cfg (0) is "trying hard"; otherwise tree optimization
> passes shouldn't be allowed to set TODO_cleanup_cfg.

Tree cleanup cfg takes about 3.5% of the total compile time for my
cc1-i files.  'nuff said.


> That being said, if you
> can come up with a testcase, I'll revise my position.

I think any random large function will show some compile time
increase, although obviously no compile time explosions.  As I said,
the problem is not so much to do it here, but that we do it
everywhere.  Anyway, ...


>  Of course don't try
> with C, C++ or Fortran, the change only affects those languages for which
> -fnon-call-exceptions is set, e.g. Ada that I specially care about.

...I had overlooked it only affects compilation with
-fnon-call-exceptions.  Not worth the battle :-)

Gr.
Steven

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fix PR rtl-optimization/34035
  2007-11-13  2:16       ` Steven Bosscher
@ 2007-11-13 15:45         ` Eric Botcazou
  2007-11-14 12:46         ` Eric Botcazou
  1 sibling, 0 replies; 10+ messages in thread
From: Eric Botcazou @ 2007-11-13 15:45 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches

> I think any random large function will show some compile time
> increase, although obviously no compile time explosions.  As I said,
> the problem is not so much to do it here, but that we do it
> everywhere.

Here is at least a figure: 'make all-target-libjava' on x86-64/Linux with 
default+RTL checking, before the patch

real    117m59.962s
user    108m4.086s
sys     3m13.106s

and after the patch:

real    117m32.815s
user    107m57.010s
sys     3m16.274s

> ...I had overlooked it only affects compilation with -fnon-call-exceptions. 
> Not worth the battle :-) 

Looks like I've eventually found the killer argument. :-)

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fix PR rtl-optimization/34035
  2007-11-13  2:16       ` Steven Bosscher
  2007-11-13 15:45         ` Eric Botcazou
@ 2007-11-14 12:46         ` Eric Botcazou
  1 sibling, 0 replies; 10+ messages in thread
From: Eric Botcazou @ 2007-11-14 12:46 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches

> I think any random large function will show some compile time
> increase, although obviously no compile time explosions.  As I said,
> the problem is not so much to do it here, but that we do it
> everywhere.

Same wash for the Ada library:

real    8m4.547s
user    7m19.187s
sys     0m21.496s

real    7m47.673s
user    7m17.818s
sys     0m21.297s

-ftime-report yields no (significant) differences for the biggest Ada files.

The compiler is even slightly but consistently faster on make.adb (300KB).

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Fix PR rtl-optimization/34035
@ 2007-11-12 22:22 Eric Botcazou
  0 siblings, 0 replies; 10+ messages in thread
From: Eric Botcazou @ 2007-11-12 22:22 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1303 bytes --]

This is a regression present on the mainline at -O2 -fnon-call-exceptions
-ffast-math -fno-gcse.  An assertion in the dominance.c code is triggered
at the beginning of the second fwprop pass because there is a dangling
basic block in the CFG.

It has been created by the second CSE pass, more specifically:

      /* With non-call exceptions, we are not always able to update
	 the CFG properly inside cse_insn.  So clean up possibly
	 redundant EH edges here.  */
      if (flag_non_call_exceptions && have_eh_succ_edges (bb))
	purge_dead_edges (bb);

Fixed by running cleanup_cfg (0) in this case.

Bootstrapped/regtested on x86_64-suse-linux, applied on the mainline.


2007-11-12  Eric Botcazou  <ebotcazou@libertysurf.fr>

	PR rtl-optimization/34035
	* cse.c (cse_cfg_altered): New global variable.
	(cse_jumps_altered): Make boolean.
	(recorded_label_ref): Likewise.
	(cse_insn): Adjust for above changes.
	(cse_extended_basic_block): Likewise.  Set cse_cfg_altered
	if dead edges have been purged.
	(cse_main): Change return value specification and adjust code.
	(rest_of_handle_cse): Adjust for above change.
	(rest_of_handle_cse2): Likewise.
	* gcse.c (rest_of_handle_gcse): Likewise.


2007-11-12  Eric Botcazou  <ebotcazou@libertysurf.fr>

	* g++.dg/opt/cfg5.C: New test.


-- 
Eric Botcazou

[-- Attachment #2: pr34035.diff --]
[-- Type: text/x-diff, Size: 5661 bytes --]

Index: cse.c
===================================================================
--- cse.c	(revision 130003)
+++ cse.c	(working copy)
@@ -348,15 +348,17 @@ static unsigned int cse_reg_info_timesta
 
 static HARD_REG_SET hard_regs_in_table;
 
-/* Nonzero if cse has altered conditional jump insns
-   in such a way that jump optimization should be redone.  */
+/* True if CSE has altered the CFG.  */
+static bool cse_cfg_altered;
 
-static int cse_jumps_altered;
-
-/* Nonzero if we put a LABEL_REF into the hash table for an INSN
-   without a REG_LABEL_OPERAND, we have to rerun jump after CSE to put
-   in the note.  */
-static int recorded_label_ref;
+/* True if CSE has altered conditional jump insns in such a way
+   that jump optimization should be redone.  */
+static bool cse_jumps_altered;
+
+/* True if we put a LABEL_REF into the hash table for an INSN
+   without a REG_LABEL_OPERAND, we have to rerun jump after CSE
+   to put in the note.  */
+static bool recorded_label_ref;
 
 /* canon_hash stores 1 in do_not_record
    if it notices a reference to CC0, PC, or some other volatile
@@ -4761,7 +4763,7 @@ cse_insn (rtx insn, rtx libcall_insn)
 		continue;
 
 	      SET_SRC (sets[i].rtl) = trial;
-	      cse_jumps_altered = 1;
+	      cse_jumps_altered = true;
 	      break;
 	    }
 
@@ -4987,7 +4989,7 @@ cse_insn (rtx insn, rtx libcall_insn)
 	{
 	  /* One less use of the label this insn used to jump to.  */
 	  delete_insn_and_edges (insn);
-	  cse_jumps_altered = 1;
+	  cse_jumps_altered = true;
 	  /* No more processing for this set.  */
 	  sets[i].rtl = 0;
 	}
@@ -5026,10 +5028,8 @@ cse_insn (rtx insn, rtx libcall_insn)
 	  else
 	    INSN_CODE (insn) = -1;
 
-	  /* Do not bother deleting any unreachable code,
-	     let jump/flow do that.  */
-
-	  cse_jumps_altered = 1;
+	  /* Do not bother deleting any unreachable code, let jump do it.  */
+	  cse_jumps_altered = true;
 	  sets[i].rtl = 0;
 	}
 
@@ -6033,10 +6033,10 @@ cse_extended_basic_block (struct cse_bas
 	    
 	      /* If we haven't already found an insn where we added a LABEL_REF,
 		 check this one.  */
-	      if (INSN_P (insn) && ! recorded_label_ref
+	      if (INSN_P (insn) && !recorded_label_ref
 		  && for_each_rtx (&PATTERN (insn), check_for_label_ref,
 				   (void *) insn))
-		recorded_label_ref = 1;
+		recorded_label_ref = true;
 
 #ifdef HAVE_cc0
 	      /* If the previous insn set CC0 and this insn no longer
@@ -6074,7 +6074,7 @@ cse_extended_basic_block (struct cse_bas
 	 the CFG properly inside cse_insn.  So clean up possibly
 	 redundant EH edges here.  */
       if (flag_non_call_exceptions && have_eh_succ_edges (bb))
-	purge_dead_edges (bb);
+	cse_cfg_altered |= purge_dead_edges (bb);
 
       /* If we changed a conditional jump, we may have terminated
 	 the path we are following.  Check that by verifying that
@@ -6132,8 +6132,10 @@ cse_extended_basic_block (struct cse_bas
    F is the first instruction.
    NREGS is one plus the highest pseudo-reg number used in the instruction.
 
-   Returns 1 if jump_optimize should be redone due to simplifications
-   in conditional jump instructions.  */
+   Return 2 if jump optimizations should be redone due to simplifications
+   in conditional jump instructions.
+   Return 1 if the CFG should be cleaned up because it has been modified.
+   Return 0 otherwise.  */
 
 int
 cse_main (rtx f ATTRIBUTE_UNUSED, int nregs)
@@ -6153,8 +6155,9 @@ cse_main (rtx f ATTRIBUTE_UNUSED, int nr
   ebb_data.path = XNEWVEC (struct branch_path,
 			   PARAM_VALUE (PARAM_MAX_CSE_PATH_LENGTH));
 
-  cse_jumps_altered = 0;
-  recorded_label_ref = 0;
+  cse_cfg_altered = false;
+  cse_jumps_altered = false;
+  recorded_label_ref = false;
   constant_pool_entries_cost = 0;
   constant_pool_entries_regcost = 0;
   ebb_data.path_size = 0;
@@ -6216,7 +6219,12 @@ cse_main (rtx f ATTRIBUTE_UNUSED, int nr
   free (rc_order);
   rtl_hooks = general_rtl_hooks;
 
-  return cse_jumps_altered || recorded_label_ref;
+  if (cse_jumps_altered || recorded_label_ref)
+    return 2;
+  else if (cse_cfg_altered)
+    return 1;
+  else
+    return 0;
 }
 \f
 /* Called via for_each_rtx to see if an insn is using a LABEL_REF for
@@ -6948,10 +6956,14 @@ rest_of_handle_cse (void)
      expecting CSE to be run.  But always rerun it in a cheap mode.  */
   cse_not_expected = !flag_rerun_cse_after_loop && !flag_gcse;
 
-  if (tem)
-    rebuild_jump_labels (get_insns ());
-
-  if (tem || optimize > 1)
+  if (tem == 2)
+    {
+      timevar_push (TV_JUMP);
+      rebuild_jump_labels (get_insns ());
+      cleanup_cfg (0);
+      timevar_pop (TV_JUMP);
+    }
+  else if (tem == 1 || optimize > 1)
     cleanup_cfg (0);
 
   return 0;
@@ -7003,13 +7015,16 @@ rest_of_handle_cse2 (void)
 
   delete_trivially_dead_insns (get_insns (), max_reg_num ());
 
-  if (tem)
+  if (tem == 2)
     {
       timevar_push (TV_JUMP);
       rebuild_jump_labels (get_insns ());
       cleanup_cfg (0);
       timevar_pop (TV_JUMP);
     }
+  else if (tem == 1)
+    cleanup_cfg (0);
+
   cse_not_expected = 1;
   return 0;
 }
Index: gcse.c
===================================================================
--- gcse.c	(revision 130003)
+++ gcse.c	(working copy)
@@ -6752,13 +6752,15 @@ rest_of_handle_gcse (void)
 
   /* If gcse or cse altered any jumps, rerun jump optimizations to clean
      things up.  */
-  if (tem || tem2)
+  if (tem || tem2 == 2)
     {
       timevar_push (TV_JUMP);
       rebuild_jump_labels (get_insns ());
       cleanup_cfg (0);
       timevar_pop (TV_JUMP);
     }
+  else if (tem2 == 1)
+    cleanup_cfg (0);
 
   flag_cse_skip_blocks = save_csb;
   flag_cse_follow_jumps = save_cfj;

[-- Attachment #3: pr34035.C --]
[-- Type: text/x-c++src, Size: 561 bytes --]

/* PR rtl-optimization/34035 */
/* Origin: Janis Johnson <janis@gcc.gnu.org> */

/* { dg-do compile } */
/* { dg-options "-O2 -fnon-call-exceptions -ffast-math -fno-gcse" } */

class One {
public:
    One () { e[0] = e[1] = 0.0; }
    double e[2];
};

template <class T>
class Two {
public:
    Two();
private:
    T *data;
    int arraySize;
};

template <class T>
Two<T>::Two() {
   data = new T[arraySize];
}

class Three {
protected:
  Two<One> data;
};

class Four : public Three {
public:
  Four ();
  void Foo (int n);
};

Four :: Four (){
   Foo (1);
}

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-11-14 11:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-12 22:50 Fix PR rtl-optimization/34035 Steven Bosscher
2007-11-12 23:00 ` Eric Botcazou
2007-11-12 23:10   ` David Edelsohn
2007-11-13  0:27     ` Eric Botcazou
2007-11-13  0:29   ` Steven Bosscher
2007-11-13  1:51     ` Eric Botcazou
2007-11-13  2:16       ` Steven Bosscher
2007-11-13 15:45         ` Eric Botcazou
2007-11-14 12:46         ` Eric Botcazou
  -- strict thread matches above, loose matches on Subject: below --
2007-11-12 22:22 Eric Botcazou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).