* Re: Fix PR rtl-optimization/34035
@ 2007-11-12 22:50 Steven Bosscher
2007-11-12 23:00 ` Eric Botcazou
0 siblings, 1 reply; 10+ messages in thread
From: Steven Bosscher @ 2007-11-12 22:50 UTC (permalink / raw)
To: Eric Botcazou; +Cc: gcc-patches
Eric Botcazou wrote in http://gcc.gnu.org/ml/gcc-patches/2007-11/msg00666.html:
> Fixed by running cleanup_cfg (0) in this case.
This is how compile time always creeps up in gcc.
Please just call delete_unreachable_blocks() instead of cleanup_cfg().
Gr.
Steven
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Fix PR rtl-optimization/34035
2007-11-12 22:50 Fix PR rtl-optimization/34035 Steven Bosscher
@ 2007-11-12 23:00 ` Eric Botcazou
2007-11-12 23:10 ` David Edelsohn
2007-11-13 0:29 ` Steven Bosscher
0 siblings, 2 replies; 10+ messages in thread
From: Eric Botcazou @ 2007-11-12 23:00 UTC (permalink / raw)
To: Steven Bosscher; +Cc: gcc-patches
> This is how compile time always creeps up in gcc.
Figures?
> Please just call delete_unreachable_blocks() instead of cleanup_cfg().
I think that cleanup_cfg (0) is a good compromise in this case.
--
Eric Botcazou
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Fix PR rtl-optimization/34035
2007-11-12 23:00 ` Eric Botcazou
@ 2007-11-12 23:10 ` David Edelsohn
2007-11-13 0:27 ` Eric Botcazou
2007-11-13 0:29 ` Steven Bosscher
1 sibling, 1 reply; 10+ messages in thread
From: David Edelsohn @ 2007-11-12 23:10 UTC (permalink / raw)
To: Eric Botcazou; +Cc: gcc-patches, Steven Bosscher
Steven> Please just call delete_unreachable_blocks() instead of cleanup_cfg().
Eric> I think that cleanup_cfg (0) is a good compromise in this case.
Why? How about some justification for your assertion.
Thanks, David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Fix PR rtl-optimization/34035
2007-11-12 23:10 ` David Edelsohn
@ 2007-11-13 0:27 ` Eric Botcazou
0 siblings, 0 replies; 10+ messages in thread
From: Eric Botcazou @ 2007-11-13 0:27 UTC (permalink / raw)
To: David Edelsohn; +Cc: gcc-patches, Steven Bosscher
> Why? How about some justification for your assertion.
It's the fastest mode of cleanup_cfg and purging dead edges can lead to
opportunities of CFG simplification. And it's the canonical cleanup routine
at the end of RTL passes. Note that it is only invoked on demand here.
--
Eric Botcazou
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Fix PR rtl-optimization/34035
2007-11-12 23:00 ` Eric Botcazou
2007-11-12 23:10 ` David Edelsohn
@ 2007-11-13 0:29 ` Steven Bosscher
2007-11-13 1:51 ` Eric Botcazou
1 sibling, 1 reply; 10+ messages in thread
From: Steven Bosscher @ 2007-11-13 0:29 UTC (permalink / raw)
To: Eric Botcazou; +Cc: gcc-patches
On Nov 12, 2007 10:49 PM, Eric Botcazou <ebotcazou@libertysurf.fr> wrote:
> > This is how compile time always creeps up in gcc.
>
> Figures?
>
> > Please just call delete_unreachable_blocks() instead of cleanup_cfg().
>
> I think that cleanup_cfg (0) is a good compromise in this case.
I respectfully disagree.
I appreciate you fixed this bug, first of all. But if deleting just
the unreachable blocks is sufficient, then there is no reason that I
can think of to do a cfg cleanup.
With cleanup_cfg(0) you add another delete_trivially_dead_insns()
pass, and three iterations over all basic blocks. This does seem, at
first, to be quite cheap. The problem is that we do it all over the
place when it actually doesn't buy you anything: The "good" rtl passes
look through the dead code that d_t_d_i can remove, and even the
semi-good passes don't care about an extra forwarder block or a
single_succ block here and there.
IMO much of gcc's abysmal compile time performance can be blamed on
doing redundant work all over the place, trying hard for little gain.
Sorry to jump on your patch like this, but it just looks like
"obvious" redundant cfg cleanup work (bikeshed, bikeshed! ;-)
But, many thanks for fixing the bug, at least! :-)
Gr.
Steven
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Fix PR rtl-optimization/34035
2007-11-13 0:29 ` Steven Bosscher
@ 2007-11-13 1:51 ` Eric Botcazou
2007-11-13 2:16 ` Steven Bosscher
0 siblings, 1 reply; 10+ messages in thread
From: Eric Botcazou @ 2007-11-13 1:51 UTC (permalink / raw)
To: Steven Bosscher; +Cc: gcc-patches
> With cleanup_cfg(0) you add another delete_trivially_dead_insns()
> pass, and three iterations over all basic blocks. This does seem, at
> first, to be quite cheap. The problem is that we do it all over the
> place when it actually doesn't buy you anything: The "good" rtl passes
> look through the dead code that d_t_d_i can remove, and even the
> semi-good passes don't care about an extra forwarder block or a
> single_succ block here and there.
cleanup_cfg (0) is the canonical invocation to minimally clean up the CFG at
the end of the RTL passes. It is already run at the end of every CSE and
GCSE pass if a single jump is modified. We now also invoke it on demand in,
I believe, relatively rare cases.
> IMO much of gcc's abysmal compile time performance can be blamed on
> doing redundant work all over the place, trying hard for little gain.
> Sorry to jump on your patch like this, but it just looks like
> "obvious" redundant cfg cleanup work (bikeshed, bikeshed! ;-)
I don't think cleanup_cfg (0) is "trying hard"; otherwise tree optimization
passes shouldn't be allowed to set TODO_cleanup_cfg. That being said, if you
can come up with a testcase, I'll revise my position. Of course don't try
with C, C++ or Fortran, the change only affects those languages for which
-fnon-call-exceptions is set, e.g. Ada that I specially care about.
--
Eric Botcazou
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Fix PR rtl-optimization/34035
2007-11-13 1:51 ` Eric Botcazou
@ 2007-11-13 2:16 ` Steven Bosscher
2007-11-13 15:45 ` Eric Botcazou
2007-11-14 12:46 ` Eric Botcazou
0 siblings, 2 replies; 10+ messages in thread
From: Steven Bosscher @ 2007-11-13 2:16 UTC (permalink / raw)
To: Eric Botcazou; +Cc: gcc-patches
On Nov 13, 2007 12:01 AM, Eric Botcazou <ebotcazou@libertysurf.fr> wrote:
> > With cleanup_cfg(0) you add another delete_trivially_dead_insns()
> > pass, and three iterations over all basic blocks. This does seem, at
> > first, to be quite cheap. The problem is that we do it all over the
> > place when it actually doesn't buy you anything: The "good" rtl passes
> > look through the dead code that d_t_d_i can remove, and even the
> > semi-good passes don't care about an extra forwarder block or a
> > single_succ block here and there.
>
> cleanup_cfg (0) is the canonical invocation to minimally clean up the CFG at
> the end of the RTL passes. It is already run at the end of every CSE and
> GCSE pass if a single jump is modified.
True. And whether that is a good thing? I seriously doubt it.
> > IMO much of gcc's abysmal compile time performance can be blamed on
> > doing redundant work all over the place, trying hard for little gain.
> > Sorry to jump on your patch like this, but it just looks like
> > "obvious" redundant cfg cleanup work (bikeshed, bikeshed! ;-)
>
> I don't think cleanup_cfg (0) is "trying hard"; otherwise tree optimization
> passes shouldn't be allowed to set TODO_cleanup_cfg.
Tree cleanup cfg takes about 3.5% of the total compile time for my
cc1-i files. 'nuff said.
> That being said, if you
> can come up with a testcase, I'll revise my position.
I think any random large function will show some compile time
increase, although obviously no compile time explosions. As I said,
the problem is not so much to do it here, but that we do it
everywhere. Anyway, ...
> Of course don't try
> with C, C++ or Fortran, the change only affects those languages for which
> -fnon-call-exceptions is set, e.g. Ada that I specially care about.
...I had overlooked it only affects compilation with
-fnon-call-exceptions. Not worth the battle :-)
Gr.
Steven
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Fix PR rtl-optimization/34035
2007-11-13 2:16 ` Steven Bosscher
@ 2007-11-13 15:45 ` Eric Botcazou
2007-11-14 12:46 ` Eric Botcazou
1 sibling, 0 replies; 10+ messages in thread
From: Eric Botcazou @ 2007-11-13 15:45 UTC (permalink / raw)
To: Steven Bosscher; +Cc: gcc-patches
> I think any random large function will show some compile time
> increase, although obviously no compile time explosions. As I said,
> the problem is not so much to do it here, but that we do it
> everywhere.
Here is at least a figure: 'make all-target-libjava' on x86-64/Linux with
default+RTL checking, before the patch
real 117m59.962s
user 108m4.086s
sys 3m13.106s
and after the patch:
real 117m32.815s
user 107m57.010s
sys 3m16.274s
> ...I had overlooked it only affects compilation with -fnon-call-exceptions.
> Not worth the battle :-)
Looks like I've eventually found the killer argument. :-)
--
Eric Botcazou
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Fix PR rtl-optimization/34035
2007-11-13 2:16 ` Steven Bosscher
2007-11-13 15:45 ` Eric Botcazou
@ 2007-11-14 12:46 ` Eric Botcazou
1 sibling, 0 replies; 10+ messages in thread
From: Eric Botcazou @ 2007-11-14 12:46 UTC (permalink / raw)
To: Steven Bosscher; +Cc: gcc-patches
> I think any random large function will show some compile time
> increase, although obviously no compile time explosions. As I said,
> the problem is not so much to do it here, but that we do it
> everywhere.
Same wash for the Ada library:
real 8m4.547s
user 7m19.187s
sys 0m21.496s
real 7m47.673s
user 7m17.818s
sys 0m21.297s
-ftime-report yields no (significant) differences for the biggest Ada files.
The compiler is even slightly but consistently faster on make.adb (300KB).
--
Eric Botcazou
^ permalink raw reply [flat|nested] 10+ messages in thread
* Fix PR rtl-optimization/34035
@ 2007-11-12 22:22 Eric Botcazou
0 siblings, 0 replies; 10+ messages in thread
From: Eric Botcazou @ 2007-11-12 22:22 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1303 bytes --]
This is a regression present on the mainline at -O2 -fnon-call-exceptions
-ffast-math -fno-gcse. An assertion in the dominance.c code is triggered
at the beginning of the second fwprop pass because there is a dangling
basic block in the CFG.
It has been created by the second CSE pass, more specifically:
/* With non-call exceptions, we are not always able to update
the CFG properly inside cse_insn. So clean up possibly
redundant EH edges here. */
if (flag_non_call_exceptions && have_eh_succ_edges (bb))
purge_dead_edges (bb);
Fixed by running cleanup_cfg (0) in this case.
Bootstrapped/regtested on x86_64-suse-linux, applied on the mainline.
2007-11-12 Eric Botcazou <ebotcazou@libertysurf.fr>
PR rtl-optimization/34035
* cse.c (cse_cfg_altered): New global variable.
(cse_jumps_altered): Make boolean.
(recorded_label_ref): Likewise.
(cse_insn): Adjust for above changes.
(cse_extended_basic_block): Likewise. Set cse_cfg_altered
if dead edges have been purged.
(cse_main): Change return value specification and adjust code.
(rest_of_handle_cse): Adjust for above change.
(rest_of_handle_cse2): Likewise.
* gcse.c (rest_of_handle_gcse): Likewise.
2007-11-12 Eric Botcazou <ebotcazou@libertysurf.fr>
* g++.dg/opt/cfg5.C: New test.
--
Eric Botcazou
[-- Attachment #2: pr34035.diff --]
[-- Type: text/x-diff, Size: 5661 bytes --]
Index: cse.c
===================================================================
--- cse.c (revision 130003)
+++ cse.c (working copy)
@@ -348,15 +348,17 @@ static unsigned int cse_reg_info_timesta
static HARD_REG_SET hard_regs_in_table;
-/* Nonzero if cse has altered conditional jump insns
- in such a way that jump optimization should be redone. */
+/* True if CSE has altered the CFG. */
+static bool cse_cfg_altered;
-static int cse_jumps_altered;
-
-/* Nonzero if we put a LABEL_REF into the hash table for an INSN
- without a REG_LABEL_OPERAND, we have to rerun jump after CSE to put
- in the note. */
-static int recorded_label_ref;
+/* True if CSE has altered conditional jump insns in such a way
+ that jump optimization should be redone. */
+static bool cse_jumps_altered;
+
+/* True if we put a LABEL_REF into the hash table for an INSN
+ without a REG_LABEL_OPERAND, we have to rerun jump after CSE
+ to put in the note. */
+static bool recorded_label_ref;
/* canon_hash stores 1 in do_not_record
if it notices a reference to CC0, PC, or some other volatile
@@ -4761,7 +4763,7 @@ cse_insn (rtx insn, rtx libcall_insn)
continue;
SET_SRC (sets[i].rtl) = trial;
- cse_jumps_altered = 1;
+ cse_jumps_altered = true;
break;
}
@@ -4987,7 +4989,7 @@ cse_insn (rtx insn, rtx libcall_insn)
{
/* One less use of the label this insn used to jump to. */
delete_insn_and_edges (insn);
- cse_jumps_altered = 1;
+ cse_jumps_altered = true;
/* No more processing for this set. */
sets[i].rtl = 0;
}
@@ -5026,10 +5028,8 @@ cse_insn (rtx insn, rtx libcall_insn)
else
INSN_CODE (insn) = -1;
- /* Do not bother deleting any unreachable code,
- let jump/flow do that. */
-
- cse_jumps_altered = 1;
+ /* Do not bother deleting any unreachable code, let jump do it. */
+ cse_jumps_altered = true;
sets[i].rtl = 0;
}
@@ -6033,10 +6033,10 @@ cse_extended_basic_block (struct cse_bas
/* If we haven't already found an insn where we added a LABEL_REF,
check this one. */
- if (INSN_P (insn) && ! recorded_label_ref
+ if (INSN_P (insn) && !recorded_label_ref
&& for_each_rtx (&PATTERN (insn), check_for_label_ref,
(void *) insn))
- recorded_label_ref = 1;
+ recorded_label_ref = true;
#ifdef HAVE_cc0
/* If the previous insn set CC0 and this insn no longer
@@ -6074,7 +6074,7 @@ cse_extended_basic_block (struct cse_bas
the CFG properly inside cse_insn. So clean up possibly
redundant EH edges here. */
if (flag_non_call_exceptions && have_eh_succ_edges (bb))
- purge_dead_edges (bb);
+ cse_cfg_altered |= purge_dead_edges (bb);
/* If we changed a conditional jump, we may have terminated
the path we are following. Check that by verifying that
@@ -6132,8 +6132,10 @@ cse_extended_basic_block (struct cse_bas
F is the first instruction.
NREGS is one plus the highest pseudo-reg number used in the instruction.
- Returns 1 if jump_optimize should be redone due to simplifications
- in conditional jump instructions. */
+ Return 2 if jump optimizations should be redone due to simplifications
+ in conditional jump instructions.
+ Return 1 if the CFG should be cleaned up because it has been modified.
+ Return 0 otherwise. */
int
cse_main (rtx f ATTRIBUTE_UNUSED, int nregs)
@@ -6153,8 +6155,9 @@ cse_main (rtx f ATTRIBUTE_UNUSED, int nr
ebb_data.path = XNEWVEC (struct branch_path,
PARAM_VALUE (PARAM_MAX_CSE_PATH_LENGTH));
- cse_jumps_altered = 0;
- recorded_label_ref = 0;
+ cse_cfg_altered = false;
+ cse_jumps_altered = false;
+ recorded_label_ref = false;
constant_pool_entries_cost = 0;
constant_pool_entries_regcost = 0;
ebb_data.path_size = 0;
@@ -6216,7 +6219,12 @@ cse_main (rtx f ATTRIBUTE_UNUSED, int nr
free (rc_order);
rtl_hooks = general_rtl_hooks;
- return cse_jumps_altered || recorded_label_ref;
+ if (cse_jumps_altered || recorded_label_ref)
+ return 2;
+ else if (cse_cfg_altered)
+ return 1;
+ else
+ return 0;
}
\f
/* Called via for_each_rtx to see if an insn is using a LABEL_REF for
@@ -6948,10 +6956,14 @@ rest_of_handle_cse (void)
expecting CSE to be run. But always rerun it in a cheap mode. */
cse_not_expected = !flag_rerun_cse_after_loop && !flag_gcse;
- if (tem)
- rebuild_jump_labels (get_insns ());
-
- if (tem || optimize > 1)
+ if (tem == 2)
+ {
+ timevar_push (TV_JUMP);
+ rebuild_jump_labels (get_insns ());
+ cleanup_cfg (0);
+ timevar_pop (TV_JUMP);
+ }
+ else if (tem == 1 || optimize > 1)
cleanup_cfg (0);
return 0;
@@ -7003,13 +7015,16 @@ rest_of_handle_cse2 (void)
delete_trivially_dead_insns (get_insns (), max_reg_num ());
- if (tem)
+ if (tem == 2)
{
timevar_push (TV_JUMP);
rebuild_jump_labels (get_insns ());
cleanup_cfg (0);
timevar_pop (TV_JUMP);
}
+ else if (tem == 1)
+ cleanup_cfg (0);
+
cse_not_expected = 1;
return 0;
}
Index: gcse.c
===================================================================
--- gcse.c (revision 130003)
+++ gcse.c (working copy)
@@ -6752,13 +6752,15 @@ rest_of_handle_gcse (void)
/* If gcse or cse altered any jumps, rerun jump optimizations to clean
things up. */
- if (tem || tem2)
+ if (tem || tem2 == 2)
{
timevar_push (TV_JUMP);
rebuild_jump_labels (get_insns ());
cleanup_cfg (0);
timevar_pop (TV_JUMP);
}
+ else if (tem2 == 1)
+ cleanup_cfg (0);
flag_cse_skip_blocks = save_csb;
flag_cse_follow_jumps = save_cfj;
[-- Attachment #3: pr34035.C --]
[-- Type: text/x-c++src, Size: 561 bytes --]
/* PR rtl-optimization/34035 */
/* Origin: Janis Johnson <janis@gcc.gnu.org> */
/* { dg-do compile } */
/* { dg-options "-O2 -fnon-call-exceptions -ffast-math -fno-gcse" } */
class One {
public:
One () { e[0] = e[1] = 0.0; }
double e[2];
};
template <class T>
class Two {
public:
Two();
private:
T *data;
int arraySize;
};
template <class T>
Two<T>::Two() {
data = new T[arraySize];
}
class Three {
protected:
Two<One> data;
};
class Four : public Three {
public:
Four ();
void Foo (int n);
};
Four :: Four (){
Foo (1);
}
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2007-11-14 11:19 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-12 22:50 Fix PR rtl-optimization/34035 Steven Bosscher
2007-11-12 23:00 ` Eric Botcazou
2007-11-12 23:10 ` David Edelsohn
2007-11-13 0:27 ` Eric Botcazou
2007-11-13 0:29 ` Steven Bosscher
2007-11-13 1:51 ` Eric Botcazou
2007-11-13 2:16 ` Steven Bosscher
2007-11-13 15:45 ` Eric Botcazou
2007-11-14 12:46 ` Eric Botcazou
-- strict thread matches above, loose matches on Subject: below --
2007-11-12 22:22 Eric Botcazou
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).