* Re: [RFA] expand from SSA form (1/2)
@ 2009-04-27 14:15 David Edelsohn
2009-04-27 14:43 ` H.J. Lu
2009-04-27 15:08 ` Michael Matz
0 siblings, 2 replies; 56+ messages in thread
From: David Edelsohn @ 2009-04-27 14:15 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches
Michael,
A patch in the last 24 hours has caused a mis-optimization on PowerPC.
The error appears as a failure while compiling libobjc/linking.m.
A function in cgraphunit.c is being mis-compiled, probably
build_cdtor(). This causes cgraph_build_static_cdtor() to be called
with an invalid priority of "-1", instead of 65535. The priority
should not be negative. This negative priority value generates an
invalid global file function name:
_GLOBAL__I_-00001_0___objc_linking
A function name should not contain a minus sign, which is a syntax
error. Recompiling cgarphunit.c without optimization fixes the
miscompilation.
Darwin displays a similar error earlier in bootstrap.
Jakub tested his patch on PowerPC and Geoff's tester failed before
Honza's patch, so the "expand from SSA" patch appears to be the likely
culprit.
Both Darwin and AIX are bootstrapping in 32 bit mode. Richi suggested
on IRC that this may be related to some known CONST_INT issue for
which you already have a patch.
Thanks, David
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-27 14:15 [RFA] expand from SSA form (1/2) David Edelsohn
@ 2009-04-27 14:43 ` H.J. Lu
2009-04-27 15:08 ` Michael Matz
1 sibling, 0 replies; 56+ messages in thread
From: H.J. Lu @ 2009-04-27 14:43 UTC (permalink / raw)
To: David Edelsohn; +Cc: Michael Matz, gcc-patches
On Mon, Apr 27, 2009 at 7:11 AM, David Edelsohn <dje.gcc@gmail.com> wrote:
> Michael,
>
> A patch in the last 24 hours has caused a mis-optimization on PowerPC.
> The error appears as a failure while compiling libobjc/linking.m.
>
> A function in cgraphunit.c is being mis-compiled, probably
> build_cdtor(). This causes cgraph_build_static_cdtor() to be called
> with an invalid priority of "-1", instead of 65535. The priority
> should not be negative. This negative priority value generates an
> invalid global file function name:
>
> _GLOBAL__I_-00001_0___objc_linking
>
> A function name should not contain a minus sign, which is a syntax
> error. Recompiling cgarphunit.c without optimization fixes the
> miscompilation.
>
> Darwin displays a similar error earlier in bootstrap.
>
> Jakub tested his patch on PowerPC and Geoff's tester failed before
> Honza's patch, so the "expand from SSA" patch appears to be the likely
> culprit.
>
> Both Darwin and AIX are bootstrapping in 32 bit mode. Richi suggested
> on IRC that this may be related to some known CONST_INT issue for
> which you already have a patch.
>
It could be:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922
--
H.J.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-27 14:15 [RFA] expand from SSA form (1/2) David Edelsohn
2009-04-27 14:43 ` H.J. Lu
@ 2009-04-27 15:08 ` Michael Matz
2009-04-27 15:11 ` David Edelsohn
1 sibling, 1 reply; 56+ messages in thread
From: Michael Matz @ 2009-04-27 15:08 UTC (permalink / raw)
To: David Edelsohn; +Cc: gcc-patches
Hi,
On Mon, 27 Apr 2009, David Edelsohn wrote:
> A function in cgraphunit.c is being mis-compiled, probably
> build_cdtor(). This causes cgraph_build_static_cdtor() to be called
> with an invalid priority of "-1", instead of 65535. The priority should
> not be negative. This negative priority value generates an invalid
> global file function name:
> ...
> Both Darwin and AIX are bootstrapping in 32 bit mode. Richi suggested
> on IRC that this may be related to some known CONST_INT issue for
> which you already have a patch.
This confusion with the numbers seems indeed related to PR39922. If you
can test the below patch on powerpc that would be nice. I know that it
fixes the PR on i686-linux where something similar happens.
Ciao,
Michael.
--
PR middle-end/39922
* tree-outof-ssa.c (insert_value_copy_on_edge): Don't convert constants.
Index: tree-outof-ssa.c
===================================================================
--- tree-outof-ssa.c (revision 146829)
+++ tree-outof-ssa.c (working copy)
@@ -184,7 +184,7 @@ insert_value_copy_on_edge (edge e, int d
start_sequence ();
mode = GET_MODE (SA.partition_to_pseudo[dest]);
x = expand_expr (src, SA.partition_to_pseudo[dest], mode, EXPAND_NORMAL);
- if (GET_MODE (x) != mode)
+ if (GET_MODE (x) != VOIDmode && GET_MODE (x) != mode)
x = convert_to_mode (mode, x, TYPE_UNSIGNED (TREE_TYPE (src)));
if (x != SA.partition_to_pseudo[dest])
emit_move_insn (SA.partition_to_pseudo[dest], x);
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-27 15:08 ` Michael Matz
@ 2009-04-27 15:11 ` David Edelsohn
2009-04-27 15:51 ` Michael Matz
0 siblings, 1 reply; 56+ messages in thread
From: David Edelsohn @ 2009-04-27 15:11 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches
On Mon, Apr 27, 2009 at 10:43 AM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Mon, 27 Apr 2009, David Edelsohn wrote:
>
>> A function in cgraphunit.c is being mis-compiled, probably
>> build_cdtor(). This causes cgraph_build_static_cdtor() to be called
>> with an invalid priority of "-1", instead of 65535. The priority should
>> not be negative. This negative priority value generates an invalid
>> global file function name:
>> ...
>> Both Darwin and AIX are bootstrapping in 32 bit mode. Richi suggested
>> on IRC that this may be related to some known CONST_INT issue for
>> which you already have a patch.
>
> This confusion with the numbers seems indeed related to PR39922. If you
> can test the below patch on powerpc that would be nice. I know that it
> fixes the PR on i686-linux where something similar happens.
A quick test of the patch does not fix the failure on AIX. I will try
a complete bootstrap in case
the bug was causing other miscompilations.
David
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-27 15:11 ` David Edelsohn
@ 2009-04-27 15:51 ` Michael Matz
2009-04-27 17:03 ` David Edelsohn
2009-04-27 19:15 ` David Edelsohn
0 siblings, 2 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-27 15:51 UTC (permalink / raw)
To: David Edelsohn; +Cc: gcc-patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 793 bytes --]
Hi,
On Mon, 27 Apr 2009, David Edelsohn wrote:
> > This confusion with the numbers seems indeed related to PR39922. Â If
> > you can test the below patch on powerpc that would be nice. Â I know
> > that it fixes the PR on i686-linux where something similar happens.
>
> A quick test of the patch does not fix the failure on AIX. I will try a
> complete bootstrap in case the bug was causing other miscompilations.
Hmpf. There's another problem (promoted parameters), which might also
cause this. The patch I just sent to Andreas Krebbel should fix that one.
If it doesn't help AIX/darwin I fear I need a testcase. There probably
are some failures in execute.exp which also would be miscompiled, that
might be easier to extract than the miscompilation itself.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-27 15:51 ` Michael Matz
@ 2009-04-27 17:03 ` David Edelsohn
2009-04-27 17:27 ` David Edelsohn
2009-04-27 19:15 ` David Edelsohn
1 sibling, 1 reply; 56+ messages in thread
From: David Edelsohn @ 2009-04-27 17:03 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches
On Mon, Apr 27, 2009 at 11:51 AM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Mon, 27 Apr 2009, David Edelsohn wrote:
>
>> > This confusion with the numbers seems indeed related to PR39922. If
>> > you can test the below patch on powerpc that would be nice. I know
>> > that it fixes the PR on i686-linux where something similar happens.
>>
>> A quick test of the patch does not fix the failure on AIX. I will try a
>> complete bootstrap in case the bug was causing other miscompilations.
>
> Hmpf. There's another problem (promoted parameters), which might also
> cause this. The patch I just sent to Andreas Krebbel should fix that one.
> If it doesn't help AIX/darwin I fear I need a testcase. There probably
> are some failures in execute.exp which also would be miscompiled, that
> might be easier to extract than the miscompilation itself.
Bootstrapping from scratch with the patch you sent me still shows the
problem. I will try with the patch you sent to Andreas.
As I mentioned in the PR, for AIX one can reproduce the problem by
compiling libobjc/linking.m. That is a fairly small file and produces
invalid assembly due to ctor priority of -1.
David
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-27 17:03 ` David Edelsohn
@ 2009-04-27 17:27 ` David Edelsohn
0 siblings, 0 replies; 56+ messages in thread
From: David Edelsohn @ 2009-04-27 17:27 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches
On Mon, Apr 27, 2009 at 12:57 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
> On Mon, Apr 27, 2009 at 11:51 AM, Michael Matz <matz@suse.de> wrote:
>> Hi,
>>
>> On Mon, 27 Apr 2009, David Edelsohn wrote:
>>
>>> > This confusion with the numbers seems indeed related to PR39922. If
>>> > you can test the below patch on powerpc that would be nice. I know
>>> > that it fixes the PR on i686-linux where something similar happens.
>>>
>>> A quick test of the patch does not fix the failure on AIX. I will try a
>>> complete bootstrap in case the bug was causing other miscompilations.
>>
>> Hmpf. There's another problem (promoted parameters), which might also
>> cause this. The patch I just sent to Andreas Krebbel should fix that one.
>> If it doesn't help AIX/darwin I fear I need a testcase. There probably
>> are some failures in execute.exp which also would be miscompiled, that
>> might be easier to extract than the miscompilation itself.
>
> Bootstrapping from scratch with the patch you sent me still shows the
> problem. I will try with the patch you sent to Andreas.
>
> As I mentioned in the PR, for AIX one can reproduce the problem by
> compiling libobjc/linking.m. That is a fairly small file and produces
> invalid assembly due to ctor priority of -1.
Unfortunately, the second patch does not fix the PowerPC failure either.
I am running the C testsuite to see what new failures occur.
David
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-27 15:51 ` Michael Matz
2009-04-27 17:03 ` David Edelsohn
@ 2009-04-27 19:15 ` David Edelsohn
2009-04-28 0:48 ` Michael Matz
1 sibling, 1 reply; 56+ messages in thread
From: David Edelsohn @ 2009-04-27 19:15 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches
gcc.c-torture/execute/20030916-1.c is a new testsuite failure.
David
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-27 19:15 ` David Edelsohn
@ 2009-04-28 0:48 ` Michael Matz
2009-04-28 0:54 ` Luis Machado
2009-04-28 16:05 ` David Edelsohn
0 siblings, 2 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-28 0:48 UTC (permalink / raw)
To: David Edelsohn; +Cc: gcc-patches
Hi David,
On Mon, 27 Apr 2009, David Edelsohn wrote:
> gcc.c-torture/execute/20030916-1.c is a new testsuite failure.
Excellent! (or not, depending on the p.o.v. :-) ). I'll try to analyze
this. I feared already that this wouldn't help powerpc, but this most
probably is some confusion in the RTL move instructions generated on the
edges. No doubt I overlooked some target peculiarity in generating them
:-(
Meanwhile Andrew has analyzed this somewhat (PR39929). If that turns out
to be correct we would need to amend the target to check for
currently_expanding_to_rtl, not just for current_ir_type(). That wouldn't
influence the AIX breakage though.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-28 0:48 ` Michael Matz
@ 2009-04-28 0:54 ` Luis Machado
2009-04-28 1:22 ` Michael Matz
2009-04-28 16:05 ` David Edelsohn
1 sibling, 1 reply; 56+ messages in thread
From: Luis Machado @ 2009-04-28 0:54 UTC (permalink / raw)
To: Michael Matz; +Cc: David Edelsohn, gcc-patches
Hi,
On Tue, 2009-04-28 at 02:23 +0200, Michael Matz wrote:
> Hi David,
>
> On Mon, 27 Apr 2009, David Edelsohn wrote:
>
> > gcc.c-torture/execute/20030916-1.c is a new testsuite failure.
>
> Excellent! (or not, depending on the p.o.v. :-) ). I'll try to analyze
> this. I feared already that this wouldn't help powerpc, but this most
> probably is some confusion in the RTL move instructions generated on the
> edges. No doubt I overlooked some target peculiarity in generating them
> :-(
Speaking about powerpc, i've tracked down a 19% degradation on cpu2000's
32-bit sixtrack and found that revision 146817 caused/revealed it.
I'll have more details on it soon.
Regards,
Luis
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-28 0:54 ` Luis Machado
@ 2009-04-28 1:22 ` Michael Matz
2009-04-28 13:24 ` Luis Machado
2009-04-30 17:55 ` Luis Machado
0 siblings, 2 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-28 1:22 UTC (permalink / raw)
To: Luis Machado; +Cc: David Edelsohn, gcc-patches
Hi,
On Mon, 27 Apr 2009, Luis Machado wrote:
> Speaking about powerpc, i've tracked down a 19% degradation on cpu2000's
> 32-bit sixtrack and found that revision 146817 caused/revealed it.
>
> I'll have more details on it soon.
It seems also x86_64 is affected, so anything you find is very welcome.
If I may speculate it could be related to the half TER we're now doing.
As in, we're not feeding large trees to expand anymore, so there're no
opportunities to cleverly expand them to short insn sequences. For
cross-checking try to build with -fno-tree-ter (before the patch) and see
if it's resulting in the same slowdown.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-28 1:22 ` Michael Matz
@ 2009-04-28 13:24 ` Luis Machado
2009-04-30 17:55 ` Luis Machado
1 sibling, 0 replies; 56+ messages in thread
From: Luis Machado @ 2009-04-28 13:24 UTC (permalink / raw)
To: Michael Matz; +Cc: David Edelsohn, gcc-patches
Hi,
> > Speaking about powerpc, i've tracked down a 19% degradation on cpu2000's
> > 32-bit sixtrack and found that revision 146817 caused/revealed it.
> >
> > I'll have more details on it soon.
>
> It seems also x86_64 is affected, so anything you find is very welcome.
>
> If I may speculate it could be related to the half TER we're now doing.
> As in, we're not feeding large trees to expand anymore, so there're no
> opportunities to cleverly expand them to short insn sequences. For
> cross-checking try to build with -fno-tree-ter (before the patch) and see
> if it's resulting in the same slowdown.
It results in some slowdown (~7%), but it's not as big as the one i've
seen with the SSA patch.
Regards,
Luis
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-28 0:48 ` Michael Matz
2009-04-28 0:54 ` Luis Machado
@ 2009-04-28 16:05 ` David Edelsohn
2009-04-28 16:19 ` Michael Matz
2009-04-28 23:49 ` Michael Matz
1 sibling, 2 replies; 56+ messages in thread
From: David Edelsohn @ 2009-04-28 16:05 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches
On Mon, Apr 27, 2009 at 8:23 PM, Michael Matz <matz@suse.de> wrote:
> Hi David,
>
> On Mon, 27 Apr 2009, David Edelsohn wrote:
>
>> gcc.c-torture/execute/20030916-1.c is a new testsuite failure.
>
> Excellent! (or not, depending on the p.o.v. :-) ). I'll try to analyze
> this. I feared already that this wouldn't help powerpc, but this most
> probably is some confusion in the RTL move instructions generated on the
> edges. No doubt I overlooked some target peculiarity in generating them
> :-(
>
> Meanwhile Andrew has analyzed this somewhat (PR39929). If that turns out
> to be correct we would need to amend the target to check for
> currently_expanding_to_rtl, not just for current_ir_type(). That wouldn't
> influence the AIX breakage though.
I tried Andreas Krebbel's patch, but that, unfortunately, did not fix
the AIX libobjc build.
I am able to bootstrap C, C++, and Fortran, and run the testsuite.
Objective C is not
required on AIX, but the build problem is exposing a miscompilation of GCC on
PowerPC after the expand SSA merge.
David
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-28 16:05 ` David Edelsohn
@ 2009-04-28 16:19 ` Michael Matz
2009-04-28 23:49 ` Michael Matz
1 sibling, 0 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-28 16:19 UTC (permalink / raw)
To: David Edelsohn; +Cc: gcc-patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 867 bytes --]
Hi,
On Tue, 28 Apr 2009, David Edelsohn wrote:
> > Meanwhile Andrew has analyzed this somewhat (PR39929). Â If that turns
> > out to be correct we would need to amend the target to check for
> > currently_expanding_to_rtl, not just for current_ir_type(). Â That
> > wouldn't influence the AIX breakage though.
>
> I tried Andreas Krebbel's patch, but that, unfortunately, did not fix
> the AIX libobjc build.
>
> I am able to bootstrap C, C++, and Fortran, and run the testsuite.
> Objective C is not required on AIX, but the build problem is exposing a
> miscompilation of GCC on PowerPC after the expand SSA merge.
Yeah, I also seem to see a miscompilation on powerpc64-linux somewhere in
df-core.c (two times the same bitmap free'd which breaks the
df_bitmap_obstack leading to breakage downstream). Will analyze this some
more.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-28 16:05 ` David Edelsohn
2009-04-28 16:19 ` Michael Matz
@ 2009-04-28 23:49 ` Michael Matz
2009-04-29 5:50 ` David Edelsohn
1 sibling, 1 reply; 56+ messages in thread
From: Michael Matz @ 2009-04-28 23:49 UTC (permalink / raw)
To: David Edelsohn; +Cc: gcc-patches
Hi,
On Tue, 28 Apr 2009, David Edelsohn wrote:
> I tried Andreas Krebbel's patch, but that, unfortunately, did not fix
> the AIX libobjc build.
>
> I am able to bootstrap C, C++, and Fortran, and run the testsuite.
> Objective C is not required on AIX, but the build problem is exposing a
> miscompilation of GCC on PowerPC after the expand SSA merge.
I tried to reproduce this on powerpc64-linux (default languages,
multilibs) but failed. It just bootstraps, and testresults look mostly
sane, the only additional FAIL that I didn't find in older gcc-testresults
posts were execute/va-arg-22.c and dfp/pr3903[45].c :-(
That is with Andreas' patch (the approved one).
I'm at a loss now without a testcase. You said you found the
miscompilation in cgraphunit.c:build_cdtor(), so it possibly helps if you
provide the preprocessed version of that and the command line for
compilation (perhaps also giving a snippet of asm where the wrong
instructions are).
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-28 23:49 ` Michael Matz
@ 2009-04-29 5:50 ` David Edelsohn
2009-04-29 12:48 ` Michael Matz
0 siblings, 1 reply; 56+ messages in thread
From: David Edelsohn @ 2009-04-29 5:50 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 4191 bytes --]
On Tue, Apr 28, 2009 at 7:48 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Tue, 28 Apr 2009, David Edelsohn wrote:
>
>> I tried Andreas Krebbel's patch, but that, unfortunately, did not fix
>> the AIX libobjc build.
>>
>> I am able to bootstrap C, C++, and Fortran, and run the testsuite.
>> Objective C is not required on AIX, but the build problem is exposing a
>> miscompilation of GCC on PowerPC after the expand SSA merge.
>
> I tried to reproduce this on powerpc64-linux (default languages,
> multilibs) but failed. It just bootstraps, and testresults look mostly
> sane, the only additional FAIL that I didn't find in older gcc-testresults
> posts were execute/va-arg-22.c and dfp/pr3903[45].c :-(
>
> That is with Andreas' patch (the approved one).
>
> I'm at a loss now without a testcase. You said you found the
> miscompilation in cgraphunit.c:build_cdtor(), so it possibly helps if you
> provide the preprocessed version of that and the command line for
> compilation (perhaps also giving a snippet of asm where the wrong
> instructions are).
The command to compile libobjc/linking.m is in libobjc.sh.
The pre-processed linking.m is attached as linking.i.
The command to compile cgraphunit.c is in cgraphunit.sh
The pre-processed cgraphunit.c is attached as cgraphunit.i
The pre-processed tree.c is attached as tree.i
I have not had an opportunity to investigate which function is
miscompiled. cgraphunit.c compiled with -O1 or -O2 produces incorrect
assembly language. The inlining and additional jumps introduced by
optimization makes pinpointing the exact wrong code sequence
difficult. GDB shows the priority returned by
decl_init_priority_lookup() as 65535 but the value in the register is
-1, which is the value that cgraph_build_static_cdtor() receives:
priority = -1:
(gdb) n
220 cgraph_build_static_cdtor (ctor_p ? 'I' : 'D', body, priority);
(gdb) step
200 body = NULL_TREE;
(gdb)
decl_init_priority_lookup (decl=0x300e2980)
at /farm/dje/src/src/gcc/tree.c:4347
4347 gcc_assert (VAR_OR_FUNCTION_DECL_P (decl));
(gdb) finish
Run till exit from #0 decl_init_priority_lookup (decl=0x300e2980)
at /farm/dje/src/src/gcc/tree.c:4347
0x10609ce8 in build_cdtor (ctor_p=255 'ÿ', cdtors=0x30097b78, len=1)
at /farm/dje/src/src/gcc/cgraphunit.c:200
200 body = NULL_TREE;
Value returned is $5 = 65535
(gdb) finish
Run till exit from #0 0x10609ce8 in build_cdtor (ctor_p=255 'ÿ',
cdtors=0x30097b78, len=1) at /farm/dje/src/src/gcc/cgraphunit.c:200
Breakpoint 6, cgraph_build_static_cdtor (which=73 'I', body=0x3009e820,
priority=-1) at /farm/dje/src/src/gcc/cgraphunit.c:1374
1374 sprintf (which_buf, "%c_%.5d_%d", which, priority, counter++);
It may be that cgraphunit.c is not miscompiled, but massages the value
in extra ways that maintains the unsigned short and that somehow is
lost with optimization enabled, assuming the value returned is
properly zero-extended.
decl_init_priority_lookup() appears to be setting the wrong value:
/* An initialization priority. */
typedef unsigned short priority_type;
/* The initialization priority for entities for which no explicit
initialization priority has been specified. */
#define DEFAULT_INIT_PRIORITY 65535
4349 h = (struct tree_priority_map *) htab_find
(init_priority_for_decl, &in);
4350 return h ? h->init : DEFAULT_INIT_PRIORITY;
0x1002af5c <decl_init_priority_lookup+48>: bl 0x100494cc <htab_find>
0x1002af60 <decl_init_priority_lookup+52>: nop
0x1002af64 <decl_init_priority_lookup+56>: li r0,-1 <------------
0x1002af68 <decl_init_priority_lookup+60>: cmpwi r3,0
0x1002af6c <decl_init_priority_lookup+64>:
beq- 0x1002af74 <decl_init_priority_lookup+72>
0x1002af70 <decl_init_priority_lookup+68>: lhz r0,4(r3)
0x1002af74 <decl_init_priority_lookup+72>: addi r1,r1,80
0x1002af78 <decl_init_priority_lookup+76>: mr r3,r0 <-------------
When compiling libobjc/linking.m, the default priority is used, which
should be 65535, but it is being sign-extended to -1.
David
[-- Attachment #2: libobjc.sh --]
[-- Type: application/x-sh, Size: 760 bytes --]
[-- Attachment #3: linking.i.bz2 --]
[-- Type: application/x-bzip2, Size: 5450 bytes --]
[-- Attachment #4: cgraphunit.sh --]
[-- Type: application/x-sh, Size: 787 bytes --]
[-- Attachment #5: cgraphunit.i.bz2 --]
[-- Type: application/x-bzip2, Size: 132479 bytes --]
[-- Attachment #6: tree.i.bz2 --]
[-- Type: application/x-bzip2, Size: 146244 bytes --]
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-29 5:50 ` David Edelsohn
@ 2009-04-29 12:48 ` Michael Matz
2009-04-29 13:21 ` David Edelsohn
` (2 more replies)
0 siblings, 3 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-29 12:48 UTC (permalink / raw)
To: David Edelsohn; +Cc: gcc-patches
Hi David,
On Tue, 28 Apr 2009, David Edelsohn wrote:
> decl_init_priority_lookup() appears to be setting the wrong value:
>
> 0x1002af64 <decl_init_priority_lookup+56>: li r0,-1 <------------
Thanks! That was very helpful. It's another case of RTL constants
handled incorrectly because they are modeless :-( I've modifed Andreas'
patch somewhat to take care of this, see below. The important change from
his patch is the hunk in insert_value_copy_on_edge().
I'm regstrapping this currently on x86_64-linux and powerpc64-linux (plus
the fix for PR39955).
Ciao,
Michael.
--
Index: tree-outof-ssa.c
===================================================================
--- tree-outof-ssa.c (revision 146952)
+++ tree-outof-ssa.c (working copy)
@@ -128,6 +128,25 @@ set_location_for_edge (edge e)
}
}
+/* Emit insns to copy SRC into DEST converting SRC if necessary. */
+
+static inline rtx
+emit_partition_copy (rtx dest, rtx src, int unsignedsrcp)
+{
+ rtx seq;
+
+ start_sequence ();
+
+ if (GET_MODE (src) != VOIDmode && GET_MODE (src) != GET_MODE (dest))
+ src = convert_to_mode (GET_MODE (dest), src, unsignedsrcp);
+ emit_move_insn (dest, src);
+
+ seq = get_insns ();
+ end_sequence ();
+
+ return seq;
+}
+
/* Insert a copy instruction from partition SRC to DEST onto edge E. */
static void
@@ -149,12 +168,10 @@ insert_partition_copy_on_edge (edge e, i
set_location_for_edge (e);
- /* Partition copy between same base variables only, so it's the same mode,
- hence we can use emit_move_insn. */
- start_sequence ();
- emit_move_insn (SA.partition_to_pseudo[dest], SA.partition_to_pseudo[src]);
- seq = get_insns ();
- end_sequence ();
+ seq = emit_partition_copy (SA.partition_to_pseudo[dest],
+ SA.partition_to_pseudo[src],
+ TYPE_UNSIGNED (TREE_TYPE (
+ partition_to_var (SA.map, src))));
insert_insn_on_edge (seq, e);
}
@@ -166,7 +183,6 @@ static void
insert_value_copy_on_edge (edge e, int dest, tree src)
{
rtx seq, x;
- enum machine_mode mode;
if (dump_file && (dump_flags & TDF_DETAILS))
{
fprintf (dump_file,
@@ -186,6 +202,10 @@ insert_value_copy_on_edge (edge e, int d
x = expand_expr (src, SA.partition_to_pseudo[dest], mode, EXPAND_NORMAL);
if (GET_MODE (x) != VOIDmode && GET_MODE (x) != mode)
x = convert_to_mode (mode, x, TYPE_UNSIGNED (TREE_TYPE (src)));
+ if (CONSTANT_P (x) && GET_MODE (x) == VOIDmode
+ && mode != TYPE_MODE (TREE_TYPE (src)))
+ x = convert_modes (mode, TYPE_MODE (TREE_TYPE (src)),
+ x, TYPE_UNSIGNED (TREE_TYPE (src)));
if (x != SA.partition_to_pseudo[dest])
emit_move_insn (SA.partition_to_pseudo[dest], x);
seq = get_insns ();
@@ -198,7 +218,7 @@ insert_value_copy_on_edge (edge e, int d
onto edge E. */
static void
-insert_rtx_to_part_on_edge (edge e, int dest, rtx src)
+insert_rtx_to_part_on_edge (edge e, int dest, rtx src, int unsignedsrcp)
{
rtx seq;
if (dump_file && (dump_flags & TDF_DETAILS))
@@ -214,11 +234,9 @@ insert_rtx_to_part_on_edge (edge e, int
gcc_assert (SA.partition_to_pseudo[dest]);
set_location_for_edge (e);
- start_sequence ();
- gcc_assert (GET_MODE (src) == GET_MODE (SA.partition_to_pseudo[dest]));
- emit_move_insn (SA.partition_to_pseudo[dest], src);
- seq = get_insns ();
- end_sequence ();
+ seq = emit_partition_copy (SA.partition_to_pseudo[dest],
+ src,
+ unsignedsrcp);
insert_insn_on_edge (seq, e);
}
@@ -243,11 +261,10 @@ insert_part_to_rtx_on_edge (edge e, rtx
gcc_assert (SA.partition_to_pseudo[src]);
set_location_for_edge (e);
- start_sequence ();
- gcc_assert (GET_MODE (dest) == GET_MODE (SA.partition_to_pseudo[src]));
- emit_move_insn (dest, SA.partition_to_pseudo[src]);
- seq = get_insns ();
- end_sequence ();
+ seq = emit_partition_copy (dest,
+ SA.partition_to_pseudo[src],
+ TYPE_UNSIGNED (TREE_TYPE (
+ partition_to_var (SA.map, src))));
insert_insn_on_edge (seq, e);
}
@@ -522,14 +539,17 @@ elim_create (elim_graph g, int T)
if (elim_unvisited_predecessor (g, T))
{
- rtx U = get_temp_reg (partition_to_var (g->map, T));
+ tree var = partition_to_var (g->map, T);
+ rtx U = get_temp_reg (var);
+ int unsignedsrcp = TYPE_UNSIGNED (TREE_TYPE (var));
+
insert_part_to_rtx_on_edge (g->e, U, T);
FOR_EACH_ELIM_GRAPH_PRED (g, T, P,
{
if (!TEST_BIT (g->visited, P))
{
elim_backward (g, P);
- insert_rtx_to_part_on_edge (g->e, P, U);
+ insert_rtx_to_part_on_edge (g->e, P, U, unsignedsrcp);
}
});
}
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-29 12:48 ` Michael Matz
@ 2009-04-29 13:21 ` David Edelsohn
2009-04-29 13:35 ` Michael Matz
2009-04-29 14:38 ` David Edelsohn
2009-04-29 15:03 ` Andreas Krebbel
2 siblings, 1 reply; 56+ messages in thread
From: David Edelsohn @ 2009-04-29 13:21 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches
On Wed, Apr 29, 2009 at 8:39 AM, Michael Matz <matz@suse.de> wrote:
> @@ -166,7 +183,6 @@ static void
> insert_value_copy_on_edge (edge e, int dest, tree src)
> {
> rtx seq, x;
> - enum machine_mode mode;
Did you intend to remove the declaration of mode?
David
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-29 13:21 ` David Edelsohn
@ 2009-04-29 13:35 ` Michael Matz
0 siblings, 0 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-29 13:35 UTC (permalink / raw)
To: David Edelsohn; +Cc: gcc-patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 406 bytes --]
Hi,
On Wed, 29 Apr 2009, David Edelsohn wrote:
> > @@ -166,7 +183,6 @@ static void
> > Â insert_value_copy_on_edge (edge e, int dest, tree src)
> > Â {
> > Â rtx seq, x;
> > - Â enum machine_mode mode;
>
> Did you intend to remove the declaration of mode?
Nope, oversight when merging the patches for sending, my testing trees
have exactly this patch minus that hunk applied.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-29 12:48 ` Michael Matz
2009-04-29 13:21 ` David Edelsohn
@ 2009-04-29 14:38 ` David Edelsohn
2009-04-29 14:50 ` Richard Guenther
2009-04-29 15:03 ` Andreas Krebbel
2 siblings, 1 reply; 56+ messages in thread
From: David Edelsohn @ 2009-04-29 14:38 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches
On Wed, Apr 29, 2009 at 8:39 AM, Michael Matz <matz@suse.de> wrote:
> Thanks! That was very helpful. It's another case of RTL constants
> handled incorrectly because they are modeless :-( I've modifed Andreas'
> patch somewhat to take care of this, see below. The important change from
> his patch is the hunk in insert_value_copy_on_edge().
>
> I'm regstrapping this currently on x86_64-linux and powerpc64-linux (plus
> the fix for PR39955).
This latest patch fixes the libobjc build problem on AIX. GCC now
bootstraps successfully with all libraries.
Thanks, David
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-29 14:38 ` David Edelsohn
@ 2009-04-29 14:50 ` Richard Guenther
0 siblings, 0 replies; 56+ messages in thread
From: Richard Guenther @ 2009-04-29 14:50 UTC (permalink / raw)
To: David Edelsohn; +Cc: Michael Matz, gcc-patches
On Wed, Apr 29, 2009 at 4:34 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
> On Wed, Apr 29, 2009 at 8:39 AM, Michael Matz <matz@suse.de> wrote:
>
>> Thanks! That was very helpful. It's another case of RTL constants
>> handled incorrectly because they are modeless :-( I've modifed Andreas'
>> patch somewhat to take care of this, see below. The important change from
>> his patch is the hunk in insert_value_copy_on_edge().
>>
>> I'm regstrapping this currently on x86_64-linux and powerpc64-linux (plus
>> the fix for PR39955).
>
> This latest patch fixes the libobjc build problem on AIX. GCC now
> bootstraps successfully with all libraries.
Thus, that patch is ok for mainline.
Thanks,
Richard.
> Thanks, David
>
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-29 12:48 ` Michael Matz
2009-04-29 13:21 ` David Edelsohn
2009-04-29 14:38 ` David Edelsohn
@ 2009-04-29 15:03 ` Andreas Krebbel
2009-04-29 15:11 ` Michael Matz
2 siblings, 1 reply; 56+ messages in thread
From: Andreas Krebbel @ 2009-04-29 15:03 UTC (permalink / raw)
To: Michael Matz; +Cc: David Edelsohn, gcc-patches
Hi Michael,
> I'm regstrapping this currently on x86_64-linux and powerpc64-linux (plus
> the fix for PR39955).
I've bootstrapped the patch on s390x. On s390 I am seeing an RTL
sharing bug which I haven't tracked down yet.
Bye,
-Andreas-
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-29 15:03 ` Andreas Krebbel
@ 2009-04-29 15:11 ` Michael Matz
2009-04-29 15:40 ` Andreas Krebbel
0 siblings, 1 reply; 56+ messages in thread
From: Michael Matz @ 2009-04-29 15:11 UTC (permalink / raw)
To: Andreas Krebbel; +Cc: David Edelsohn, gcc-patches
Hi,
On Wed, 29 Apr 2009, Andreas Krebbel wrote:
> > I'm regstrapping this currently on x86_64-linux and powerpc64-linux
> > (plus the fix for PR39955).
>
> I've bootstrapped the patch on s390x. On s390 I am seeing an RTL
> sharing bug which I haven't tracked down yet.
Your original one, or the amended one (for AIX)? I'm asking because I
noticed a problem when fiddling with the AIX problem. Your change to
insert_value_copy_on_edge might not work in all cases as you're doing the
expand_expr outside of a start_sequence. I decided to simply not use the
helper function as I had to also change the code to use convert_modes.
It shouldn't expose itself as an RTL sharing problem, but it definitely
might have undesired effects. Unfortunately our s390 is off today, so I
can't help without a testcase.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-29 15:11 ` Michael Matz
@ 2009-04-29 15:40 ` Andreas Krebbel
2009-04-29 17:33 ` Michael Matz
0 siblings, 1 reply; 56+ messages in thread
From: Andreas Krebbel @ 2009-04-29 15:40 UTC (permalink / raw)
To: Michael Matz; +Cc: David Edelsohn, gcc-patches
> Your original one, or the amended one (for AIX)? I'm asking because I
> noticed a problem when fiddling with the AIX problem. Your change to
> insert_value_copy_on_edge might not work in all cases as you're doing the
> expand_expr outside of a start_sequence. I decided to simply not use the
> helper function as I had to also change the code to use convert_modes.
>
> It shouldn't expose itself as an RTL sharing problem, but it definitely
> might have undesired effects. Unfortunately our s390 is off today, so I
> can't help without a testcase.
I've tested the modified version from your last posting
(http://gcc.gnu.org/ml/gcc-patches/2009-04/msg02325.html). A already
debugged a problem caused by my insert_value_copy_on_edge change - an
infinite loop due to a miscompile of do_add in real.c arrgh.
Bye,
-Andreas-
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-29 15:40 ` Andreas Krebbel
@ 2009-04-29 17:33 ` Michael Matz
2009-04-29 17:41 ` Michael Matz
0 siblings, 1 reply; 56+ messages in thread
From: Michael Matz @ 2009-04-29 17:33 UTC (permalink / raw)
To: Andreas Krebbel; +Cc: David Edelsohn, gcc-patches
Hi,
On Wed, 29 Apr 2009, Andreas Krebbel wrote:
> > Your original one, or the amended one (for AIX)? I'm asking because I
> > noticed a problem when fiddling with the AIX problem. Your change to
> > insert_value_copy_on_edge might not work in all cases as you're doing the
> > expand_expr outside of a start_sequence. I decided to simply not use the
> > helper function as I had to also change the code to use convert_modes.
> >
> > It shouldn't expose itself as an RTL sharing problem, but it definitely
> > might have undesired effects. Unfortunately our s390 is off today, so I
> > can't help without a testcase.
>
> I've tested the modified version from your last posting
> (http://gcc.gnu.org/ml/gcc-patches/2009-04/msg02325.html). A already
> debugged a problem caused by my insert_value_copy_on_edge change - an
> infinite loop due to a miscompile of do_add in real.c arrgh.
The machine is online again, so I had a look. cprop creates this sharing,
but it seems it's a preexisting bug (no idea why it didn't trigger
before).
The problem happens in this insn:
(insn 26 25 27 7 ../../../gcc/libgomp/config/linux/affinity.c:82
(set (reg:DI 65 [ cpu+-4 ])
(ior:DI (ashift:DI (zero_extend:DI (truncate:SI (mod:DI (reg:DI 65 [ cpu+-4 ])
(sign_extend:DI (reg:SI 66 [ gomp_cpu_affinity_len ] )))))
(const_int 32 [0x20]))
(zero_extend:DI (truncate:SI
(div:DI (reg:DI 65 [ cpu+-4 ])
(sign_extend:DI (reg:SI 66 [ gomp_cpu_affinity_len ]))))))) 352 {divmoddisi3}
(expr_list:REG_EQUAL (ior:DI (ashift:DI (zero_extend:DI (umod:SI (reg:SI 61)
(mem/c/i:SI (plus:SI (reg:SI 12 %r12)
(reg:SI 64)) [7 gomp_cpu_affinity_len+0 S4 A32])))
(const_int 32 [0x20]))
(zero_extend:DI (udiv:SI (reg:SI 61)
(mem/c/i:SI (plus:SI (reg:SI 12 %r12)
(reg:SI 64)) [7 gomp_cpu_affinity_len+0 S4 A32]))))
(nil)))
pseudo 64 is the interesting one. Note how it only occurs in the
REG_EQUAL not, but it occurs twice. try_replace_reg won't find anything
to substitute in the PATTERN itself, but it unconditionally wants to
replace occurences also in the notes:
if (note != 0 && REG_NOTE_KIND (note) == REG_EQUAL)
set_unique_reg_note (insn, REG_EQUAL,
simplify_replace_rtx (XEXP (note, 0), from,
copy_rtx (to)));
Now, simplify_replace_rtx doesn't do any unsharing, and apply_change_group
doesn't work for notes. So reg 64 is replaced by this:
(const:SI (unspec:SI [
(symbol_ref:SI ("gomp_cpu_affinity_len") [flags 0x42]
<var_decl0x40938540 gomp_cpu_affinity_len>)
] 112))
Voila, RTL sharing in the note. We need to unshare explicitely when using
simplify_replace_rtx() when we can't be sure that FROM only occurs once,
or TO is sharable. Like the patch below. It lets me build libgomp on
s390.
On this basis I'm checking in our combined patch fixing the AIX problem.
I'd be grateful for any further testing of this patch for the RTL sharing
problem, I have not bootstrapped it on s390.
Ciao,
Michael.
--
* gcse.c (try_replace_reg): Unshare RTL when substituting into
notes.
--- gcse.c 2009-04-29 14:34:41.000000000 +0200
+++ /suse/matz/gcse.c 2009-04-29 19:19:44.833381000 +0200
@@ -2272,9 +2272,14 @@ try_replace_reg (rtx from, rtx to, rtx i
/* If there is already a REG_EQUAL note, update the expression in it
with our replacement. */
if (note != 0 && REG_NOTE_KIND (note) == REG_EQUAL)
- set_unique_reg_note (insn, REG_EQUAL,
- simplify_replace_rtx (XEXP (note, 0), from,
- copy_rtx (to)));
+ {
+ rtx x = XEXP (note, 0);
+ x = simplify_replace_rtx (x, from, copy_rtx (to));
+ reset_used_flags (x);
+ x = copy_rtx_if_shared (x);
+ set_unique_reg_note (insn, REG_EQUAL, x);
+ }
+
if (!success && set && reg_mentioned_p (from, SET_SRC (set)))
{
/* If above failed and this is a single set, try to simplify the source of
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-29 17:33 ` Michael Matz
@ 2009-04-29 17:41 ` Michael Matz
0 siblings, 0 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-29 17:41 UTC (permalink / raw)
To: Andreas Krebbel; +Cc: David Edelsohn, gcc-patches
On Wed, 29 Apr 2009, Michael Matz wrote:
> The machine is online again, so I had a look. cprop creates this
> sharing, but it seems it's a preexisting bug (no idea why it didn't
> trigger before).
Bah, wires crossing :-) I retract the unsharing patch then.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-28 1:22 ` Michael Matz
2009-04-28 13:24 ` Luis Machado
@ 2009-04-30 17:55 ` Luis Machado
2009-05-01 19:33 ` Richard Guenther
1 sibling, 1 reply; 56+ messages in thread
From: Luis Machado @ 2009-04-30 17:55 UTC (permalink / raw)
To: Michael Matz; +Cc: David Edelsohn, gcc-patches
Hi,
On Tue, 2009-04-28 at 02:48 +0200, Michael Matz wrote:
> Hi,
>
> On Mon, 27 Apr 2009, Luis Machado wrote:
>
> > Speaking about powerpc, i've tracked down a 19% degradation on cpu2000's
> > 32-bit sixtrack and found that revision 146817 caused/revealed it.
> >
> > I'll have more details on it soon.
>
> It seems also x86_64 is affected, so anything you find is very welcome.
>
> If I may speculate it could be related to the half TER we're now doing.
> As in, we're not feeding large trees to expand anymore, so there're no
> opportunities to cleverly expand them to short insn sequences. For
> cross-checking try to build with -fno-tree-ter (before the patch) and see
> if it's resulting in the same slowdown.
I've tracked down the cause of the degradation on sixtrack.
We have a hot spot on sixtrack in a function called thin6d.
Such loop is generated by the old (pre-146817) gcc as a single BB, thus
the only way inside that loop is by executing instructions until we fall
into that code.
The post-146817 gcc breaks that loop in two BB's, such that we can
actually branch to the middle of that loop in the first iteration, and
then the loop runs just like in pre-146817.
The degradation comes from the fact that the creation of two BB's for
that single loop breaks good scheduling of instructions inside it, like
this:
Good code: All the fp load instructions are grouped in the upper portion
of the code.
fmul f22,f11,f13
fmul f23,f11,f0
addis r12,r6,-27
lfd f3,0(r6)
addi r4,r6,8
lfd f1,9472(r12)
addis r12,r4,-27
fmadd f8,f12,f0,f22
fmsub f4,f12,f13,f23
lfd f22,9472(r12)
lfd f23,8(r6)
addi r6,r4,8
fmul f11,f8,f13
fmul f24,f8,f1
fmul f25,f8,f3
fmul f5,f8,f0
fmadd f11,f4,f0,f11
fmadd f21,f4,f3,f24
fmsub f2,f4,f1,f25
fmsub f12,f4,f13,f5
fmul f1,f11,f23
fmul f8,f11,f22
fadd f9,f9,f21
fadd f10,f10,f2
fmsub f24,f12,f22,f1
fmadd f25,f12,f23,f8
fadd f10,f10,f24
fadd f9,f9,f25
bdnz 100ca878 <thin6d_+0x1018>
Bad code: The second pair of loads are pushed down the second BB,
causing slowdowns.
fmul f5,f8,f0
addis r3,r4,-27
lfd f22,8(r7)
addi r7,r4,8
lfd f6,9472(r3)
fmadd f10,f9,f0,f10
fmsub f23,f9,f13,f5
fmul f2,f10,f22
fmul f9,f10,f6
fmr f7,f23
fmsub f25,f23,f6,f2
fmadd f26,f23,f22,f9
fadd f12,f12,f25
fadd f11,f11,f26
fmul f8,f10,f13
>> BB mark
fmul f22,f10,f0
addis r3,r7,-27
lfd f21,0(r7)
addi r4,r7,8
lfd f25,9472(r3)
fmadd f8,f7,f0,f8
fmsub f9,f7,f13,f22
fmul f23,f8,f21
fmul f26,f8,f25
fmsub f24,f9,f25,f23
fmadd f7,f9,f21,f26
fadd f12,f12,f24
fadd f11,f11,f7
bdnz 100c9fe0 <thin6d_+0xfd0>
I've opened bugzilla http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
for this.
Best regards,
Luis
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-30 17:55 ` Luis Machado
@ 2009-05-01 19:33 ` Richard Guenther
2009-05-04 13:38 ` Luis Machado
0 siblings, 1 reply; 56+ messages in thread
From: Richard Guenther @ 2009-05-01 19:33 UTC (permalink / raw)
To: luisgpm; +Cc: Michael Matz, David Edelsohn, gcc-patches
On Thu, Apr 30, 2009 at 6:26 PM, Luis Machado
<luisgpm@linux.vnet.ibm.com> wrote:
> Hi,
>
> On Tue, 2009-04-28 at 02:48 +0200, Michael Matz wrote:
>> Hi,
>>
>> On Mon, 27 Apr 2009, Luis Machado wrote:
>>
>> > Speaking about powerpc, i've tracked down a 19% degradation on cpu2000's
>> > 32-bit sixtrack and found that revision 146817 caused/revealed it.
>> >
>> > I'll have more details on it soon.
>>
>> It seems also x86_64 is affected, so anything you find is very welcome.
>>
>> If I may speculate it could be related to the half TER we're now doing.
>> As in, we're not feeding large trees to expand anymore, so there're no
>> opportunities to cleverly expand them to short insn sequences. For
>> cross-checking try to build with -fno-tree-ter (before the patch) and see
>> if it's resulting in the same slowdown.
>
> I've tracked down the cause of the degradation on sixtrack.
>
> We have a hot spot on sixtrack in a function called thin6d.
>
> Such loop is generated by the old (pre-146817) gcc as a single BB, thus
> the only way inside that loop is by executing instructions until we fall
> into that code.
>
> The post-146817 gcc breaks that loop in two BB's, such that we can
> actually branch to the middle of that loop in the first iteration, and
> then the loop runs just like in pre-146817.
>
> The degradation comes from the fact that the creation of two BB's for
> that single loop breaks good scheduling of instructions inside it, like
> this:
>
> Good code: All the fp load instructions are grouped in the upper portion
> of the code.
>
> fmul f22,f11,f13
> fmul f23,f11,f0
> addis r12,r6,-27
> lfd f3,0(r6)
> addi r4,r6,8
> lfd f1,9472(r12)
> addis r12,r4,-27
> fmadd f8,f12,f0,f22
> fmsub f4,f12,f13,f23
> lfd f22,9472(r12)
> lfd f23,8(r6)
> addi r6,r4,8
> fmul f11,f8,f13
> fmul f24,f8,f1
> fmul f25,f8,f3
> fmul f5,f8,f0
> fmadd f11,f4,f0,f11
> fmadd f21,f4,f3,f24
> fmsub f2,f4,f1,f25
> fmsub f12,f4,f13,f5
> fmul f1,f11,f23
> fmul f8,f11,f22
> fadd f9,f9,f21
> fadd f10,f10,f2
> fmsub f24,f12,f22,f1
> fmadd f25,f12,f23,f8
> fadd f10,f10,f24
> fadd f9,f9,f25
> bdnz 100ca878 <thin6d_+0x1018>
>
> Bad code: The second pair of loads are pushed down the second BB,
> causing slowdowns.
>
> fmul f5,f8,f0
> addis r3,r4,-27
> lfd f22,8(r7)
> addi r7,r4,8
> lfd f6,9472(r3)
> fmadd f10,f9,f0,f10
> fmsub f23,f9,f13,f5
> fmul f2,f10,f22
> fmul f9,f10,f6
> fmr f7,f23
> fmsub f25,f23,f6,f2
> fmadd f26,f23,f22,f9
> fadd f12,f12,f25
> fadd f11,f11,f26
> fmul f8,f10,f13
>>> BB mark
> fmul f22,f10,f0
> addis r3,r7,-27
> lfd f21,0(r7)
> addi r4,r7,8
> lfd f25,9472(r3)
> fmadd f8,f7,f0,f8
> fmsub f9,f7,f13,f22
> fmul f23,f8,f21
> fmul f26,f8,f25
> fmsub f24,f9,f25,f23
> fmadd f7,f9,f21,f26
> fadd f12,f12,f24
> fadd f11,f11,f7
> bdnz 100c9fe0 <thin6d_+0xfd0>
>
> I've opened bugzilla http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39976
> for this.
Does enabling the selective scheduler work around this problem?
Richard.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-05-01 19:33 ` Richard Guenther
@ 2009-05-04 13:38 ` Luis Machado
0 siblings, 0 replies; 56+ messages in thread
From: Luis Machado @ 2009-05-04 13:38 UTC (permalink / raw)
To: Richard Guenther; +Cc: Michael Matz, David Edelsohn, gcc-patches
Hi,
On Fri, 2009-05-01 at 21:33 +0200, Richard Guenther wrote:
> Does enabling the selective scheduler work around this problem?
>
> Richard.
Not really. The numbers are the same with -fselective-scheduling.
Luis
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2010-10-20 18:02 ` H.J. Lu
@ 2011-02-14 18:02 ` H.J. Lu
0 siblings, 0 replies; 56+ messages in thread
From: H.J. Lu @ 2011-02-14 18:02 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Wed, Oct 20, 2010 at 10:00 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, May 28, 2009 at 8:24 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, Apr 30, 2009 at 6:06 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Tue, Apr 28, 2009 at 4:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Sun, Apr 26, 2009 at 7:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote:
>>>>>> On Wed, 22 Apr 2009, Michael Matz wrote:
>>>>>>
>>>>>>> I'll soon send a new version of the patch that fixes all problems and
>>>>>>> testcases I encountered.
>>>>>>
>>>>>> Like so. This is the full patch, i.e. including the cleanups, but
>>>>>> excluding the testsuite changes. It should incorporate all feedback.
>>>>>> Compared to the last version it adds comments for new functions, fixes
>>>>>> muflap2, and generally some other minor problems showing when I started
>>>>>> testing Ada and a bug reported by Andrey.
>>>>>>
>>>>>> This patch (plus testsuite changes) was bootstrapped with Ada on
>>>>>> x86_64-linux. There are no testsuite regressions:
>>>>>> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE
>>>>>> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE
>>>>>> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors)
>>>>>> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors)
>>>>>> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors)
>>>>>> FAIL: libmudflap.c++/pass41-frag.cxx execution test
>>>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test
>>>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test
>>>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test
>>>>>>
>>>>>> All of these happen without the patch too (known bugs, old binutils, and
>>>>>> pass41-frag never seems to work anyway).
>>>>>>
>>>>>> I'd like to ask for approval for the series.
>>>>>>
>>>>>>
>>>>>> Ciao,
>>>>>> Michael.
>>>>>> --
>>>>>> * builtins.c (fold_builtin_next_arg): Handle SSA names.
>>>>>> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate
>>>>>> beyond num_ssa_names, use ssa_name() directly.
>>>>>> * tree-ssa-ter.c (free_temp_expr_table): Likewise.
>>>>>> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise,
>>>>>> mark only useful SSA names.
>>>>>> (compare_pairs): Swap cost comparison.
>>>>>> (coalesce_ssa_name): Don't use change_partition_var.
>>>>>> * tree-nrv.c (struct nrv_data): Add modified member.
>>>>>> (finalize_nrv_r): Set it.
>>>>>> (tree_nrv): Use it to update statements.
>>>>>> (pass_nrv): Require PROP_ssa.
>>>>>> * tree-mudflap.c (create_referenced_var): New static helper.
>>>>>> (mf_decl_cache_locals, mf_build_check_statement_for): Use it.
>>>>>> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish.
>>>>>> * alias.c (find_base_decl): Handle SSA names.
>>>>>> * emit-rtl (set_reg_attrs_for_parm): Make non-static.
>>>>>> (component_ref_for_mem_expr): Don't leak SSA names into RTL.
>>>>>> * rtl.h (set_reg_attrs_for_parm): Declare.
>>>>>> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename
>>>>>> to "optimized", remove unused locals at finish.
>>>>>> (execute_free_datastructures): Make global, call
>>>>>> delete_tree_cfg_annotations.
>>>>>> (execute_free_cfg_annotations): Don't call
>>>>>> delete_tree_cfg_annotations.
>>>>>>
>>>>>> * ssaexpand.h: New file.
>>>>>> * expr.c (toplevel): Include ssaexpand.h.
>>>>>> (expand_assignment): Handle SSA names the same as register
>>>>>> variables.
>>>>>> (expand_expr_real_1): Expand SSA names.
>>>>>> * cfgexpand.c (toplevel): Include ssaexpand.h.
>>>>>> (SA): New global variable.
>>>>>> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates.
>>>>>> (SSAVAR): New macro.
>>>>>> (set_rtl): New helper function.
>>>>>> (add_stack_var): Deal with SSA names, use set_rtl.
>>>>>> (expand_one_stack_var_at): Likewise.
>>>>>> (expand_one_stack_var): Deal with SSA names.
>>>>>> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker
>>>>>> before unique numbers.
>>>>>> (expand_stack_vars): Use set_rtl.
>>>>>> (expand_one_var): Accept SSA names, add asserts for them, feed them
>>>>>> to above subroutines.
>>>>>> (expand_used_vars): Expand all partitions (without default defs),
>>>>>> then only the local decls (ignoring those expanded already).
>>>>>> (expand_gimple_cond): Remove edges when jumpif() expands an
>>>>>> unconditional jump.
>>>>>> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here,
>>>>>> or remove abnormal edges. Ignore insns setting the LHS of a TERed
>>>>>> SSA name.
>>>>>> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize
>>>>>> members of SA; deal with PARM_DECL partitions here; expand
>>>>>> all PHI nodes, free tree datastructures and SA. Commit instructions
>>>>>> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here.
>>>>>> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow
>>>>>> info and statements at start, collect garbage at finish.
>>>>>> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member.
>>>>>> (VAR_ANN_PARTITION) Remove.
>>>>>> (change_partition_var): Don't declare.
>>>>>> (partition_to_var): Always return SSA names.
>>>>>> (var_to_partition): Only accept SSA names.
>>>>>> (register_ssa_partition): Only check argument.
>>>>>> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var
>>>>>> member.
>>>>>> (delete_var_map): Don't free it.
>>>>>> (var_union): Only accept SSA names, simplify.
>>>>>> (partition_view_init): Mark only useful SSA names as used.
>>>>>> (partition_view_fini): Only deal with SSA names.
>>>>>> (change_partition_var): Remove.
>>>>>> (dump_var_map): Use ssa_name instead of partition_to_var member.
>>>>>> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL
>>>>>> basic blocks.
>>>>>> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h.
>>>>>> (struct _elim_graph): New member const_dests; nodes member vector of
>>>>>> ints.
>>>>>> (set_location_for_edge): New static helper.
>>>>>> (create_temp): Remove.
>>>>>> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge,
>>>>>> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New
>>>>>> functions.
>>>>>> (new_elim_graph): Allocate const_dests member.
>>>>>> (clean_elim_graph): Truncate const_dests member.
>>>>>> (delete_elim_graph): Free const_dests member.
>>>>>> (elim_graph_size): Adapt to new type of nodes member.
>>>>>> (elim_graph_add_node): Likewise.
>>>>>> (eliminate_name): Likewise.
>>>>>> (eliminate_build): Don't take basic block argument, deal only with
>>>>>> partition numbers, not variables.
>>>>>> (get_temp_reg): New static helper.
>>>>>> (elim_create): Use it, deal with RTL temporaries instead of trees.
>>>>>> (eliminate_phi): Adjust all calls to new signature.
>>>>>> (assign_vars, replace_use_variable, replace_def_variable): Remove.
>>>>>> (rewrite_trees): Only do checking.
>>>>>> (edge_leader, stmt_list, leader_has_match, leader_match): Remove.
>>>>>> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p,
>>>>>> init_analyze_edges_for_bb, fini_analyze_edges_for_bb,
>>>>>> contains_tree_r, MAX_STMTS_IN_LATCH,
>>>>>> process_single_block_loop_latch, analyze_edges_for_bb,
>>>>>> perform_edge_inserts): Remove.
>>>>>> (expand_phi_nodes): New global function.
>>>>>> (remove_ssa_form): Take ssaexpand parameter. Don't call removed
>>>>>> functions, initialize new parameter, remember partitions having a
>>>>>> default def.
>>>>>> (finish_out_of_ssa): New global function.
>>>>>> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form,
>>>>>> don't reset in_ssa_p here.
>>>>>> (pass_del_ssa): Remove.
>>>>>> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and
>>>>>> partition members.
>>>>>> (execute_free_datastructures): Declare.
>>>>>> * Makefile.in (SSAEXPAND_H): New variable.
>>>>>> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H.
>>>>>> * basic-block.h (commit_one_edge_insertion): Declare.
>>>>>> * passes.c (init_optimization_passes): Move pass_nrv and
>>>>>> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove
>>>>>> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations.
>>>>>> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare.
>>>>>> (redirect_branch_edge): Deal with super block when expanding, split
>>>>>> out jump patching itself into ...
>>>>>> (patch_jump_insn): ... here, new static helper.
>>>>>>
>>>>>
>>>>> This patch caused:
>>>>>
>>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922
>>>>>
>>>>> You may need a 32bit host to see it since I didn't see it on
>>>>> Linux/x86-64 with -m32.
>>>>>
>>>>
>>>> This also caused:
>>>>
>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954
>>>>
>>>
>>> This also caused:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39973
>>>
>>> and may cause:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39972
>>>
>>
>> This also caused:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40012
>>
>>
>
> This also caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46098
>
This also caused:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47735
--
H.J.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-05-29 3:47 ` H.J. Lu
@ 2010-10-20 18:02 ` H.J. Lu
2011-02-14 18:02 ` H.J. Lu
0 siblings, 1 reply; 56+ messages in thread
From: H.J. Lu @ 2010-10-20 18:02 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Thu, May 28, 2009 at 8:24 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Apr 30, 2009 at 6:06 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Apr 28, 2009 at 4:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Sun, Apr 26, 2009 at 7:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote:
>>>>> On Wed, 22 Apr 2009, Michael Matz wrote:
>>>>>
>>>>>> I'll soon send a new version of the patch that fixes all problems and
>>>>>> testcases I encountered.
>>>>>
>>>>> Like so. This is the full patch, i.e. including the cleanups, but
>>>>> excluding the testsuite changes. It should incorporate all feedback.
>>>>> Compared to the last version it adds comments for new functions, fixes
>>>>> muflap2, and generally some other minor problems showing when I started
>>>>> testing Ada and a bug reported by Andrey.
>>>>>
>>>>> This patch (plus testsuite changes) was bootstrapped with Ada on
>>>>> x86_64-linux. There are no testsuite regressions:
>>>>> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE
>>>>> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE
>>>>> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors)
>>>>> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors)
>>>>> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors)
>>>>> FAIL: libmudflap.c++/pass41-frag.cxx execution test
>>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test
>>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test
>>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test
>>>>>
>>>>> All of these happen without the patch too (known bugs, old binutils, and
>>>>> pass41-frag never seems to work anyway).
>>>>>
>>>>> I'd like to ask for approval for the series.
>>>>>
>>>>>
>>>>> Ciao,
>>>>> Michael.
>>>>> --
>>>>> * builtins.c (fold_builtin_next_arg): Handle SSA names.
>>>>> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate
>>>>> beyond num_ssa_names, use ssa_name() directly.
>>>>> * tree-ssa-ter.c (free_temp_expr_table): Likewise.
>>>>> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise,
>>>>> mark only useful SSA names.
>>>>> (compare_pairs): Swap cost comparison.
>>>>> (coalesce_ssa_name): Don't use change_partition_var.
>>>>> * tree-nrv.c (struct nrv_data): Add modified member.
>>>>> (finalize_nrv_r): Set it.
>>>>> (tree_nrv): Use it to update statements.
>>>>> (pass_nrv): Require PROP_ssa.
>>>>> * tree-mudflap.c (create_referenced_var): New static helper.
>>>>> (mf_decl_cache_locals, mf_build_check_statement_for): Use it.
>>>>> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish.
>>>>> * alias.c (find_base_decl): Handle SSA names.
>>>>> * emit-rtl (set_reg_attrs_for_parm): Make non-static.
>>>>> (component_ref_for_mem_expr): Don't leak SSA names into RTL.
>>>>> * rtl.h (set_reg_attrs_for_parm): Declare.
>>>>> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename
>>>>> to "optimized", remove unused locals at finish.
>>>>> (execute_free_datastructures): Make global, call
>>>>> delete_tree_cfg_annotations.
>>>>> (execute_free_cfg_annotations): Don't call
>>>>> delete_tree_cfg_annotations.
>>>>>
>>>>> * ssaexpand.h: New file.
>>>>> * expr.c (toplevel): Include ssaexpand.h.
>>>>> (expand_assignment): Handle SSA names the same as register
>>>>> variables.
>>>>> (expand_expr_real_1): Expand SSA names.
>>>>> * cfgexpand.c (toplevel): Include ssaexpand.h.
>>>>> (SA): New global variable.
>>>>> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates.
>>>>> (SSAVAR): New macro.
>>>>> (set_rtl): New helper function.
>>>>> (add_stack_var): Deal with SSA names, use set_rtl.
>>>>> (expand_one_stack_var_at): Likewise.
>>>>> (expand_one_stack_var): Deal with SSA names.
>>>>> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker
>>>>> before unique numbers.
>>>>> (expand_stack_vars): Use set_rtl.
>>>>> (expand_one_var): Accept SSA names, add asserts for them, feed them
>>>>> to above subroutines.
>>>>> (expand_used_vars): Expand all partitions (without default defs),
>>>>> then only the local decls (ignoring those expanded already).
>>>>> (expand_gimple_cond): Remove edges when jumpif() expands an
>>>>> unconditional jump.
>>>>> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here,
>>>>> or remove abnormal edges. Ignore insns setting the LHS of a TERed
>>>>> SSA name.
>>>>> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize
>>>>> members of SA; deal with PARM_DECL partitions here; expand
>>>>> all PHI nodes, free tree datastructures and SA. Commit instructions
>>>>> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here.
>>>>> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow
>>>>> info and statements at start, collect garbage at finish.
>>>>> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member.
>>>>> (VAR_ANN_PARTITION) Remove.
>>>>> (change_partition_var): Don't declare.
>>>>> (partition_to_var): Always return SSA names.
>>>>> (var_to_partition): Only accept SSA names.
>>>>> (register_ssa_partition): Only check argument.
>>>>> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var
>>>>> member.
>>>>> (delete_var_map): Don't free it.
>>>>> (var_union): Only accept SSA names, simplify.
>>>>> (partition_view_init): Mark only useful SSA names as used.
>>>>> (partition_view_fini): Only deal with SSA names.
>>>>> (change_partition_var): Remove.
>>>>> (dump_var_map): Use ssa_name instead of partition_to_var member.
>>>>> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL
>>>>> basic blocks.
>>>>> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h.
>>>>> (struct _elim_graph): New member const_dests; nodes member vector of
>>>>> ints.
>>>>> (set_location_for_edge): New static helper.
>>>>> (create_temp): Remove.
>>>>> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge,
>>>>> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New
>>>>> functions.
>>>>> (new_elim_graph): Allocate const_dests member.
>>>>> (clean_elim_graph): Truncate const_dests member.
>>>>> (delete_elim_graph): Free const_dests member.
>>>>> (elim_graph_size): Adapt to new type of nodes member.
>>>>> (elim_graph_add_node): Likewise.
>>>>> (eliminate_name): Likewise.
>>>>> (eliminate_build): Don't take basic block argument, deal only with
>>>>> partition numbers, not variables.
>>>>> (get_temp_reg): New static helper.
>>>>> (elim_create): Use it, deal with RTL temporaries instead of trees.
>>>>> (eliminate_phi): Adjust all calls to new signature.
>>>>> (assign_vars, replace_use_variable, replace_def_variable): Remove.
>>>>> (rewrite_trees): Only do checking.
>>>>> (edge_leader, stmt_list, leader_has_match, leader_match): Remove.
>>>>> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p,
>>>>> init_analyze_edges_for_bb, fini_analyze_edges_for_bb,
>>>>> contains_tree_r, MAX_STMTS_IN_LATCH,
>>>>> process_single_block_loop_latch, analyze_edges_for_bb,
>>>>> perform_edge_inserts): Remove.
>>>>> (expand_phi_nodes): New global function.
>>>>> (remove_ssa_form): Take ssaexpand parameter. Don't call removed
>>>>> functions, initialize new parameter, remember partitions having a
>>>>> default def.
>>>>> (finish_out_of_ssa): New global function.
>>>>> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form,
>>>>> don't reset in_ssa_p here.
>>>>> (pass_del_ssa): Remove.
>>>>> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and
>>>>> partition members.
>>>>> (execute_free_datastructures): Declare.
>>>>> * Makefile.in (SSAEXPAND_H): New variable.
>>>>> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H.
>>>>> * basic-block.h (commit_one_edge_insertion): Declare.
>>>>> * passes.c (init_optimization_passes): Move pass_nrv and
>>>>> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove
>>>>> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations.
>>>>> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare.
>>>>> (redirect_branch_edge): Deal with super block when expanding, split
>>>>> out jump patching itself into ...
>>>>> (patch_jump_insn): ... here, new static helper.
>>>>>
>>>>
>>>> This patch caused:
>>>>
>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922
>>>>
>>>> You may need a 32bit host to see it since I didn't see it on
>>>> Linux/x86-64 with -m32.
>>>>
>>>
>>> This also caused:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954
>>>
>>
>> This also caused:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39973
>>
>> and may cause:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39972
>>
>
> This also caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40012
>
>
This also caused:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46098
--
H.J.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-26 20:27 ` Michael Matz
@ 2010-01-19 15:48 ` H.J. Lu
0 siblings, 0 replies; 56+ messages in thread
From: H.J. Lu @ 2010-01-19 15:48 UTC (permalink / raw)
To: Michael Matz; +Cc: Andrew MacLeod, gcc-patches, Andrey Belevantsev
On Sun, Apr 26, 2009 at 12:21 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Thu, 23 Apr 2009, Andrew MacLeod wrote:
>
>> Looks good to me.
>
> It's now in.
>
It caused:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42800
--
H.J.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-30 13:47 ` H.J. Lu
@ 2009-05-29 3:47 ` H.J. Lu
2010-10-20 18:02 ` H.J. Lu
0 siblings, 1 reply; 56+ messages in thread
From: H.J. Lu @ 2009-05-29 3:47 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Thu, Apr 30, 2009 at 6:06 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Apr 28, 2009 at 4:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Sun, Apr 26, 2009 at 7:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote:
>>>> On Wed, 22 Apr 2009, Michael Matz wrote:
>>>>
>>>>> I'll soon send a new version of the patch that fixes all problems and
>>>>> testcases I encountered.
>>>>
>>>> Like so. This is the full patch, i.e. including the cleanups, but
>>>> excluding the testsuite changes. It should incorporate all feedback.
>>>> Compared to the last version it adds comments for new functions, fixes
>>>> muflap2, and generally some other minor problems showing when I started
>>>> testing Ada and a bug reported by Andrey.
>>>>
>>>> This patch (plus testsuite changes) was bootstrapped with Ada on
>>>> x86_64-linux. There are no testsuite regressions:
>>>> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE
>>>> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE
>>>> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors)
>>>> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors)
>>>> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors)
>>>> FAIL: libmudflap.c++/pass41-frag.cxx execution test
>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test
>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test
>>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test
>>>>
>>>> All of these happen without the patch too (known bugs, old binutils, and
>>>> pass41-frag never seems to work anyway).
>>>>
>>>> I'd like to ask for approval for the series.
>>>>
>>>>
>>>> Ciao,
>>>> Michael.
>>>> --
>>>> * builtins.c (fold_builtin_next_arg): Handle SSA names.
>>>> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate
>>>> beyond num_ssa_names, use ssa_name() directly.
>>>> * tree-ssa-ter.c (free_temp_expr_table): Likewise.
>>>> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise,
>>>> mark only useful SSA names.
>>>> (compare_pairs): Swap cost comparison.
>>>> (coalesce_ssa_name): Don't use change_partition_var.
>>>> * tree-nrv.c (struct nrv_data): Add modified member.
>>>> (finalize_nrv_r): Set it.
>>>> (tree_nrv): Use it to update statements.
>>>> (pass_nrv): Require PROP_ssa.
>>>> * tree-mudflap.c (create_referenced_var): New static helper.
>>>> (mf_decl_cache_locals, mf_build_check_statement_for): Use it.
>>>> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish.
>>>> * alias.c (find_base_decl): Handle SSA names.
>>>> * emit-rtl (set_reg_attrs_for_parm): Make non-static.
>>>> (component_ref_for_mem_expr): Don't leak SSA names into RTL.
>>>> * rtl.h (set_reg_attrs_for_parm): Declare.
>>>> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename
>>>> to "optimized", remove unused locals at finish.
>>>> (execute_free_datastructures): Make global, call
>>>> delete_tree_cfg_annotations.
>>>> (execute_free_cfg_annotations): Don't call
>>>> delete_tree_cfg_annotations.
>>>>
>>>> * ssaexpand.h: New file.
>>>> * expr.c (toplevel): Include ssaexpand.h.
>>>> (expand_assignment): Handle SSA names the same as register
>>>> variables.
>>>> (expand_expr_real_1): Expand SSA names.
>>>> * cfgexpand.c (toplevel): Include ssaexpand.h.
>>>> (SA): New global variable.
>>>> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates.
>>>> (SSAVAR): New macro.
>>>> (set_rtl): New helper function.
>>>> (add_stack_var): Deal with SSA names, use set_rtl.
>>>> (expand_one_stack_var_at): Likewise.
>>>> (expand_one_stack_var): Deal with SSA names.
>>>> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker
>>>> before unique numbers.
>>>> (expand_stack_vars): Use set_rtl.
>>>> (expand_one_var): Accept SSA names, add asserts for them, feed them
>>>> to above subroutines.
>>>> (expand_used_vars): Expand all partitions (without default defs),
>>>> then only the local decls (ignoring those expanded already).
>>>> (expand_gimple_cond): Remove edges when jumpif() expands an
>>>> unconditional jump.
>>>> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here,
>>>> or remove abnormal edges. Ignore insns setting the LHS of a TERed
>>>> SSA name.
>>>> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize
>>>> members of SA; deal with PARM_DECL partitions here; expand
>>>> all PHI nodes, free tree datastructures and SA. Commit instructions
>>>> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here.
>>>> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow
>>>> info and statements at start, collect garbage at finish.
>>>> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member.
>>>> (VAR_ANN_PARTITION) Remove.
>>>> (change_partition_var): Don't declare.
>>>> (partition_to_var): Always return SSA names.
>>>> (var_to_partition): Only accept SSA names.
>>>> (register_ssa_partition): Only check argument.
>>>> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var
>>>> member.
>>>> (delete_var_map): Don't free it.
>>>> (var_union): Only accept SSA names, simplify.
>>>> (partition_view_init): Mark only useful SSA names as used.
>>>> (partition_view_fini): Only deal with SSA names.
>>>> (change_partition_var): Remove.
>>>> (dump_var_map): Use ssa_name instead of partition_to_var member.
>>>> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL
>>>> basic blocks.
>>>> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h.
>>>> (struct _elim_graph): New member const_dests; nodes member vector of
>>>> ints.
>>>> (set_location_for_edge): New static helper.
>>>> (create_temp): Remove.
>>>> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge,
>>>> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New
>>>> functions.
>>>> (new_elim_graph): Allocate const_dests member.
>>>> (clean_elim_graph): Truncate const_dests member.
>>>> (delete_elim_graph): Free const_dests member.
>>>> (elim_graph_size): Adapt to new type of nodes member.
>>>> (elim_graph_add_node): Likewise.
>>>> (eliminate_name): Likewise.
>>>> (eliminate_build): Don't take basic block argument, deal only with
>>>> partition numbers, not variables.
>>>> (get_temp_reg): New static helper.
>>>> (elim_create): Use it, deal with RTL temporaries instead of trees.
>>>> (eliminate_phi): Adjust all calls to new signature.
>>>> (assign_vars, replace_use_variable, replace_def_variable): Remove.
>>>> (rewrite_trees): Only do checking.
>>>> (edge_leader, stmt_list, leader_has_match, leader_match): Remove.
>>>> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p,
>>>> init_analyze_edges_for_bb, fini_analyze_edges_for_bb,
>>>> contains_tree_r, MAX_STMTS_IN_LATCH,
>>>> process_single_block_loop_latch, analyze_edges_for_bb,
>>>> perform_edge_inserts): Remove.
>>>> (expand_phi_nodes): New global function.
>>>> (remove_ssa_form): Take ssaexpand parameter. Don't call removed
>>>> functions, initialize new parameter, remember partitions having a
>>>> default def.
>>>> (finish_out_of_ssa): New global function.
>>>> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form,
>>>> don't reset in_ssa_p here.
>>>> (pass_del_ssa): Remove.
>>>> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and
>>>> partition members.
>>>> (execute_free_datastructures): Declare.
>>>> * Makefile.in (SSAEXPAND_H): New variable.
>>>> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H.
>>>> * basic-block.h (commit_one_edge_insertion): Declare.
>>>> * passes.c (init_optimization_passes): Move pass_nrv and
>>>> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove
>>>> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations.
>>>> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare.
>>>> (redirect_branch_edge): Deal with super block when expanding, split
>>>> out jump patching itself into ...
>>>> (patch_jump_insn): ... here, new static helper.
>>>>
>>>
>>> This patch caused:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922
>>>
>>> You may need a 32bit host to see it since I didn't see it on
>>> Linux/x86-64 with -m32.
>>>
>>
>> This also caused:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954
>>
>
> This also caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39973
>
> and may cause:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39972
>
This also caused:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40012
--
H.J.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-30 18:18 ` Steve Ellcey
@ 2009-05-01 17:40 ` Michael Matz
0 siblings, 0 replies; 56+ messages in thread
From: Michael Matz @ 2009-05-01 17:40 UTC (permalink / raw)
To: Steve Ellcey; +Cc: gcc-patches, dave
Hi,
On Thu, 30 Apr 2009, Steve Ellcey wrote:
> FYI: I have submitted a bug about the hppa2.0w-hp-hpux11.11 bootstrap
> not working after version r146817. The bug number is 39977.
>
> I get an ICE while compiling libgcc and there is a cutdown source
> program in the bug report.
I've attached a possible patch for this problem there, it would be nice if
you could test it on a hppa machine (I don't have one available).
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-22 16:45 ` [RFA] " Michael Matz
` (3 preceding siblings ...)
2009-04-27 7:22 ` Hans-Peter Nilsson
@ 2009-04-30 18:18 ` Steve Ellcey
2009-05-01 17:40 ` Michael Matz
4 siblings, 1 reply; 56+ messages in thread
From: Steve Ellcey @ 2009-04-30 18:18 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, dave
FYI: I have submitted a bug about the hppa2.0w-hp-hpux11.11 bootstrap
not working after version r146817. The bug number is 39977.
I get an ICE while compiling libgcc and there is a cutdown source
program in the bug report.
Steve Ellcey
sje@cup.hp.com
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-28 23:49 ` H.J. Lu
2009-04-29 0:21 ` Andrew Pinski
@ 2009-04-30 13:47 ` H.J. Lu
2009-05-29 3:47 ` H.J. Lu
1 sibling, 1 reply; 56+ messages in thread
From: H.J. Lu @ 2009-04-30 13:47 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Tue, Apr 28, 2009 at 4:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sun, Apr 26, 2009 at 7:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote:
>>> On Wed, 22 Apr 2009, Michael Matz wrote:
>>>
>>>> I'll soon send a new version of the patch that fixes all problems and
>>>> testcases I encountered.
>>>
>>> Like so. This is the full patch, i.e. including the cleanups, but
>>> excluding the testsuite changes. It should incorporate all feedback.
>>> Compared to the last version it adds comments for new functions, fixes
>>> muflap2, and generally some other minor problems showing when I started
>>> testing Ada and a bug reported by Andrey.
>>>
>>> This patch (plus testsuite changes) was bootstrapped with Ada on
>>> x86_64-linux. There are no testsuite regressions:
>>> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE
>>> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE
>>> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors)
>>> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors)
>>> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors)
>>> FAIL: libmudflap.c++/pass41-frag.cxx execution test
>>> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test
>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test
>>> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test
>>>
>>> All of these happen without the patch too (known bugs, old binutils, and
>>> pass41-frag never seems to work anyway).
>>>
>>> I'd like to ask for approval for the series.
>>>
>>>
>>> Ciao,
>>> Michael.
>>> --
>>> * builtins.c (fold_builtin_next_arg): Handle SSA names.
>>> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate
>>> beyond num_ssa_names, use ssa_name() directly.
>>> * tree-ssa-ter.c (free_temp_expr_table): Likewise.
>>> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise,
>>> mark only useful SSA names.
>>> (compare_pairs): Swap cost comparison.
>>> (coalesce_ssa_name): Don't use change_partition_var.
>>> * tree-nrv.c (struct nrv_data): Add modified member.
>>> (finalize_nrv_r): Set it.
>>> (tree_nrv): Use it to update statements.
>>> (pass_nrv): Require PROP_ssa.
>>> * tree-mudflap.c (create_referenced_var): New static helper.
>>> (mf_decl_cache_locals, mf_build_check_statement_for): Use it.
>>> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish.
>>> * alias.c (find_base_decl): Handle SSA names.
>>> * emit-rtl (set_reg_attrs_for_parm): Make non-static.
>>> (component_ref_for_mem_expr): Don't leak SSA names into RTL.
>>> * rtl.h (set_reg_attrs_for_parm): Declare.
>>> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename
>>> to "optimized", remove unused locals at finish.
>>> (execute_free_datastructures): Make global, call
>>> delete_tree_cfg_annotations.
>>> (execute_free_cfg_annotations): Don't call
>>> delete_tree_cfg_annotations.
>>>
>>> * ssaexpand.h: New file.
>>> * expr.c (toplevel): Include ssaexpand.h.
>>> (expand_assignment): Handle SSA names the same as register
>>> variables.
>>> (expand_expr_real_1): Expand SSA names.
>>> * cfgexpand.c (toplevel): Include ssaexpand.h.
>>> (SA): New global variable.
>>> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates.
>>> (SSAVAR): New macro.
>>> (set_rtl): New helper function.
>>> (add_stack_var): Deal with SSA names, use set_rtl.
>>> (expand_one_stack_var_at): Likewise.
>>> (expand_one_stack_var): Deal with SSA names.
>>> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker
>>> before unique numbers.
>>> (expand_stack_vars): Use set_rtl.
>>> (expand_one_var): Accept SSA names, add asserts for them, feed them
>>> to above subroutines.
>>> (expand_used_vars): Expand all partitions (without default defs),
>>> then only the local decls (ignoring those expanded already).
>>> (expand_gimple_cond): Remove edges when jumpif() expands an
>>> unconditional jump.
>>> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here,
>>> or remove abnormal edges. Ignore insns setting the LHS of a TERed
>>> SSA name.
>>> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize
>>> members of SA; deal with PARM_DECL partitions here; expand
>>> all PHI nodes, free tree datastructures and SA. Commit instructions
>>> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here.
>>> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow
>>> info and statements at start, collect garbage at finish.
>>> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member.
>>> (VAR_ANN_PARTITION) Remove.
>>> (change_partition_var): Don't declare.
>>> (partition_to_var): Always return SSA names.
>>> (var_to_partition): Only accept SSA names.
>>> (register_ssa_partition): Only check argument.
>>> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var
>>> member.
>>> (delete_var_map): Don't free it.
>>> (var_union): Only accept SSA names, simplify.
>>> (partition_view_init): Mark only useful SSA names as used.
>>> (partition_view_fini): Only deal with SSA names.
>>> (change_partition_var): Remove.
>>> (dump_var_map): Use ssa_name instead of partition_to_var member.
>>> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL
>>> basic blocks.
>>> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h.
>>> (struct _elim_graph): New member const_dests; nodes member vector of
>>> ints.
>>> (set_location_for_edge): New static helper.
>>> (create_temp): Remove.
>>> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge,
>>> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New
>>> functions.
>>> (new_elim_graph): Allocate const_dests member.
>>> (clean_elim_graph): Truncate const_dests member.
>>> (delete_elim_graph): Free const_dests member.
>>> (elim_graph_size): Adapt to new type of nodes member.
>>> (elim_graph_add_node): Likewise.
>>> (eliminate_name): Likewise.
>>> (eliminate_build): Don't take basic block argument, deal only with
>>> partition numbers, not variables.
>>> (get_temp_reg): New static helper.
>>> (elim_create): Use it, deal with RTL temporaries instead of trees.
>>> (eliminate_phi): Adjust all calls to new signature.
>>> (assign_vars, replace_use_variable, replace_def_variable): Remove.
>>> (rewrite_trees): Only do checking.
>>> (edge_leader, stmt_list, leader_has_match, leader_match): Remove.
>>> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p,
>>> init_analyze_edges_for_bb, fini_analyze_edges_for_bb,
>>> contains_tree_r, MAX_STMTS_IN_LATCH,
>>> process_single_block_loop_latch, analyze_edges_for_bb,
>>> perform_edge_inserts): Remove.
>>> (expand_phi_nodes): New global function.
>>> (remove_ssa_form): Take ssaexpand parameter. Don't call removed
>>> functions, initialize new parameter, remember partitions having a
>>> default def.
>>> (finish_out_of_ssa): New global function.
>>> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form,
>>> don't reset in_ssa_p here.
>>> (pass_del_ssa): Remove.
>>> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and
>>> partition members.
>>> (execute_free_datastructures): Declare.
>>> * Makefile.in (SSAEXPAND_H): New variable.
>>> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H.
>>> * basic-block.h (commit_one_edge_insertion): Declare.
>>> * passes.c (init_optimization_passes): Move pass_nrv and
>>> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove
>>> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations.
>>> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare.
>>> (redirect_branch_edge): Deal with super block when expanding, split
>>> out jump patching itself into ...
>>> (patch_jump_insn): ... here, new static helper.
>>>
>>
>> This patch caused:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922
>>
>> You may need a 32bit host to see it since I didn't see it on
>> Linux/x86-64 with -m32.
>>
>
> This also caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954
>
This also caused:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39973
and may cause:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39972
--
H.J.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-28 23:49 ` H.J. Lu
@ 2009-04-29 0:21 ` Andrew Pinski
2009-04-30 13:47 ` H.J. Lu
1 sibling, 0 replies; 56+ messages in thread
From: Andrew Pinski @ 2009-04-29 0:21 UTC (permalink / raw)
To: H.J. Lu; +Cc: Michael Matz, gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Tue, Apr 28, 2009 at 4:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
> This also caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954
But as I mentioned in that bug report, I think the taking the address
here is not so valid thing of doing.
-- Pinski
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-27 5:47 ` H.J. Lu
@ 2009-04-28 23:49 ` H.J. Lu
2009-04-29 0:21 ` Andrew Pinski
2009-04-30 13:47 ` H.J. Lu
0 siblings, 2 replies; 56+ messages in thread
From: H.J. Lu @ 2009-04-28 23:49 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Sun, Apr 26, 2009 at 7:40 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote:
>> On Wed, 22 Apr 2009, Michael Matz wrote:
>>
>>> I'll soon send a new version of the patch that fixes all problems and
>>> testcases I encountered.
>>
>> Like so. This is the full patch, i.e. including the cleanups, but
>> excluding the testsuite changes. It should incorporate all feedback.
>> Compared to the last version it adds comments for new functions, fixes
>> muflap2, and generally some other minor problems showing when I started
>> testing Ada and a bug reported by Andrey.
>>
>> This patch (plus testsuite changes) was bootstrapped with Ada on
>> x86_64-linux. There are no testsuite regressions:
>> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE
>> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE
>> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors)
>> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors)
>> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors)
>> FAIL: libmudflap.c++/pass41-frag.cxx execution test
>> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test
>> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test
>> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test
>>
>> All of these happen without the patch too (known bugs, old binutils, and
>> pass41-frag never seems to work anyway).
>>
>> I'd like to ask for approval for the series.
>>
>>
>> Ciao,
>> Michael.
>> --
>> * builtins.c (fold_builtin_next_arg): Handle SSA names.
>> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate
>> beyond num_ssa_names, use ssa_name() directly.
>> * tree-ssa-ter.c (free_temp_expr_table): Likewise.
>> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise,
>> mark only useful SSA names.
>> (compare_pairs): Swap cost comparison.
>> (coalesce_ssa_name): Don't use change_partition_var.
>> * tree-nrv.c (struct nrv_data): Add modified member.
>> (finalize_nrv_r): Set it.
>> (tree_nrv): Use it to update statements.
>> (pass_nrv): Require PROP_ssa.
>> * tree-mudflap.c (create_referenced_var): New static helper.
>> (mf_decl_cache_locals, mf_build_check_statement_for): Use it.
>> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish.
>> * alias.c (find_base_decl): Handle SSA names.
>> * emit-rtl (set_reg_attrs_for_parm): Make non-static.
>> (component_ref_for_mem_expr): Don't leak SSA names into RTL.
>> * rtl.h (set_reg_attrs_for_parm): Declare.
>> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename
>> to "optimized", remove unused locals at finish.
>> (execute_free_datastructures): Make global, call
>> delete_tree_cfg_annotations.
>> (execute_free_cfg_annotations): Don't call
>> delete_tree_cfg_annotations.
>>
>> * ssaexpand.h: New file.
>> * expr.c (toplevel): Include ssaexpand.h.
>> (expand_assignment): Handle SSA names the same as register
>> variables.
>> (expand_expr_real_1): Expand SSA names.
>> * cfgexpand.c (toplevel): Include ssaexpand.h.
>> (SA): New global variable.
>> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates.
>> (SSAVAR): New macro.
>> (set_rtl): New helper function.
>> (add_stack_var): Deal with SSA names, use set_rtl.
>> (expand_one_stack_var_at): Likewise.
>> (expand_one_stack_var): Deal with SSA names.
>> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker
>> before unique numbers.
>> (expand_stack_vars): Use set_rtl.
>> (expand_one_var): Accept SSA names, add asserts for them, feed them
>> to above subroutines.
>> (expand_used_vars): Expand all partitions (without default defs),
>> then only the local decls (ignoring those expanded already).
>> (expand_gimple_cond): Remove edges when jumpif() expands an
>> unconditional jump.
>> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here,
>> or remove abnormal edges. Ignore insns setting the LHS of a TERed
>> SSA name.
>> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize
>> members of SA; deal with PARM_DECL partitions here; expand
>> all PHI nodes, free tree datastructures and SA. Commit instructions
>> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here.
>> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow
>> info and statements at start, collect garbage at finish.
>> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member.
>> (VAR_ANN_PARTITION) Remove.
>> (change_partition_var): Don't declare.
>> (partition_to_var): Always return SSA names.
>> (var_to_partition): Only accept SSA names.
>> (register_ssa_partition): Only check argument.
>> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var
>> member.
>> (delete_var_map): Don't free it.
>> (var_union): Only accept SSA names, simplify.
>> (partition_view_init): Mark only useful SSA names as used.
>> (partition_view_fini): Only deal with SSA names.
>> (change_partition_var): Remove.
>> (dump_var_map): Use ssa_name instead of partition_to_var member.
>> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL
>> basic blocks.
>> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h.
>> (struct _elim_graph): New member const_dests; nodes member vector of
>> ints.
>> (set_location_for_edge): New static helper.
>> (create_temp): Remove.
>> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge,
>> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New
>> functions.
>> (new_elim_graph): Allocate const_dests member.
>> (clean_elim_graph): Truncate const_dests member.
>> (delete_elim_graph): Free const_dests member.
>> (elim_graph_size): Adapt to new type of nodes member.
>> (elim_graph_add_node): Likewise.
>> (eliminate_name): Likewise.
>> (eliminate_build): Don't take basic block argument, deal only with
>> partition numbers, not variables.
>> (get_temp_reg): New static helper.
>> (elim_create): Use it, deal with RTL temporaries instead of trees.
>> (eliminate_phi): Adjust all calls to new signature.
>> (assign_vars, replace_use_variable, replace_def_variable): Remove.
>> (rewrite_trees): Only do checking.
>> (edge_leader, stmt_list, leader_has_match, leader_match): Remove.
>> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p,
>> init_analyze_edges_for_bb, fini_analyze_edges_for_bb,
>> contains_tree_r, MAX_STMTS_IN_LATCH,
>> process_single_block_loop_latch, analyze_edges_for_bb,
>> perform_edge_inserts): Remove.
>> (expand_phi_nodes): New global function.
>> (remove_ssa_form): Take ssaexpand parameter. Don't call removed
>> functions, initialize new parameter, remember partitions having a
>> default def.
>> (finish_out_of_ssa): New global function.
>> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form,
>> don't reset in_ssa_p here.
>> (pass_del_ssa): Remove.
>> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and
>> partition members.
>> (execute_free_datastructures): Declare.
>> * Makefile.in (SSAEXPAND_H): New variable.
>> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H.
>> * basic-block.h (commit_one_edge_insertion): Declare.
>> * passes.c (init_optimization_passes): Move pass_nrv and
>> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove
>> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations.
>> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare.
>> (redirect_branch_edge): Deal with super block when expanding, split
>> out jump patching itself into ...
>> (patch_jump_insn): ... here, new static helper.
>>
>
> This patch caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922
>
> You may need a 32bit host to see it since I didn't see it on
> Linux/x86-64 with -m32.
>
This also caused:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39954
--
H.J.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-26 20:21 ` Michael Matz
2009-04-26 20:34 ` Richard Guenther
2009-04-26 21:42 ` Michael Matz
@ 2009-04-27 12:34 ` Michael Matz
2 siblings, 0 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-27 12:34 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 7187 bytes --]
Hi,
On Sun, 26 Apr 2009, Michael Matz wrote:
> > > + Â if (SA.values
> > > + Â Â Â && TREE_CODE (lhs) == SSA_NAME
> > > + Â Â Â && SA.values[SSA_NAME_VERSION (lhs)])
> > > + Â Â lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]);
> >
> > Do we really need the SA.values array here? It seems that
> > a bitmap would be enough, as the definition should be still
> > reachable via SSA_NAME_DEF_STMT (lhs). (Thanks Paolo
> > for noticing this)
>
> This should indeed be possible, but I want to do this as cleanup next
> week.
Like so. Already regstrapped without regressions on x86_64-linux.
Committed as r146837.
Ciao,
Michael.
--
* ssaexpand.h (struct ssaexpand): Member 'values' is a bitmap.
(get_gimple_for_ssa_name): Adjust, lookup using SSA_NAME_DEF_STMT.
* tree-ssa-live.h: (find_replaceable_exprs): Return a bitmap.
(dump_replaceable_exprs): Take a bitmap.
* cfgexpand.c (gimple_cond_pred_to_tree): Handle bitmap instead of
array.
(expand_gimple_basic_block): Likewise.
* tree-ssa-ter.c (struct temp_expr_table_d): Make
replaceable_expressions member a bitmap.
(free_temp_expr_table): Pass back and deal with bitmap, not gimple*.
(mark_replaceable): Likewise.
(find_replaceable_in_bb, dump_replaceable_exprs): Likewise.
* tree-outof-ssa.c (remove_ssa_form): 'values' is a bitmap.
Index: cfgexpand.c
===================================================================
--- cfgexpand.c (Revision 146817)
+++ cfgexpand.c (Arbeitskopie)
@@ -94,8 +94,8 @@ gimple_cond_pred_to_tree (gimple stmt)
tree lhs = gimple_cond_lhs (stmt);
if (SA.values
&& TREE_CODE (lhs) == SSA_NAME
- && SA.values[SSA_NAME_VERSION (lhs)])
- lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]);
+ && bitmap_bit_p (SA.values, SSA_NAME_VERSION (lhs)))
+ lhs = gimple_assign_rhs_to_tree (SSA_NAME_DEF_STMT (lhs));
return build2 (gimple_cond_code (stmt), boolean_type_node,
lhs, gimple_cond_rhs (stmt));
@@ -2078,7 +2078,8 @@ expand_gimple_basic_block (basic_block b
/* Ignore this stmt if it is in the list of
replaceable expressions. */
if (SA.values
- && SA.values[SSA_NAME_VERSION (DEF_FROM_PTR (def_p))])
+ && bitmap_bit_p (SA.values,
+ SSA_NAME_VERSION (DEF_FROM_PTR (def_p))))
continue;
}
stmt_tree = gimple_to_tree (stmt);
Index: tree-ssa-live.h
===================================================================
--- tree-ssa-live.h (Revision 146817)
+++ tree-ssa-live.h (Arbeitskopie)
@@ -340,8 +340,8 @@ extern var_map coalesce_ssa_name (void);
/* From tree-ssa-ter.c */
-extern gimple *find_replaceable_exprs (var_map);
-extern void dump_replaceable_exprs (FILE *, gimple *);
+extern bitmap find_replaceable_exprs (var_map);
+extern void dump_replaceable_exprs (FILE *, bitmap);
#endif /* _TREE_SSA_LIVE_H */
Index: tree-ssa-ter.c
===================================================================
--- tree-ssa-ter.c (Revision 146815)
+++ tree-ssa-ter.c (Arbeitskopie)
@@ -159,7 +159,7 @@ typedef struct temp_expr_table_d
{
var_map map;
bitmap *partition_dependencies; /* Partitions expr is dependent on. */
- gimple *replaceable_expressions; /* Replacement expression table. */
+ bitmap replaceable_expressions; /* Replacement expression table. */
bitmap *expr_decl_uids; /* Base uids of exprs. */
bitmap *kill_list; /* Expr's killed by a partition. */
int virtual_partition; /* Pseudo partition for virtual ops. */
@@ -216,10 +216,10 @@ new_temp_expr_table (var_map map)
/* Free TER table T. If there are valid replacements, return the expression
vector. */
-static gimple *
+static bitmap
free_temp_expr_table (temp_expr_table_p t)
{
- gimple *ret = NULL;
+ bitmap ret = NULL;
#ifdef ENABLE_CHECKING
unsigned x;
@@ -255,7 +255,7 @@ version_to_be_replaced_p (temp_expr_tabl
{
if (!tab->replaceable_expressions)
return false;
- return tab->replaceable_expressions[version] != NULL;
+ return bitmap_bit_p (tab->replaceable_expressions, version);
}
@@ -562,8 +562,8 @@ mark_replaceable (temp_expr_table_p tab,
/* Set the replaceable expression. */
if (!tab->replaceable_expressions)
- tab->replaceable_expressions = XCNEWVEC (gimple, num_ssa_names + 1);
- tab->replaceable_expressions[version] = SSA_NAME_DEF_STMT (var);
+ tab->replaceable_expressions = BITMAP_ALLOC (NULL);
+ bitmap_set_bit (tab->replaceable_expressions, version);
}
@@ -653,12 +653,12 @@ find_replaceable_in_bb (temp_expr_table_
NULL is returned by the function, otherwise an expression vector indexed
by SSA_NAME version numbers. */
-extern gimple *
+extern bitmap
find_replaceable_exprs (var_map map)
{
basic_block bb;
temp_expr_table_p table;
- gimple *ret;
+ bitmap ret;
table = new_temp_expr_table (map);
FOR_EACH_BB (bb)
@@ -676,19 +676,19 @@ find_replaceable_exprs (var_map map)
/* Dump TER expression table EXPR to file F. */
void
-dump_replaceable_exprs (FILE *f, gimple *expr)
+dump_replaceable_exprs (FILE *f, bitmap expr)
{
tree var;
unsigned x;
fprintf (f, "\nReplacing Expressions\n");
for (x = 0; x < num_ssa_names; x++)
- if (expr[x])
+ if (bitmap_bit_p (expr, x))
{
var = ssa_name (x);
print_generic_expr (f, var, TDF_SLIM);
fprintf (f, " replace with --> ");
- print_gimple_stmt (f, expr[x], 0, TDF_SLIM);
+ print_gimple_stmt (f, SSA_NAME_DEF_STMT (var), 0, TDF_SLIM);
fprintf (f, "\n");
}
fprintf (f, "\n");
Index: tree-outof-ssa.c
===================================================================
--- tree-outof-ssa.c (Revision 146817)
+++ tree-outof-ssa.c (Arbeitskopie)
@@ -791,7 +791,7 @@ expand_phi_nodes (struct ssaexpand *sa)
static void
remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
{
- gimple *values = NULL;
+ bitmap values = NULL;
var_map map;
unsigned i;
@@ -926,7 +926,7 @@ finish_out_of_ssa (struct ssaexpand *sa)
{
free (sa->partition_to_pseudo);
if (sa->values)
- free (sa->values);
+ BITMAP_FREE (sa->values);
delete_var_map (sa->map);
BITMAP_FREE (sa->partition_has_default_def);
memset (sa, 0, sizeof *sa);
Index: ssaexpand.h
===================================================================
--- ssaexpand.h (Revision 146817)
+++ ssaexpand.h (Arbeitskopie)
@@ -31,10 +31,9 @@ struct ssaexpand
/* The computed partitions of SSA names are stored here. */
var_map map;
- /* For a SSA name version V values[V] contains the gimple statement
- defining it iff TER decided that it should be forwarded, NULL
- otherwise. */
- gimple *values;
+ /* For an SSA name version V bit V is set iff TER decided that
+ its definition should be forwarded. */
+ bitmap values;
/* For a partition number I partition_to_pseudo[I] contains the
RTL expression of the allocated space of it (either a MEM or
@@ -67,8 +66,8 @@ static inline gimple
get_gimple_for_ssa_name (tree exp)
{
int v = SSA_NAME_VERSION (exp);
- if (SA.values)
- return SA.values[v];
+ if (SA.values && bitmap_bit_p (SA.values, v))
+ return SSA_NAME_DEF_STMT (exp);
return NULL;
}
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-22 16:45 ` [RFA] " Michael Matz
` (2 preceding siblings ...)
2009-04-27 5:47 ` H.J. Lu
@ 2009-04-27 7:22 ` Hans-Peter Nilsson
2009-04-30 18:18 ` Steve Ellcey
4 siblings, 0 replies; 56+ messages in thread
From: Hans-Peter Nilsson @ 2009-04-27 7:22 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches
On Wed, 22 Apr 2009, Michael Matz wrote:
> * builtins.c (fold_builtin_next_arg): Handle SSA names.
<long expand-from-ssa CL elided>
On the off-chance that matz-at-gcc doesn't reach you, a change
in the range 146814:146820 (i.e. this patch) caused PR39927.
No good deed goes unpunished!
brgds, H-P
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-22 16:45 ` [RFA] " Michael Matz
2009-04-23 15:10 ` Andrew MacLeod
2009-04-24 14:32 ` Richard Guenther
@ 2009-04-27 5:47 ` H.J. Lu
2009-04-28 23:49 ` H.J. Lu
2009-04-27 7:22 ` Hans-Peter Nilsson
2009-04-30 18:18 ` Steve Ellcey
4 siblings, 1 reply; 56+ messages in thread
From: H.J. Lu @ 2009-04-27 5:47 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Wed, Apr 22, 2009 at 9:42 AM, Michael Matz <matz@suse.de> wrote:
> On Wed, 22 Apr 2009, Michael Matz wrote:
>
>> I'll soon send a new version of the patch that fixes all problems and
>> testcases I encountered.
>
> Like so. This is the full patch, i.e. including the cleanups, but
> excluding the testsuite changes. It should incorporate all feedback.
> Compared to the last version it adds comments for new functions, fixes
> muflap2, and generally some other minor problems showing when I started
> testing Ada and a bug reported by Andrey.
>
> This patch (plus testsuite changes) was bootstrapped with Ada on
> x86_64-linux. There are no testsuite regressions:
> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE
> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE
> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors)
> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors)
> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors)
> FAIL: libmudflap.c++/pass41-frag.cxx execution test
> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test
> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test
> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test
>
> All of these happen without the patch too (known bugs, old binutils, and
> pass41-frag never seems to work anyway).
>
> I'd like to ask for approval for the series.
>
>
> Ciao,
> Michael.
> --
> * builtins.c (fold_builtin_next_arg): Handle SSA names.
> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate
> beyond num_ssa_names, use ssa_name() directly.
> * tree-ssa-ter.c (free_temp_expr_table): Likewise.
> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise,
> mark only useful SSA names.
> (compare_pairs): Swap cost comparison.
> (coalesce_ssa_name): Don't use change_partition_var.
> * tree-nrv.c (struct nrv_data): Add modified member.
> (finalize_nrv_r): Set it.
> (tree_nrv): Use it to update statements.
> (pass_nrv): Require PROP_ssa.
> * tree-mudflap.c (create_referenced_var): New static helper.
> (mf_decl_cache_locals, mf_build_check_statement_for): Use it.
> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish.
> * alias.c (find_base_decl): Handle SSA names.
> * emit-rtl (set_reg_attrs_for_parm): Make non-static.
> (component_ref_for_mem_expr): Don't leak SSA names into RTL.
> * rtl.h (set_reg_attrs_for_parm): Declare.
> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename
> to "optimized", remove unused locals at finish.
> (execute_free_datastructures): Make global, call
> delete_tree_cfg_annotations.
> (execute_free_cfg_annotations): Don't call
> delete_tree_cfg_annotations.
>
> * ssaexpand.h: New file.
> * expr.c (toplevel): Include ssaexpand.h.
> (expand_assignment): Handle SSA names the same as register
> variables.
> (expand_expr_real_1): Expand SSA names.
> * cfgexpand.c (toplevel): Include ssaexpand.h.
> (SA): New global variable.
> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates.
> (SSAVAR): New macro.
> (set_rtl): New helper function.
> (add_stack_var): Deal with SSA names, use set_rtl.
> (expand_one_stack_var_at): Likewise.
> (expand_one_stack_var): Deal with SSA names.
> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker
> before unique numbers.
> (expand_stack_vars): Use set_rtl.
> (expand_one_var): Accept SSA names, add asserts for them, feed them
> to above subroutines.
> (expand_used_vars): Expand all partitions (without default defs),
> then only the local decls (ignoring those expanded already).
> (expand_gimple_cond): Remove edges when jumpif() expands an
> unconditional jump.
> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here,
> or remove abnormal edges. Ignore insns setting the LHS of a TERed
> SSA name.
> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize
> members of SA; deal with PARM_DECL partitions here; expand
> all PHI nodes, free tree datastructures and SA. Commit instructions
> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here.
> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow
> info and statements at start, collect garbage at finish.
> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member.
> (VAR_ANN_PARTITION) Remove.
> (change_partition_var): Don't declare.
> (partition_to_var): Always return SSA names.
> (var_to_partition): Only accept SSA names.
> (register_ssa_partition): Only check argument.
> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var
> member.
> (delete_var_map): Don't free it.
> (var_union): Only accept SSA names, simplify.
> (partition_view_init): Mark only useful SSA names as used.
> (partition_view_fini): Only deal with SSA names.
> (change_partition_var): Remove.
> (dump_var_map): Use ssa_name instead of partition_to_var member.
> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL
> basic blocks.
> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h.
> (struct _elim_graph): New member const_dests; nodes member vector of
> ints.
> (set_location_for_edge): New static helper.
> (create_temp): Remove.
> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge,
> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New
> functions.
> (new_elim_graph): Allocate const_dests member.
> (clean_elim_graph): Truncate const_dests member.
> (delete_elim_graph): Free const_dests member.
> (elim_graph_size): Adapt to new type of nodes member.
> (elim_graph_add_node): Likewise.
> (eliminate_name): Likewise.
> (eliminate_build): Don't take basic block argument, deal only with
> partition numbers, not variables.
> (get_temp_reg): New static helper.
> (elim_create): Use it, deal with RTL temporaries instead of trees.
> (eliminate_phi): Adjust all calls to new signature.
> (assign_vars, replace_use_variable, replace_def_variable): Remove.
> (rewrite_trees): Only do checking.
> (edge_leader, stmt_list, leader_has_match, leader_match): Remove.
> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p,
> init_analyze_edges_for_bb, fini_analyze_edges_for_bb,
> contains_tree_r, MAX_STMTS_IN_LATCH,
> process_single_block_loop_latch, analyze_edges_for_bb,
> perform_edge_inserts): Remove.
> (expand_phi_nodes): New global function.
> (remove_ssa_form): Take ssaexpand parameter. Don't call removed
> functions, initialize new parameter, remember partitions having a
> default def.
> (finish_out_of_ssa): New global function.
> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form,
> don't reset in_ssa_p here.
> (pass_del_ssa): Remove.
> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and
> partition members.
> (execute_free_datastructures): Declare.
> * Makefile.in (SSAEXPAND_H): New variable.
> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H.
> * basic-block.h (commit_one_edge_insertion): Declare.
> * passes.c (init_optimization_passes): Move pass_nrv and
> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove
> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations.
> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare.
> (redirect_branch_edge): Deal with super block when expanding, split
> out jump patching itself into ...
> (patch_jump_insn): ... here, new static helper.
>
This patch caused:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39922
You may need a 32bit host to see it since I didn't see it on
Linux/x86-64 with -m32.
--
H.J.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-26 21:17 ` Richard Guenther
@ 2009-04-26 22:21 ` Michael Matz
0 siblings, 0 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-26 22:21 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1013 bytes --]
Hi,
On Sun, 26 Apr 2009, Richard Guenther wrote:
> > I had one in testing also removing the value_handle field. Â It's
> > unused since some time too (it's moved over to the ssa_name structure
> > directly).
>
> Pre-approved if it passes bootstrap.
Committed this as r146820.
Ciao,
Michael.
--
* tree-flow.h (tree_ann_common_d): Remove aux and value_handle members.
Index: tree-flow.h
===================================================================
--- tree-flow.h (Revision 146817)
+++ tree-flow.h (Arbeitskopie)
@@ -136,13 +136,6 @@ struct GTY(()) tree_ann_common_d {
expansion (see gimple_to_tree). */
int rn;
- /* Auxiliary info specific to a pass. At all times, this
- should either point to valid data or be NULL. */
- PTR GTY ((skip (""))) aux;
-
- /* The value handle for this expression. Used by GVN-PRE. */
- tree GTY((skip)) value_handle;
-
/* Pointer to original GIMPLE statement. Used during RTL expansion
(see gimple_to_tree). */
gimple stmt;
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-26 21:42 ` Michael Matz
@ 2009-04-26 22:15 ` Michael Matz
0 siblings, 0 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-26 22:15 UTC (permalink / raw)
To: gcc-patches
Hi,
On Sun, 26 Apr 2009, Michael Matz wrote:
> Like so. Will commit if regstrapping passes.
Which it did on x86_64-linux --> r146819.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-26 20:21 ` Michael Matz
2009-04-26 20:34 ` Richard Guenther
@ 2009-04-26 21:42 ` Michael Matz
2009-04-26 22:15 ` Michael Matz
2009-04-27 12:34 ` Michael Matz
2 siblings, 1 reply; 56+ messages in thread
From: Michael Matz @ 2009-04-26 21:42 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches
Hi,
On Sun, 26 Apr 2009, Michael Matz wrote:
> > It looks like the pass referencing this is now dead. Please remove
> > this function and the pass structure.
>
> It is, but done as followup.
>
> > pass_mark_unused_blocks is no longer needed now as you remove unused
> > locals and that marks blocks used? Please remove this pass (and its
> > code).
>
> Also followup.
Like so. Will commit if regstrapping passes.
Ciao,
Michael.
--
* tree-pass.h (pass_del_ssa, pass_mark_used_blocks,
pass_free_cfg_annotations, pass_free_datastructures): Remove decls.
* gimple-low.c (mark_blocks_with_used_vars, mark_used_blocks,
pass_mark_used_blocks): Remove.
* tree-optimize.c (pass_free_datastructures,
execute_free_cfg_annotations, pass_free_cfg_annotations): Remove.
* passes.c (init_optimization_passes): Don't call
pass_mark_used_blocks, remove dead code.
Index: tree-pass.h
===================================================================
--- tree-pass.h (Revision 146806)
+++ tree-pass.h (Arbeitskopie)
@@ -344,7 +344,6 @@ extern struct gimple_opt_pass pass_ch;
extern struct gimple_opt_pass pass_ccp;
extern struct gimple_opt_pass pass_phi_only_cprop;
extern struct gimple_opt_pass pass_build_ssa;
-extern struct gimple_opt_pass pass_del_ssa;
extern struct gimple_opt_pass pass_build_alias;
extern struct gimple_opt_pass pass_dominator;
extern struct gimple_opt_pass pass_dce;
@@ -380,7 +379,6 @@ extern struct gimple_opt_pass pass_phipr
extern struct gimple_opt_pass pass_tree_ifcombine;
extern struct gimple_opt_pass pass_dse;
extern struct gimple_opt_pass pass_nrv;
-extern struct gimple_opt_pass pass_mark_used_blocks;
extern struct gimple_opt_pass pass_rename_ssa_copies;
extern struct gimple_opt_pass pass_rest_of_compilation;
extern struct gimple_opt_pass pass_sink_code;
@@ -414,8 +412,6 @@ extern struct simple_ipa_opt_pass pass_i
extern struct gimple_opt_pass pass_all_optimizations;
extern struct gimple_opt_pass pass_cleanup_cfg_post_optimizing;
-extern struct gimple_opt_pass pass_free_cfg_annotations;
-extern struct gimple_opt_pass pass_free_datastructures;
extern struct gimple_opt_pass pass_init_datastructures;
extern struct gimple_opt_pass pass_fixup_cfg;
Index: gimple-low.c
===================================================================
--- gimple-low.c (Revision 146806)
+++ gimple-low.c (Arbeitskopie)
@@ -900,61 +900,3 @@ record_vars (tree vars)
{
record_vars_into (vars, current_function_decl);
}
-
-
-/* Mark BLOCK used if it has a used variable in it, then recurse over its
- subblocks. */
-
-static void
-mark_blocks_with_used_vars (tree block)
-{
- tree var;
- tree subblock;
-
- if (!TREE_USED (block))
- {
- for (var = BLOCK_VARS (block);
- var;
- var = TREE_CHAIN (var))
- {
- if (TREE_USED (var))
- {
- TREE_USED (block) = true;
- break;
- }
- }
- }
- for (subblock = BLOCK_SUBBLOCKS (block);
- subblock;
- subblock = BLOCK_CHAIN (subblock))
- mark_blocks_with_used_vars (subblock);
-}
-
-/* Mark the used attribute on blocks correctly. */
-
-static unsigned int
-mark_used_blocks (void)
-{
- mark_blocks_with_used_vars (DECL_INITIAL (current_function_decl));
- return 0;
-}
-
-
-struct gimple_opt_pass pass_mark_used_blocks =
-{
- {
- GIMPLE_PASS,
- "blocks", /* name */
- NULL, /* gate */
- mark_used_blocks, /* execute */
- NULL, /* sub */
- NULL, /* next */
- 0, /* static_pass_number */
- TV_NONE, /* tv_id */
- 0, /* properties_required */
- 0, /* properties_provided */
- 0, /* properties_destroyed */
- 0, /* todo_flags_start */
- TODO_dump_func /* todo_flags_finish */
- }
-};
Index: tree-optimize.c
===================================================================
--- tree-optimize.c (Revision 146817)
+++ tree-optimize.c (Arbeitskopie)
@@ -236,51 +236,6 @@ execute_free_datastructures (void)
return 0;
}
-struct gimple_opt_pass pass_free_datastructures =
-{
- {
- GIMPLE_PASS,
- NULL, /* name */
- NULL, /* gate */
- execute_free_datastructures, /* execute */
- NULL, /* sub */
- NULL, /* next */
- 0, /* static_pass_number */
- TV_NONE, /* tv_id */
- PROP_cfg, /* properties_required */
- 0, /* properties_provided */
- 0, /* properties_destroyed */
- 0, /* todo_flags_start */
- 0 /* todo_flags_finish */
- }
-};
-/* Pass: free cfg annotations. */
-
-static unsigned int
-execute_free_cfg_annotations (void)
-{
- return 0;
-}
-
-struct gimple_opt_pass pass_free_cfg_annotations =
-{
- {
- GIMPLE_PASS,
- NULL, /* name */
- NULL, /* gate */
- execute_free_cfg_annotations, /* execute */
- NULL, /* sub */
- NULL, /* next */
- 0, /* static_pass_number */
- TV_NONE, /* tv_id */
- PROP_cfg, /* properties_required */
- 0, /* properties_provided */
- 0, /* properties_destroyed */
- 0, /* todo_flags_start */
- 0 /* todo_flags_finish */
- }
-};
-
/* Pass: fixup_cfg. IPA passes, compilation of earlier functions or inlining
might have changed some properties, such as marked functions nothrow.
Remove redundant edges and basic blocks, and create new ones if necessary.
Index: passes.c
===================================================================
--- passes.c (Revision 146817)
+++ passes.c (Arbeitskopie)
@@ -709,13 +709,9 @@ init_optimization_passes (void)
NEXT_PASS (pass_cleanup_eh);
NEXT_PASS (pass_nrv);
NEXT_PASS (pass_mudflap_2);
- NEXT_PASS (pass_mark_used_blocks);
NEXT_PASS (pass_cleanup_cfg_post_optimizing);
NEXT_PASS (pass_warn_function_noreturn);
-/* NEXT_PASS (pass_del_ssa);
- NEXT_PASS (pass_free_datastructures);
- NEXT_PASS (pass_free_cfg_annotations);*/
NEXT_PASS (pass_expand);
NEXT_PASS (pass_rest_of_compilation);
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-26 21:15 ` Michael Matz
@ 2009-04-26 21:17 ` Richard Guenther
2009-04-26 22:21 ` Michael Matz
0 siblings, 1 reply; 56+ messages in thread
From: Richard Guenther @ 2009-04-26 21:17 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Sun, Apr 26, 2009 at 11:13 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Sun, 26 Apr 2009, Richard Guenther wrote:
>
>> > I'd rather remove that field and the fields in the var annotation.
>>
>> Indeed. I'm testing a patch to remove the aux field.
>
> I had one in testing also removing the value_handle field. It's unused
> since some time too (it's moved over to the ssa_name structure directly).
Pre-approved if it passes bootstrap.
Thanks,
Richard.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-26 21:14 ` Richard Guenther
@ 2009-04-26 21:15 ` Michael Matz
2009-04-26 21:17 ` Richard Guenther
0 siblings, 1 reply; 56+ messages in thread
From: Michael Matz @ 2009-04-26 21:15 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
Hi,
On Sun, 26 Apr 2009, Richard Guenther wrote:
> > I'd rather remove that field and the fields in the var annotation.
>
> Indeed. I'm testing a patch to remove the aux field.
I had one in testing also removing the value_handle field. It's unused
since some time too (it's moved over to the ssa_name structure directly).
> The partial transition to tuples (the hack going back to GENERIC for
> expansion) also introduced two members, stmt and rn. Fixing that
> partial transition should shrink tree_ann_common_d further.
Yes.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-26 20:53 ` Michael Matz
@ 2009-04-26 21:14 ` Richard Guenther
2009-04-26 21:15 ` Michael Matz
0 siblings, 1 reply; 56+ messages in thread
From: Richard Guenther @ 2009-04-26 21:14 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Sun, Apr 26, 2009 at 10:47 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Sun, 26 Apr 2009, Richard Guenther wrote:
>
>> > into the var annotation (to be able to read it out again when the same
>> > basevar is seen for a different partition). But this whole info is
>> > strictly local to the above function, so it doesn't need to live in the
>> > annotation. I could very well implement this as an array indexed by
>> > DECL_UID. The UIDs shouldn't become exceptionally large, so that seems
>>
>> Hm. DECL_UIDs are sparse (and global), it would be a bad idea to index
>> an array with it.
>
> Those few 100k entries ... nobody will notice. Hmm, well, maybe someone
> does :-)
>
>> Is var_ann->common.aux already used by out-of-SSA?
>
> Oh joy. This field is unused by out-of-SSA. Which makes sense, as -
> drumroll - nothing (!) at all uses it (in the sense of deleting it doesn't
> break compile).
>
> I'd rather remove that field and the fields in the var annotation.
Indeed. I'm testing a patch to remove the aux field. The partial
transition to tuples (the hack going back to GENERIC for expansion)
also introduced two members, stmt and rn. Fixing that partial
transition should shrink tree_ann_common_d further.
In the end we would like to get rid of tree annotations completely
of course.
Richard.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-26 20:34 ` Richard Guenther
@ 2009-04-26 20:53 ` Michael Matz
2009-04-26 21:14 ` Richard Guenther
0 siblings, 1 reply; 56+ messages in thread
From: Michael Matz @ 2009-04-26 20:53 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1084 bytes --]
Hi,
On Sun, 26 Apr 2009, Richard Guenther wrote:
> > into the var annotation (to be able to read it out again when the same
> > basevar is seen for a different partition). Â But this whole info is
> > strictly local to the above function, so it doesn't need to live in the
> > annotation. Â I could very well implement this as an array indexed by
> > DECL_UID. Â The UIDs shouldn't become exceptionally large, so that seems
>
> Hm. DECL_UIDs are sparse (and global), it would be a bad idea to index
> an array with it.
Those few 100k entries ... nobody will notice. Hmm, well, maybe someone
does :-)
> Is var_ann->common.aux already used by out-of-SSA?
Oh joy. This field is unused by out-of-SSA. Which makes sense, as -
drumroll - nothing (!) at all uses it (in the sense of deleting it doesn't
break compile).
I'd rather remove that field and the fields in the var annotation.
> > feasible. Â I wouldn't want to use a hash-table for fear of slowing down
> > var_map_base_init().
>
> I think it would be not too bad ;)
I'll experiment a bit.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-26 20:21 ` Michael Matz
@ 2009-04-26 20:34 ` Richard Guenther
2009-04-26 20:53 ` Michael Matz
2009-04-26 21:42 ` Michael Matz
2009-04-27 12:34 ` Michael Matz
2 siblings, 1 reply; 56+ messages in thread
From: Richard Guenther @ 2009-04-26 20:34 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Sun, Apr 26, 2009 at 10:19 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Fri, 24 Apr 2009, Richard Guenther wrote:
>
>> *************** struct var_ann_d GTY(())
>> *** 234,243 ****
>> /* Used by var_map for the base index of ssa base variables. */
>> unsigned base_index;
>>
>> hmm, so what is needed to remove base_index as well?
>
> Rewriting var_map_base_init() to not use var annotations. SSA name
> coalescing and conflict building wants to have a DECL->dense-integer
> mapping for various reasons (two-stage approach for building
> conflicts).
>
> Currently that mapping is generated by storing the next available index
> into the var annotation (to be able to read it out again when the same
> basevar is seen for a different partition). But this whole info is
> strictly local to the above function, so it doesn't need to live in the
> annotation. I could very well implement this as an array indexed by
> DECL_UID. The UIDs shouldn't become exceptionally large, so that seems
Hm. DECL_UIDs are sparse (and global), it would be a bad idea to index
an array with it. Is var_ann->common.aux already used by out-of-SSA?
(also common.rn and common.value_handle are "local" values, but they
should go as well in the end).
> feasible. I wouldn't want to use a hash-table for fear of slowing down
> var_map_base_init().
I think it would be not too bad ;)
Richard.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-23 15:10 ` Andrew MacLeod
2009-04-24 9:42 ` Richard Guenther
@ 2009-04-26 20:27 ` Michael Matz
2010-01-19 15:48 ` H.J. Lu
1 sibling, 1 reply; 56+ messages in thread
From: Michael Matz @ 2009-04-26 20:27 UTC (permalink / raw)
To: Andrew MacLeod; +Cc: gcc-patches, Andrey Belevantsev
Hi,
On Thu, 23 Apr 2009, Andrew MacLeod wrote:
> Looks good to me.
It's now in.
> I do think this should be fixed in commit_edge_insertions and the common
> code factored out and called from here, but that can be done separately.
> Then commit_one_edge_insertion() could return to being a static function
> as well.
Yes, I plan to do that. There are also some other cleanups on the plate.
> Im guessing current_ir_type() is not IR_RTL_CFGLAYOUT, or you could just
> call commit_edge_insertions directly...
No, unfortunately it's not.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-24 14:46 ` Richard Guenther
@ 2009-04-26 20:21 ` Michael Matz
2009-04-26 20:34 ` Richard Guenther
` (2 more replies)
0 siblings, 3 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-26 20:21 UTC (permalink / raw)
To: Richard Guenther; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3055 bytes --]
Hi,
On Fri, 24 Apr 2009, Richard Guenther wrote:
> > ! Â map = init_var_map (num_ssa_names + 1);
> > --- 291,297 ----
> > ! Â map = init_var_map (num_ssa_names);
>
> This piece is ok as obvious. Please commit it separately.
This and the other two instances of num_ssa_names confusions committed as
r146815.
The rest of the patch (with adjustments except as noted below)
committed as r146817. I had to apply the below additional change to be
able to build Ada, as we can't redirect (and hence split) EH edges in RTL
land. So I have to pre-split them when queing insns.
Reregstrapped on x86_64-linux.
I haven't dealt with with these comments:
> > + Â if (SA.values
> > + Â Â Â && TREE_CODE (lhs) == SSA_NAME
> > + Â Â Â && SA.values[SSA_NAME_VERSION (lhs)])
> > + Â Â lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]);
>
> Do we really need the SA.values array here? It seems that
> a bitmap would be enough, as the definition should be still
> reachable via SSA_NAME_DEF_STMT (lhs). (Thanks Paolo
> for noticing this)
This should indeed be possible, but I want to do this as cleanup next
week.
> > + #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> This should be in some public place, ideally with a more sound name.
> Any ideas?
No bright idea. STRIP_SSA perhaps? If someone has a better name or
agrees to that one, I'd move it to tree.h and make a pass over the
compiler to substitute the above pattern with the macro.
> > Â execute_free_cfg_annotations (void)
> > Â {
> > - Â /* And get rid of annotations we no longer need. Â */
> > - Â delete_tree_cfg_annotations ();
> > -
>
> It looks like the pass referencing this is now dead. Please remove
> this function and the pass structure.
It is, but done as followup.
> *************** struct var_ann_d GTY(())
> *** 234,243 ****
> /* Used by var_map for the base index of ssa base variables. */
> unsigned base_index;
>
> hmm, so what is needed to remove base_index as well?
Rewriting var_map_base_init() to not use var annotations. SSA name
coalescing and conflict building wants to have a DECL->dense-integer
mapping for various reasons (two-stage approach for building
conflicts).
Currently that mapping is generated by storing the next available index
into the var annotation (to be able to read it out again when the same
basevar is seen for a different partition). But this whole info is
strictly local to the above function, so it doesn't need to live in the
annotation. I could very well implement this as an array indexed by
DECL_UID. The UIDs shouldn't become exceptionally large, so that seems
feasible. I wouldn't want to use a hash-table for fear of slowing down
var_map_base_init().
> + NEXT_PASS (pass_mudflap_2);
> NEXT_PASS (pass_mark_used_blocks);
>
> pass_mark_unused_blocks is no longer needed now as you remove unused
> locals and that
> marks blocks used? Please remove this pass (and its code).
Also followup.
Ciao,
Michael.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-24 14:32 ` Richard Guenther
@ 2009-04-24 14:46 ` Richard Guenther
2009-04-26 20:21 ` Michael Matz
0 siblings, 1 reply; 56+ messages in thread
From: Richard Guenther @ 2009-04-24 14:46 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Fri, Apr 24, 2009 at 4:21 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Wed, Apr 22, 2009 at 6:42 PM, Michael Matz <matz@suse.de> wrote:
>> On Wed, 22 Apr 2009, Michael Matz wrote:
>>
>>> I'll soon send a new version of the patch that fixes all problems and
>>> testcases I encountered.
>>
>> Like so. This is the full patch, i.e. including the cleanups, but
>> excluding the testsuite changes. It should incorporate all feedback.
>> Compared to the last version it adds comments for new functions, fixes
>> muflap2, and generally some other minor problems showing when I started
>> testing Ada and a bug reported by Andrey.
>>
>> This patch (plus testsuite changes) was bootstrapped with Ada on
>> x86_64-linux. There are no testsuite regressions:
>> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE
>> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE
>> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors)
>> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors)
>> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors)
>> FAIL: libmudflap.c++/pass41-frag.cxx execution test
>> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test
>> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test
>> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test
>>
>> All of these happen without the patch too (known bugs, old binutils, and
>> pass41-frag never seems to work anyway).
>>
>> I'd like to ask for approval for the series.
>>
>>
>> Ciao,
>> Michael.
>> --
>> * builtins.c (fold_builtin_next_arg): Handle SSA names.
>> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate
>> beyond num_ssa_names, use ssa_name() directly.
>> * tree-ssa-ter.c (free_temp_expr_table): Likewise.
>> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise,
>> mark only useful SSA names.
>> (compare_pairs): Swap cost comparison.
>> (coalesce_ssa_name): Don't use change_partition_var.
>> * tree-nrv.c (struct nrv_data): Add modified member.
>> (finalize_nrv_r): Set it.
>> (tree_nrv): Use it to update statements.
>> (pass_nrv): Require PROP_ssa.
>> * tree-mudflap.c (create_referenced_var): New static helper.
>> (mf_decl_cache_locals, mf_build_check_statement_for): Use it.
>> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish.
>> * alias.c (find_base_decl): Handle SSA names.
>> * emit-rtl (set_reg_attrs_for_parm): Make non-static.
>> (component_ref_for_mem_expr): Don't leak SSA names into RTL.
>> * rtl.h (set_reg_attrs_for_parm): Declare.
>> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename
>> to "optimized", remove unused locals at finish.
>> (execute_free_datastructures): Make global, call
>> delete_tree_cfg_annotations.
>> (execute_free_cfg_annotations): Don't call
>> delete_tree_cfg_annotations.
>>
>> * ssaexpand.h: New file.
>> * expr.c (toplevel): Include ssaexpand.h.
>> (expand_assignment): Handle SSA names the same as register
>> variables.
>> (expand_expr_real_1): Expand SSA names.
>> * cfgexpand.c (toplevel): Include ssaexpand.h.
>> (SA): New global variable.
>> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates.
>> (SSAVAR): New macro.
>> (set_rtl): New helper function.
>> (add_stack_var): Deal with SSA names, use set_rtl.
>> (expand_one_stack_var_at): Likewise.
>> (expand_one_stack_var): Deal with SSA names.
>> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker
>> before unique numbers.
>> (expand_stack_vars): Use set_rtl.
>> (expand_one_var): Accept SSA names, add asserts for them, feed them
>> to above subroutines.
>> (expand_used_vars): Expand all partitions (without default defs),
>> then only the local decls (ignoring those expanded already).
>> (expand_gimple_cond): Remove edges when jumpif() expands an
>> unconditional jump.
>> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here,
>> or remove abnormal edges. Ignore insns setting the LHS of a TERed
>> SSA name.
>> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize
>> members of SA; deal with PARM_DECL partitions here; expand
>> all PHI nodes, free tree datastructures and SA. Commit instructions
>> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here.
>> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow
>> info and statements at start, collect garbage at finish.
>> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member.
>> (VAR_ANN_PARTITION) Remove.
>> (change_partition_var): Don't declare.
>> (partition_to_var): Always return SSA names.
>> (var_to_partition): Only accept SSA names.
>> (register_ssa_partition): Only check argument.
>> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var
>> member.
>> (delete_var_map): Don't free it.
>> (var_union): Only accept SSA names, simplify.
>> (partition_view_init): Mark only useful SSA names as used.
>> (partition_view_fini): Only deal with SSA names.
>> (change_partition_var): Remove.
>> (dump_var_map): Use ssa_name instead of partition_to_var member.
>> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL
>> basic blocks.
>> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h.
>> (struct _elim_graph): New member const_dests; nodes member vector of
>> ints.
>> (set_location_for_edge): New static helper.
>> (create_temp): Remove.
>> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge,
>> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New
>> functions.
>> (new_elim_graph): Allocate const_dests member.
>> (clean_elim_graph): Truncate const_dests member.
>> (delete_elim_graph): Free const_dests member.
>> (elim_graph_size): Adapt to new type of nodes member.
>> (elim_graph_add_node): Likewise.
>> (eliminate_name): Likewise.
>> (eliminate_build): Don't take basic block argument, deal only with
>> partition numbers, not variables.
>> (get_temp_reg): New static helper.
>> (elim_create): Use it, deal with RTL temporaries instead of trees.
>> (eliminate_phi): Adjust all calls to new signature.
>> (assign_vars, replace_use_variable, replace_def_variable): Remove.
>> (rewrite_trees): Only do checking.
>> (edge_leader, stmt_list, leader_has_match, leader_match): Remove.
>> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p,
>> init_analyze_edges_for_bb, fini_analyze_edges_for_bb,
>> contains_tree_r, MAX_STMTS_IN_LATCH,
>> process_single_block_loop_latch, analyze_edges_for_bb,
>> perform_edge_inserts): Remove.
>> (expand_phi_nodes): New global function.
>> (remove_ssa_form): Take ssaexpand parameter. Don't call removed
>> functions, initialize new parameter, remember partitions having a
>> default def.
>> (finish_out_of_ssa): New global function.
>> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form,
>> don't reset in_ssa_p here.
>> (pass_del_ssa): Remove.
>> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and
>> partition members.
>> (execute_free_datastructures): Declare.
>> * Makefile.in (SSAEXPAND_H): New variable.
>> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H.
>> * basic-block.h (commit_one_edge_insertion): Declare.
>> * passes.c (init_optimization_passes): Move pass_nrv and
>> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove
>> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations.
>> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare.
>> (redirect_branch_edge): Deal with super block when expanding, split
>> out jump patching itself into ...
>> (patch_jump_insn): ... here, new static helper.
>
> Some comments inline...
>
>> Index: tree-ssa-copyrename.c
>> ===================================================================
>> *** tree-ssa-copyrename.c (revision 146576)
>> --- tree-ssa-copyrename.c (working copy)
>> *************** rename_ssa_copies (void)
>> *** 291,297 ****
>> else
>> debug = NULL;
>>
>> ! map = init_var_map (num_ssa_names + 1);
>>
>> FOR_EACH_BB (bb)
>> {
>> --- 291,297 ----
>> else
>> debug = NULL;
>>
>> ! map = init_var_map (num_ssa_names);
>>
>> FOR_EACH_BB (bb)
>> {
>> *************** rename_ssa_copies (void)
>> *** 339,350 ****
>> /* Now one more pass to make all elements of a partition share the same
>> root variable. */
>>
>> ! for (x = 1; x <= num_ssa_names; x++)
>> {
>> part_var = partition_to_var (map, x);
>> if (!part_var)
>> continue;
>> ! var = map->partition_to_var[x];
>> if (debug)
>> {
>> if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var))
>> --- 339,350 ----
>> /* Now one more pass to make all elements of a partition share the same
>> root variable. */
>>
>> ! for (x = 1; x < num_ssa_names; x++)
>> {
>> part_var = partition_to_var (map, x);
>> if (!part_var)
>> continue;
>> ! var = ssa_name (x);
>> if (debug)
>> {
>> if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var))
>
> This piece is ok as obvious. Please commit it separately.
>
>
>> Index: cfgexpand.c
>> ===================================================================
>> *** cfgexpand.c (revision 146576)
>> --- cfgexpand.c (working copy)
>> *************** along with GCC; see the file COPYING3.
>> *** 42,49 ****
>> --- 42,54 ----
>> #include "tree-inline.h"
>> #include "value-prof.h"
>> #include "target.h"
>> + #include "ssaexpand.h"
>>
>>
>> + /* This variable holds information helping the rewriting of SSA trees
>> + into RTL. */
>> + struct ssaexpand SA;
>> +
>> /* Return an expression tree corresponding to the RHS of GIMPLE
>> statement STMT. */
>>
>> *************** gimple_assign_rhs_to_tree (gimple stmt)
>> *** 78,85 ****
>> static tree
>> gimple_cond_pred_to_tree (gimple stmt)
>> {
>> return build2 (gimple_cond_code (stmt), boolean_type_node,
>> ! gimple_cond_lhs (stmt), gimple_cond_rhs (stmt));
>> }
>>
>> /* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression
>> --- 83,104 ----
>> static tree
>> gimple_cond_pred_to_tree (gimple stmt)
>> {
>> + /* We're sometimes presented with such code:
>> + D.123_1 = x < y;
>> + if (D.123_1 != 0)
>> + ...
>> + This would expand to two comparisons which then later might
>> + be cleaned up by combine. But some pattern matchers like if-conversion
>> + work better when there's only one compare, so make up for this
>> + here as special exception if TER would have made the same change. */
>> + tree lhs = gimple_cond_lhs (stmt);
>> + if (SA.values
>> + && TREE_CODE (lhs) == SSA_NAME
>> + && SA.values[SSA_NAME_VERSION (lhs)])
>> + lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]);
>
> Do we really need the SA.values array here? It seems that
> a bitmap would be enough, as the definition should be still
> reachable via SSA_NAME_DEF_STMT (lhs). (Thanks Paolo
> for noticing this)
>
>> +
>> return build2 (gimple_cond_code (stmt), boolean_type_node,
>> ! lhs, gimple_cond_rhs (stmt));
>> }
>>
>> /* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression
>> *************** failed:
>> *** 423,428 ****
>> --- 442,464 ----
>> #define STACK_ALIGNMENT_NEEDED 1
>> #endif
>>
>> + #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
>
> This should be in some public place, ideally with a more
> sound name. Any ideas?
>
>> + /* Associate declaration T with storage space X. If T is no
>> + SSA name this is exactly SET_DECL_RTL, otherwise make the
>> + partition of T associated with X. */
>> + static inline void
>> + set_rtl (tree t, rtx x)
>> + {
>> + if (TREE_CODE (t) == SSA_NAME)
>> + {
>> + SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
>> + if (x && !MEM_P (x))
>> + set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
>> + }
>> + else
>> + SET_DECL_RTL (t, x);
>> + }
>>
>> /* This structure holds data relevant to one variable that will be
>> placed in a stack slot. */
>> *************** add_stack_var (tree decl)
>> *** 561,575 ****
>> }
>> stack_vars[stack_vars_num].decl = decl;
>> stack_vars[stack_vars_num].offset = 0;
>> ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
>> ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (decl);
>>
>> /* All variables are initially in their own partition. */
>> stack_vars[stack_vars_num].representative = stack_vars_num;
>> stack_vars[stack_vars_num].next = EOC;
>>
>> /* Ensure that this decl doesn't get put onto the list twice. */
>> ! SET_DECL_RTL (decl, pc_rtx);
>>
>> stack_vars_num++;
>> }
>> --- 597,611 ----
>> }
>> stack_vars[stack_vars_num].decl = decl;
>> stack_vars[stack_vars_num].offset = 0;
>> ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1);
>> ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (SSAVAR (decl));
>>
>> /* All variables are initially in their own partition. */
>> stack_vars[stack_vars_num].representative = stack_vars_num;
>> stack_vars[stack_vars_num].next = EOC;
>>
>> /* Ensure that this decl doesn't get put onto the list twice. */
>> ! set_rtl (decl, pc_rtx);
>>
>> stack_vars_num++;
>> }
>> *************** add_alias_set_conflicts (void)
>> *** 688,709 ****
>> }
>>
>> /* A subroutine of partition_stack_vars. A comparison function for qsort,
>> ! sorting an array of indices by the size of the object. */
>>
>> static int
>> stack_var_size_cmp (const void *a, const void *b)
>> {
>> HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size;
>> HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size;
>> ! unsigned int uida = DECL_UID (stack_vars[*(const size_t *)a].decl);
>> ! unsigned int uidb = DECL_UID (stack_vars[*(const size_t *)b].decl);
>>
>> if (sa < sb)
>> return -1;
>> if (sa > sb)
>> return 1;
>> ! /* For stack variables of the same size use the uid of the decl
>> ! to make the sort stable. */
>> if (uida < uidb)
>> return -1;
>> if (uida > uidb)
>> --- 724,760 ----
>> }
>>
>> /* A subroutine of partition_stack_vars. A comparison function for qsort,
>> ! sorting an array of indices by the size and type of the object. */
>>
>> static int
>> stack_var_size_cmp (const void *a, const void *b)
>> {
>> HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size;
>> HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size;
>> ! tree decla, declb;
>> ! unsigned int uida, uidb;
>>
>> if (sa < sb)
>> return -1;
>> if (sa > sb)
>> return 1;
>> ! decla = stack_vars[*(const size_t *)a].decl;
>> ! declb = stack_vars[*(const size_t *)b].decl;
>> ! /* For stack variables of the same size use and id of the decls
>> ! to make the sort stable. Two SSA names are compared by their
>> ! version, SSA names come before non-SSA names, and two normal
>> ! decls are compared by their DECL_UID. */
>> ! if (TREE_CODE (decla) == SSA_NAME)
>> ! {
>> ! if (TREE_CODE (declb) == SSA_NAME)
>> ! uida = SSA_NAME_VERSION (decla), uidb = SSA_NAME_VERSION (declb);
>> ! else
>> ! return -1;
>> ! }
>> ! else if (TREE_CODE (declb) == SSA_NAME)
>> ! return 1;
>> ! else
>> ! uida = DECL_UID (decla), uidb = DECL_UID (declb);
>> if (uida < uidb)
>> return -1;
>> if (uida > uidb)
>> *************** expand_one_stack_var_at (tree decl, HOST
>> *** 874,894 ****
>> gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>>
>> x = plus_constant (virtual_stack_vars_rtx, offset);
>> ! x = gen_rtx_MEM (DECL_MODE (decl), x);
>>
>> ! /* Set alignment we actually gave this decl. */
>> ! offset -= frame_phase;
>> ! align = offset & -offset;
>> ! align *= BITS_PER_UNIT;
>> ! if (align == 0)
>> ! align = STACK_BOUNDARY;
>> ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
>> ! align = MAX_SUPPORTED_STACK_ALIGNMENT;
>> ! DECL_ALIGN (decl) = align;
>> ! DECL_USER_ALIGN (decl) = 0;
>>
>> ! set_mem_attributes (x, decl, true);
>> ! SET_DECL_RTL (decl, x);
>> }
>>
>> /* A subroutine of expand_used_vars. Give each partition representative
>> --- 925,951 ----
>> gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>>
>> x = plus_constant (virtual_stack_vars_rtx, offset);
>> ! x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
>>
>> ! if (TREE_CODE (decl) != SSA_NAME)
>> ! {
>> ! /* Set alignment we actually gave this decl if it isn't an SSA name.
>> ! If it is we generate stack slots only accidentally so it isn't as
>> ! important, we'll simply use the alignment that is already set. */
>> ! offset -= frame_phase;
>> ! align = offset & -offset;
>> ! align *= BITS_PER_UNIT;
>> ! if (align == 0)
>> ! align = STACK_BOUNDARY;
>> ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
>> ! align = MAX_SUPPORTED_STACK_ALIGNMENT;
>>
>> ! DECL_ALIGN (decl) = align;
>> ! DECL_USER_ALIGN (decl) = 0;
>> ! }
>> !
>> ! set_mem_attributes (x, SSAVAR (decl), true);
>> ! set_rtl (decl, x);
>> }
>>
>> /* A subroutine of expand_used_vars. Give each partition representative
>> *************** expand_stack_vars (bool (*pred) (tree))
>> *** 912,918 ****
>>
>> /* Skip variables that have already had rtl assigned. See also
>> add_stack_var where we perpetrate this pc_rtx hack. */
>> ! if (DECL_RTL (stack_vars[i].decl) != pc_rtx)
>> continue;
>>
>> /* Check the predicate to see whether this variable should be
>> --- 969,977 ----
>>
>> /* Skip variables that have already had rtl assigned. See also
>> add_stack_var where we perpetrate this pc_rtx hack. */
>> ! if ((TREE_CODE (stack_vars[i].decl) == SSA_NAME
>> ! ? SA.partition_to_pseudo[var_to_partition (SA.map, stack_vars[i].decl)]
>> ! : DECL_RTL (stack_vars[i].decl)) != pc_rtx)
>> continue;
>>
>> /* Check the predicate to see whether this variable should be
>> *************** account_stack_vars (void)
>> *** 951,957 ****
>>
>> size += stack_vars[i].size;
>> for (j = i; j != EOC; j = stack_vars[j].next)
>> ! SET_DECL_RTL (stack_vars[j].decl, NULL);
>> }
>> return size;
>> }
>> --- 1010,1016 ----
>>
>> size += stack_vars[i].size;
>> for (j = i; j != EOC; j = stack_vars[j].next)
>> ! set_rtl (stack_vars[j].decl, NULL);
>> }
>> return size;
>> }
>> *************** expand_one_stack_var (tree var)
>> *** 964,971 ****
>> {
>> HOST_WIDE_INT size, offset, align;
>>
>> ! size = tree_low_cst (DECL_SIZE_UNIT (var), 1);
>> ! align = get_decl_align_unit (var);
>> offset = alloc_stack_frame_space (size, align);
>>
>> expand_one_stack_var_at (var, offset);
>> --- 1023,1030 ----
>> {
>> HOST_WIDE_INT size, offset, align;
>>
>> ! size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (var)), 1);
>> ! align = get_decl_align_unit (SSAVAR (var));
>> offset = alloc_stack_frame_space (size, align);
>>
>> expand_one_stack_var_at (var, offset);
>> *************** expand_one_hard_reg_var (tree var)
>> *** 986,1005 ****
>> static void
>> expand_one_register_var (tree var)
>> {
>> ! tree type = TREE_TYPE (var);
>> int unsignedp = TYPE_UNSIGNED (type);
>> enum machine_mode reg_mode
>> ! = promote_mode (type, DECL_MODE (var), &unsignedp, 0);
>> rtx x = gen_reg_rtx (reg_mode);
>>
>> ! SET_DECL_RTL (var, x);
>>
>> /* Note if the object is a user variable. */
>> ! if (!DECL_ARTIFICIAL (var))
>> ! mark_user_reg (x);
>>
>> if (POINTER_TYPE_P (type))
>> ! mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
>> }
>>
>> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that
>> --- 1045,1065 ----
>> static void
>> expand_one_register_var (tree var)
>> {
>
>
>> *************** struct rtl_opt_pass pass_expand =
>> *** 2471,2480 ****
>> 0, /* static_pass_number */
>> TV_EXPAND, /* tv_id */
>> /* ??? If TER is enabled, we actually receive GENERIC. */
>> ! PROP_gimple_leh | PROP_cfg, /* properties_required */
>> PROP_rtl, /* properties_provided */
>> ! PROP_trees, /* properties_destroyed */
>> ! 0, /* todo_flags_start */
>> ! TODO_dump_func, /* todo_flags_finish */
>> }
>> };
>> --- 2644,2655 ----
>> 0, /* static_pass_number */
>> TV_EXPAND, /* tv_id */
>> /* ??? If TER is enabled, we actually receive GENERIC. */
>
> This is no longer true ;)
>
>> ! PROP_ssa | PROP_gimple_leh | PROP_cfg, /* properties_required */
>> PROP_rtl, /* properties_provided */
>> ! PROP_ssa | PROP_trees, /* properties_destroyed */
>> ! TODO_verify_ssa | TODO_verify_flow
>> ! | TODO_verify_stmts, /* todo_flags_start */
>> ! TODO_dump_func
>> ! | TODO_ggc_collect /* todo_flags_finish */
>> }
>> };
>
>
>> Index: tree-mudflap.c
>> ===================================================================
>> *** tree-mudflap.c (revision 146576)
>> --- tree-mudflap.c (working copy)
>> *************** execute_mudflap_function_ops (void)
>> *** 447,452 ****
>> --- 447,465 ----
>> return 0;
>> }
>>
>> + /* Construct a new temporary variable with TYPE
>> + as type and PREFIX as name prefix, add it to referenced vars
>> + and mark it for renaming. */
>> +
>> + static tree
>> + create_referenced_var (tree type, const char *prefix)
>> + {
>> + tree var = create_tmp_var (type, prefix);
>> + add_referenced_var (var);
>> + mark_sym_for_renaming (var);
>> + return var;
>> + }
>
> This is exactly the same as make_rename_temp () from tree-dfa.c,
> so use that.
>
>> /* Create and initialize local shadow variables for the lookup cache
>> globals. Put their decls in the *_l globals for use by
>> mf_build_check_statement_for. */
>> *************** mf_decl_cache_locals (void)
>> *** 459,469 ****
>>
>> /* Build the cache vars. */
>> mf_cache_shift_decl_l
>> ! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_shift_decl),
>> "__mf_lookup_shift_l"));
>>
>> mf_cache_mask_decl_l
>> ! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_mask_decl),
>> "__mf_lookup_mask_l"));
>>
>> /* Build initialization nodes for the cache vars. We just load the
>> --- 472,482 ----
>>
>> /* Build the cache vars. */
>> mf_cache_shift_decl_l
>> ! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_shift_decl),
>> "__mf_lookup_shift_l"));
>>
>> mf_cache_mask_decl_l
>> ! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_mask_decl),
>> "__mf_lookup_mask_l"));
>>
>> /* Build initialization nodes for the cache vars. We just load the
>> *************** mf_build_check_statement_for (tree base,
>> *** 546,554 ****
>> }
>>
>> /* Build our local variables. */
>> ! mf_elem = create_tmp_var (mf_cache_structptr_type, "__mf_elem");
>> ! mf_base = create_tmp_var (mf_uintptr_type, "__mf_base");
>> ! mf_limit = create_tmp_var (mf_uintptr_type, "__mf_limit");
>>
>> /* Build: __mf_base = (uintptr_t) <base address expression>. */
>> seq = gimple_seq_alloc ();
>> --- 559,567 ----
>> }
>>
>> /* Build our local variables. */
>> ! mf_elem = create_referenced_var (mf_cache_structptr_type, "__mf_elem");
>> ! mf_base = create_referenced_var (mf_uintptr_type, "__mf_base");
>> ! mf_limit = create_referenced_var (mf_uintptr_type, "__mf_limit");
>>
>> /* Build: __mf_base = (uintptr_t) <base address expression>. */
>> seq = gimple_seq_alloc ();
>> *************** mf_build_check_statement_for (tree base,
>> *** 627,633 ****
>> t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u);
>> t = force_gimple_operand (t, &stmts, false, NULL_TREE);
>> gimple_seq_add_seq (&seq, stmts);
>> ! cond = create_tmp_var (boolean_type_node, "__mf_unlikely_cond");
>> g = gimple_build_assign (cond, t);
>> gimple_set_location (g, location);
>> gimple_seq_add_stmt (&seq, g);
>> --- 640,646 ----
>> t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u);
>> t = force_gimple_operand (t, &stmts, false, NULL_TREE);
>> gimple_seq_add_seq (&seq, stmts);
>> ! cond = create_referenced_var (boolean_type_node, "__mf_unlikely_cond");
>> g = gimple_build_assign (cond, t);
>> gimple_set_location (g, location);
>> gimple_seq_add_stmt (&seq, g);
>> *************** struct gimple_opt_pass pass_mudflap_2 =
>> *** 1366,1377 ****
>> NULL, /* next */
>> 0, /* static_pass_number */
>> TV_NONE, /* tv_id */
>> ! PROP_gimple_leh, /* properties_required */
>> 0, /* properties_provided */
>> 0, /* properties_destroyed */
>> 0, /* todo_flags_start */
>> TODO_verify_flow | TODO_verify_stmts
>> ! | TODO_dump_func /* todo_flags_finish */
>> }
>> };
>>
>> --- 1379,1390 ----
>> NULL, /* next */
>> 0, /* static_pass_number */
>> TV_NONE, /* tv_id */
>> ! PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required */
>> 0, /* properties_provided */
>> 0, /* properties_destroyed */
>> 0, /* todo_flags_start */
>> TODO_verify_flow | TODO_verify_stmts
>> ! | TODO_dump_func | TODO_update_ssa /* todo_flags_finish */
>> }
>> };
>>
>> Index: tree-ssa-ter.c
>> ===================================================================
>> *** tree-ssa-ter.c (revision 146576)
>> --- tree-ssa-ter.c (working copy)
>> *************** free_temp_expr_table (temp_expr_table_p
>> *** 225,231 ****
>> unsigned x;
>> for (x = 0; x <= num_var_partitions (t->map); x++)
>> gcc_assert (!t->kill_list[x]);
>> ! for (x = 0; x < num_ssa_names + 1; x++)
>> {
>> gcc_assert (t->expr_decl_uids[x] == NULL);
>> gcc_assert (t->partition_dependencies[x] == NULL);
>> --- 225,231 ----
>> unsigned x;
>> for (x = 0; x <= num_var_partitions (t->map); x++)
>> gcc_assert (!t->kill_list[x]);
>> ! for (x = 0; x < num_ssa_names; x++)
>> {
>> gcc_assert (t->expr_decl_uids[x] == NULL);
>> gcc_assert (t->partition_dependencies[x] == NULL);
>
> Obvious, commit with the other similar piece separately.
>
>
>> Index: tree-ssa.c
>> ===================================================================
>> *** tree-ssa.c (revision 146576)
>> --- tree-ssa.c (working copy)
>> *************** delete_tree_ssa (void)
>> *** 844,850 ****
>>
>> gimple_set_modified (stmt, true);
>> }
>> ! set_phi_nodes (bb, NULL);
>> }
>>
>> /* Remove annotations from every referenced local variable. */
>> --- 844,851 ----
>>
>> gimple_set_modified (stmt, true);
>> }
>> ! if (!(bb->flags & BB_RTL))
>> ! set_phi_nodes (bb, NULL);
>> }
>>
>> /* Remove annotations from every referenced local variable. */
>> Index: rtl.h
>> ===================================================================
>> *** rtl.h (revision 146576)
>> --- rtl.h (working copy)
>> *************** extern rtx gen_int_mode (HOST_WIDE_INT,
>> *** 1494,1499 ****
>> --- 1494,1500 ----
>> extern rtx emit_copy_of_insn_after (rtx, rtx);
>> extern void set_reg_attrs_from_value (rtx, rtx);
>> extern void set_reg_attrs_for_parm (rtx, rtx);
>> + extern void set_reg_attrs_for_decl_rtl (tree t, rtx x);
>> extern void adjust_reg_mode (rtx, enum machine_mode);
>> extern int mem_expr_equal_p (const_tree, const_tree);
>>
>> Index: tree-optimize.c
>> ===================================================================
>> *** tree-optimize.c (revision 146576)
>> --- tree-optimize.c (working copy)
>> *************** struct gimple_opt_pass pass_cleanup_cfg_
>> *** 201,207 ****
>> {
>> {
>> GIMPLE_PASS,
>> ! "final_cleanup", /* name */
>> NULL, /* gate */
>> execute_cleanup_cfg_post_optimizing, /* execute */
>> NULL, /* sub */
>> --- 201,207 ----
>> {
>> {
>> GIMPLE_PASS,
>> ! "optimized", /* name */
>> NULL, /* gate */
>> execute_cleanup_cfg_post_optimizing, /* execute */
>> NULL, /* sub */
>> *************** struct gimple_opt_pass pass_cleanup_cfg_
>> *** 213,225 ****
>> 0, /* properties_destroyed */
>> 0, /* todo_flags_start */
>> TODO_dump_func /* todo_flags_finish */
>> }
>> };
>>
>> /* Pass: do the actions required to finish with tree-ssa optimization
>> passes. */
>>
>> ! static unsigned int
>> execute_free_datastructures (void)
>> {
>> free_dominance_info (CDI_DOMINATORS);
>> --- 213,226 ----
>> 0, /* properties_destroyed */
>> 0, /* todo_flags_start */
>> TODO_dump_func /* todo_flags_finish */
>> + | TODO_remove_unused_locals
>> }
>> };
>>
>> /* Pass: do the actions required to finish with tree-ssa optimization
>> passes. */
>>
>> ! unsigned int
>> execute_free_datastructures (void)
>> {
>> free_dominance_info (CDI_DOMINATORS);
>> *************** execute_free_datastructures (void)
>> *** 228,233 ****
>> --- 229,238 ----
>> /* Remove the ssa structures. */
>> if (cfun->gimple_df)
>> delete_tree_ssa ();
>> +
>> + /* And get rid of annotations we no longer need. */
>> + delete_tree_cfg_annotations ();
>> +
>> return 0;
>> }
>>
>> *************** struct gimple_opt_pass pass_free_datastr
>> *** 254,262 ****
>> static unsigned int
>> execute_free_cfg_annotations (void)
>> {
>> - /* And get rid of annotations we no longer need. */
>> - delete_tree_cfg_annotations ();
>> -
>> return 0;
>> }
>
> It looks like the pass referencing this is now dead. Please remove
> this function and the pass structure.
>
>> [Message clipped]
>
> bah ... ;) Other half in next mail.
Continued (without quoting for the patch)
--- 932,941 ----
if (dump_file && (dump_flags & TDF_DETAILS))
gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS);
! remove_ssa_form (flag_tree_ter && !flag_mudflap, sa);
if (dump_file && (dump_flags & TDF_DETAILS))
gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS);
it looks like this restriction for !flag_mudflap can now be lifted.
+ struct ssaexpand
+ {
+ /* The computed partitions of SSA names are stored here. */
+ var_map map;
+
+ /* For a SSA name version V values[V] contains the gimple statement
+ defining it iff TER decided that it should be forwarded, NULL
+ otherwise. */
+ gimple *values;
as said above this could be a bitmap.
*************** struct var_ann_d GTY(())
*** 234,243 ****
information on each attribute. */
ENUM_BITFIELD (noalias_state) noalias_state : 2;
- /* Used when going out of SSA form to indicate which partition this
- variable represents storage for. */
- unsigned partition;
-
/* Used by var_map for the base index of ssa base variables. */
unsigned base_index;
hmm, so what is needed to remove base_index as well?
--- 707,723 ----
NEXT_PASS (pass_local_pure_const);
}
NEXT_PASS (pass_cleanup_eh);
NEXT_PASS (pass_nrv);
+ NEXT_PASS (pass_mudflap_2);
NEXT_PASS (pass_mark_used_blocks);
NEXT_PASS (pass_cleanup_cfg_post_optimizing);
NEXT_PASS (pass_warn_function_noreturn);
pass_mark_unused_blocks is no longer needed now as you remove unused
locals and that
marks blocks used? Please remove this pass (and its code).
The patches are ok with the suggested minor changes and with the
larger cleanups as followup (or
the same patch if you prefer).
Thanks,
Richard.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-22 16:45 ` [RFA] " Michael Matz
2009-04-23 15:10 ` Andrew MacLeod
@ 2009-04-24 14:32 ` Richard Guenther
2009-04-24 14:46 ` Richard Guenther
2009-04-27 5:47 ` H.J. Lu
` (2 subsequent siblings)
4 siblings, 1 reply; 56+ messages in thread
From: Richard Guenther @ 2009-04-24 14:32 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrew MacLeod, Andrey Belevantsev
On Wed, Apr 22, 2009 at 6:42 PM, Michael Matz <matz@suse.de> wrote:
> On Wed, 22 Apr 2009, Michael Matz wrote:
>
>> I'll soon send a new version of the patch that fixes all problems and
>> testcases I encountered.
>
> Like so. This is the full patch, i.e. including the cleanups, but
> excluding the testsuite changes. It should incorporate all feedback.
> Compared to the last version it adds comments for new functions, fixes
> muflap2, and generally some other minor problems showing when I started
> testing Ada and a bug reported by Andrey.
>
> This patch (plus testsuite changes) was bootstrapped with Ada on
> x86_64-linux. There are no testsuite regressions:
> FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE
> FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE
> FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors)
> FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors)
> FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors)
> FAIL: libmudflap.c++/pass41-frag.cxx execution test
> FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test
> FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test
> FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test
>
> All of these happen without the patch too (known bugs, old binutils, and
> pass41-frag never seems to work anyway).
>
> I'd like to ask for approval for the series.
>
>
> Ciao,
> Michael.
> --
> * builtins.c (fold_builtin_next_arg): Handle SSA names.
> * tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate
> beyond num_ssa_names, use ssa_name() directly.
> * tree-ssa-ter.c (free_temp_expr_table): Likewise.
> * tree-ssa-coalesce.c (create_outofssa_var_map): Likewise,
> mark only useful SSA names.
> (compare_pairs): Swap cost comparison.
> (coalesce_ssa_name): Don't use change_partition_var.
> * tree-nrv.c (struct nrv_data): Add modified member.
> (finalize_nrv_r): Set it.
> (tree_nrv): Use it to update statements.
> (pass_nrv): Require PROP_ssa.
> * tree-mudflap.c (create_referenced_var): New static helper.
> (mf_decl_cache_locals, mf_build_check_statement_for): Use it.
> (pass_mudflap_2): Require PROP_ssa, run ssa update at finish.
> * alias.c (find_base_decl): Handle SSA names.
> * emit-rtl (set_reg_attrs_for_parm): Make non-static.
> (component_ref_for_mem_expr): Don't leak SSA names into RTL.
> * rtl.h (set_reg_attrs_for_parm): Declare.
> * tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename
> to "optimized", remove unused locals at finish.
> (execute_free_datastructures): Make global, call
> delete_tree_cfg_annotations.
> (execute_free_cfg_annotations): Don't call
> delete_tree_cfg_annotations.
>
> * ssaexpand.h: New file.
> * expr.c (toplevel): Include ssaexpand.h.
> (expand_assignment): Handle SSA names the same as register
> variables.
> (expand_expr_real_1): Expand SSA names.
> * cfgexpand.c (toplevel): Include ssaexpand.h.
> (SA): New global variable.
> (gimple_cond_pred_to_tree): Fold TERed comparisons into predicates.
> (SSAVAR): New macro.
> (set_rtl): New helper function.
> (add_stack_var): Deal with SSA names, use set_rtl.
> (expand_one_stack_var_at): Likewise.
> (expand_one_stack_var): Deal with SSA names.
> (stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker
> before unique numbers.
> (expand_stack_vars): Use set_rtl.
> (expand_one_var): Accept SSA names, add asserts for them, feed them
> to above subroutines.
> (expand_used_vars): Expand all partitions (without default defs),
> then only the local decls (ignoring those expanded already).
> (expand_gimple_cond): Remove edges when jumpif() expands an
> unconditional jump.
> (expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here,
> or remove abnormal edges. Ignore insns setting the LHS of a TERed
> SSA name.
> (gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize
> members of SA; deal with PARM_DECL partitions here; expand
> all PHI nodes, free tree datastructures and SA. Commit instructions
> on edges, clear EDGE_EXECUTABLE and remove abnormal edges here.
> (pass_expand): Require and destroy PROP_ssa, verify SSA form, flow
> info and statements at start, collect garbage at finish.
> * tree-ssa-live.h (struct _var_map): Remove partition_to_var member.
> (VAR_ANN_PARTITION) Remove.
> (change_partition_var): Don't declare.
> (partition_to_var): Always return SSA names.
> (var_to_partition): Only accept SSA names.
> (register_ssa_partition): Only check argument.
> * tree-ssa-live.c (init_var_map): Don't allocate partition_to_var
> member.
> (delete_var_map): Don't free it.
> (var_union): Only accept SSA names, simplify.
> (partition_view_init): Mark only useful SSA names as used.
> (partition_view_fini): Only deal with SSA names.
> (change_partition_var): Remove.
> (dump_var_map): Use ssa_name instead of partition_to_var member.
> * tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL
> basic blocks.
> * tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h.
> (struct _elim_graph): New member const_dests; nodes member vector of
> ints.
> (set_location_for_edge): New static helper.
> (create_temp): Remove.
> (insert_partition_copy_on_edge, insert_part_to_rtx_on_edge,
> insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New
> functions.
> (new_elim_graph): Allocate const_dests member.
> (clean_elim_graph): Truncate const_dests member.
> (delete_elim_graph): Free const_dests member.
> (elim_graph_size): Adapt to new type of nodes member.
> (elim_graph_add_node): Likewise.
> (eliminate_name): Likewise.
> (eliminate_build): Don't take basic block argument, deal only with
> partition numbers, not variables.
> (get_temp_reg): New static helper.
> (elim_create): Use it, deal with RTL temporaries instead of trees.
> (eliminate_phi): Adjust all calls to new signature.
> (assign_vars, replace_use_variable, replace_def_variable): Remove.
> (rewrite_trees): Only do checking.
> (edge_leader, stmt_list, leader_has_match, leader_match): Remove.
> (same_stmt_list_p, identical_copies_p, identical_stmt_lists_p,
> init_analyze_edges_for_bb, fini_analyze_edges_for_bb,
> contains_tree_r, MAX_STMTS_IN_LATCH,
> process_single_block_loop_latch, analyze_edges_for_bb,
> perform_edge_inserts): Remove.
> (expand_phi_nodes): New global function.
> (remove_ssa_form): Take ssaexpand parameter. Don't call removed
> functions, initialize new parameter, remember partitions having a
> default def.
> (finish_out_of_ssa): New global function.
> (rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form,
> don't reset in_ssa_p here.
> (pass_del_ssa): Remove.
> * tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and
> partition members.
> (execute_free_datastructures): Declare.
> * Makefile.in (SSAEXPAND_H): New variable.
> (tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H.
> * basic-block.h (commit_one_edge_insertion): Declare.
> * passes.c (init_optimization_passes): Move pass_nrv and
> pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove
> pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations.
> * cfgrtl.c (commit_one_edge_insertion): Make global, don't declare.
> (redirect_branch_edge): Deal with super block when expanding, split
> out jump patching itself into ...
> (patch_jump_insn): ... here, new static helper.
Some comments inline...
> Index: tree-ssa-copyrename.c
> ===================================================================
> *** tree-ssa-copyrename.c (revision 146576)
> --- tree-ssa-copyrename.c (working copy)
> *************** rename_ssa_copies (void)
> *** 291,297 ****
> else
> debug = NULL;
>
> ! map = init_var_map (num_ssa_names + 1);
>
> FOR_EACH_BB (bb)
> {
> --- 291,297 ----
> else
> debug = NULL;
>
> ! map = init_var_map (num_ssa_names);
>
> FOR_EACH_BB (bb)
> {
> *************** rename_ssa_copies (void)
> *** 339,350 ****
> /* Now one more pass to make all elements of a partition share the same
> root variable. */
>
> ! for (x = 1; x <= num_ssa_names; x++)
> {
> part_var = partition_to_var (map, x);
> if (!part_var)
> continue;
> ! var = map->partition_to_var[x];
> if (debug)
> {
> if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var))
> --- 339,350 ----
> /* Now one more pass to make all elements of a partition share the same
> root variable. */
>
> ! for (x = 1; x < num_ssa_names; x++)
> {
> part_var = partition_to_var (map, x);
> if (!part_var)
> continue;
> ! var = ssa_name (x);
> if (debug)
> {
> if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var))
This piece is ok as obvious. Please commit it separately.
> Index: cfgexpand.c
> ===================================================================
> *** cfgexpand.c (revision 146576)
> --- cfgexpand.c (working copy)
> *************** along with GCC; see the file COPYING3.
> *** 42,49 ****
> --- 42,54 ----
> #include "tree-inline.h"
> #include "value-prof.h"
> #include "target.h"
> + #include "ssaexpand.h"
>
>
> + /* This variable holds information helping the rewriting of SSA trees
> + into RTL. */
> + struct ssaexpand SA;
> +
> /* Return an expression tree corresponding to the RHS of GIMPLE
> statement STMT. */
>
> *************** gimple_assign_rhs_to_tree (gimple stmt)
> *** 78,85 ****
> static tree
> gimple_cond_pred_to_tree (gimple stmt)
> {
> return build2 (gimple_cond_code (stmt), boolean_type_node,
> ! gimple_cond_lhs (stmt), gimple_cond_rhs (stmt));
> }
>
> /* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression
> --- 83,104 ----
> static tree
> gimple_cond_pred_to_tree (gimple stmt)
> {
> + /* We're sometimes presented with such code:
> + D.123_1 = x < y;
> + if (D.123_1 != 0)
> + ...
> + This would expand to two comparisons which then later might
> + be cleaned up by combine. But some pattern matchers like if-conversion
> + work better when there's only one compare, so make up for this
> + here as special exception if TER would have made the same change. */
> + tree lhs = gimple_cond_lhs (stmt);
> + if (SA.values
> + && TREE_CODE (lhs) == SSA_NAME
> + && SA.values[SSA_NAME_VERSION (lhs)])
> + lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]);
Do we really need the SA.values array here? It seems that
a bitmap would be enough, as the definition should be still
reachable via SSA_NAME_DEF_STMT (lhs). (Thanks Paolo
for noticing this)
> +
> return build2 (gimple_cond_code (stmt), boolean_type_node,
> ! lhs, gimple_cond_rhs (stmt));
> }
>
> /* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression
> *************** failed:
> *** 423,428 ****
> --- 442,464 ----
> #define STACK_ALIGNMENT_NEEDED 1
> #endif
>
> + #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
This should be in some public place, ideally with a more
sound name. Any ideas?
> + /* Associate declaration T with storage space X. If T is no
> + SSA name this is exactly SET_DECL_RTL, otherwise make the
> + partition of T associated with X. */
> + static inline void
> + set_rtl (tree t, rtx x)
> + {
> + if (TREE_CODE (t) == SSA_NAME)
> + {
> + SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
> + if (x && !MEM_P (x))
> + set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
> + }
> + else
> + SET_DECL_RTL (t, x);
> + }
>
> /* This structure holds data relevant to one variable that will be
> placed in a stack slot. */
> *************** add_stack_var (tree decl)
> *** 561,575 ****
> }
> stack_vars[stack_vars_num].decl = decl;
> stack_vars[stack_vars_num].offset = 0;
> ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
> ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (decl);
>
> /* All variables are initially in their own partition. */
> stack_vars[stack_vars_num].representative = stack_vars_num;
> stack_vars[stack_vars_num].next = EOC;
>
> /* Ensure that this decl doesn't get put onto the list twice. */
> ! SET_DECL_RTL (decl, pc_rtx);
>
> stack_vars_num++;
> }
> --- 597,611 ----
> }
> stack_vars[stack_vars_num].decl = decl;
> stack_vars[stack_vars_num].offset = 0;
> ! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1);
> ! stack_vars[stack_vars_num].alignb = get_decl_align_unit (SSAVAR (decl));
>
> /* All variables are initially in their own partition. */
> stack_vars[stack_vars_num].representative = stack_vars_num;
> stack_vars[stack_vars_num].next = EOC;
>
> /* Ensure that this decl doesn't get put onto the list twice. */
> ! set_rtl (decl, pc_rtx);
>
> stack_vars_num++;
> }
> *************** add_alias_set_conflicts (void)
> *** 688,709 ****
> }
>
> /* A subroutine of partition_stack_vars. A comparison function for qsort,
> ! sorting an array of indices by the size of the object. */
>
> static int
> stack_var_size_cmp (const void *a, const void *b)
> {
> HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size;
> HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size;
> ! unsigned int uida = DECL_UID (stack_vars[*(const size_t *)a].decl);
> ! unsigned int uidb = DECL_UID (stack_vars[*(const size_t *)b].decl);
>
> if (sa < sb)
> return -1;
> if (sa > sb)
> return 1;
> ! /* For stack variables of the same size use the uid of the decl
> ! to make the sort stable. */
> if (uida < uidb)
> return -1;
> if (uida > uidb)
> --- 724,760 ----
> }
>
> /* A subroutine of partition_stack_vars. A comparison function for qsort,
> ! sorting an array of indices by the size and type of the object. */
>
> static int
> stack_var_size_cmp (const void *a, const void *b)
> {
> HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size;
> HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size;
> ! tree decla, declb;
> ! unsigned int uida, uidb;
>
> if (sa < sb)
> return -1;
> if (sa > sb)
> return 1;
> ! decla = stack_vars[*(const size_t *)a].decl;
> ! declb = stack_vars[*(const size_t *)b].decl;
> ! /* For stack variables of the same size use and id of the decls
> ! to make the sort stable. Two SSA names are compared by their
> ! version, SSA names come before non-SSA names, and two normal
> ! decls are compared by their DECL_UID. */
> ! if (TREE_CODE (decla) == SSA_NAME)
> ! {
> ! if (TREE_CODE (declb) == SSA_NAME)
> ! uida = SSA_NAME_VERSION (decla), uidb = SSA_NAME_VERSION (declb);
> ! else
> ! return -1;
> ! }
> ! else if (TREE_CODE (declb) == SSA_NAME)
> ! return 1;
> ! else
> ! uida = DECL_UID (decla), uidb = DECL_UID (declb);
> if (uida < uidb)
> return -1;
> if (uida > uidb)
> *************** expand_one_stack_var_at (tree decl, HOST
> *** 874,894 ****
> gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
> x = plus_constant (virtual_stack_vars_rtx, offset);
> ! x = gen_rtx_MEM (DECL_MODE (decl), x);
>
> ! /* Set alignment we actually gave this decl. */
> ! offset -= frame_phase;
> ! align = offset & -offset;
> ! align *= BITS_PER_UNIT;
> ! if (align == 0)
> ! align = STACK_BOUNDARY;
> ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> ! align = MAX_SUPPORTED_STACK_ALIGNMENT;
> ! DECL_ALIGN (decl) = align;
> ! DECL_USER_ALIGN (decl) = 0;
>
> ! set_mem_attributes (x, decl, true);
> ! SET_DECL_RTL (decl, x);
> }
>
> /* A subroutine of expand_used_vars. Give each partition representative
> --- 925,951 ----
> gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
>
> x = plus_constant (virtual_stack_vars_rtx, offset);
> ! x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
>
> ! if (TREE_CODE (decl) != SSA_NAME)
> ! {
> ! /* Set alignment we actually gave this decl if it isn't an SSA name.
> ! If it is we generate stack slots only accidentally so it isn't as
> ! important, we'll simply use the alignment that is already set. */
> ! offset -= frame_phase;
> ! align = offset & -offset;
> ! align *= BITS_PER_UNIT;
> ! if (align == 0)
> ! align = STACK_BOUNDARY;
> ! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
> ! align = MAX_SUPPORTED_STACK_ALIGNMENT;
>
> ! DECL_ALIGN (decl) = align;
> ! DECL_USER_ALIGN (decl) = 0;
> ! }
> !
> ! set_mem_attributes (x, SSAVAR (decl), true);
> ! set_rtl (decl, x);
> }
>
> /* A subroutine of expand_used_vars. Give each partition representative
> *************** expand_stack_vars (bool (*pred) (tree))
> *** 912,918 ****
>
> /* Skip variables that have already had rtl assigned. See also
> add_stack_var where we perpetrate this pc_rtx hack. */
> ! if (DECL_RTL (stack_vars[i].decl) != pc_rtx)
> continue;
>
> /* Check the predicate to see whether this variable should be
> --- 969,977 ----
>
> /* Skip variables that have already had rtl assigned. See also
> add_stack_var where we perpetrate this pc_rtx hack. */
> ! if ((TREE_CODE (stack_vars[i].decl) == SSA_NAME
> ! ? SA.partition_to_pseudo[var_to_partition (SA.map, stack_vars[i].decl)]
> ! : DECL_RTL (stack_vars[i].decl)) != pc_rtx)
> continue;
>
> /* Check the predicate to see whether this variable should be
> *************** account_stack_vars (void)
> *** 951,957 ****
>
> size += stack_vars[i].size;
> for (j = i; j != EOC; j = stack_vars[j].next)
> ! SET_DECL_RTL (stack_vars[j].decl, NULL);
> }
> return size;
> }
> --- 1010,1016 ----
>
> size += stack_vars[i].size;
> for (j = i; j != EOC; j = stack_vars[j].next)
> ! set_rtl (stack_vars[j].decl, NULL);
> }
> return size;
> }
> *************** expand_one_stack_var (tree var)
> *** 964,971 ****
> {
> HOST_WIDE_INT size, offset, align;
>
> ! size = tree_low_cst (DECL_SIZE_UNIT (var), 1);
> ! align = get_decl_align_unit (var);
> offset = alloc_stack_frame_space (size, align);
>
> expand_one_stack_var_at (var, offset);
> --- 1023,1030 ----
> {
> HOST_WIDE_INT size, offset, align;
>
> ! size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (var)), 1);
> ! align = get_decl_align_unit (SSAVAR (var));
> offset = alloc_stack_frame_space (size, align);
>
> expand_one_stack_var_at (var, offset);
> *************** expand_one_hard_reg_var (tree var)
> *** 986,1005 ****
> static void
> expand_one_register_var (tree var)
> {
> ! tree type = TREE_TYPE (var);
> int unsignedp = TYPE_UNSIGNED (type);
> enum machine_mode reg_mode
> ! = promote_mode (type, DECL_MODE (var), &unsignedp, 0);
> rtx x = gen_reg_rtx (reg_mode);
>
> ! SET_DECL_RTL (var, x);
>
> /* Note if the object is a user variable. */
> ! if (!DECL_ARTIFICIAL (var))
> ! mark_user_reg (x);
>
> if (POINTER_TYPE_P (type))
> ! mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
> }
>
> /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that
> --- 1045,1065 ----
> static void
> expand_one_register_var (tree var)
> {
> *************** struct rtl_opt_pass pass_expand =
> *** 2471,2480 ****
> 0, /* static_pass_number */
> TV_EXPAND, /* tv_id */
> /* ??? If TER is enabled, we actually receive GENERIC. */
> ! PROP_gimple_leh | PROP_cfg, /* properties_required */
> PROP_rtl, /* properties_provided */
> ! PROP_trees, /* properties_destroyed */
> ! 0, /* todo_flags_start */
> ! TODO_dump_func, /* todo_flags_finish */
> }
> };
> --- 2644,2655 ----
> 0, /* static_pass_number */
> TV_EXPAND, /* tv_id */
> /* ??? If TER is enabled, we actually receive GENERIC. */
This is no longer true ;)
> ! PROP_ssa | PROP_gimple_leh | PROP_cfg, /* properties_required */
> PROP_rtl, /* properties_provided */
> ! PROP_ssa | PROP_trees, /* properties_destroyed */
> ! TODO_verify_ssa | TODO_verify_flow
> ! | TODO_verify_stmts, /* todo_flags_start */
> ! TODO_dump_func
> ! | TODO_ggc_collect /* todo_flags_finish */
> }
> };
> Index: tree-mudflap.c
> ===================================================================
> *** tree-mudflap.c (revision 146576)
> --- tree-mudflap.c (working copy)
> *************** execute_mudflap_function_ops (void)
> *** 447,452 ****
> --- 447,465 ----
> return 0;
> }
>
> + /* Construct a new temporary variable with TYPE
> + as type and PREFIX as name prefix, add it to referenced vars
> + and mark it for renaming. */
> +
> + static tree
> + create_referenced_var (tree type, const char *prefix)
> + {
> + tree var = create_tmp_var (type, prefix);
> + add_referenced_var (var);
> + mark_sym_for_renaming (var);
> + return var;
> + }
This is exactly the same as make_rename_temp () from tree-dfa.c,
so use that.
> /* Create and initialize local shadow variables for the lookup cache
> globals. Put their decls in the *_l globals for use by
> mf_build_check_statement_for. */
> *************** mf_decl_cache_locals (void)
> *** 459,469 ****
>
> /* Build the cache vars. */
> mf_cache_shift_decl_l
> ! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_shift_decl),
> "__mf_lookup_shift_l"));
>
> mf_cache_mask_decl_l
> ! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_mask_decl),
> "__mf_lookup_mask_l"));
>
> /* Build initialization nodes for the cache vars. We just load the
> --- 472,482 ----
>
> /* Build the cache vars. */
> mf_cache_shift_decl_l
> ! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_shift_decl),
> "__mf_lookup_shift_l"));
>
> mf_cache_mask_decl_l
> ! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_mask_decl),
> "__mf_lookup_mask_l"));
>
> /* Build initialization nodes for the cache vars. We just load the
> *************** mf_build_check_statement_for (tree base,
> *** 546,554 ****
> }
>
> /* Build our local variables. */
> ! mf_elem = create_tmp_var (mf_cache_structptr_type, "__mf_elem");
> ! mf_base = create_tmp_var (mf_uintptr_type, "__mf_base");
> ! mf_limit = create_tmp_var (mf_uintptr_type, "__mf_limit");
>
> /* Build: __mf_base = (uintptr_t) <base address expression>. */
> seq = gimple_seq_alloc ();
> --- 559,567 ----
> }
>
> /* Build our local variables. */
> ! mf_elem = create_referenced_var (mf_cache_structptr_type, "__mf_elem");
> ! mf_base = create_referenced_var (mf_uintptr_type, "__mf_base");
> ! mf_limit = create_referenced_var (mf_uintptr_type, "__mf_limit");
>
> /* Build: __mf_base = (uintptr_t) <base address expression>. */
> seq = gimple_seq_alloc ();
> *************** mf_build_check_statement_for (tree base,
> *** 627,633 ****
> t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u);
> t = force_gimple_operand (t, &stmts, false, NULL_TREE);
> gimple_seq_add_seq (&seq, stmts);
> ! cond = create_tmp_var (boolean_type_node, "__mf_unlikely_cond");
> g = gimple_build_assign (cond, t);
> gimple_set_location (g, location);
> gimple_seq_add_stmt (&seq, g);
> --- 640,646 ----
> t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u);
> t = force_gimple_operand (t, &stmts, false, NULL_TREE);
> gimple_seq_add_seq (&seq, stmts);
> ! cond = create_referenced_var (boolean_type_node, "__mf_unlikely_cond");
> g = gimple_build_assign (cond, t);
> gimple_set_location (g, location);
> gimple_seq_add_stmt (&seq, g);
> *************** struct gimple_opt_pass pass_mudflap_2 =
> *** 1366,1377 ****
> NULL, /* next */
> 0, /* static_pass_number */
> TV_NONE, /* tv_id */
> ! PROP_gimple_leh, /* properties_required */
> 0, /* properties_provided */
> 0, /* properties_destroyed */
> 0, /* todo_flags_start */
> TODO_verify_flow | TODO_verify_stmts
> ! | TODO_dump_func /* todo_flags_finish */
> }
> };
>
> --- 1379,1390 ----
> NULL, /* next */
> 0, /* static_pass_number */
> TV_NONE, /* tv_id */
> ! PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required */
> 0, /* properties_provided */
> 0, /* properties_destroyed */
> 0, /* todo_flags_start */
> TODO_verify_flow | TODO_verify_stmts
> ! | TODO_dump_func | TODO_update_ssa /* todo_flags_finish */
> }
> };
>
> Index: tree-ssa-ter.c
> ===================================================================
> *** tree-ssa-ter.c (revision 146576)
> --- tree-ssa-ter.c (working copy)
> *************** free_temp_expr_table (temp_expr_table_p
> *** 225,231 ****
> unsigned x;
> for (x = 0; x <= num_var_partitions (t->map); x++)
> gcc_assert (!t->kill_list[x]);
> ! for (x = 0; x < num_ssa_names + 1; x++)
> {
> gcc_assert (t->expr_decl_uids[x] == NULL);
> gcc_assert (t->partition_dependencies[x] == NULL);
> --- 225,231 ----
> unsigned x;
> for (x = 0; x <= num_var_partitions (t->map); x++)
> gcc_assert (!t->kill_list[x]);
> ! for (x = 0; x < num_ssa_names; x++)
> {
> gcc_assert (t->expr_decl_uids[x] == NULL);
> gcc_assert (t->partition_dependencies[x] == NULL);
Obvious, commit with the other similar piece separately.
> Index: tree-ssa.c
> ===================================================================
> *** tree-ssa.c (revision 146576)
> --- tree-ssa.c (working copy)
> *************** delete_tree_ssa (void)
> *** 844,850 ****
>
> gimple_set_modified (stmt, true);
> }
> ! set_phi_nodes (bb, NULL);
> }
>
> /* Remove annotations from every referenced local variable. */
> --- 844,851 ----
>
> gimple_set_modified (stmt, true);
> }
> ! if (!(bb->flags & BB_RTL))
> ! set_phi_nodes (bb, NULL);
> }
>
> /* Remove annotations from every referenced local variable. */
> Index: rtl.h
> ===================================================================
> *** rtl.h (revision 146576)
> --- rtl.h (working copy)
> *************** extern rtx gen_int_mode (HOST_WIDE_INT,
> *** 1494,1499 ****
> --- 1494,1500 ----
> extern rtx emit_copy_of_insn_after (rtx, rtx);
> extern void set_reg_attrs_from_value (rtx, rtx);
> extern void set_reg_attrs_for_parm (rtx, rtx);
> + extern void set_reg_attrs_for_decl_rtl (tree t, rtx x);
> extern void adjust_reg_mode (rtx, enum machine_mode);
> extern int mem_expr_equal_p (const_tree, const_tree);
>
> Index: tree-optimize.c
> ===================================================================
> *** tree-optimize.c (revision 146576)
> --- tree-optimize.c (working copy)
> *************** struct gimple_opt_pass pass_cleanup_cfg_
> *** 201,207 ****
> {
> {
> GIMPLE_PASS,
> ! "final_cleanup", /* name */
> NULL, /* gate */
> execute_cleanup_cfg_post_optimizing, /* execute */
> NULL, /* sub */
> --- 201,207 ----
> {
> {
> GIMPLE_PASS,
> ! "optimized", /* name */
> NULL, /* gate */
> execute_cleanup_cfg_post_optimizing, /* execute */
> NULL, /* sub */
> *************** struct gimple_opt_pass pass_cleanup_cfg_
> *** 213,225 ****
> 0, /* properties_destroyed */
> 0, /* todo_flags_start */
> TODO_dump_func /* todo_flags_finish */
> }
> };
>
> /* Pass: do the actions required to finish with tree-ssa optimization
> passes. */
>
> ! static unsigned int
> execute_free_datastructures (void)
> {
> free_dominance_info (CDI_DOMINATORS);
> --- 213,226 ----
> 0, /* properties_destroyed */
> 0, /* todo_flags_start */
> TODO_dump_func /* todo_flags_finish */
> + | TODO_remove_unused_locals
> }
> };
>
> /* Pass: do the actions required to finish with tree-ssa optimization
> passes. */
>
> ! unsigned int
> execute_free_datastructures (void)
> {
> free_dominance_info (CDI_DOMINATORS);
> *************** execute_free_datastructures (void)
> *** 228,233 ****
> --- 229,238 ----
> /* Remove the ssa structures. */
> if (cfun->gimple_df)
> delete_tree_ssa ();
> +
> + /* And get rid of annotations we no longer need. */
> + delete_tree_cfg_annotations ();
> +
> return 0;
> }
>
> *************** struct gimple_opt_pass pass_free_datastr
> *** 254,262 ****
> static unsigned int
> execute_free_cfg_annotations (void)
> {
> - /* And get rid of annotations we no longer need. */
> - delete_tree_cfg_annotations ();
> -
> return 0;
> }
It looks like the pass referencing this is now dead. Please remove
this function and the pass structure.
> [Message clipped]
bah ... ;) Other half in next mail.
Richard.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-23 15:10 ` Andrew MacLeod
@ 2009-04-24 9:42 ` Richard Guenther
2009-04-26 20:27 ` Michael Matz
1 sibling, 0 replies; 56+ messages in thread
From: Richard Guenther @ 2009-04-24 9:42 UTC (permalink / raw)
To: Andrew MacLeod; +Cc: Michael Matz, gcc-patches, Andrey Belevantsev
On Thu, Apr 23, 2009 at 4:30 PM, Andrew MacLeod <amacleod@redhat.com> wrote:
> Michael Matz wrote:
>>
>> On Wed, 22 Apr 2009, Michael Matz wrote:
>>
>> I'd like to ask for approval for the series.
>>
>>
>> Ciao,
>> Michael.
>>
>
> Looks good to me.
>
> Andrew
>
>
>
>
>
>> --- 233,251 ----
> if (gimple_assign_copy_p (stmt)
> && gimple_assign_lhs (stmt) == result
> && gimple_assign_rhs1 (stmt) == found)
>> ! {
>> ! unlink_stmt_vdef (stmt);
>> ! gsi_remove (&gsi, true);
>> ! }
> else
>
> Unrelated to your patch, but out of curiosity, shouldn't gsi_remove()
> automatically do an unlink_stmt_vdef() if the remove_permanently flag is
> true like it is here? Seems like an oversight bug waiting to happen...
No, similar to release_defs unlink_stmt_vdef should not be done
if you plan to re-use the virtual operands of the stmt, for example
when replacing the stmt with a newly built one.
Richard.
^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFA] expand from SSA form (1/2)
2009-04-22 16:45 ` [RFA] " Michael Matz
@ 2009-04-23 15:10 ` Andrew MacLeod
2009-04-24 9:42 ` Richard Guenther
2009-04-26 20:27 ` Michael Matz
2009-04-24 14:32 ` Richard Guenther
` (3 subsequent siblings)
4 siblings, 2 replies; 56+ messages in thread
From: Andrew MacLeod @ 2009-04-23 15:10 UTC (permalink / raw)
To: Michael Matz; +Cc: gcc-patches, Andrey Belevantsev
[-- Attachment #1: Type: text/plain, Size: 173 bytes --]
Michael Matz wrote:
> On Wed, 22 Apr 2009, Michael Matz wrote:
>
>
> I'd like to ask for approval for the series.
>
>
> Ciao,
> Michael.
>
Looks good to me.
Andrew
[-- Attachment #2: com --]
[-- Type: text/plain, Size: 1229 bytes --]
> --- 233,251 ----
if (gimple_assign_copy_p (stmt)
&& gimple_assign_lhs (stmt) == result
&& gimple_assign_rhs1 (stmt) == found)
> ! {
> ! unlink_stmt_vdef (stmt);
> ! gsi_remove (&gsi, true);
> ! }
else
Unrelated to your patch, but out of curiosity, shouldn't gsi_remove() automatically do an unlink_stmt_vdef() if the remove_permanently flag is true like it is here? Seems like an oversight bug waiting to happen...
> --- 2537,2584 ----
> rebuild_jump_labels (get_insns ());
> find_exception_handler_labels ();
>
> + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR, next_bb)
> + {
> + edge e;
> + edge_iterator ei;
> + for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); )
> + {
> + if (e->insns.r)
> + commit_one_edge_insertion (e);
> + else
> + ei_next (&ei);
> + }
> + }
I do think this should be fixed in commit_edge_insertions and the common code factored out and called from here, but that can be done separately. Then commit_one_edge_insertion() could return to being a static function as well.
Im guessing current_ir_type() is not IR_RTL_CFGLAYOUT, or you could just call commit_edge_insertions directly...
^ permalink raw reply [flat|nested] 56+ messages in thread
* [RFA] expand from SSA form (1/2)
2009-04-22 10:54 ` Michael Matz
@ 2009-04-22 16:45 ` Michael Matz
2009-04-23 15:10 ` Andrew MacLeod
` (4 more replies)
0 siblings, 5 replies; 56+ messages in thread
From: Michael Matz @ 2009-04-22 16:45 UTC (permalink / raw)
To: gcc-patches; +Cc: Andrew MacLeod, Andrey Belevantsev
On Wed, 22 Apr 2009, Michael Matz wrote:
> I'll soon send a new version of the patch that fixes all problems and
> testcases I encountered.
Like so. This is the full patch, i.e. including the cleanups, but
excluding the testsuite changes. It should incorporate all feedback.
Compared to the last version it adds comments for new functions, fixes
muflap2, and generally some other minor problems showing when I started
testing Ada and a bug reported by Andrey.
This patch (plus testsuite changes) was bootstrapped with Ada on
x86_64-linux. There are no testsuite regressions:
FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE
FAIL: gcc.dg/tree-prof/pr34999.c compilation, -fprofile-use -D_PROFILE_USE
FAIL: gcc.target/i386/avx-vmovntdq-256-1.c (test for excess errors)
FAIL: gcc.target/i386/avx-vmovntpd-256-1.c (test for excess errors)
FAIL: gcc.target/i386/avx-vmovntps-256-1.c (test for excess errors)
FAIL: libmudflap.c++/pass41-frag.cxx execution test
FAIL: libmudflap.c++/pass41-frag.cxx (-static) execution test
FAIL: libmudflap.c++/pass41-frag.cxx (-O2) execution test
FAIL: libmudflap.c++/pass41-frag.cxx (-O3) execution test
All of these happen without the patch too (known bugs, old binutils, and
pass41-frag never seems to work anyway).
I'd like to ask for approval for the series.
Ciao,
Michael.
--
* builtins.c (fold_builtin_next_arg): Handle SSA names.
* tree-ssa-copyrename.c (rename_ssa_copies): Don't iterate
beyond num_ssa_names, use ssa_name() directly.
* tree-ssa-ter.c (free_temp_expr_table): Likewise.
* tree-ssa-coalesce.c (create_outofssa_var_map): Likewise,
mark only useful SSA names.
(compare_pairs): Swap cost comparison.
(coalesce_ssa_name): Don't use change_partition_var.
* tree-nrv.c (struct nrv_data): Add modified member.
(finalize_nrv_r): Set it.
(tree_nrv): Use it to update statements.
(pass_nrv): Require PROP_ssa.
* tree-mudflap.c (create_referenced_var): New static helper.
(mf_decl_cache_locals, mf_build_check_statement_for): Use it.
(pass_mudflap_2): Require PROP_ssa, run ssa update at finish.
* alias.c (find_base_decl): Handle SSA names.
* emit-rtl (set_reg_attrs_for_parm): Make non-static.
(component_ref_for_mem_expr): Don't leak SSA names into RTL.
* rtl.h (set_reg_attrs_for_parm): Declare.
* tree-optimize.c (pass_cleanup_cfg_post_optimizing): Rename
to "optimized", remove unused locals at finish.
(execute_free_datastructures): Make global, call
delete_tree_cfg_annotations.
(execute_free_cfg_annotations): Don't call
delete_tree_cfg_annotations.
* ssaexpand.h: New file.
* expr.c (toplevel): Include ssaexpand.h.
(expand_assignment): Handle SSA names the same as register
variables.
(expand_expr_real_1): Expand SSA names.
* cfgexpand.c (toplevel): Include ssaexpand.h.
(SA): New global variable.
(gimple_cond_pred_to_tree): Fold TERed comparisons into predicates.
(SSAVAR): New macro.
(set_rtl): New helper function.
(add_stack_var): Deal with SSA names, use set_rtl.
(expand_one_stack_var_at): Likewise.
(expand_one_stack_var): Deal with SSA names.
(stack_var_size_cmp): Use code (SSA_NAME / DECL) as tie breaker
before unique numbers.
(expand_stack_vars): Use set_rtl.
(expand_one_var): Accept SSA names, add asserts for them, feed them
to above subroutines.
(expand_used_vars): Expand all partitions (without default defs),
then only the local decls (ignoring those expanded already).
(expand_gimple_cond): Remove edges when jumpif() expands an
unconditional jump.
(expand_gimple_basic_block): Don't clear EDGE_EXECUTABLE here,
or remove abnormal edges. Ignore insns setting the LHS of a TERed
SSA name.
(gimple_expand_cfg): Call into rewrite_out_of_ssa, initialize
members of SA; deal with PARM_DECL partitions here; expand
all PHI nodes, free tree datastructures and SA. Commit instructions
on edges, clear EDGE_EXECUTABLE and remove abnormal edges here.
(pass_expand): Require and destroy PROP_ssa, verify SSA form, flow
info and statements at start, collect garbage at finish.
* tree-ssa-live.h (struct _var_map): Remove partition_to_var member.
(VAR_ANN_PARTITION) Remove.
(change_partition_var): Don't declare.
(partition_to_var): Always return SSA names.
(var_to_partition): Only accept SSA names.
(register_ssa_partition): Only check argument.
* tree-ssa-live.c (init_var_map): Don't allocate partition_to_var
member.
(delete_var_map): Don't free it.
(var_union): Only accept SSA names, simplify.
(partition_view_init): Mark only useful SSA names as used.
(partition_view_fini): Only deal with SSA names.
(change_partition_var): Remove.
(dump_var_map): Use ssa_name instead of partition_to_var member.
* tree-ssa.c (delete_tree_ssa): Don't remove PHI nodes on RTL
basic blocks.
* tree-outof-ssa.c (toplevel): Include ssaexpand.h and expr.h.
(struct _elim_graph): New member const_dests; nodes member vector of
ints.
(set_location_for_edge): New static helper.
(create_temp): Remove.
(insert_partition_copy_on_edge, insert_part_to_rtx_on_edge,
insert_value_copy_on_edge, insert_rtx_to_part_on_edge): New
functions.
(new_elim_graph): Allocate const_dests member.
(clean_elim_graph): Truncate const_dests member.
(delete_elim_graph): Free const_dests member.
(elim_graph_size): Adapt to new type of nodes member.
(elim_graph_add_node): Likewise.
(eliminate_name): Likewise.
(eliminate_build): Don't take basic block argument, deal only with
partition numbers, not variables.
(get_temp_reg): New static helper.
(elim_create): Use it, deal with RTL temporaries instead of trees.
(eliminate_phi): Adjust all calls to new signature.
(assign_vars, replace_use_variable, replace_def_variable): Remove.
(rewrite_trees): Only do checking.
(edge_leader, stmt_list, leader_has_match, leader_match): Remove.
(same_stmt_list_p, identical_copies_p, identical_stmt_lists_p,
init_analyze_edges_for_bb, fini_analyze_edges_for_bb,
contains_tree_r, MAX_STMTS_IN_LATCH,
process_single_block_loop_latch, analyze_edges_for_bb,
perform_edge_inserts): Remove.
(expand_phi_nodes): New global function.
(remove_ssa_form): Take ssaexpand parameter. Don't call removed
functions, initialize new parameter, remember partitions having a
default def.
(finish_out_of_ssa): New global function.
(rewrite_out_of_ssa): Make global. Adjust call to remove_ssa_form,
don't reset in_ssa_p here.
(pass_del_ssa): Remove.
* tree-flow.h (struct var_ann_d): Remove out_of_ssa_tag and
partition members.
(execute_free_datastructures): Declare.
* Makefile.in (SSAEXPAND_H): New variable.
(tree-outof-ssa.o, expr.o, cfgexpand.o): Depend on SSAEXPAND_H.
* basic-block.h (commit_one_edge_insertion): Declare.
* passes.c (init_optimization_passes): Move pass_nrv and
pass_mudflap2 before pass_cleanup_cfg_post_optimizing, remove
pass_del_ssa, pass_free_datastructures, pass_free_cfg_annotations.
* cfgrtl.c (commit_one_edge_insertion): Make global, don't declare.
(redirect_branch_edge): Deal with super block when expanding, split
out jump patching itself into ...
(patch_jump_insn): ... here, new static helper.
Index: builtins.c
===================================================================
*** builtins.c (revision 146576)
--- builtins.c (working copy)
*************** fold_builtin_next_arg (tree exp, bool va
*** 11801,11806 ****
--- 11801,11809 ----
arg = CALL_EXPR_ARG (exp, 0);
}
+ if (TREE_CODE (arg) == SSA_NAME)
+ arg = SSA_NAME_VAR (arg);
+
/* We destructively modify the call to be __builtin_va_start (ap, 0)
or __builtin_next_arg (0) the first time we see it, after checking
the arguments and if needed issuing a warning. */
Index: tree-ssa-copyrename.c
===================================================================
*** tree-ssa-copyrename.c (revision 146576)
--- tree-ssa-copyrename.c (working copy)
*************** rename_ssa_copies (void)
*** 291,297 ****
else
debug = NULL;
! map = init_var_map (num_ssa_names + 1);
FOR_EACH_BB (bb)
{
--- 291,297 ----
else
debug = NULL;
! map = init_var_map (num_ssa_names);
FOR_EACH_BB (bb)
{
*************** rename_ssa_copies (void)
*** 339,350 ****
/* Now one more pass to make all elements of a partition share the same
root variable. */
! for (x = 1; x <= num_ssa_names; x++)
{
part_var = partition_to_var (map, x);
if (!part_var)
continue;
! var = map->partition_to_var[x];
if (debug)
{
if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var))
--- 339,350 ----
/* Now one more pass to make all elements of a partition share the same
root variable. */
! for (x = 1; x < num_ssa_names; x++)
{
part_var = partition_to_var (map, x);
if (!part_var)
continue;
! var = ssa_name (x);
if (debug)
{
if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var))
Index: tree-nrv.c
===================================================================
*** tree-nrv.c (revision 146576)
--- tree-nrv.c (working copy)
*************** struct nrv_data
*** 56,61 ****
--- 56,62 ----
/* This is the function's RESULT_DECL. We will replace all occurrences
of VAR with RESULT_DECL when we apply this optimization. */
tree result;
+ int modified;
};
static tree finalize_nrv_r (tree *, int *, void *);
*************** finalize_nrv_r (tree *tp, int *walk_subt
*** 83,89 ****
/* Otherwise replace all occurrences of VAR with RESULT. */
else if (*tp == dp->var)
! *tp = dp->result;
/* Keep iterating. */
return NULL_TREE;
--- 84,93 ----
/* Otherwise replace all occurrences of VAR with RESULT. */
else if (*tp == dp->var)
! {
! *tp = dp->result;
! dp->modified = 1;
! }
/* Keep iterating. */
return NULL_TREE;
*************** tree_nrv (void)
*** 229,241 ****
if (gimple_assign_copy_p (stmt)
&& gimple_assign_lhs (stmt) == result
&& gimple_assign_rhs1 (stmt) == found)
! gsi_remove (&gsi, true);
else
{
struct walk_stmt_info wi;
memset (&wi, 0, sizeof (wi));
wi.info = &data;
walk_gimple_op (stmt, finalize_nrv_r, &wi);
gsi_next (&gsi);
}
}
--- 233,251 ----
if (gimple_assign_copy_p (stmt)
&& gimple_assign_lhs (stmt) == result
&& gimple_assign_rhs1 (stmt) == found)
! {
! unlink_stmt_vdef (stmt);
! gsi_remove (&gsi, true);
! }
else
{
struct walk_stmt_info wi;
memset (&wi, 0, sizeof (wi));
wi.info = &data;
+ data.modified = 0;
walk_gimple_op (stmt, finalize_nrv_r, &wi);
+ if (data.modified)
+ update_stmt (stmt);
gsi_next (&gsi);
}
}
*************** struct gimple_opt_pass pass_nrv =
*** 263,269 ****
NULL, /* next */
0, /* static_pass_number */
TV_TREE_NRV, /* tv_id */
! PROP_cfg, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
--- 273,279 ----
NULL, /* next */
0, /* static_pass_number */
TV_TREE_NRV, /* tv_id */
! PROP_ssa | PROP_cfg, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
Index: expr.c
===================================================================
*** expr.c (revision 146576)
--- expr.c (working copy)
*************** along with GCC; see the file COPYING3.
*** 54,59 ****
--- 54,60 ----
#include "timevar.h"
#include "df.h"
#include "diagnostic.h"
+ #include "ssaexpand.h"
/* Decide whether a function's arguments should be processed
from first to last or from last to first.
*************** expand_assignment (tree to, tree from, b
*** 4284,4295 ****
Don't do this if TO is a VAR_DECL or PARM_DECL whose DECL_RTL is REG
since it might be a promoted variable where the zero- or sign- extension
needs to be done. Handling this in the normal way is safe because no
! computation is done before the call. */
if (TREE_CODE (from) == CALL_EXPR && ! aggregate_value_p (from, from)
&& COMPLETE_TYPE_P (TREE_TYPE (from))
&& TREE_CODE (TYPE_SIZE (TREE_TYPE (from))) == INTEGER_CST
! && ! ((TREE_CODE (to) == VAR_DECL || TREE_CODE (to) == PARM_DECL)
! && REG_P (DECL_RTL (to))))
{
rtx value;
--- 4285,4297 ----
Don't do this if TO is a VAR_DECL or PARM_DECL whose DECL_RTL is REG
since it might be a promoted variable where the zero- or sign- extension
needs to be done. Handling this in the normal way is safe because no
! computation is done before the call. The same is true for SSA names. */
if (TREE_CODE (from) == CALL_EXPR && ! aggregate_value_p (from, from)
&& COMPLETE_TYPE_P (TREE_TYPE (from))
&& TREE_CODE (TYPE_SIZE (TREE_TYPE (from))) == INTEGER_CST
! && ! (((TREE_CODE (to) == VAR_DECL || TREE_CODE (to) == PARM_DECL)
! && REG_P (DECL_RTL (to)))
! || TREE_CODE (to) == SSA_NAME))
{
rtx value;
*************** expand_expr_real_1 (tree exp, rtx target
*** 7223,7230 ****
}
case SSA_NAME:
! return expand_expr_real_1 (SSA_NAME_VAR (exp), target, tmode, modifier,
! NULL);
case PARM_DECL:
case VAR_DECL:
--- 7225,7245 ----
}
case SSA_NAME:
! /* ??? ivopts calls expander, without any preparation from
! out-of-ssa. So fake instructions as if this was an access to the
! base variable. This unnecessarily allocates a pseudo, see how we can
! reuse it, if partition base vars have it set already. */
! if (!currently_expanding_to_rtl)
! return expand_expr_real_1 (SSA_NAME_VAR (exp), target, tmode, modifier, NULL);
! {
! gimple g = get_gimple_for_ssa_name (exp);
! if (g)
! return expand_expr_real_1 (gimple_assign_rhs_to_tree (g), target,
! tmode, modifier, NULL);
! }
! decl_rtl = get_rtx_for_ssa_name (exp);
! exp = SSA_NAME_VAR (exp);
! goto expand_decl_rtl;
case PARM_DECL:
case VAR_DECL:
*************** expand_expr_real_1 (tree exp, rtx target
*** 7250,7255 ****
--- 7265,7271 ----
case FUNCTION_DECL:
case RESULT_DECL:
decl_rtl = DECL_RTL (exp);
+ expand_decl_rtl:
gcc_assert (decl_rtl);
decl_rtl = copy_rtx (decl_rtl);
Index: alias.c
===================================================================
*** alias.c (revision 146576)
--- alias.c (working copy)
*************** find_base_decl (tree t)
*** 436,441 ****
--- 436,444 ----
if (t == 0 || t == error_mark_node || ! POINTER_TYPE_P (TREE_TYPE (t)))
return 0;
+ if (TREE_CODE (t) == SSA_NAME)
+ t = SSA_NAME_VAR (t);
+
/* If this is a declaration, return it. If T is based on a restrict
qualified decl, return that decl. */
if (DECL_P (t))
Index: tree-ssa-coalesce.c
===================================================================
*** tree-ssa-coalesce.c (revision 146576)
--- tree-ssa-coalesce.c (working copy)
*************** compare_pairs (const void *p1, const voi
*** 314,320 ****
const_coalesce_pair_p const *const pp2 = (const_coalesce_pair_p const *) p2;
int result;
! result = (* pp2)->cost - (* pp1)->cost;
/* Since qsort does not guarantee stability we use the elements
as a secondary key. This provides us with independence from
the host's implementation of the sorting algorithm. */
--- 314,320 ----
const_coalesce_pair_p const *const pp2 = (const_coalesce_pair_p const *) p2;
int result;
! result = (* pp1)->cost - (* pp2)->cost;
/* Since qsort does not guarantee stability we use the elements
as a secondary key. This provides us with independence from
the host's implementation of the sorting algorithm. */
*************** create_outofssa_var_map (coalesce_list_p
*** 974,980 ****
used_in_virtual_ops = BITMAP_ALLOC (NULL);
#endif
! map = init_var_map (num_ssa_names + 1);
FOR_EACH_BB (bb)
{
--- 974,980 ----
used_in_virtual_ops = BITMAP_ALLOC (NULL);
#endif
! map = init_var_map (num_ssa_names);
FOR_EACH_BB (bb)
{
*************** create_outofssa_var_map (coalesce_list_p
*** 1126,1133 ****
first = NULL_TREE;
for (i = 1; i < num_ssa_names; i++)
{
! var = map->partition_to_var[i];
! if (var != NULL_TREE)
{
/* Add coalesces between all the result decls. */
if (TREE_CODE (SSA_NAME_VAR (var)) == RESULT_DECL)
--- 1126,1133 ----
first = NULL_TREE;
for (i = 1; i < num_ssa_names; i++)
{
! var = ssa_name (i);
! if (var != NULL_TREE && is_gimple_reg (var))
{
/* Add coalesces between all the result decls. */
if (TREE_CODE (SSA_NAME_VAR (var)) == RESULT_DECL)
*************** create_outofssa_var_map (coalesce_list_p
*** 1148,1154 ****
/* Mark any default_def variables as being in the coalesce list
since they will have to be coalesced with the base variable. If
not marked as present, they won't be in the coalesce view. */
! if (gimple_default_def (cfun, SSA_NAME_VAR (var)) == var)
bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
}
}
--- 1148,1155 ----
/* Mark any default_def variables as being in the coalesce list
since they will have to be coalesced with the base variable. If
not marked as present, they won't be in the coalesce view. */
! if (gimple_default_def (cfun, SSA_NAME_VAR (var)) == var
! && !has_zero_uses (var))
bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
}
}
*************** eq_ssa_name_by_var (const void *p1, cons
*** 1329,1335 ****
extern var_map
coalesce_ssa_name (void)
{
- unsigned num, x;
tree_live_info_p liveinfo;
ssa_conflicts_p graph;
coalesce_list_p cl;
--- 1330,1335 ----
*************** coalesce_ssa_name (void)
*** 1406,1436 ****
/* First, coalesce all live on entry variables to their base variable.
This will ensure the first use is coming from the correct location. */
- num = num_var_partitions (map);
- for (x = 0 ; x < num; x++)
- {
- tree var = partition_to_var (map, x);
- tree root;
-
- if (TREE_CODE (var) != SSA_NAME)
- continue;
-
- root = SSA_NAME_VAR (var);
- if (gimple_default_def (cfun, root) == var)
- {
- /* This root variable should have not already been assigned
- to another partition which is not coalesced with this one. */
- gcc_assert (!var_ann (root)->out_of_ssa_tag);
-
- if (dump_file && (dump_flags & TDF_DETAILS))
- {
- print_exprs (dump_file, "Must coalesce ", var,
- " with the root variable ", root, ".\n");
- }
- change_partition_var (map, root, x);
- }
- }
-
if (dump_file && (dump_flags & TDF_DETAILS))
dump_var_map (dump_file, map);
--- 1406,1411 ----
Index: emit-rtl.c
===================================================================
*** emit-rtl.c (revision 146576)
--- emit-rtl.c (working copy)
*************** set_reg_attrs_for_parm (rtx parm_rtx, rt
*** 1028,1034 ****
/* Set the REG_ATTRS for registers in value X, given that X represents
decl T. */
! static void
set_reg_attrs_for_decl_rtl (tree t, rtx x)
{
if (GET_CODE (x) == SUBREG)
--- 1028,1034 ----
/* Set the REG_ATTRS for registers in value X, given that X represents
decl T. */
! void
set_reg_attrs_for_decl_rtl (tree t, rtx x)
{
if (GET_CODE (x) == SUBREG)
*************** component_ref_for_mem_expr (tree ref)
*** 1449,1455 ****
inner = NULL_TREE;
}
! if (inner == TREE_OPERAND (ref, 0))
return ref;
else
return build3 (COMPONENT_REF, TREE_TYPE (ref), inner,
--- 1449,1458 ----
inner = NULL_TREE;
}
! if (inner == TREE_OPERAND (ref, 0)
! /* Don't leak SSA-names in the third operand. */
! && (!TREE_OPERAND (ref, 2)
! || TREE_CODE (TREE_OPERAND (ref, 2)) != SSA_NAME))
return ref;
else
return build3 (COMPONENT_REF, TREE_TYPE (ref), inner,
Index: cfgexpand.c
===================================================================
*** cfgexpand.c (revision 146576)
--- cfgexpand.c (working copy)
*************** along with GCC; see the file COPYING3.
*** 42,49 ****
--- 42,54 ----
#include "tree-inline.h"
#include "value-prof.h"
#include "target.h"
+ #include "ssaexpand.h"
+ /* This variable holds information helping the rewriting of SSA trees
+ into RTL. */
+ struct ssaexpand SA;
+
/* Return an expression tree corresponding to the RHS of GIMPLE
statement STMT. */
*************** gimple_assign_rhs_to_tree (gimple stmt)
*** 78,85 ****
static tree
gimple_cond_pred_to_tree (gimple stmt)
{
return build2 (gimple_cond_code (stmt), boolean_type_node,
! gimple_cond_lhs (stmt), gimple_cond_rhs (stmt));
}
/* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression
--- 83,104 ----
static tree
gimple_cond_pred_to_tree (gimple stmt)
{
+ /* We're sometimes presented with such code:
+ D.123_1 = x < y;
+ if (D.123_1 != 0)
+ ...
+ This would expand to two comparisons which then later might
+ be cleaned up by combine. But some pattern matchers like if-conversion
+ work better when there's only one compare, so make up for this
+ here as special exception if TER would have made the same change. */
+ tree lhs = gimple_cond_lhs (stmt);
+ if (SA.values
+ && TREE_CODE (lhs) == SSA_NAME
+ && SA.values[SSA_NAME_VERSION (lhs)])
+ lhs = gimple_assign_rhs_to_tree (SA.values[SSA_NAME_VERSION (lhs)]);
+
return build2 (gimple_cond_code (stmt), boolean_type_node,
! lhs, gimple_cond_rhs (stmt));
}
/* Helper for gimple_to_tree. Set EXPR_LOCATION for every expression
*************** failed:
*** 423,428 ****
--- 442,464 ----
#define STACK_ALIGNMENT_NEEDED 1
#endif
+ #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x)
+
+ /* Associate declaration T with storage space X. If T is no
+ SSA name this is exactly SET_DECL_RTL, otherwise make the
+ partition of T associated with X. */
+ static inline void
+ set_rtl (tree t, rtx x)
+ {
+ if (TREE_CODE (t) == SSA_NAME)
+ {
+ SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x;
+ if (x && !MEM_P (x))
+ set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x);
+ }
+ else
+ SET_DECL_RTL (t, x);
+ }
/* This structure holds data relevant to one variable that will be
placed in a stack slot. */
*************** add_stack_var (tree decl)
*** 561,575 ****
}
stack_vars[stack_vars_num].decl = decl;
stack_vars[stack_vars_num].offset = 0;
! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (decl), 1);
! stack_vars[stack_vars_num].alignb = get_decl_align_unit (decl);
/* All variables are initially in their own partition. */
stack_vars[stack_vars_num].representative = stack_vars_num;
stack_vars[stack_vars_num].next = EOC;
/* Ensure that this decl doesn't get put onto the list twice. */
! SET_DECL_RTL (decl, pc_rtx);
stack_vars_num++;
}
--- 597,611 ----
}
stack_vars[stack_vars_num].decl = decl;
stack_vars[stack_vars_num].offset = 0;
! stack_vars[stack_vars_num].size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (decl)), 1);
! stack_vars[stack_vars_num].alignb = get_decl_align_unit (SSAVAR (decl));
/* All variables are initially in their own partition. */
stack_vars[stack_vars_num].representative = stack_vars_num;
stack_vars[stack_vars_num].next = EOC;
/* Ensure that this decl doesn't get put onto the list twice. */
! set_rtl (decl, pc_rtx);
stack_vars_num++;
}
*************** add_alias_set_conflicts (void)
*** 688,709 ****
}
/* A subroutine of partition_stack_vars. A comparison function for qsort,
! sorting an array of indices by the size of the object. */
static int
stack_var_size_cmp (const void *a, const void *b)
{
HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size;
HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size;
! unsigned int uida = DECL_UID (stack_vars[*(const size_t *)a].decl);
! unsigned int uidb = DECL_UID (stack_vars[*(const size_t *)b].decl);
if (sa < sb)
return -1;
if (sa > sb)
return 1;
! /* For stack variables of the same size use the uid of the decl
! to make the sort stable. */
if (uida < uidb)
return -1;
if (uida > uidb)
--- 724,760 ----
}
/* A subroutine of partition_stack_vars. A comparison function for qsort,
! sorting an array of indices by the size and type of the object. */
static int
stack_var_size_cmp (const void *a, const void *b)
{
HOST_WIDE_INT sa = stack_vars[*(const size_t *)a].size;
HOST_WIDE_INT sb = stack_vars[*(const size_t *)b].size;
! tree decla, declb;
! unsigned int uida, uidb;
if (sa < sb)
return -1;
if (sa > sb)
return 1;
! decla = stack_vars[*(const size_t *)a].decl;
! declb = stack_vars[*(const size_t *)b].decl;
! /* For stack variables of the same size use and id of the decls
! to make the sort stable. Two SSA names are compared by their
! version, SSA names come before non-SSA names, and two normal
! decls are compared by their DECL_UID. */
! if (TREE_CODE (decla) == SSA_NAME)
! {
! if (TREE_CODE (declb) == SSA_NAME)
! uida = SSA_NAME_VERSION (decla), uidb = SSA_NAME_VERSION (declb);
! else
! return -1;
! }
! else if (TREE_CODE (declb) == SSA_NAME)
! return 1;
! else
! uida = DECL_UID (decla), uidb = DECL_UID (declb);
if (uida < uidb)
return -1;
if (uida > uidb)
*************** expand_one_stack_var_at (tree decl, HOST
*** 874,894 ****
gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
x = plus_constant (virtual_stack_vars_rtx, offset);
! x = gen_rtx_MEM (DECL_MODE (decl), x);
! /* Set alignment we actually gave this decl. */
! offset -= frame_phase;
! align = offset & -offset;
! align *= BITS_PER_UNIT;
! if (align == 0)
! align = STACK_BOUNDARY;
! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
! align = MAX_SUPPORTED_STACK_ALIGNMENT;
! DECL_ALIGN (decl) = align;
! DECL_USER_ALIGN (decl) = 0;
! set_mem_attributes (x, decl, true);
! SET_DECL_RTL (decl, x);
}
/* A subroutine of expand_used_vars. Give each partition representative
--- 925,951 ----
gcc_assert (offset == trunc_int_for_mode (offset, Pmode));
x = plus_constant (virtual_stack_vars_rtx, offset);
! x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x);
! if (TREE_CODE (decl) != SSA_NAME)
! {
! /* Set alignment we actually gave this decl if it isn't an SSA name.
! If it is we generate stack slots only accidentally so it isn't as
! important, we'll simply use the alignment that is already set. */
! offset -= frame_phase;
! align = offset & -offset;
! align *= BITS_PER_UNIT;
! if (align == 0)
! align = STACK_BOUNDARY;
! else if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
! align = MAX_SUPPORTED_STACK_ALIGNMENT;
! DECL_ALIGN (decl) = align;
! DECL_USER_ALIGN (decl) = 0;
! }
!
! set_mem_attributes (x, SSAVAR (decl), true);
! set_rtl (decl, x);
}
/* A subroutine of expand_used_vars. Give each partition representative
*************** expand_stack_vars (bool (*pred) (tree))
*** 912,918 ****
/* Skip variables that have already had rtl assigned. See also
add_stack_var where we perpetrate this pc_rtx hack. */
! if (DECL_RTL (stack_vars[i].decl) != pc_rtx)
continue;
/* Check the predicate to see whether this variable should be
--- 969,977 ----
/* Skip variables that have already had rtl assigned. See also
add_stack_var where we perpetrate this pc_rtx hack. */
! if ((TREE_CODE (stack_vars[i].decl) == SSA_NAME
! ? SA.partition_to_pseudo[var_to_partition (SA.map, stack_vars[i].decl)]
! : DECL_RTL (stack_vars[i].decl)) != pc_rtx)
continue;
/* Check the predicate to see whether this variable should be
*************** account_stack_vars (void)
*** 951,957 ****
size += stack_vars[i].size;
for (j = i; j != EOC; j = stack_vars[j].next)
! SET_DECL_RTL (stack_vars[j].decl, NULL);
}
return size;
}
--- 1010,1016 ----
size += stack_vars[i].size;
for (j = i; j != EOC; j = stack_vars[j].next)
! set_rtl (stack_vars[j].decl, NULL);
}
return size;
}
*************** expand_one_stack_var (tree var)
*** 964,971 ****
{
HOST_WIDE_INT size, offset, align;
! size = tree_low_cst (DECL_SIZE_UNIT (var), 1);
! align = get_decl_align_unit (var);
offset = alloc_stack_frame_space (size, align);
expand_one_stack_var_at (var, offset);
--- 1023,1030 ----
{
HOST_WIDE_INT size, offset, align;
! size = tree_low_cst (DECL_SIZE_UNIT (SSAVAR (var)), 1);
! align = get_decl_align_unit (SSAVAR (var));
offset = alloc_stack_frame_space (size, align);
expand_one_stack_var_at (var, offset);
*************** expand_one_hard_reg_var (tree var)
*** 986,1005 ****
static void
expand_one_register_var (tree var)
{
! tree type = TREE_TYPE (var);
int unsignedp = TYPE_UNSIGNED (type);
enum machine_mode reg_mode
! = promote_mode (type, DECL_MODE (var), &unsignedp, 0);
rtx x = gen_reg_rtx (reg_mode);
! SET_DECL_RTL (var, x);
/* Note if the object is a user variable. */
! if (!DECL_ARTIFICIAL (var))
! mark_user_reg (x);
if (POINTER_TYPE_P (type))
! mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
}
/* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that
--- 1045,1065 ----
static void
expand_one_register_var (tree var)
{
! tree decl = SSAVAR (var);
! tree type = TREE_TYPE (decl);
int unsignedp = TYPE_UNSIGNED (type);
enum machine_mode reg_mode
! = promote_mode (type, DECL_MODE (decl), &unsignedp, 0);
rtx x = gen_reg_rtx (reg_mode);
! set_rtl (var, x);
/* Note if the object is a user variable. */
! if (!DECL_ARTIFICIAL (decl))
! mark_user_reg (x);
if (POINTER_TYPE_P (type))
! mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type)));
}
/* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that
*************** defer_stack_allocation (tree var, bool t
*** 1067,1072 ****
--- 1127,1135 ----
static HOST_WIDE_INT
expand_one_var (tree var, bool toplevel, bool really_expand)
{
+ tree origvar = var;
+ var = SSAVAR (var);
+
if (SUPPORTS_STACK_ALIGNMENT
&& TREE_TYPE (var) != error_mark_node
&& TREE_CODE (var) == VAR_DECL)
*************** expand_one_var (tree var, bool toplevel,
*** 1092,1098 ****
}
}
! if (TREE_CODE (var) != VAR_DECL)
;
else if (DECL_EXTERNAL (var))
;
--- 1155,1172 ----
}
}
! if (TREE_CODE (origvar) == SSA_NAME)
! {
! gcc_assert (TREE_CODE (var) != VAR_DECL
! || (!DECL_EXTERNAL (var)
! && !DECL_HAS_VALUE_EXPR_P (var)
! && !TREE_STATIC (var)
! && !DECL_RTL_SET_P (var)
! && TREE_TYPE (var) != error_mark_node
! && !DECL_HARD_REGISTER (var)
! && really_expand));
! }
! if (TREE_CODE (var) != VAR_DECL && TREE_CODE (origvar) != SSA_NAME)
;
else if (DECL_EXTERNAL (var))
;
*************** expand_one_var (tree var, bool toplevel,
*** 1107,1113 ****
if (really_expand)
expand_one_error_var (var);
}
! else if (DECL_HARD_REGISTER (var))
{
if (really_expand)
expand_one_hard_reg_var (var);
--- 1181,1187 ----
if (really_expand)
expand_one_error_var (var);
}
! else if (TREE_CODE (var) == VAR_DECL && DECL_HARD_REGISTER (var))
{
if (really_expand)
expand_one_hard_reg_var (var);
*************** expand_one_var (tree var, bool toplevel,
*** 1115,1128 ****
else if (use_register_for_decl (var))
{
if (really_expand)
! expand_one_register_var (var);
}
else if (defer_stack_allocation (var, toplevel))
! add_stack_var (var);
else
{
if (really_expand)
! expand_one_stack_var (var);
return tree_low_cst (DECL_SIZE_UNIT (var), 1);
}
return 0;
--- 1189,1202 ----
else if (use_register_for_decl (var))
{
if (really_expand)
! expand_one_register_var (origvar);
}
else if (defer_stack_allocation (var, toplevel))
! add_stack_var (origvar);
else
{
if (really_expand)
! expand_one_stack_var (origvar);
return tree_low_cst (DECL_SIZE_UNIT (var), 1);
}
return 0;
*************** static void
*** 1441,1446 ****
--- 1515,1521 ----
expand_used_vars (void)
{
tree t, next, outer_block = DECL_INITIAL (current_function_decl);
+ unsigned i;
/* Compute the phase of the stack frame for this function. */
{
*************** expand_used_vars (void)
*** 1451,1456 ****
--- 1526,1553 ----
init_vars_expansion ();
+ for (i = 0; i < SA.map->num_partitions; i++)
+ {
+ tree var = partition_to_var (SA.map, i);
+
+ gcc_assert (is_gimple_reg (var));
+ if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL)
+ expand_one_var (var, true, true);
+ else
+ {
+ /* This is a PARM_DECL or RESULT_DECL. For those partitions that
+ contain the default def (representing the parm or result itself)
+ we don't do anything here. But those which don't contain the
+ default def (representing a temporary based on the parm/result)
+ we need to allocate space just like for normal VAR_DECLs. */
+ if (!bitmap_bit_p (SA.partition_has_default_def, i))
+ {
+ expand_one_var (var, true, true);
+ gcc_assert (SA.partition_to_pseudo[i]);
+ }
+ }
+ }
+
/* At this point all variables on the local_decls with TREE_USED
set are not associated with any block scope. Lay them out. */
t = cfun->local_decls;
*************** expand_used_vars (void)
*** 1462,1480 ****
next = TREE_CHAIN (t);
/* We didn't set a block for static or extern because it's hard
to tell the difference between a global variable (re)declared
in a local scope, and one that's really declared there to
begin with. And it doesn't really matter much, since we're
not giving them stack space. Expand them now. */
! if (TREE_STATIC (var) || DECL_EXTERNAL (var))
! expand_now = true;
!
! /* Any variable that could have been hoisted into an SSA_NAME
! will have been propagated anywhere the optimizers chose,
! i.e. not confined to their original block. Allocate them
! as if they were defined in the outermost scope. */
! else if (is_gimple_reg (var))
expand_now = true;
/* If the variable is not associated with any block, then it
--- 1559,1573 ----
next = TREE_CHAIN (t);
+ /* Expanded above already. */
+ if (is_gimple_reg (var))
+ ;
/* We didn't set a block for static or extern because it's hard
to tell the difference between a global variable (re)declared
in a local scope, and one that's really declared there to
begin with. And it doesn't really matter much, since we're
not giving them stack space. Expand them now. */
! else if (TREE_STATIC (var) || DECL_EXTERNAL (var))
expand_now = true;
/* If the variable is not associated with any block, then it
*************** expand_gimple_cond (basic_block bb, gimp
*** 1674,1679 ****
--- 1767,1785 ----
true_edge->goto_block = NULL;
false_edge->flags |= EDGE_FALLTHRU;
ggc_free (pred);
+ /* Special case: when jumpif decides that the condition is
+ trivial it emits an unconditional jump (and the necessary
+ barrier). But we still have two edges, the fallthru one is
+ wrong. purge_dead_edges would clean this up later. Unfortunately
+ we have to insert insns (and split edges) before
+ find_many_sub_basic_blocks and hence before purge_dead_edges.
+ But splitting edges might create new blocks which depend on the
+ fact that if there are two edges there's no barrier. So the
+ barrier would get lost and verify_flow_info would ICE. Instead
+ of auditing all edge splitters to care for the barrier (which
+ normally isn't there in a cleaned CFG), fix it here. */
+ if (BARRIER_P (get_last_insn ()))
+ remove_edge (false_edge);
return NULL;
}
if (true_edge->dest == bb->next_bb)
*************** expand_gimple_cond (basic_block bb, gimp
*** 1690,1695 ****
--- 1796,1803 ----
false_edge->goto_block = NULL;
true_edge->flags |= EDGE_FALLTHRU;
ggc_free (pred);
+ if (BARRIER_P (get_last_insn ()))
+ remove_edge (true_edge);
return NULL;
}
*************** expand_gimple_basic_block (basic_block b
*** 1932,1951 ****
NOTE_BASIC_BLOCK (note) = bb;
- for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); )
- {
- /* Clear EDGE_EXECUTABLE. This flag is never used in the backend. */
- e->flags &= ~EDGE_EXECUTABLE;
-
- /* At the moment not all abnormal edges match the RTL representation.
- It is safe to remove them here as find_many_sub_basic_blocks will
- rediscover them. In the future we should get this fixed properly. */
- if (e->flags & EDGE_ABNORMAL)
- remove_edge (e);
- else
- ei_next (&ei);
- }
-
for (; !gsi_end_p (gsi); gsi_next (&gsi))
{
gimple stmt = gsi_stmt (gsi);
--- 2040,2045 ----
*************** expand_gimple_basic_block (basic_block b
*** 1975,1981 ****
}
else if (gimple_code (stmt) != GIMPLE_CHANGE_DYNAMIC_TYPE)
{
! tree stmt_tree = gimple_to_tree (stmt);
last = get_last_insn ();
expand_expr_stmt (stmt_tree);
maybe_dump_rtl_for_gimple_stmt (stmt, last);
--- 2069,2087 ----
}
else if (gimple_code (stmt) != GIMPLE_CHANGE_DYNAMIC_TYPE)
{
! def_operand_p def_p;
! tree stmt_tree;
! def_p = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);
!
! if (def_p != NULL)
! {
! /* Ignore this stmt if it is in the list of
! replaceable expressions. */
! if (SA.values
! && SA.values[SSA_NAME_VERSION (DEF_FROM_PTR (def_p))])
! continue;
! }
! stmt_tree = gimple_to_tree (stmt);
last = get_last_insn ();
expand_expr_stmt (stmt_tree);
maybe_dump_rtl_for_gimple_stmt (stmt, last);
*************** gimple_expand_cfg (void)
*** 2286,2291 ****
--- 2392,2402 ----
sbitmap blocks;
edge_iterator ei;
edge e;
+ unsigned i;
+
+ rewrite_out_of_ssa (&SA);
+ SA.partition_to_pseudo = (rtx *)xcalloc (SA.map->num_partitions,
+ sizeof (rtx));
/* Some backends want to know that we are expanding to RTL. */
currently_expanding_to_rtl = 1;
*************** gimple_expand_cfg (void)
*** 2339,2344 ****
--- 2450,2478 ----
/* Set up parameters and prepare for return, for the function. */
expand_function_start (current_function_decl);
+ /* Now that we also have the parameter RTXs, copy them over to our
+ partitions. */
+ for (i = 0; i < SA.map->num_partitions; i++)
+ {
+ tree var = SSA_NAME_VAR (partition_to_var (SA.map, i));
+
+ if (TREE_CODE (var) != VAR_DECL
+ && !SA.partition_to_pseudo[i])
+ SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var);
+ gcc_assert (SA.partition_to_pseudo[i]);
+ /* Some RTL parts really want to look at DECL_RTL(x) when x
+ was a decl marked in REG_ATTR or MEM_ATTR. We could use
+ SET_DECL_RTL here making this available, but that would mean
+ to select one of the potentially many RTLs for one DECL. Instead
+ of doing that we simply reset the MEM_EXPR of the RTL in question,
+ then nobody can get at it and hence nobody can call DECL_RTL on it. */
+ if (!DECL_RTL_SET_P (var))
+ {
+ if (MEM_P (SA.partition_to_pseudo[i]))
+ set_mem_expr (SA.partition_to_pseudo[i], NULL);
+ }
+ }
+
/* If this function is `main', emit a call to `__main'
to run global initializers, etc. */
if (DECL_NAME (current_function_decl)
*************** gimple_expand_cfg (void)
*** 2371,2380 ****
/* Register rtl specific functions for cfg. */
rtl_register_cfg_hooks ();
init_block = construct_init_block ();
/* Clear EDGE_EXECUTABLE on the entry edge(s). It is cleaned from the
! remaining edges in expand_gimple_basic_block. */
FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR->succs)
e->flags &= ~EDGE_EXECUTABLE;
--- 2505,2516 ----
/* Register rtl specific functions for cfg. */
rtl_register_cfg_hooks ();
+ expand_phi_nodes (&SA);
+
init_block = construct_init_block ();
/* Clear EDGE_EXECUTABLE on the entry edge(s). It is cleaned from the
! remaining edges later. */
FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR->succs)
e->flags &= ~EDGE_EXECUTABLE;
*************** gimple_expand_cfg (void)
*** 2382,2387 ****
--- 2518,2526 ----
FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR, next_bb)
bb = expand_gimple_basic_block (bb);
+ execute_free_datastructures ();
+ finish_out_of_ssa (&SA);
+
/* Expansion is used by optimization passes too, set maybe_hot_insn_p
conservatively to true until they are all profile aware. */
pointer_map_destroy (lab_rtx_for_bb);
*************** gimple_expand_cfg (void)
*** 2391,2399 ****
set_curr_insn_block (DECL_INITIAL (current_function_decl));
insn_locators_finalize ();
- /* We're done expanding trees to RTL. */
- currently_expanding_to_rtl = 0;
-
/* Convert tree EH labels to RTL EH labels and zap the tree EH table. */
convert_from_eh_region_ranges ();
set_eh_throw_stmt_table (cfun, NULL);
--- 2530,2535 ----
*************** gimple_expand_cfg (void)
*** 2401,2411 ****
rebuild_jump_labels (get_insns ());
find_exception_handler_labels ();
blocks = sbitmap_alloc (last_basic_block);
sbitmap_ones (blocks);
find_many_sub_basic_blocks (blocks);
- purge_all_dead_edges ();
sbitmap_free (blocks);
compact_blocks ();
--- 2537,2584 ----
rebuild_jump_labels (get_insns ());
find_exception_handler_labels ();
+ FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR, next_bb)
+ {
+ edge e;
+ edge_iterator ei;
+ for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); )
+ {
+ if (e->insns.r)
+ commit_one_edge_insertion (e);
+ else
+ ei_next (&ei);
+ }
+ }
+
+ /* We're done expanding trees to RTL. */
+ currently_expanding_to_rtl = 0;
+
+ FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb)
+ {
+ edge e;
+ edge_iterator ei;
+ for (ei = ei_start (bb->succs); (e = ei_safe_edge (ei)); )
+ {
+ /* Clear EDGE_EXECUTABLE. This flag is never used in the backend. */
+ e->flags &= ~EDGE_EXECUTABLE;
+
+ /* At the moment not all abnormal edges match the RTL
+ representation. It is safe to remove them here as
+ find_many_sub_basic_blocks will rediscover them.
+ In the future we should get this fixed properly. */
+ if ((e->flags & EDGE_ABNORMAL)
+ && !(e->flags & EDGE_SIBCALL))
+ remove_edge (e);
+ else
+ ei_next (&ei);
+ }
+ }
+
blocks = sbitmap_alloc (last_basic_block);
sbitmap_ones (blocks);
find_many_sub_basic_blocks (blocks);
sbitmap_free (blocks);
+ purge_all_dead_edges ();
compact_blocks ();
*************** struct rtl_opt_pass pass_expand =
*** 2471,2480 ****
0, /* static_pass_number */
TV_EXPAND, /* tv_id */
/* ??? If TER is enabled, we actually receive GENERIC. */
! PROP_gimple_leh | PROP_cfg, /* properties_required */
PROP_rtl, /* properties_provided */
! PROP_trees, /* properties_destroyed */
! 0, /* todo_flags_start */
! TODO_dump_func, /* todo_flags_finish */
}
};
--- 2644,2655 ----
0, /* static_pass_number */
TV_EXPAND, /* tv_id */
/* ??? If TER is enabled, we actually receive GENERIC. */
! PROP_ssa | PROP_gimple_leh | PROP_cfg, /* properties_required */
PROP_rtl, /* properties_provided */
! PROP_ssa | PROP_trees, /* properties_destroyed */
! TODO_verify_ssa | TODO_verify_flow
! | TODO_verify_stmts, /* todo_flags_start */
! TODO_dump_func
! | TODO_ggc_collect /* todo_flags_finish */
}
};
Index: tree-ssa-live.c
===================================================================
*** tree-ssa-live.c (revision 146576)
--- tree-ssa-live.c (working copy)
*************** init_var_map (int size)
*** 136,144 ****
map = (var_map) xmalloc (sizeof (struct _var_map));
map->var_partition = partition_new (size);
- map->partition_to_var
- = (tree *)xmalloc (size * sizeof (tree));
- memset (map->partition_to_var, 0, size * sizeof (tree));
map->partition_to_view = NULL;
map->view_to_partition = NULL;
--- 136,141 ----
*************** void
*** 157,163 ****
delete_var_map (var_map map)
{
var_map_base_fini (map);
- free (map->partition_to_var);
partition_delete (map->var_partition);
if (map->partition_to_view)
free (map->partition_to_view);
--- 154,159 ----
*************** int
*** 175,215 ****
var_union (var_map map, tree var1, tree var2)
{
int p1, p2, p3;
! tree root_var = NULL_TREE;
! tree other_var = NULL_TREE;
/* This is independent of partition_to_view. If partition_to_view is
on, then whichever one of these partitions is absorbed will never have a
dereference into the partition_to_view array any more. */
! if (TREE_CODE (var1) == SSA_NAME)
! p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
! else
! {
! p1 = var_to_partition (map, var1);
! if (map->view_to_partition)
! p1 = map->view_to_partition[p1];
! root_var = var1;
! }
!
! if (TREE_CODE (var2) == SSA_NAME)
! p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
! else
! {
! p2 = var_to_partition (map, var2);
! if (map->view_to_partition)
! p2 = map->view_to_partition[p2];
!
! /* If there is no root_var set, or it's not a user variable, set the
! root_var to this one. */
! if (!root_var || (DECL_P (root_var) && DECL_IGNORED_P (root_var)))
! {
! other_var = root_var;
! root_var = var2;
! }
! else
! other_var = var2;
! }
gcc_assert (p1 != NO_PARTITION);
gcc_assert (p2 != NO_PARTITION);
--- 171,186 ----
var_union (var_map map, tree var1, tree var2)
{
int p1, p2, p3;
!
! gcc_assert (TREE_CODE (var1) == SSA_NAME);
! gcc_assert (TREE_CODE (var2) == SSA_NAME);
/* This is independent of partition_to_view. If partition_to_view is
on, then whichever one of these partitions is absorbed will never have a
dereference into the partition_to_view array any more. */
! p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1));
! p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2));
gcc_assert (p1 != NO_PARTITION);
gcc_assert (p2 != NO_PARTITION);
*************** var_union (var_map map, tree var1, tree
*** 222,232 ****
if (map->partition_to_view)
p3 = map->partition_to_view[p3];
- if (root_var)
- change_partition_var (map, root_var, p3);
- if (other_var)
- change_partition_var (map, other_var, p3);
-
return p3;
}
--- 193,198 ----
*************** partition_view_init (var_map map)
*** 278,284 ****
for (x = 0; x < map->partition_size; x++)
{
tmp = partition_find (map->var_partition, x);
! if (map->partition_to_var[tmp] != NULL_TREE && !bitmap_bit_p (used, tmp))
bitmap_set_bit (used, tmp);
}
--- 244,252 ----
for (x = 0; x < map->partition_size; x++)
{
tmp = partition_find (map->var_partition, x);
! if (ssa_name (tmp) != NULL_TREE && is_gimple_reg (ssa_name (tmp))
! && (!has_zero_uses (ssa_name (tmp))
! || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp))))
bitmap_set_bit (used, tmp);
}
*************** partition_view_fini (var_map map, bitmap
*** 297,303 ****
{
bitmap_iterator bi;
unsigned count, i, x, limit;
- tree var;
gcc_assert (selected);
--- 265,270 ----
*************** partition_view_fini (var_map map, bitmap
*** 317,327 ****
{
map->partition_to_view[x] = i;
map->view_to_partition[i] = x;
- var = map->partition_to_var[x];
- /* If any one of the members of a partition is not an SSA_NAME, make
- sure it is the representative. */
- if (TREE_CODE (var) != SSA_NAME)
- change_partition_var (map, var, i);
i++;
}
gcc_assert (i == count);
--- 284,289 ----
*************** partition_view_bitmap (var_map map, bitm
*** 379,403 ****
}
- /* This function is used to change the representative variable in MAP for VAR's
- partition to a regular non-ssa variable. This allows partitions to be
- mapped back to real variables. */
-
- void
- change_partition_var (var_map map, tree var, int part)
- {
- var_ann_t ann;
-
- gcc_assert (TREE_CODE (var) != SSA_NAME);
-
- ann = var_ann (var);
- ann->out_of_ssa_tag = 1;
- VAR_ANN_PARTITION (ann) = part;
- if (map->view_to_partition)
- map->partition_to_var[map->view_to_partition[part]] = var;
- }
-
-
static inline void mark_all_vars_used (tree *, void *data);
/* Helper function for mark_all_vars_used, called via walk_tree. */
--- 341,346 ----
*************** dump_var_map (FILE *f, var_map map)
*** 1105,1111 ****
else
p = x;
! if (map->partition_to_var[p] == NULL_TREE)
continue;
t = 0;
--- 1048,1054 ----
else
p = x;
! if (ssa_name (p) == NULL_TREE)
continue;
t = 0;
Index: tree-ssa-live.h
===================================================================
*** tree-ssa-live.h (revision 146576)
--- tree-ssa-live.h (working copy)
*************** typedef struct _var_map
*** 60,68 ****
int *partition_to_view;
int *view_to_partition;
- /* Mapping of partition numbers to variables. */
- tree *partition_to_var;
-
/* Current number of partitions in var_map based on the current view. */
unsigned int num_partitions;
--- 60,65 ----
*************** typedef struct _var_map
*** 80,87 ****
} *var_map;
- /* Partition number of a non ssa-name variable. */
- #define VAR_ANN_PARTITION(ann) (ann->partition)
/* Index to the basevar table of a non ssa-name variable. */
#define VAR_ANN_BASE_INDEX(ann) (ann->base_index)
--- 77,82 ----
*************** extern var_map init_var_map (int);
*** 93,99 ****
extern void delete_var_map (var_map);
extern void dump_var_map (FILE *, var_map);
extern int var_union (var_map, tree, tree);
- extern void change_partition_var (var_map, tree, int);
extern void partition_view_normal (var_map, bool);
extern void partition_view_bitmap (var_map, bitmap, bool);
#ifdef ENABLE_CHECKING
--- 88,93 ----
*************** num_var_partitions (var_map map)
*** 116,125 ****
static inline tree
partition_to_var (var_map map, int i)
{
if (map->view_to_partition)
i = map->view_to_partition[i];
i = partition_find (map->var_partition, i);
! return map->partition_to_var[i];
}
--- 110,121 ----
static inline tree
partition_to_var (var_map map, int i)
{
+ tree name;
if (map->view_to_partition)
i = map->view_to_partition[i];
i = partition_find (map->var_partition, i);
! name = ssa_name (i);
! return name;
}
*************** version_to_var (var_map map, int version
*** 146,168 ****
static inline int
var_to_partition (var_map map, tree var)
{
- var_ann_t ann;
int part;
! if (TREE_CODE (var) == SSA_NAME)
! {
! part = partition_find (map->var_partition, SSA_NAME_VERSION (var));
! if (map->partition_to_view)
! part = map->partition_to_view[part];
! }
! else
! {
! ann = var_ann (var);
! if (ann && ann->out_of_ssa_tag)
! part = VAR_ANN_PARTITION (ann);
! else
! part = NO_PARTITION;
! }
return part;
}
--- 142,153 ----
static inline int
var_to_partition (var_map map, tree var)
{
int part;
! gcc_assert (TREE_CODE (var) == SSA_NAME);
! part = partition_find (map->var_partition, SSA_NAME_VERSION (var));
! if (map->partition_to_view)
! part = map->partition_to_view[part];
return part;
}
*************** num_basevars (var_map map)
*** 207,223 ****
partitions may be filtered out by a view later. */
static inline void
! register_ssa_partition (var_map map, tree ssa_var)
{
- int version;
-
#if defined ENABLE_CHECKING
register_ssa_partition_check (ssa_var);
#endif
-
- version = SSA_NAME_VERSION (ssa_var);
- if (map->partition_to_var[version] == NULL_TREE)
- map->partition_to_var[version] = ssa_var;
}
--- 192,202 ----
partitions may be filtered out by a view later. */
static inline void
! register_ssa_partition (var_map map ATTRIBUTE_UNUSED, tree ssa_var)
{
#if defined ENABLE_CHECKING
register_ssa_partition_check (ssa_var);
#endif
}
Index: tree-mudflap.c
===================================================================
*** tree-mudflap.c (revision 146576)
--- tree-mudflap.c (working copy)
*************** execute_mudflap_function_ops (void)
*** 447,452 ****
--- 447,465 ----
return 0;
}
+ /* Construct a new temporary variable with TYPE
+ as type and PREFIX as name prefix, add it to referenced vars
+ and mark it for renaming. */
+
+ static tree
+ create_referenced_var (tree type, const char *prefix)
+ {
+ tree var = create_tmp_var (type, prefix);
+ add_referenced_var (var);
+ mark_sym_for_renaming (var);
+ return var;
+ }
+
/* Create and initialize local shadow variables for the lookup cache
globals. Put their decls in the *_l globals for use by
mf_build_check_statement_for. */
*************** mf_decl_cache_locals (void)
*** 459,469 ****
/* Build the cache vars. */
mf_cache_shift_decl_l
! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_shift_decl),
"__mf_lookup_shift_l"));
mf_cache_mask_decl_l
! = mf_mark (create_tmp_var (TREE_TYPE (mf_cache_mask_decl),
"__mf_lookup_mask_l"));
/* Build initialization nodes for the cache vars. We just load the
--- 472,482 ----
/* Build the cache vars. */
mf_cache_shift_decl_l
! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_shift_decl),
"__mf_lookup_shift_l"));
mf_cache_mask_decl_l
! = mf_mark (create_referenced_var (TREE_TYPE (mf_cache_mask_decl),
"__mf_lookup_mask_l"));
/* Build initialization nodes for the cache vars. We just load the
*************** mf_build_check_statement_for (tree base,
*** 546,554 ****
}
/* Build our local variables. */
! mf_elem = create_tmp_var (mf_cache_structptr_type, "__mf_elem");
! mf_base = create_tmp_var (mf_uintptr_type, "__mf_base");
! mf_limit = create_tmp_var (mf_uintptr_type, "__mf_limit");
/* Build: __mf_base = (uintptr_t) <base address expression>. */
seq = gimple_seq_alloc ();
--- 559,567 ----
}
/* Build our local variables. */
! mf_elem = create_referenced_var (mf_cache_structptr_type, "__mf_elem");
! mf_base = create_referenced_var (mf_uintptr_type, "__mf_base");
! mf_limit = create_referenced_var (mf_uintptr_type, "__mf_limit");
/* Build: __mf_base = (uintptr_t) <base address expression>. */
seq = gimple_seq_alloc ();
*************** mf_build_check_statement_for (tree base,
*** 627,633 ****
t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u);
t = force_gimple_operand (t, &stmts, false, NULL_TREE);
gimple_seq_add_seq (&seq, stmts);
! cond = create_tmp_var (boolean_type_node, "__mf_unlikely_cond");
g = gimple_build_assign (cond, t);
gimple_set_location (g, location);
gimple_seq_add_stmt (&seq, g);
--- 640,646 ----
t = build2 (TRUTH_OR_EXPR, boolean_type_node, t, u);
t = force_gimple_operand (t, &stmts, false, NULL_TREE);
gimple_seq_add_seq (&seq, stmts);
! cond = create_referenced_var (boolean_type_node, "__mf_unlikely_cond");
g = gimple_build_assign (cond, t);
gimple_set_location (g, location);
gimple_seq_add_stmt (&seq, g);
*************** struct gimple_opt_pass pass_mudflap_2 =
*** 1366,1377 ****
NULL, /* next */
0, /* static_pass_number */
TV_NONE, /* tv_id */
! PROP_gimple_leh, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
TODO_verify_flow | TODO_verify_stmts
! | TODO_dump_func /* todo_flags_finish */
}
};
--- 1379,1390 ----
NULL, /* next */
0, /* static_pass_number */
TV_NONE, /* tv_id */
! PROP_ssa | PROP_cfg | PROP_gimple_leh,/* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
TODO_verify_flow | TODO_verify_stmts
! | TODO_dump_func | TODO_update_ssa /* todo_flags_finish */
}
};
Index: tree-ssa-ter.c
===================================================================
*** tree-ssa-ter.c (revision 146576)
--- tree-ssa-ter.c (working copy)
*************** free_temp_expr_table (temp_expr_table_p
*** 225,231 ****
unsigned x;
for (x = 0; x <= num_var_partitions (t->map); x++)
gcc_assert (!t->kill_list[x]);
! for (x = 0; x < num_ssa_names + 1; x++)
{
gcc_assert (t->expr_decl_uids[x] == NULL);
gcc_assert (t->partition_dependencies[x] == NULL);
--- 225,231 ----
unsigned x;
for (x = 0; x <= num_var_partitions (t->map); x++)
gcc_assert (!t->kill_list[x]);
! for (x = 0; x < num_ssa_names; x++)
{
gcc_assert (t->expr_decl_uids[x] == NULL);
gcc_assert (t->partition_dependencies[x] == NULL);
Index: tree-ssa.c
===================================================================
*** tree-ssa.c (revision 146576)
--- tree-ssa.c (working copy)
*************** delete_tree_ssa (void)
*** 844,850 ****
gimple_set_modified (stmt, true);
}
! set_phi_nodes (bb, NULL);
}
/* Remove annotations from every referenced local variable. */
--- 844,851 ----
gimple_set_modified (stmt, true);
}
! if (!(bb->flags & BB_RTL))
! set_phi_nodes (bb, NULL);
}
/* Remove annotations from every referenced local variable. */
Index: rtl.h
===================================================================
*** rtl.h (revision 146576)
--- rtl.h (working copy)
*************** extern rtx gen_int_mode (HOST_WIDE_INT,
*** 1494,1499 ****
--- 1494,1500 ----
extern rtx emit_copy_of_insn_after (rtx, rtx);
extern void set_reg_attrs_from_value (rtx, rtx);
extern void set_reg_attrs_for_parm (rtx, rtx);
+ extern void set_reg_attrs_for_decl_rtl (tree t, rtx x);
extern void adjust_reg_mode (rtx, enum machine_mode);
extern int mem_expr_equal_p (const_tree, const_tree);
Index: tree-optimize.c
===================================================================
*** tree-optimize.c (revision 146576)
--- tree-optimize.c (working copy)
*************** struct gimple_opt_pass pass_cleanup_cfg_
*** 201,207 ****
{
{
GIMPLE_PASS,
! "final_cleanup", /* name */
NULL, /* gate */
execute_cleanup_cfg_post_optimizing, /* execute */
NULL, /* sub */
--- 201,207 ----
{
{
GIMPLE_PASS,
! "optimized", /* name */
NULL, /* gate */
execute_cleanup_cfg_post_optimizing, /* execute */
NULL, /* sub */
*************** struct gimple_opt_pass pass_cleanup_cfg_
*** 213,225 ****
0, /* properties_destroyed */
0, /* todo_flags_start */
TODO_dump_func /* todo_flags_finish */
}
};
/* Pass: do the actions required to finish with tree-ssa optimization
passes. */
! static unsigned int
execute_free_datastructures (void)
{
free_dominance_info (CDI_DOMINATORS);
--- 213,226 ----
0, /* properties_destroyed */
0, /* todo_flags_start */
TODO_dump_func /* todo_flags_finish */
+ | TODO_remove_unused_locals
}
};
/* Pass: do the actions required to finish with tree-ssa optimization
passes. */
! unsigned int
execute_free_datastructures (void)
{
free_dominance_info (CDI_DOMINATORS);
*************** execute_free_datastructures (void)
*** 228,233 ****
--- 229,238 ----
/* Remove the ssa structures. */
if (cfun->gimple_df)
delete_tree_ssa ();
+
+ /* And get rid of annotations we no longer need. */
+ delete_tree_cfg_annotations ();
+
return 0;
}
*************** struct gimple_opt_pass pass_free_datastr
*** 254,262 ****
static unsigned int
execute_free_cfg_annotations (void)
{
- /* And get rid of annotations we no longer need. */
- delete_tree_cfg_annotations ();
-
return 0;
}
--- 259,264 ----
Index: tree-outof-ssa.c
===================================================================
*** tree-outof-ssa.c (revision 146576)
--- tree-outof-ssa.c (working copy)
*************** along with GCC; see the file COPYING3.
*** 30,38 ****
#include "tree-flow.h"
#include "timevar.h"
#include "tree-dump.h"
- #include "tree-ssa-live.h"
#include "tree-pass.h"
#include "toplev.h"
/* Used to hold all the components required to do SSA PHI elimination.
--- 30,39 ----
#include "tree-flow.h"
#include "timevar.h"
#include "tree-dump.h"
#include "tree-pass.h"
#include "toplev.h"
+ #include "expr.h"
+ #include "ssaexpand.h"
/* Used to hold all the components required to do SSA PHI elimination.
*************** typedef struct _elim_graph {
*** 61,67 ****
int size;
/* List of nodes in the elimination graph. */
! VEC(tree,heap) *nodes;
/* The predecessor and successor edge list. */
VEC(int,heap) *edge_list;
--- 62,68 ----
int size;
/* List of nodes in the elimination graph. */
! VEC(int,heap) *nodes;
/* The predecessor and successor edge list. */
VEC(int,heap) *edge_list;
*************** typedef struct _elim_graph {
*** 79,163 ****
edge e;
/* List of constant copies to emit. These are pushed on in pairs. */
VEC(tree,heap) *const_copies;
} *elim_graph;
! /* Create a temporary variable based on the type of variable T. Use T's name
! as the prefix. */
! static tree
! create_temp (tree t)
{
! tree tmp;
! const char *name = NULL;
! tree type;
!
! if (TREE_CODE (t) == SSA_NAME)
! t = SSA_NAME_VAR (t);
!
! gcc_assert (TREE_CODE (t) == VAR_DECL || TREE_CODE (t) == PARM_DECL);
! type = TREE_TYPE (t);
! tmp = DECL_NAME (t);
! if (tmp)
! name = IDENTIFIER_POINTER (tmp);
! if (name == NULL)
! name = "temp";
! tmp = create_tmp_var (type, name);
! if (DECL_DEBUG_EXPR_IS_FROM (t) && DECL_DEBUG_EXPR (t))
{
! SET_DECL_DEBUG_EXPR (tmp, DECL_DEBUG_EXPR (t));
! DECL_DEBUG_EXPR_IS_FROM (tmp) = 1;
}
! else if (!DECL_IGNORED_P (t))
{
! SET_DECL_DEBUG_EXPR (tmp, t);
! DECL_DEBUG_EXPR_IS_FROM (tmp) = 1;
}
- DECL_ARTIFICIAL (tmp) = DECL_ARTIFICIAL (t);
- DECL_IGNORED_P (tmp) = DECL_IGNORED_P (t);
- DECL_GIMPLE_REG_P (tmp) = DECL_GIMPLE_REG_P (t);
- add_referenced_var (tmp);
! /* We should never have copied variables in non-automatic storage
! or variables that have their address taken. So it is pointless
! to try to copy call-clobber state here. */
! gcc_assert (!may_be_aliased (t) && !is_global_var (t));
! return tmp;
! }
! /* This helper function fill insert a copy from a constant or variable SRC to
! variable DEST on edge E. */
static void
! insert_copy_on_edge (edge e, tree dest, tree src)
{
! gimple copy;
! copy = gimple_build_assign (dest, src);
! set_is_used (dest);
! if (TREE_CODE (src) == ADDR_EXPR)
! src = TREE_OPERAND (src, 0);
! if (TREE_CODE (src) == VAR_DECL || TREE_CODE (src) == PARM_DECL)
! set_is_used (src);
if (dump_file && (dump_flags & TDF_DETAILS))
{
fprintf (dump_file,
! "Inserting a copy on edge BB%d->BB%d :",
e->src->index,
e->dest->index);
! print_gimple_stmt (dump_file, copy, 0, dump_flags);
! fprintf (dump_file, "\n");
}
! gsi_insert_on_edge (e, copy);
}
--- 80,255 ----
edge e;
/* List of constant copies to emit. These are pushed on in pairs. */
+ VEC(int,heap) *const_dests;
VEC(tree,heap) *const_copies;
} *elim_graph;
! /* For an edge E find out a good source location to associate with
! instructions inserted on edge E. If E has an implicit goto set,
! use its location. Otherwise search instructions in predecessors
! of E for a location, and use that one. That makes sense because
! we insert on edges for PHI nodes, and effects of PHIs happen on
! the end of the predecessor conceptually. */
! static void
! set_location_for_edge (edge e)
{
! if (e->goto_locus)
! {
! set_curr_insn_source_location (e->goto_locus);
! set_curr_insn_block (e->goto_block);
! }
! else
! {
! basic_block bb = e->src;
! gimple_stmt_iterator gsi;
! do
! {
! for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi))
! {
! gimple stmt = gsi_stmt (gsi);
! if (gimple_has_location (stmt) || gimple_block (stmt))
! {
! set_curr_insn_source_location (gimple_location (stmt));
! set_curr_insn_block (gimple_block (stmt));
! return;
! }
! }
! /* Nothing found in this basic block. Make a half-assed attempt
! to continue with another block. */
! if (single_pred_p (bb))
! bb = single_pred (bb);
! else
! bb = e->src;
! }
! while (bb != e->src);
! }
! }
! /* Insert a copy instruction from partition SRC to DEST onto edge E. */
! static void
! insert_partition_copy_on_edge (edge e, int dest, int src)
! {
! rtx seq;
! if (dump_file && (dump_flags & TDF_DETAILS))
{
! fprintf (dump_file,
! "Inserting a partition copy on edge BB%d->BB%d :"
! "PART.%d = PART.%d",
! e->src->index,
! e->dest->index, dest, src);
! fprintf (dump_file, "\n");
}
!
! gcc_assert (SA.partition_to_pseudo[dest]);
! gcc_assert (SA.partition_to_pseudo[src]);
!
! set_location_for_edge (e);
!
! /* Partition copy between same base variables only, so it's the same mode,
! hence we can use emit_move_insn. */
! start_sequence ();
! emit_move_insn (SA.partition_to_pseudo[dest], SA.partition_to_pseudo[src]);
! seq = get_insns ();
! end_sequence ();
!
! insert_insn_on_edge (seq, e);
! }
!
! /* Insert a copy instruction from expression SRC to partition DEST
! onto edge E. */
!
! static void
! insert_value_copy_on_edge (edge e, int dest, tree src)
! {
! rtx seq, x;
! enum machine_mode mode;
! if (dump_file && (dump_flags & TDF_DETAILS))
{
! fprintf (dump_file,
! "Inserting a value copy on edge BB%d->BB%d : PART.%d = ",
! e->src->index,
! e->dest->index, dest);
! print_generic_expr (dump_file, src, TDF_SLIM);
! fprintf (dump_file, "\n");
}
! gcc_assert (SA.partition_to_pseudo[dest]);
! set_location_for_edge (e);
+ start_sequence ();
+ mode = GET_MODE (SA.partition_to_pseudo[dest]);
+ x = expand_expr (src, SA.partition_to_pseudo[dest], mode, EXPAND_NORMAL);
+ if (GET_MODE (x) != mode)
+ x = convert_to_mode (mode, x, TYPE_UNSIGNED (TREE_TYPE (src)));
+ if (x != SA.partition_to_pseudo[dest])
+ emit_move_insn (SA.partition_to_pseudo[dest], x);
+ seq = get_insns ();
+ end_sequence ();
! insert_insn_on_edge (seq, e);
! }
!
! /* Insert a copy instruction from RTL expression SRC to partition DEST
! onto edge E. */
static void
! insert_rtx_to_part_on_edge (edge e, int dest, rtx src)
{
! rtx seq;
! if (dump_file && (dump_flags & TDF_DETAILS))
! {
! fprintf (dump_file,
! "Inserting a temp copy on edge BB%d->BB%d : PART.%d = ",
! e->src->index,
! e->dest->index, dest);
! print_simple_rtl (dump_file, src);
! fprintf (dump_file, "\n");
! }
!
! gcc_assert (SA.partition_to_pseudo[dest]);
! set_location_for_edge (e);
! start_sequence ();
! gcc_assert (GET_MODE (src) == GET_MODE (SA.partition_to_pseudo[dest]));
! emit_move_insn (SA.partition_to_pseudo[dest], src);
! seq = get_insns ();
! end_sequence ();
! insert_insn_on_edge (seq, e);
! }
!
! /* Insert a copy instruction from partition SRC to RTL lvalue DEST
! onto edge E. */
+ static void
+ insert_part_to_rtx_on_edge (edge e, rtx dest, int src)
+ {
+ rtx seq;
if (dump_file && (dump_flags & TDF_DETAILS))
{
fprintf (dump_file,
! "Inserting a temp copy on edge BB%d->BB%d : ",
e->src->index,
e->dest->index);
! print_simple_rtl (dump_file, dest);
! fprintf (dump_file, "= PART.%d\n", src);
}
! gcc_assert (SA.partition_to_pseudo[src]);
! set_location_for_edge (e);
!
! start_sequence ();
! gcc_assert (GET_MODE (dest) == GET_MODE (SA.partition_to_pseudo[src]));
! emit_move_insn (dest, SA.partition_to_pseudo[src]);
! seq = get_insns ();
! end_sequence ();
!
! insert_insn_on_edge (seq, e);
}
*************** new_elim_graph (int size)
*** 169,175 ****
{
elim_graph g = (elim_graph) xmalloc (sizeof (struct _elim_graph));
! g->nodes = VEC_alloc (tree, heap, 30);
g->const_copies = VEC_alloc (tree, heap, 20);
g->edge_list = VEC_alloc (int, heap, 20);
g->stack = VEC_alloc (int, heap, 30);
--- 261,268 ----
{
elim_graph g = (elim_graph) xmalloc (sizeof (struct _elim_graph));
! g->nodes = VEC_alloc (int, heap, 30);
! g->const_dests = VEC_alloc (int, heap, 20);
g->const_copies = VEC_alloc (tree, heap, 20);
g->edge_list = VEC_alloc (int, heap, 20);
g->stack = VEC_alloc (int, heap, 30);
*************** new_elim_graph (int size)
*** 185,191 ****
static inline void
clear_elim_graph (elim_graph g)
{
! VEC_truncate (tree, g->nodes, 0);
VEC_truncate (int, g->edge_list, 0);
}
--- 278,284 ----
static inline void
clear_elim_graph (elim_graph g)
{
! VEC_truncate (int, g->nodes, 0);
VEC_truncate (int, g->edge_list, 0);
}
*************** delete_elim_graph (elim_graph g)
*** 199,205 ****
VEC_free (int, heap, g->stack);
VEC_free (int, heap, g->edge_list);
VEC_free (tree, heap, g->const_copies);
! VEC_free (tree, heap, g->nodes);
free (g);
}
--- 292,299 ----
VEC_free (int, heap, g->stack);
VEC_free (int, heap, g->edge_list);
VEC_free (tree, heap, g->const_copies);
! VEC_free (int, heap, g->const_dests);
! VEC_free (int, heap, g->nodes);
free (g);
}
*************** delete_elim_graph (elim_graph g)
*** 209,230 ****
static inline int
elim_graph_size (elim_graph g)
{
! return VEC_length (tree, g->nodes);
}
/* Add NODE to graph G, if it doesn't exist already. */
static inline void
! elim_graph_add_node (elim_graph g, tree node)
{
int x;
! tree t;
! for (x = 0; VEC_iterate (tree, g->nodes, x, t); x++)
if (t == node)
return;
! VEC_safe_push (tree, heap, g->nodes, node);
}
--- 303,324 ----
static inline int
elim_graph_size (elim_graph g)
{
! return VEC_length (int, g->nodes);
}
/* Add NODE to graph G, if it doesn't exist already. */
static inline void
! elim_graph_add_node (elim_graph g, int node)
{
int x;
! int t;
! for (x = 0; VEC_iterate (int, g->nodes, x, t); x++)
if (t == node)
return;
! VEC_safe_push (int, heap, g->nodes, node);
}
*************** do { \
*** 299,305 ****
/* Add T to elimination graph G. */
static inline void
! eliminate_name (elim_graph g, tree T)
{
elim_graph_add_node (g, T);
}
--- 393,399 ----
/* Add T to elimination graph G. */
static inline void
! eliminate_name (elim_graph g, int T)
{
elim_graph_add_node (g, T);
}
*************** eliminate_name (elim_graph g, tree T)
*** 309,330 ****
G->e. */
static void
! eliminate_build (elim_graph g, basic_block B)
{
! tree T0, Ti;
int p0, pi;
gimple_stmt_iterator gsi;
clear_elim_graph (g);
! for (gsi = gsi_start_phis (B); !gsi_end_p (gsi); gsi_next (&gsi))
{
gimple phi = gsi_stmt (gsi);
! T0 = var_to_partition_to_var (g->map, gimple_phi_result (phi));
!
/* Ignore results which are not in partitions. */
! if (T0 == NULL_TREE)
continue;
Ti = PHI_ARG_DEF (phi, g->e->dest_idx);
--- 403,423 ----
G->e. */
static void
! eliminate_build (elim_graph g)
{
! tree Ti;
int p0, pi;
gimple_stmt_iterator gsi;
clear_elim_graph (g);
! for (gsi = gsi_start_phis (g->e->dest); !gsi_end_p (gsi); gsi_next (&gsi))
{
gimple phi = gsi_stmt (gsi);
! p0 = var_to_partition (g->map, gimple_phi_result (phi));
/* Ignore results which are not in partitions. */
! if (p0 == NO_PARTITION)
continue;
Ti = PHI_ARG_DEF (phi, g->e->dest_idx);
*************** eliminate_build (elim_graph g, basic_blo
*** 338,355 ****
{
/* Save constant copies until all other copies have been emitted
on this edge. */
! VEC_safe_push (tree, heap, g->const_copies, T0);
VEC_safe_push (tree, heap, g->const_copies, Ti);
}
else
{
! Ti = var_to_partition_to_var (g->map, Ti);
! if (T0 != Ti)
{
! eliminate_name (g, T0);
! eliminate_name (g, Ti);
! p0 = var_to_partition (g->map, T0);
! pi = var_to_partition (g->map, Ti);
elim_graph_add_edge (g, p0, pi);
}
}
--- 431,446 ----
{
/* Save constant copies until all other copies have been emitted
on this edge. */
! VEC_safe_push (int, heap, g->const_dests, p0);
VEC_safe_push (tree, heap, g->const_copies, Ti);
}
else
{
! pi = var_to_partition (g->map, Ti);
! if (p0 != pi)
{
! eliminate_name (g, p0);
! eliminate_name (g, pi);
elim_graph_add_edge (g, p0, pi);
}
}
*************** elim_backward (elim_graph g, int T)
*** 399,430 ****
if (!TEST_BIT (g->visited, P))
{
elim_backward (g, P);
! insert_copy_on_edge (g->e,
! partition_to_var (g->map, P),
! partition_to_var (g->map, T));
}
});
}
/* Insert required copies for T in graph G. Check for a strongly connected
region, and create a temporary to break the cycle if one is found. */
static void
elim_create (elim_graph g, int T)
{
- tree U;
int P, S;
if (elim_unvisited_predecessor (g, T))
{
! U = create_temp (partition_to_var (g->map, T));
! insert_copy_on_edge (g->e, U, partition_to_var (g->map, T));
FOR_EACH_ELIM_GRAPH_PRED (g, T, P,
{
if (!TEST_BIT (g->visited, P))
{
elim_backward (g, P);
! insert_copy_on_edge (g->e, partition_to_var (g->map, P), U);
}
});
}
--- 490,535 ----
if (!TEST_BIT (g->visited, P))
{
elim_backward (g, P);
! insert_partition_copy_on_edge (g->e, P, T);
}
});
}
+ /* Allocate a new pseudo register usable for storing values sitting
+ in NAME (a decl or SSA name), i.e. with matching mode and attributes. */
+
+ static rtx
+ get_temp_reg (tree name)
+ {
+ tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
+ tree type = TREE_TYPE (var);
+ int unsignedp = TYPE_UNSIGNED (type);
+ enum machine_mode reg_mode
+ = promote_mode (type, DECL_MODE (var), &unsignedp, 0);
+ rtx x = gen_reg_rtx (reg_mode);
+ if (POINTER_TYPE_P (type))
+ mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var))));
+ return x;
+ }
+
/* Insert required copies for T in graph G. Check for a strongly connected
region, and create a temporary to break the cycle if one is found. */
static void
elim_create (elim_graph g, int T)
{
int P, S;
if (elim_unvisited_predecessor (g, T))
{
! rtx U = get_temp_reg (partition_to_var (g->map, T));
! insert_part_to_rtx_on_edge (g->e, U, T);
FOR_EACH_ELIM_GRAPH_PRED (g, T, P,
{
if (!TEST_BIT (g->visited, P))
{
elim_backward (g, P);
! insert_rtx_to_part_on_edge (g->e, P, U);
}
});
}
*************** elim_create (elim_graph g, int T)
*** 434,445 ****
if (S != -1)
{
SET_BIT (g->visited, T);
! insert_copy_on_edge (g->e,
! partition_to_var (g->map, T),
! partition_to_var (g->map, S));
}
}
-
}
--- 539,547 ----
if (S != -1)
{
SET_BIT (g->visited, T);
! insert_partition_copy_on_edge (g->e, T, S);
}
}
}
*************** static void
*** 449,455 ****
eliminate_phi (edge e, elim_graph g)
{
int x;
- basic_block B = e->dest;
gcc_assert (VEC_length (tree, g->const_copies) == 0);
--- 551,556 ----
*************** eliminate_phi (edge e, elim_graph g)
*** 459,478 ****
g->e = e;
! eliminate_build (g, B);
if (elim_graph_size (g) != 0)
{
! tree var;
sbitmap_zero (g->visited);
VEC_truncate (int, g->stack, 0);
! for (x = 0; VEC_iterate (tree, g->nodes, x, var); x++)
{
! int p = var_to_partition (g->map, var);
! if (!TEST_BIT (g->visited, p))
! elim_forward (g, p);
}
sbitmap_zero (g->visited);
--- 560,578 ----
g->e = e;
! eliminate_build (g);
if (elim_graph_size (g) != 0)
{
! int part;
sbitmap_zero (g->visited);
VEC_truncate (int, g->stack, 0);
! for (x = 0; VEC_iterate (int, g->nodes, x, part); x++)
{
! if (!TEST_BIT (g->visited, part))
! elim_forward (g, part);
}
sbitmap_zero (g->visited);
*************** eliminate_phi (edge e, elim_graph g)
*** 487,607 ****
/* If there are any pending constant copies, issue them now. */
while (VEC_length (tree, g->const_copies) > 0)
{
! tree src, dest;
src = VEC_pop (tree, g->const_copies);
! dest = VEC_pop (tree, g->const_copies);
! insert_copy_on_edge (e, dest, src);
}
}
- /* Take the ssa-name var_map MAP, and assign real variables to each
- partition. */
-
- static void
- assign_vars (var_map map)
- {
- int x, num;
- tree var, root;
- var_ann_t ann;
-
- num = num_var_partitions (map);
- for (x = 0; x < num; x++)
- {
- var = partition_to_var (map, x);
- if (TREE_CODE (var) != SSA_NAME)
- {
- ann = var_ann (var);
- /* It must already be coalesced. */
- gcc_assert (ann->out_of_ssa_tag == 1);
- if (dump_file && (dump_flags & TDF_DETAILS))
- {
- fprintf (dump_file, "partition %d already has variable ", x);
- print_generic_expr (dump_file, var, TDF_SLIM);
- fprintf (dump_file, " assigned to it.\n");
- }
- }
- else
- {
- root = SSA_NAME_VAR (var);
- ann = var_ann (root);
- /* If ROOT is already associated, create a new one. */
- if (ann->out_of_ssa_tag)
- {
- root = create_temp (root);
- ann = var_ann (root);
- }
- /* ROOT has not been coalesced yet, so use it. */
- if (dump_file && (dump_flags & TDF_DETAILS))
- {
- fprintf (dump_file, "Partition %d is assigned to var ", x);
- print_generic_stmt (dump_file, root, TDF_SLIM);
- }
- change_partition_var (map, root, x);
- }
- }
- }
-
-
- /* Replace use operand P with whatever variable it has been rewritten to based
- on the partitions in MAP. EXPR is an optional expression vector over SSA
- versions which is used to replace P with an expression instead of a variable.
- If the stmt is changed, return true. */
-
- static inline bool
- replace_use_variable (var_map map, use_operand_p p, gimple *expr)
- {
- tree new_var;
- tree var = USE_FROM_PTR (p);
-
- /* Check if we are replacing this variable with an expression. */
- if (expr)
- {
- int version = SSA_NAME_VERSION (var);
- if (expr[version])
- {
- SET_USE (p, gimple_assign_rhs_to_tree (expr[version]));
- return true;
- }
- }
-
- new_var = var_to_partition_to_var (map, var);
- if (new_var)
- {
- SET_USE (p, new_var);
- set_is_used (new_var);
- return true;
- }
- return false;
- }
-
-
- /* Replace def operand DEF_P with whatever variable it has been rewritten to
- based on the partitions in MAP. EXPR is an optional expression vector over
- SSA versions which is used to replace DEF_P with an expression instead of a
- variable. If the stmt is changed, return true. */
-
- static inline bool
- replace_def_variable (var_map map, def_operand_p def_p, tree *expr)
- {
- tree new_var;
- tree var = DEF_FROM_PTR (def_p);
-
- /* Do nothing if we are replacing this variable with an expression. */
- if (expr && expr[SSA_NAME_VERSION (var)])
- return true;
-
- new_var = var_to_partition_to_var (map, var);
- if (new_var)
- {
- SET_DEF (def_p, new_var);
- set_is_used (new_var);
- return true;
- }
- return false;
- }
-
-
/* Remove each argument from PHI. If an arg was the last use of an SSA_NAME,
check to see if this allows another PHI node to be removed. */
--- 587,601 ----
/* If there are any pending constant copies, issue them now. */
while (VEC_length (tree, g->const_copies) > 0)
{
! int dest;
! tree src;
src = VEC_pop (tree, g->const_copies);
! dest = VEC_pop (int, g->const_dests);
! insert_value_copy_on_edge (e, dest, src);
}
}
/* Remove each argument from PHI. If an arg was the last use of an SSA_NAME,
check to see if this allows another PHI node to be removed. */
*************** eliminate_useless_phis (void)
*** 704,724 ****
variable. */
static void
! rewrite_trees (var_map map, gimple *values)
{
- elim_graph g;
- basic_block bb;
- gimple_stmt_iterator gsi;
- edge e;
- gimple_seq phi;
- bool changed;
-
#ifdef ENABLE_CHECKING
/* Search for PHIs where the destination has no partition, but one
or more arguments has a partition. This should not happen and can
create incorrect code. */
FOR_EACH_BB (bb)
{
for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
{
gimple phi = gsi_stmt (gsi);
--- 698,713 ----
variable. */
static void
! rewrite_trees (var_map map)
{
#ifdef ENABLE_CHECKING
+ basic_block bb;
/* Search for PHIs where the destination has no partition, but one
or more arguments has a partition. This should not happen and can
create incorrect code. */
FOR_EACH_BB (bb)
{
+ gimple_stmt_iterator gsi;
for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
{
gimple phi = gsi_stmt (gsi);
*************** rewrite_trees (var_map map, gimple *valu
*** 744,1350 ****
}
}
#endif
-
- /* Replace PHI nodes with any required copies. */
- g = new_elim_graph (map->num_partitions);
- g->map = map;
- FOR_EACH_BB (bb)
- {
- for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
- {
- gimple stmt = gsi_stmt (gsi);
- use_operand_p use_p, copy_use_p;
- def_operand_p def_p;
- bool remove = false, is_copy = false;
- int num_uses = 0;
- ssa_op_iter iter;
-
- changed = false;
-
- if (gimple_assign_copy_p (stmt))
- is_copy = true;
-
- copy_use_p = NULL_USE_OPERAND_P;
- FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
- {
- if (replace_use_variable (map, use_p, values))
- changed = true;
- copy_use_p = use_p;
- num_uses++;
- }
-
- if (num_uses != 1)
- is_copy = false;
-
- def_p = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);
-
- if (def_p != NULL)
- {
- /* Mark this stmt for removal if it is the list of replaceable
- expressions. */
- if (values && values[SSA_NAME_VERSION (DEF_FROM_PTR (def_p))])
- remove = true;
- else
- {
- if (replace_def_variable (map, def_p, NULL))
- changed = true;
- /* If both SSA_NAMEs coalesce to the same variable,
- mark the now redundant copy for removal. */
- if (is_copy)
- {
- gcc_assert (copy_use_p != NULL_USE_OPERAND_P);
- if (DEF_FROM_PTR (def_p) == USE_FROM_PTR (copy_use_p))
- remove = true;
- }
- }
- }
- else
- FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, iter, SSA_OP_DEF)
- if (replace_def_variable (map, def_p, NULL))
- changed = true;
-
- /* Remove any stmts marked for removal. */
- if (remove)
- gsi_remove (&gsi, true);
- else
- {
- if (changed)
- if (maybe_clean_or_replace_eh_stmt (stmt, stmt))
- gimple_purge_dead_eh_edges (bb);
- gsi_next (&gsi);
- }
- }
-
- phi = phi_nodes (bb);
- if (phi)
- {
- edge_iterator ei;
- FOR_EACH_EDGE (e, ei, bb->preds)
- eliminate_phi (e, g);
- }
- }
-
- delete_elim_graph (g);
- }
-
- /* These are the local work structures used to determine the best place to
- insert the copies that were placed on edges by the SSA->normal pass.. */
- static VEC(edge,heap) *edge_leader;
- static VEC(gimple_seq,heap) *stmt_list;
- static bitmap leader_has_match = NULL;
- static edge leader_match = NULL;
-
-
- /* Pass this function to make_forwarder_block so that all the edges with
- matching PENDING_STMT lists to 'curr_stmt_list' get redirected. E is the
- edge to test for a match. */
-
- static inline bool
- same_stmt_list_p (edge e)
- {
- return (e->aux == (PTR) leader_match) ? true : false;
- }
-
-
- /* Return TRUE if S1 and S2 are equivalent copies. */
-
- static inline bool
- identical_copies_p (const_gimple s1, const_gimple s2)
- {
- #ifdef ENABLE_CHECKING
- gcc_assert (is_gimple_assign (s1));
- gcc_assert (is_gimple_assign (s2));
- gcc_assert (DECL_P (gimple_assign_lhs (s1)));
- gcc_assert (DECL_P (gimple_assign_lhs (s2)));
- #endif
-
- if (gimple_assign_lhs (s1) != gimple_assign_lhs (s2))
- return false;
-
- if (gimple_assign_rhs1 (s1) != gimple_assign_rhs1 (s2))
- return false;
-
- return true;
- }
-
-
- /* Compare the PENDING_STMT list for edges E1 and E2. Return true if the lists
- contain the same sequence of copies. */
-
- static inline bool
- identical_stmt_lists_p (const_edge e1, const_edge e2)
- {
- gimple_seq t1 = PENDING_STMT (e1);
- gimple_seq t2 = PENDING_STMT (e2);
- gimple_stmt_iterator gsi1, gsi2;
-
- for (gsi1 = gsi_start (t1), gsi2 = gsi_start (t2);
- !gsi_end_p (gsi1) && !gsi_end_p (gsi2);
- gsi_next (&gsi1), gsi_next (&gsi2))
- {
- if (!identical_copies_p (gsi_stmt (gsi1), gsi_stmt (gsi2)))
- break;
- }
-
- if (!gsi_end_p (gsi1) || !gsi_end_p (gsi2))
- return false;
-
- return true;
- }
-
-
- /* Allocate data structures used in analyze_edges_for_bb. */
-
- static void
- init_analyze_edges_for_bb (void)
- {
- edge_leader = VEC_alloc (edge, heap, 25);
- stmt_list = VEC_alloc (gimple_seq, heap, 25);
- leader_has_match = BITMAP_ALLOC (NULL);
- }
-
-
- /* Free data structures used in analyze_edges_for_bb. */
-
- static void
- fini_analyze_edges_for_bb (void)
- {
- VEC_free (edge, heap, edge_leader);
- VEC_free (gimple_seq, heap, stmt_list);
- BITMAP_FREE (leader_has_match);
- }
-
- /* A helper function to be called via walk_tree. Return DATA if it is
- contained in subtree TP. */
-
- static tree
- contains_tree_r (tree * tp, int *walk_subtrees, void *data)
- {
- if (*tp == data)
- {
- *walk_subtrees = 0;
- return (tree) data;
- }
- else
- return NULL_TREE;
- }
-
- /* A threshold for the number of insns contained in the latch block.
- It is used to prevent blowing the loop with too many copies from
- the latch. */
- #define MAX_STMTS_IN_LATCH 2
-
- /* Return TRUE if the stmts on SINGLE-EDGE can be moved to the
- body of the loop. This should be permitted only if SINGLE-EDGE is a
- single-basic-block latch edge and thus cleaning the latch will help
- to create a single-basic-block loop. Otherwise return FALSE. */
-
- static bool
- process_single_block_loop_latch (edge single_edge)
- {
- gimple_seq stmts;
- basic_block b_exit, b_pheader, b_loop = single_edge->src;
- edge_iterator ei;
- edge e;
- gimple_stmt_iterator gsi, gsi_exit;
- gimple_stmt_iterator tsi;
- tree expr;
- gimple stmt;
- unsigned int count = 0;
-
- if (single_edge == NULL || (single_edge->dest != single_edge->src)
- || (EDGE_COUNT (b_loop->succs) != 2)
- || (EDGE_COUNT (b_loop->preds) != 2))
- return false;
-
- /* Get the stmts on the latch edge. */
- stmts = PENDING_STMT (single_edge);
-
- /* Find the successor edge which is not the latch edge. */
- FOR_EACH_EDGE (e, ei, b_loop->succs)
- if (e->dest != b_loop)
- break;
-
- b_exit = e->dest;
-
- /* Check that the exit block has only the loop as a predecessor,
- and that there are no pending stmts on that edge as well. */
- if (EDGE_COUNT (b_exit->preds) != 1 || PENDING_STMT (e))
- return false;
-
- /* Find the predecessor edge which is not the latch edge. */
- FOR_EACH_EDGE (e, ei, b_loop->preds)
- if (e->src != b_loop)
- break;
-
- b_pheader = e->src;
-
- if (b_exit == b_pheader || b_exit == b_loop || b_pheader == b_loop)
- return false;
-
- gsi_exit = gsi_after_labels (b_exit);
-
- /* Get the last stmt in the loop body. */
- gsi = gsi_last_bb (single_edge->src);
- stmt = gsi_stmt (gsi);
-
- if (gimple_code (stmt) != GIMPLE_COND)
- return false;
-
-
- expr = build2 (gimple_cond_code (stmt), boolean_type_node,
- gimple_cond_lhs (stmt), gimple_cond_rhs (stmt));
- /* Iterate over the insns on the latch and count them. */
- for (tsi = gsi_start (stmts); !gsi_end_p (tsi); gsi_next (&tsi))
- {
- gimple stmt1 = gsi_stmt (tsi);
- tree var;
-
- count++;
- /* Check that the condition does not contain any new definition
- created in the latch as the stmts from the latch intended
- to precede it. */
- if (gimple_code (stmt1) != GIMPLE_ASSIGN)
- return false;
- var = gimple_assign_lhs (stmt1);
- if (TREE_THIS_VOLATILE (var)
- || TYPE_VOLATILE (TREE_TYPE (var))
- || walk_tree (&expr, contains_tree_r, var, NULL))
- return false;
- }
- /* Check that the latch does not contain more than MAX_STMTS_IN_LATCH
- insns. The purpose of this restriction is to prevent blowing the
- loop with too many copies from the latch. */
- if (count > MAX_STMTS_IN_LATCH)
- return false;
-
- /* Apply the transformation - clean up the latch block:
-
- var = something;
- L1:
- x1 = expr;
- if (cond) goto L2 else goto L3;
- L2:
- var = x1;
- goto L1
- L3:
- ...
-
- ==>
-
- var = something;
- L1:
- x1 = expr;
- tmp_var = var;
- var = x1;
- if (cond) goto L1 else goto L2;
- L2:
- var = tmp_var;
- ...
- */
- for (tsi = gsi_start (stmts); !gsi_end_p (tsi); gsi_next (&tsi))
- {
- gimple stmt1 = gsi_stmt (tsi);
- tree var, tmp_var;
- gimple copy;
-
- /* Create a new variable to load back the value of var in case
- we exit the loop. */
- var = gimple_assign_lhs (stmt1);
- tmp_var = create_temp (var);
- copy = gimple_build_assign (tmp_var, var);
- set_is_used (tmp_var);
- gsi_insert_before (&gsi, copy, GSI_SAME_STMT);
- copy = gimple_build_assign (var, tmp_var);
- gsi_insert_before (&gsi_exit, copy, GSI_SAME_STMT);
- }
-
- PENDING_STMT (single_edge) = 0;
- /* Insert the new stmts to the loop body. */
- gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
-
- if (dump_file)
- fprintf (dump_file,
- "\nCleaned-up latch block of loop with single BB: %d\n\n",
- single_edge->dest->index);
-
- return true;
- }
-
- /* Look at all the incoming edges to block BB, and decide where the best place
- to insert the stmts on each edge are, and perform those insertions. */
-
- static void
- analyze_edges_for_bb (basic_block bb)
- {
- edge e;
- edge_iterator ei;
- int count;
- unsigned int x;
- bool have_opportunity;
- gimple_stmt_iterator gsi;
- gimple stmt;
- edge single_edge = NULL;
- bool is_label;
- edge leader;
-
- count = 0;
-
- /* Blocks which contain at least one abnormal edge cannot use
- make_forwarder_block. Look for these blocks, and commit any PENDING_STMTs
- found on edges in these block. */
- have_opportunity = true;
- FOR_EACH_EDGE (e, ei, bb->preds)
- if (e->flags & EDGE_ABNORMAL)
- {
- have_opportunity = false;
- break;
- }
-
- if (!have_opportunity)
- {
- FOR_EACH_EDGE (e, ei, bb->preds)
- if (PENDING_STMT (e))
- gsi_commit_one_edge_insert (e, NULL);
- return;
- }
-
- /* Find out how many edges there are with interesting pending stmts on them.
- Commit the stmts on edges we are not interested in. */
- FOR_EACH_EDGE (e, ei, bb->preds)
- {
- if (PENDING_STMT (e))
- {
- gcc_assert (!(e->flags & EDGE_ABNORMAL));
- if (e->flags & EDGE_FALLTHRU)
- {
- gsi = gsi_start_bb (e->src);
- if (!gsi_end_p (gsi))
- {
- stmt = gsi_stmt (gsi);
- gsi_next (&gsi);
- gcc_assert (stmt != NULL);
- is_label = (gimple_code (stmt) == GIMPLE_LABEL);
- /* Punt if it has non-label stmts, or isn't local. */
- if (!is_label
- || DECL_NONLOCAL (gimple_label_label (stmt))
- || !gsi_end_p (gsi))
- {
- gsi_commit_one_edge_insert (e, NULL);
- continue;
- }
- }
- }
- single_edge = e;
- count++;
- }
- }
-
- /* If there aren't at least 2 edges, no sharing will happen. */
- if (count < 2)
- {
- if (single_edge)
- {
- /* Add stmts to the edge unless processed specially as a
- single-block loop latch edge. */
- if (!process_single_block_loop_latch (single_edge))
- gsi_commit_one_edge_insert (single_edge, NULL);
- }
- return;
- }
-
- /* Ensure that we have empty worklists. */
- #ifdef ENABLE_CHECKING
- gcc_assert (VEC_length (edge, edge_leader) == 0);
- gcc_assert (VEC_length (gimple_seq, stmt_list) == 0);
- gcc_assert (bitmap_empty_p (leader_has_match));
- #endif
-
- /* Find the "leader" block for each set of unique stmt lists. Preference is
- given to FALLTHRU blocks since they would need a GOTO to arrive at another
- block. The leader edge destination is the block which all the other edges
- with the same stmt list will be redirected to. */
- have_opportunity = false;
- FOR_EACH_EDGE (e, ei, bb->preds)
- {
- if (PENDING_STMT (e))
- {
- bool found = false;
-
- /* Look for the same stmt list in edge leaders list. */
- for (x = 0; VEC_iterate (edge, edge_leader, x, leader); x++)
- {
- if (identical_stmt_lists_p (leader, e))
- {
- /* Give this edge the same stmt list pointer. */
- PENDING_STMT (e) = NULL;
- e->aux = leader;
- bitmap_set_bit (leader_has_match, x);
- have_opportunity = found = true;
- break;
- }
- }
-
- /* If no similar stmt list, add this edge to the leader list. */
- if (!found)
- {
- VEC_safe_push (edge, heap, edge_leader, e);
- VEC_safe_push (gimple_seq, heap, stmt_list, PENDING_STMT (e));
- }
- }
- }
-
- /* If there are no similar lists, just issue the stmts. */
- if (!have_opportunity)
- {
- for (x = 0; VEC_iterate (edge, edge_leader, x, leader); x++)
- gsi_commit_one_edge_insert (leader, NULL);
- VEC_truncate (edge, edge_leader, 0);
- VEC_truncate (gimple_seq, stmt_list, 0);
- bitmap_clear (leader_has_match);
- return;
- }
-
- if (dump_file)
- fprintf (dump_file, "\nOpportunities in BB %d for stmt/block reduction:\n",
- bb->index);
-
- /* For each common list, create a forwarding block and issue the stmt's
- in that block. */
- for (x = 0; VEC_iterate (edge, edge_leader, x, leader); x++)
- if (bitmap_bit_p (leader_has_match, x))
- {
- edge new_edge;
- gimple_stmt_iterator gsi;
- gimple_seq curr_stmt_list;
-
- leader_match = leader;
-
- /* The tree_* cfg manipulation routines use the PENDING_EDGE field
- for various PHI manipulations, so it gets cleared when calls are
- made to make_forwarder_block(). So make sure the edge is clear,
- and use the saved stmt list. */
- PENDING_STMT (leader) = NULL;
- leader->aux = leader;
- curr_stmt_list = VEC_index (gimple_seq, stmt_list, x);
-
- new_edge = make_forwarder_block (leader->dest, same_stmt_list_p,
- NULL);
- bb = new_edge->dest;
- if (dump_file)
- {
- fprintf (dump_file, "Splitting BB %d for Common stmt list. ",
- leader->dest->index);
- fprintf (dump_file, "Original block is now BB%d.\n", bb->index);
- print_gimple_seq (dump_file, curr_stmt_list, 0, TDF_VOPS);
- }
-
- FOR_EACH_EDGE (e, ei, new_edge->src->preds)
- {
- e->aux = NULL;
- if (dump_file)
- fprintf (dump_file, " Edge (%d->%d) lands here.\n",
- e->src->index, e->dest->index);
- }
-
- gsi = gsi_last_bb (leader->dest);
- gsi_insert_seq_after (&gsi, curr_stmt_list, GSI_NEW_STMT);
-
- leader_match = NULL;
- /* We should never get a new block now. */
- }
- else
- {
- PENDING_STMT (leader) = VEC_index (gimple_seq, stmt_list, x);
- gsi_commit_one_edge_insert (leader, NULL);
- }
-
-
- /* Clear the working data structures. */
- VEC_truncate (edge, edge_leader, 0);
- VEC_truncate (gimple_seq, stmt_list, 0);
- bitmap_clear (leader_has_match);
}
! /* This function will analyze the insertions which were performed on edges,
! and decide whether they should be left on that edge, or whether it is more
! efficient to emit some subset of them in a single block. All stmts are
! inserted somewhere. */
!
! static void
! perform_edge_inserts (void)
{
basic_block bb;
! if (dump_file)
! fprintf(dump_file, "Analyzing Edge Insertions.\n");
!
! /* analyze_edges_for_bb calls make_forwarder_block, which tries to
! incrementally update the dominator information. Since we don't
! need dominator information after this pass, go ahead and free the
! dominator information. */
! free_dominance_info (CDI_DOMINATORS);
! free_dominance_info (CDI_POST_DOMINATORS);
!
! /* Allocate data structures used in analyze_edges_for_bb. */
! init_analyze_edges_for_bb ();
!
! FOR_EACH_BB (bb)
! analyze_edges_for_bb (bb);
!
! analyze_edges_for_bb (EXIT_BLOCK_PTR);
!
! /* Free data structures used in analyze_edges_for_bb. */
! fini_analyze_edges_for_bb ();
!
! #ifdef ENABLE_CHECKING
! {
! edge_iterator ei;
! edge e;
! FOR_EACH_BB (bb)
{
FOR_EACH_EDGE (e, ei, bb->preds)
! {
! if (PENDING_STMT (e))
! error (" Pending stmts not issued on PRED edge (%d, %d)\n",
! e->src->index, e->dest->index);
! }
! FOR_EACH_EDGE (e, ei, bb->succs)
! {
! if (PENDING_STMT (e))
! error (" Pending stmts not issued on SUCC edge (%d, %d)\n",
! e->src->index, e->dest->index);
! }
! }
! FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR->succs)
! {
! if (PENDING_STMT (e))
! error (" Pending stmts not issued on ENTRY edge (%d, %d)\n",
! e->src->index, e->dest->index);
}
! FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR->preds)
! {
! if (PENDING_STMT (e))
! error (" Pending stmts not issued on EXIT edge (%d, %d)\n",
! e->src->index, e->dest->index);
! }
! }
! #endif
}
/* Remove the ssa-names in the current function and translate them into normal
compiler variables. PERFORM_TER is true if Temporary Expression Replacement
should also be used. */
static void
! remove_ssa_form (bool perform_ter)
{
- basic_block bb;
gimple *values = NULL;
var_map map;
! gimple_stmt_iterator gsi;
map = coalesce_ssa_name ();
--- 733,777 ----
}
}
#endif
}
+ /* Given the out-of-ssa info object SA (with prepared partitions)
+ eliminate all phi nodes in all basic blocks. Afterwards no
+ basic block will have phi nodes anymore and there are possibly
+ some RTL instructions inserted on edges. */
! void
! expand_phi_nodes (struct ssaexpand *sa)
{
basic_block bb;
+ elim_graph g = new_elim_graph (sa->map->num_partitions);
+ g->map = sa->map;
! FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR->next_bb, EXIT_BLOCK_PTR, next_bb)
! if (!gimple_seq_empty_p (phi_nodes (bb)))
{
+ edge e;
+ edge_iterator ei;
FOR_EACH_EDGE (e, ei, bb->preds)
! eliminate_phi (e, g);
! set_phi_nodes (bb, NULL);
}
!
! delete_elim_graph (g);
}
+
/* Remove the ssa-names in the current function and translate them into normal
compiler variables. PERFORM_TER is true if Temporary Expression Replacement
should also be used. */
static void
! remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
{
gimple *values = NULL;
var_map map;
! unsigned i;
map = coalesce_ssa_name ();
*************** remove_ssa_form (bool perform_ter)
*** 1365,1393 ****
dump_replaceable_exprs (dump_file, values);
}
! /* Assign real variables to the partitions now. */
! assign_vars (map);
! if (dump_file && (dump_flags & TDF_DETAILS))
{
! fprintf (dump_file, "After Base variable replacement:\n");
! dump_var_map (dump_file, map);
}
-
- rewrite_trees (map, values);
-
- if (values)
- free (values);
-
- /* Remove PHI nodes which have been translated back to real variables. */
- FOR_EACH_BB (bb)
- for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);)
- remove_phi_node (&gsi, true);
-
- /* If any copies were inserted on edges, analyze and insert them now. */
- perform_edge_inserts ();
-
- delete_var_map (map);
}
--- 792,812 ----
dump_replaceable_exprs (dump_file, values);
}
! rewrite_trees (map);
! sa->map = map;
! sa->values = values;
! sa->partition_has_default_def = BITMAP_ALLOC (NULL);
! for (i = 1; i < num_ssa_names; i++)
{
! tree t = ssa_name (i);
! if (t && SSA_NAME_IS_DEFAULT_DEF (t))
! {
! int p = var_to_partition (map, t);
! if (p != NO_PARTITION)
! bitmap_set_bit (sa->partition_has_default_def, p);
! }
}
}
*************** insert_backedge_copies (void)
*** 1477,1488 ****
}
}
/* Take the current function out of SSA form, translating PHIs as described in
R. Morgan, ``Building an Optimizing Compiler'',
Butterworth-Heinemann, Boston, MA, 1998. pp 176-186. */
! static unsigned int
! rewrite_out_of_ssa (void)
{
/* If elimination of a PHI requires inserting a copy on a backedge,
then we will have to split the backedge which has numerous
--- 896,921 ----
}
}
+ /* Free all memory associated with going out of SSA form. SA is
+ the outof-SSA info object. */
+
+ void
+ finish_out_of_ssa (struct ssaexpand *sa)
+ {
+ free (sa->partition_to_pseudo);
+ if (sa->values)
+ free (sa->values);
+ delete_var_map (sa->map);
+ BITMAP_FREE (sa->partition_has_default_def);
+ memset (sa, 0, sizeof *sa);
+ }
+
/* Take the current function out of SSA form, translating PHIs as described in
R. Morgan, ``Building an Optimizing Compiler'',
Butterworth-Heinemann, Boston, MA, 1998. pp 176-186. */
! unsigned int
! rewrite_out_of_ssa (struct ssaexpand *sa)
{
/* If elimination of a PHI requires inserting a copy on a backedge,
then we will have to split the backedge which has numerous
*************** rewrite_out_of_ssa (void)
*** 1499,1535 ****
if (dump_file && (dump_flags & TDF_DETAILS))
gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS);
! remove_ssa_form (flag_tree_ter && !flag_mudflap);
if (dump_file && (dump_flags & TDF_DETAILS))
gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS);
- cfun->gimple_df->in_ssa_p = false;
return 0;
}
-
-
- /* Define the parameters of the out of SSA pass. */
-
- struct gimple_opt_pass pass_del_ssa =
- {
- {
- GIMPLE_PASS,
- "optimized", /* name */
- NULL, /* gate */
- rewrite_out_of_ssa, /* execute */
- NULL, /* sub */
- NULL, /* next */
- 0, /* static_pass_number */
- TV_TREE_SSA_TO_NORMAL, /* tv_id */
- PROP_cfg | PROP_ssa, /* properties_required */
- 0, /* properties_provided */
- /* ??? If TER is enabled, we also kill gimple. */
- PROP_ssa, /* properties_destroyed */
- TODO_verify_ssa | TODO_verify_flow
- | TODO_verify_stmts, /* todo_flags_start */
- TODO_dump_func
- | TODO_ggc_collect
- | TODO_remove_unused_locals /* todo_flags_finish */
- }
- };
--- 932,941 ----
if (dump_file && (dump_flags & TDF_DETAILS))
gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS);
! remove_ssa_form (flag_tree_ter && !flag_mudflap, sa);
if (dump_file && (dump_flags & TDF_DETAILS))
gimple_dump_cfg (dump_file, dump_flags & ~TDF_DETAILS);
return 0;
}
Index: ssaexpand.h
===================================================================
*** ssaexpand.h (revision 0)
--- ssaexpand.h (revision 0)
***************
*** 0 ****
--- 1,80 ----
+ /* Routines for expanding from SSA form to RTL.
+ Copyright (C) 2009 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+
+ #ifndef _SSAEXPAND_H
+ #define _SSAEXPAND_H 1
+
+ #include "tree-ssa-live.h"
+
+ /* This structure (of which only a singleton SA exists) is used to
+ pass around information between the outof-SSA functions, cfgexpand
+ and expand itself. */
+ struct ssaexpand
+ {
+ /* The computed partitions of SSA names are stored here. */
+ var_map map;
+
+ /* For a SSA name version V values[V] contains the gimple statement
+ defining it iff TER decided that it should be forwarded, NULL
+ otherwise. */
+ gimple *values;
+
+ /* For a partition number I partition_to_pseudo[I] contains the
+ RTL expression of the allocated space of it (either a MEM or
+ a pseudos REG). */
+ rtx *partition_to_pseudo;
+
+ /* If partition I contains an SSA name that has a default def,
+ bit I will be set in this bitmap. */
+ bitmap partition_has_default_def;
+ };
+
+ /* This is the singleton described above. */
+ extern struct ssaexpand SA;
+
+ /* Returns the RTX expression representing the storage of the outof-SSA
+ partition that the SSA name EXP is a member of. */
+ static inline rtx
+ get_rtx_for_ssa_name (tree exp)
+ {
+ int p = partition_find (SA.map->var_partition, SSA_NAME_VERSION (exp));
+ if (SA.map->partition_to_view)
+ p = SA.map->partition_to_view[p];
+ gcc_assert (p != NO_PARTITION);
+ return SA.partition_to_pseudo[p];
+ }
+
+ /* If TER decided to forward the definition of SSA name EXP this function
+ returns the defining statement, otherwise NULL. */
+ static inline gimple
+ get_gimple_for_ssa_name (tree exp)
+ {
+ int v = SSA_NAME_VERSION (exp);
+ if (SA.values)
+ return SA.values[v];
+ return NULL;
+ }
+
+ /* In tree-outof-ssa.c. */
+ void finish_out_of_ssa (struct ssaexpand *sa);
+ unsigned int rewrite_out_of_ssa (struct ssaexpand *sa);
+ void expand_phi_nodes (struct ssaexpand *sa);
+
+ #endif
Index: tree-flow.h
===================================================================
*** tree-flow.h (revision 146576)
--- tree-flow.h (working copy)
*************** struct var_ann_d GTY(())
*** 209,218 ****
{
struct tree_ann_common_d common;
- /* Used by the out of SSA pass to determine whether this variable has
- been seen yet or not. */
- unsigned out_of_ssa_tag : 1;
-
/* Used when building base variable structures in a var_map. */
unsigned base_var_processed : 1;
--- 209,214 ----
*************** struct var_ann_d GTY(())
*** 234,243 ****
information on each attribute. */
ENUM_BITFIELD (noalias_state) noalias_state : 2;
- /* Used when going out of SSA form to indicate which partition this
- variable represents storage for. */
- unsigned partition;
-
/* Used by var_map for the base index of ssa base variables. */
unsigned base_index;
--- 230,235 ----
*************** rtx addr_for_mem_ref (struct mem_address
*** 976,981 ****
--- 968,974 ----
void get_address_description (tree, struct mem_address *);
tree maybe_fold_tmr (tree);
+ unsigned int execute_free_datastructures (void);
unsigned int execute_fixup_cfg (void);
#include "tree-flow-inline.h"
Index: Makefile.in
===================================================================
*** Makefile.in (revision 146576)
--- Makefile.in (working copy)
*************** TREE_FLOW_H = tree-flow.h tree-flow-inli
*** 864,869 ****
--- 864,870 ----
$(HASHTAB_H) $(CGRAPH_H) $(IPA_REFERENCE_H) \
tree-ssa-alias.h
TREE_SSA_LIVE_H = tree-ssa-live.h $(PARTITION_H) vecprim.h
+ SSAEXPAND_H = ssaexpand.h $(TREE_SSA_LIVE_H)
PRETTY_PRINT_H = pretty-print.h $(INPUT_H) $(OBSTACK_H)
DIAGNOSTIC_H = diagnostic.h diagnostic.def $(PRETTY_PRINT_H) options.h
C_PRETTY_PRINT_H = c-pretty-print.h $(PRETTY_PRINT_H) $(C_COMMON_H) $(TREE_H)
*************** tree-ssa-coalesce.o : tree-ssa-coalesce.
*** 2106,2112 ****
tree-outof-ssa.o : tree-outof-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
$(TREE_H) $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
$(TREE_PASS_H) $(TREE_SSA_LIVE_H) $(BASIC_BLOCK_H) $(BITMAP_H) $(GGC_H) \
! $(TOPLEV_H)
tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(GGC_H) $(TREE_H) $(RTL_H) $(TM_P_H) $(BASIC_BLOCK_H) \
$(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) domwalk.h $(FLAGS_H) \
--- 2107,2113 ----
tree-outof-ssa.o : tree-outof-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
$(TREE_H) $(DIAGNOSTIC_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
$(TREE_PASS_H) $(TREE_SSA_LIVE_H) $(BASIC_BLOCK_H) $(BITMAP_H) $(GGC_H) \
! $(TOPLEV_H) $(EXPR_H) $(SSAEXPAND_H)
tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(GGC_H) $(TREE_H) $(RTL_H) $(TM_P_H) $(BASIC_BLOCK_H) \
$(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) domwalk.h $(FLAGS_H) \
*************** expr.o : expr.c $(CONFIG_H) $(SYSTEM_H)
*** 2533,2539 ****
typeclass.h hard-reg-set.h $(TOPLEV_H) hard-reg-set.h $(EXCEPT_H) reload.h \
$(GGC_H) langhooks.h intl.h $(TM_P_H) $(REAL_H) $(TARGET_H) \
tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \
! $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h
dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
$(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \
langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H)
--- 2534,2540 ----
typeclass.h hard-reg-set.h $(TOPLEV_H) hard-reg-set.h $(EXCEPT_H) reload.h \
$(GGC_H) langhooks.h intl.h $(TM_P_H) $(REAL_H) $(TARGET_H) \
tree-iterator.h gt-expr.h $(MACHMODE_H) $(TIMEVAR_H) $(TREE_FLOW_H) \
! $(TREE_PASS_H) $(DF_H) $(DIAGNOSTIC_H) vecprim.h $(SSAEXPAND_H)
dojump.o : dojump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) \
$(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(OPTABS_H) $(INSN_ATTR_H) insn-config.h \
langhooks.h $(GGC_H) gt-dojump.h vecprim.h $(BASIC_BLOCK_H)
*************** cfgexpand.o : cfgexpand.c $(TREE_FLOW_H)
*** 2805,2811 ****
$(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
coretypes.h $(TREE_DUMP_H) $(EXCEPT_H) langhooks.h $(TREE_PASS_H) $(RTL_H) \
$(DIAGNOSTIC_H) $(TOPLEV_H) $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
! value-prof.h $(TREE_INLINE_H) $(TARGET_H)
cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
$(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
output.h $(TOPLEV_H) $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) insn-config.h $(EXPR_H) \
--- 2806,2812 ----
$(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) $(FUNCTION_H) $(TIMEVAR_H) $(TM_H) \
coretypes.h $(TREE_DUMP_H) $(EXCEPT_H) langhooks.h $(TREE_PASS_H) $(RTL_H) \
$(DIAGNOSTIC_H) $(TOPLEV_H) $(BASIC_BLOCK_H) $(FLAGS_H) debug.h $(PARAMS_H) \
! value-prof.h $(TREE_INLINE_H) $(TARGET_H) $(SSAEXPAND_H)
cfgrtl.o : cfgrtl.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \
$(FLAGS_H) insn-config.h $(BASIC_BLOCK_H) $(REGS_H) hard-reg-set.h \
output.h $(TOPLEV_H) $(FUNCTION_H) $(EXCEPT_H) $(TM_P_H) insn-config.h $(EXPR_H) \
Index: basic-block.h
===================================================================
*** basic-block.h (revision 146576)
--- basic-block.h (working copy)
*************** extern void update_bb_for_insn (basic_bl
*** 512,517 ****
--- 512,518 ----
extern void insert_insn_on_edge (rtx, edge);
basic_block split_edge_and_insert (edge, rtx);
+ extern void commit_one_edge_insertion (edge e);
extern void commit_edge_insertions (void);
extern void remove_fake_edges (void);
Index: passes.c
===================================================================
*** passes.c (revision 146576)
--- passes.c (working copy)
*************** init_optimization_passes (void)
*** 707,723 ****
NEXT_PASS (pass_local_pure_const);
}
NEXT_PASS (pass_cleanup_eh);
- NEXT_PASS (pass_del_ssa);
NEXT_PASS (pass_nrv);
NEXT_PASS (pass_mark_used_blocks);
NEXT_PASS (pass_cleanup_cfg_post_optimizing);
-
NEXT_PASS (pass_warn_function_noreturn);
- NEXT_PASS (pass_free_datastructures);
- NEXT_PASS (pass_mudflap_2);
! NEXT_PASS (pass_free_cfg_annotations);
NEXT_PASS (pass_expand);
NEXT_PASS (pass_rest_of_compilation);
{
struct opt_pass **p = &pass_rest_of_compilation.pass.sub;
--- 707,723 ----
NEXT_PASS (pass_local_pure_const);
}
NEXT_PASS (pass_cleanup_eh);
NEXT_PASS (pass_nrv);
+ NEXT_PASS (pass_mudflap_2);
NEXT_PASS (pass_mark_used_blocks);
NEXT_PASS (pass_cleanup_cfg_post_optimizing);
NEXT_PASS (pass_warn_function_noreturn);
! /* NEXT_PASS (pass_del_ssa);
! NEXT_PASS (pass_free_datastructures);
! NEXT_PASS (pass_free_cfg_annotations);*/
NEXT_PASS (pass_expand);
+
NEXT_PASS (pass_rest_of_compilation);
{
struct opt_pass **p = &pass_rest_of_compilation.pass.sub;
Index: cfgrtl.c
===================================================================
*** cfgrtl.c (revision 146576)
--- cfgrtl.c (working copy)
*************** along with GCC; see the file COPYING3.
*** 64,70 ****
static int can_delete_note_p (const_rtx);
static int can_delete_label_p (const_rtx);
- static void commit_one_edge_insertion (edge);
static basic_block rtl_split_edge (edge);
static bool rtl_move_block_after (basic_block, basic_block);
static int rtl_verify_flow_info (void);
--- 64,69 ----
*************** try_redirect_by_replacing_jump (edge e,
*** 856,886 ****
return e;
}
! /* Redirect edge representing branch of (un)conditional jump or tablejump,
! NULL on failure */
! static edge
! redirect_branch_edge (edge e, basic_block target)
{
rtx tmp;
- rtx old_label = BB_HEAD (e->dest);
- basic_block src = e->src;
- rtx insn = BB_END (src);
-
- /* We can only redirect non-fallthru edges of jump insn. */
- if (e->flags & EDGE_FALLTHRU)
- return NULL;
- else if (!JUMP_P (insn))
- return NULL;
-
/* Recognize a tablejump and adjust all matching cases. */
if (tablejump_p (insn, NULL, &tmp))
{
rtvec vec;
int j;
! rtx new_label = block_label (target);
! if (target == EXIT_BLOCK_PTR)
! return NULL;
if (GET_CODE (PATTERN (tmp)) == ADDR_VEC)
vec = XVEC (PATTERN (tmp), 0);
else
--- 855,879 ----
return e;
}
! /* Subroutine of redirect_branch_edge that tries to patch the jump
! instruction INSN so that it reaches block NEW. Do this
! only when it originally reached block OLD. Return true if this
! worked or the original target wasn't OLD, return false if redirection
! doesn't work. */
!
! static bool
! patch_jump_insn (rtx insn, rtx old_label, basic_block new_bb)
{
rtx tmp;
/* Recognize a tablejump and adjust all matching cases. */
if (tablejump_p (insn, NULL, &tmp))
{
rtvec vec;
int j;
! rtx new_label = block_label (new_bb);
! if (new_bb == EXIT_BLOCK_PTR)
! return false;
if (GET_CODE (PATTERN (tmp)) == ADDR_VEC)
vec = XVEC (PATTERN (tmp), 0);
else
*************** redirect_branch_edge (edge e, basic_bloc
*** 915,934 ****
if (computed_jump_p (insn)
/* A return instruction can't be redirected. */
|| returnjump_p (insn))
! return NULL;
!
! /* If the insn doesn't go where we think, we're confused. */
! gcc_assert (JUMP_LABEL (insn) == old_label);
! /* If the substitution doesn't succeed, die. This can happen
! if the back end emitted unrecognizable instructions or if
! target is exit block on some arches. */
! if (!redirect_jump (insn, block_label (target), 0))
{
! gcc_assert (target == EXIT_BLOCK_PTR);
! return NULL;
}
}
if (dump_file)
fprintf (dump_file, "Edge %i->%i redirected to %i\n",
--- 908,962 ----
if (computed_jump_p (insn)
/* A return instruction can't be redirected. */
|| returnjump_p (insn))
! return false;
! if (!currently_expanding_to_rtl || JUMP_LABEL (insn) == old_label)
{
! /* If the insn doesn't go where we think, we're confused. */
! gcc_assert (JUMP_LABEL (insn) == old_label);
!
! /* If the substitution doesn't succeed, die. This can happen
! if the back end emitted unrecognizable instructions or if
! target is exit block on some arches. */
! if (!redirect_jump (insn, block_label (new_bb), 0))
! {
! gcc_assert (new_bb == EXIT_BLOCK_PTR);
! return false;
! }
}
}
+ return true;
+ }
+
+
+ /* Redirect edge representing branch of (un)conditional jump or tablejump,
+ NULL on failure */
+ static edge
+ redirect_branch_edge (edge e, basic_block target)
+ {
+ rtx old_label = BB_HEAD (e->dest);
+ basic_block src = e->src;
+ rtx insn = BB_END (src);
+
+ /* We can only redirect non-fallthru edges of jump insn. */
+ if (e->flags & EDGE_FALLTHRU)
+ return NULL;
+ else if (!JUMP_P (insn) && !currently_expanding_to_rtl)
+ return NULL;
+
+ if (!currently_expanding_to_rtl)
+ {
+ if (!patch_jump_insn (insn, old_label, target))
+ return NULL;
+ }
+ else
+ /* When expanding this BB might actually contain multiple
+ jumps (i.e. not yet split by find_many_sub_basic_blocks).
+ Redirect all of those that match our label. */
+ for (insn = BB_HEAD (src); insn != NEXT_INSN (BB_END (src));
+ insn = NEXT_INSN (insn))
+ if (JUMP_P (insn) && !patch_jump_insn (insn, old_label, target))
+ return NULL;
if (dump_file)
fprintf (dump_file, "Edge %i->%i redirected to %i\n",
*************** insert_insn_on_edge (rtx pattern, edge e
*** 1313,1319 ****
/* Update the CFG for the instructions queued on edge E. */
! static void
commit_one_edge_insertion (edge e)
{
rtx before = NULL_RTX, after = NULL_RTX, insns, tmp, last;
--- 1341,1347 ----
/* Update the CFG for the instructions queued on edge E. */
! void
commit_one_edge_insertion (edge e)
{
rtx before = NULL_RTX, after = NULL_RTX, insns, tmp, last;
^ permalink raw reply [flat|nested] 56+ messages in thread
end of thread, other threads:[~2011-02-14 17:38 UTC | newest]
Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-27 14:15 [RFA] expand from SSA form (1/2) David Edelsohn
2009-04-27 14:43 ` H.J. Lu
2009-04-27 15:08 ` Michael Matz
2009-04-27 15:11 ` David Edelsohn
2009-04-27 15:51 ` Michael Matz
2009-04-27 17:03 ` David Edelsohn
2009-04-27 17:27 ` David Edelsohn
2009-04-27 19:15 ` David Edelsohn
2009-04-28 0:48 ` Michael Matz
2009-04-28 0:54 ` Luis Machado
2009-04-28 1:22 ` Michael Matz
2009-04-28 13:24 ` Luis Machado
2009-04-30 17:55 ` Luis Machado
2009-05-01 19:33 ` Richard Guenther
2009-05-04 13:38 ` Luis Machado
2009-04-28 16:05 ` David Edelsohn
2009-04-28 16:19 ` Michael Matz
2009-04-28 23:49 ` Michael Matz
2009-04-29 5:50 ` David Edelsohn
2009-04-29 12:48 ` Michael Matz
2009-04-29 13:21 ` David Edelsohn
2009-04-29 13:35 ` Michael Matz
2009-04-29 14:38 ` David Edelsohn
2009-04-29 14:50 ` Richard Guenther
2009-04-29 15:03 ` Andreas Krebbel
2009-04-29 15:11 ` Michael Matz
2009-04-29 15:40 ` Andreas Krebbel
2009-04-29 17:33 ` Michael Matz
2009-04-29 17:41 ` Michael Matz
-- strict thread matches above, loose matches on Subject: below --
2009-04-13 20:50 RFC: " Michael Matz
2009-04-21 18:23 ` Andrew MacLeod
2009-04-22 10:54 ` Michael Matz
2009-04-22 16:45 ` [RFA] " Michael Matz
2009-04-23 15:10 ` Andrew MacLeod
2009-04-24 9:42 ` Richard Guenther
2009-04-26 20:27 ` Michael Matz
2010-01-19 15:48 ` H.J. Lu
2009-04-24 14:32 ` Richard Guenther
2009-04-24 14:46 ` Richard Guenther
2009-04-26 20:21 ` Michael Matz
2009-04-26 20:34 ` Richard Guenther
2009-04-26 20:53 ` Michael Matz
2009-04-26 21:14 ` Richard Guenther
2009-04-26 21:15 ` Michael Matz
2009-04-26 21:17 ` Richard Guenther
2009-04-26 22:21 ` Michael Matz
2009-04-26 21:42 ` Michael Matz
2009-04-26 22:15 ` Michael Matz
2009-04-27 12:34 ` Michael Matz
2009-04-27 5:47 ` H.J. Lu
2009-04-28 23:49 ` H.J. Lu
2009-04-29 0:21 ` Andrew Pinski
2009-04-30 13:47 ` H.J. Lu
2009-05-29 3:47 ` H.J. Lu
2010-10-20 18:02 ` H.J. Lu
2011-02-14 18:02 ` H.J. Lu
2009-04-27 7:22 ` Hans-Peter Nilsson
2009-04-30 18:18 ` Steve Ellcey
2009-05-01 17:40 ` Michael Matz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).