* [Bug rtl-optimization/19038] Loop header copying
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
@ 2004-12-16 21:39 ` dje at gcc dot gnu dot org
2004-12-16 22:55 ` steven at gcc dot gnu dot org
` (51 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: dje at gcc dot gnu dot org @ 2004-12-16 21:39 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dje at gcc dot gnu dot org 2004-12-16 21:39 -------
The focus of the problem is the inner loop of functio thin6d at line 572. The
function consumes 97.5% of the cycles and the inner loop consumes 48.5%.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug rtl-optimization/19038] Loop header copying
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
2004-12-16 21:39 ` [Bug rtl-optimization/19038] " dje at gcc dot gnu dot org
@ 2004-12-16 22:55 ` steven at gcc dot gnu dot org
2004-12-17 4:04 ` pinskia at gcc dot gnu dot org
` (50 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-12-16 22:55 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-12-16 22:55 -------
The interesting thing is that the front end already does the header
"copying" when it generates code:
{
int4 D.1164;
D.1164 = nmz;
k = 3;
k.12 = k;
if (k.12 <= D.1164)
{
<D1500>:;
{
...loop body, straight-line code...
__label_000420:;
k.12 = k;
D.1168 = k.12 == D.1164;
k.12 = k;
D.1522 = k.12 + 1;
k = D.1522;
if (D.1168)
{
goto L.51;
}
else
{
}
}
goto <D1500>;
}
else
{
}
L.51:;
}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug rtl-optimization/19038] Loop header copying
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
2004-12-16 21:39 ` [Bug rtl-optimization/19038] " dje at gcc dot gnu dot org
2004-12-16 22:55 ` steven at gcc dot gnu dot org
@ 2004-12-17 4:04 ` pinskia at gcc dot gnu dot org
2004-12-17 4:05 ` pinskia at gcc dot gnu dot org
` (49 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-17 4:04 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-17 04:04 -------
If anyone wants the full benchmark I can attach it or sent it to them.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug rtl-optimization/19038] Loop header copying
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (2 preceding siblings ...)
2004-12-17 4:04 ` pinskia at gcc dot gnu dot org
@ 2004-12-17 4:05 ` pinskia at gcc dot gnu dot org
2004-12-17 4:16 ` pinskia at gcc dot gnu dot org
` (48 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-17 4:05 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-17 04:05 -------
(In reply to comment #3)
> If anyone wants the full benchmark I can attach it or sent it to them.
That was not for this bug, woops.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug rtl-optimization/19038] Loop header copying
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (3 preceding siblings ...)
2004-12-17 4:05 ` pinskia at gcc dot gnu dot org
@ 2004-12-17 4:16 ` pinskia at gcc dot gnu dot org
2004-12-17 4:28 ` pinskia at gcc dot gnu dot org
` (47 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-17 4:16 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-17 04:16 -------
I think the problem is the BBs are not being merged if the secondary BB has a user label. If that is so,
what we should do is merge the BB and just move the labels. I will try to this and attach the patch so
someone can test it as I don't really have access to SPEC.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug rtl-optimization/19038] Loop header copying
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (4 preceding siblings ...)
2004-12-17 4:16 ` pinskia at gcc dot gnu dot org
@ 2004-12-17 4:28 ` pinskia at gcc dot gnu dot org
2004-12-20 18:25 ` [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB pinskia at gcc dot gnu dot org
` (46 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-17 4:28 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-17 04:28 -------
Could someone see if the patch attached to PR 19049 fixes the bug, I am going to submit it anyways
since it helps so much fortran code in general and we can vectorize some more loops.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (5 preceding siblings ...)
2004-12-17 4:28 ` pinskia at gcc dot gnu dot org
@ 2004-12-20 18:25 ` pinskia at gcc dot gnu dot org
2004-12-20 19:04 ` pinskia at gcc dot gnu dot org
` (45 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-20 18:25 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-20 18:25 -------
After getting access to sixtrack sources, steven and I found this is a problem in out of ssa committing
on a new BB, I almost have a patch but it has one issue in that we could be the new statement's results
in the control flow statement as we insert before it.
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |pinskia at gcc dot gnu dot
| |org
Severity|enhancement |minor
Status|UNCONFIRMED |NEW
Component|rtl-optimization |tree-optimization
Ever Confirmed| |1
Last reconfirmed|0000-00-00 00:00:00 |2004-12-20 18:25:23
date| |
Summary|Loop header copying |[4.0 Regression] out-of ssa
| |causing loops to have more
| |than one BB
Target Milestone|--- |4.0.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (6 preceding siblings ...)
2004-12-20 18:25 ` [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB pinskia at gcc dot gnu dot org
@ 2004-12-20 19:04 ` pinskia at gcc dot gnu dot org
2004-12-20 19:23 ` pinskia at gcc dot gnu dot org
` (44 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-20 19:04 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-20 19:04 -------
The reduced testcase (well heavely modified) looks like:
int
f (int temp2, int temp3, int xlvj_, int zlvj_, int yv1j_, int yv2j_)
{
int temp1, temp5;
for(;;) {
temp1 = temp2*xlvj_ - temp3*zlvj_;
temp3 = temp2*zlvj_ + temp3*xlvj_;
temp5 = temp2;
temp2 = temp1;
if (temp3)
break;
}
return yv1j_ * yv2j_;
}
Before the tree-ssa we would produce one BB when creating RTL but after out of ssa produces an extra
BB to hold " temp2 = temp1;"
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (7 preceding siblings ...)
2004-12-20 19:04 ` pinskia at gcc dot gnu dot org
@ 2004-12-20 19:23 ` pinskia at gcc dot gnu dot org
2004-12-20 19:47 ` pinskia at gcc dot gnu dot org
` (43 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-20 19:23 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-20 19:23 -------
This also causes a compile time issue as the RTL optimizers and the register allocator have to do more
work.
--
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |compile-time-hog
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (8 preceding siblings ...)
2004-12-20 19:23 ` pinskia at gcc dot gnu dot org
@ 2004-12-20 19:47 ` pinskia at gcc dot gnu dot org
2004-12-22 19:03 ` steven at gcc dot gnu dot org
` (42 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-20 19:47 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-20 19:47 -------
(In reply to comment #8)
> The reduced testcase (well heavely modified) looks like:
Here is the resulting loop asm (on x86 where shows up worse than ppc which is the same because
the RA does its work better there):
.L7:
movl %edx, %ecx
.L3:
movl %ecx, %edx
movl %ebx, %eax
imull %edi, %edx
imull %esi, %eax
imull %esi, %ecx
subl %eax, %edx
movl %ebx, %eax
movl %ecx, %ebx
imull %edi, %eax
addl %eax, %ebx
je .L7
Compared to 3.4.0:
.L2:
movl 16(%ebp), %ebx
movl %eax, %ecx
movl %esi, %edx
imull 16(%ebp), %eax
imull %edi, %ecx
imull %esi, %ebx
imull %edi, %edx
movl %ebx, %esi
subl %ecx, %esi
addl %edx, %eax
je .L2
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (9 preceding siblings ...)
2004-12-20 19:47 ` pinskia at gcc dot gnu dot org
@ 2004-12-22 19:03 ` steven at gcc dot gnu dot org
2004-12-23 2:41 ` pinskia at gcc dot gnu dot org
` (41 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-12-22 19:03 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-12-22 19:03 -------
This is not a compile time hog, it's a performance regression.
*Please*, I beg you,
DON'T USE KEYWORDS LIKE REGRESSION AND HOG SO G** D*** OFTEN FOR
TESTS THAT ARE NOT REGRESSIONS OR HOGS
--
What |Removed |Added
----------------------------------------------------------------------------
Keywords|compile-time-hog |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (10 preceding siblings ...)
2004-12-22 19:03 ` steven at gcc dot gnu dot org
@ 2004-12-23 2:41 ` pinskia at gcc dot gnu dot org
2004-12-23 15:19 ` dje at gcc dot gnu dot org
` (40 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-23 2:41 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-23 02:41 -------
Note I also see it for some simple fortran code see PR 14741.
--
What |Removed |Added
----------------------------------------------------------------------------
Known to fail| |4.0.0
Known to work| |3.3 3.4.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (11 preceding siblings ...)
2004-12-23 2:41 ` pinskia at gcc dot gnu dot org
@ 2004-12-23 15:19 ` dje at gcc dot gnu dot org
2004-12-23 16:00 ` steven at gcc dot gnu dot org
` (39 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: dje at gcc dot gnu dot org @ 2004-12-23 15:19 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dje at gcc dot gnu dot org 2004-12-23 15:19 -------
Fixing the problem at the RTL evel with loop header copying improves sixtrack
performance 12.5%.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (12 preceding siblings ...)
2004-12-23 15:19 ` dje at gcc dot gnu dot org
@ 2004-12-23 16:00 ` steven at gcc dot gnu dot org
2004-12-24 12:13 ` mustafa at il dot ibm dot com
` (38 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-12-23 16:00 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-12-23 16:00 -------
s/fixing/papering over/ in the previous comment ;-)
This appears to be a problem with loop-closed ssa form, but I don't
quite understand it yet. But we should fix this on trees no matter
what, we don't need another RTL pass.
But 12.5%, wow! Making this a "normal" bug (not `enhancement') and
upgrading to P1. We really shoud fix this for GCC 4.0. If not for
performance then for the just completely broken idea that we turn a
simple 1 BB loop into a complex multiple-BBs loop. I'm quite sure
that this hurts a lot more than just SMS. (Actually, that is what
that 12.5% shows, of course.)
--
What |Removed |Added
----------------------------------------------------------------------------
Severity|minor |normal
Priority|P2 |P1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (13 preceding siblings ...)
2004-12-23 16:00 ` steven at gcc dot gnu dot org
@ 2004-12-24 12:13 ` mustafa at il dot ibm dot com
2004-12-24 12:42 ` steven at gcc dot gnu dot org
` (37 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: mustafa at il dot ibm dot com @ 2004-12-24 12:13 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From mustafa at il dot ibm dot com 2004-12-24 12:13 -------
Doing loop header copying in RTL works around this problem, doing so
gain us improvements of 3.9% for SPECINT on a G5 machine and a 66%
for gcc benchmark.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (14 preceding siblings ...)
2004-12-24 12:13 ` mustafa at il dot ibm dot com
@ 2004-12-24 12:42 ` steven at gcc dot gnu dot org
2004-12-24 12:46 ` steven at gcc dot gnu dot org
` (36 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-12-24 12:42 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-12-24 12:42 -------
A smaller test case:
int
f (int temp2, int xlvj_)
{
int temp1;
for(;;) {
temp1 = temp2*xlvj_;
temp2 = temp1;
if (temp1)
break;
}
return xlvj_;
}
After out-of-ssa this gives the following tree dump:
;; Function f (f)
Analyzing Edge Insertions.
f (temp2, xlvj_)
{
int temp1;
int D.1471;
<L4>:;
<L0>:;
temp1 = temp2 * xlvj_;
if (temp1 != 0) goto <L2>; else goto <L7>;
<L7>:;
temp2 = temp1;
goto <bb 1> (<L0>);
<L2>:;
return xlvj_;
}
As far as I can tell the following test case is semantically
equivalent to the one above:
int
f (int temp2, int xlvj_)
{
for(;;) {
temp2 = temp2*xlvj_;
if (temp2)
break;
}
return xlvj_;
}
But the tree dumps are definitely not the same, in the smaller
test case we don't have the extra copy:
;; Function f (f)
Analyzing Edge Insertions.
f (temp2, xlvj_)
{
int D.1470;
<L4>:;
<L0>:;
temp2 = temp2 * xlvj_;
if (temp2 != 0) goto <L2>; else goto <L0>;
<L2>:;
return xlvj_;
}
So apparently something is preventing temp1 and temp2 from being
coalesced when going out of ssa.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (15 preceding siblings ...)
2004-12-24 12:42 ` steven at gcc dot gnu dot org
@ 2004-12-24 12:46 ` steven at gcc dot gnu dot org
2004-12-24 12:52 ` steven at gcc dot gnu dot org
` (35 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-12-24 12:46 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-12-24 12:46 -------
Side-by-side just before out-of-ssa:
f (temp2D.1464, xlvj_D.1465) f (temp2D.1464,
xlvj_D.1465)
{ {
intD.0 D.1470; | intD.0
temp1D.1468;
> intD.0 D.1471;
# BLOCK 0 # BLOCK 0
# PRED: ENTRY [100.0%] (fallthru,exec) # PRED: ENTRY
[100.0%] (fallthru,exec)
<L4>:; <L4>:;
# SUCC: 1 [100.0%] (fallthru,exec) # SUCC: 1
[100.0%] (fallthru,exec)
# BLOCK 1 # BLOCK 1
# PRED: 0 [100.0%] (fallthru,exec) 1 [89.0%] (dfs_back,fa # PRED: 0
[100.0%] (fallthru,exec) 1 [89.0%] (dfs_back,fa
# temp2D.1464_1 = PHI <temp2D.1464_2(0), temp2D.1464_4(1)>; | #
temp2D.1464_1 = PHI <temp2D.1464_2(0), temp1D.1468_4(1)>;
<L0>:; <L0>:;
temp2D.1464_4 = temp2D.1464_1 * xlvj_D.1465_3; | temp1D.1468_4
= temp2D.1464_1 * xlvj_D.1465_3;
if (temp2D.1464_4 != 0) goto <L2>; else goto <L0>; | if
(temp1D.1468_4 != 0) goto <L2>; else goto <L0>;
# SUCC: 2 [11.0%] (loop_exit,true,exec) 1 [89.0%] (dfs_ba # SUCC: 2
[11.0%] (loop_exit,true,exec) 1 [89.0%] (dfs_ba
# BLOCK 2 # BLOCK 2
# PRED: 1 [11.0%] (loop_exit,true,exec) # PRED: 1
[11.0%] (loop_exit,true,exec)
<L2>:; <L2>:;
return xlvj_D.1465_3; return
xlvj_D.1465_3;
# SUCC: EXIT [100.0%] # SUCC: EXIT
[100.0%]
} }
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (16 preceding siblings ...)
2004-12-24 12:46 ` steven at gcc dot gnu dot org
@ 2004-12-24 12:52 ` steven at gcc dot gnu dot org
2004-12-24 12:55 ` steven at gcc dot gnu dot org
` (34 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-12-24 12:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-12-24 12:52 -------
Silly bugzilla... Sorry about that.
A piece of unidiff, then. (-) is the smaller test that works, (+) is the
test where we get the extra copy. What I wanted to show is that only the
root variable is different in the PHI argument:
@@ -28,10 +29,10 @@
# BLOCK 1
# PRED: 0 [100.0%] (fallthru,exec) 1 [89.0%] (dfs_back,false,exec)
- # temp2D.1464_1 = PHI <temp2D.1464_2(0), temp2D.1464_4(1)>;
+ # temp2D.1464_1 = PHI <temp2D.1464_2(0), temp1D.1468_4(1)>;
<L0>:;
- temp2D.1464_4 = temp2D.1464_1 * xlvj_D.1465_3;
- if (temp2D.1464_4 != 0) goto <L2>; else goto <L0>;
+ temp1D.1468_4 = temp2D.1464_1 * xlvj_D.1465_3;
+ if (temp1D.1468_4 != 0) goto <L2>; else goto <L0>;
# SUCC: 2 [11.0%] (loop_exit,true,exec) 1 [89.0%] (dfs_back,false,exec)
# BLOCK 2
@@ -48,45 +49,45 @@
This comes from DOM copy-proping around the loop. Before DOM1 we have
a function that looks like this (.dce1):
;; Function f (f)
f (temp2, xlvj_)
{
int temp1;
int D.1471;
# temp2_1 = PHI <temp2_2(-1), temp2_5(0)>;
<L0>:;
temp1_4 = temp2_1 * xlvj__3;
temp2_5 = temp1_4;
if (temp1_4 != 0) goto <L2>; else goto <L0>;
<L2>:;
xlvj__6 = xlvj__3;
return xlvj__6;
}
After DOM1 (the .dom1 dump) we have this:
;; Function f (f)
f (temp2, xlvj_)
{
int temp1;
int D.1471;
# temp2_1 = PHI <temp2_2(-1), temp1_4(0)>;
<L0>:;
temp1_4 = temp2_1 * xlvj__3;
temp2_5 = temp1_4;
if (temp1_4 != 0) goto <L2>; else goto <L0>;
<L2>:;
xlvj__6 = xlvj__3;
return xlvj__3;
}
We never recover from this, we don't coalesce, and there's your copy.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (17 preceding siblings ...)
2004-12-24 12:52 ` steven at gcc dot gnu dot org
@ 2004-12-24 12:55 ` steven at gcc dot gnu dot org
2004-12-26 1:09 ` dberlin at gcc dot gnu dot org
` (33 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-12-24 12:55 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-12-24 12:55 -------
Possible work-arounds/hacks:
- Don't copy-prop into a PHI argument when the root variable of the
PHI result is the same as the root variable of the copy that is
being replaced. Seems silly that this would be necessary.
- Try to insert copies before conditional jumps if the jump condition
does not depend on anything in the copy.
- Figure out why we don't coalesce temp1 and temp2. Can we coalesce
them? If so, why are we failing to do it?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (18 preceding siblings ...)
2004-12-24 12:55 ` steven at gcc dot gnu dot org
@ 2004-12-26 1:09 ` dberlin at gcc dot gnu dot org
2004-12-26 15:06 ` pinskia at gcc dot gnu dot org
` (32 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2004-12-26 1:09 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at gcc dot gnu dot org 2004-12-26 01:09 -------
Just to followup on irc discussions.
Being able to coalesce this may require moving to a different out-of-ssa
algorithm that is better at coalescing.
All of the below take different approaches to the coalescing problem, however,
they may provide some insight as to whether/how these two variables can be
coalesced.
The briggs one (which is what morgan is based on, which is what the comments
claim is used) is known not to coalesce a significant number of variables.
A comparison of the two major algorithms for out-of-ssa (briggs and sreedhar's)
can be found at
http://www.is.titech.ac.jp/~sassa/papers-written/sassa-kohama-ito-ipsi04.pdf
Sreedhar's removes roughly 80% more copies than brigg's algorithm.
Note that the authors propose their own algorithm, even though it does
significantly worse than sreedhars, so i don't consider it a serious contender here.
Sreedhar's can be found at http://tinyurl.com/5axtl
i have a copy of the paper if you have no access.
In addition, yet another different approach to out-of-ssa coalescing was
presented at cgo 2004.
http://www.cgo.org/cgo2004/papers/21_14_rastello_f_revised.pdf
I mention it because it supposedly can coalesce even more than sreedhar's
algorithm is able to.
I point these out only as references to hopefully be able coalesce these variables.
I don't believe totally rewriting out-of-ssa would be apropos for 4.0 unless the
new algorithm is significantly simpler, etc (maybe one of the above would be, i
don't know).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (19 preceding siblings ...)
2004-12-26 1:09 ` dberlin at gcc dot gnu dot org
@ 2004-12-26 15:06 ` pinskia at gcc dot gnu dot org
2004-12-27 21:08 ` law at redhat dot com
` (31 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-26 15:06 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-26 15:06 -------
This will happen with any "DO LABEL VAR=START,END" loop in fortran.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (20 preceding siblings ...)
2004-12-26 15:06 ` pinskia at gcc dot gnu dot org
@ 2004-12-27 21:08 ` law at redhat dot com
2004-12-28 8:22 ` law at redhat dot com
` (30 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: law at redhat dot com @ 2004-12-27 21:08 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From law at redhat dot com 2004-12-27 21:08 -------
Trying to solve this problem by avoiding copy-propagation in special cases
is definitely not the right approach.
Any time we do a copy propagation or redundancy elimination we run the risk
of possibly creating overlapping lifetimes for objects which did not previously
overlap.
Our focus needs to be on figuring out why we did not coalesce everything to
a single variable.
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (21 preceding siblings ...)
2004-12-27 21:08 ` law at redhat dot com
@ 2004-12-28 8:22 ` law at redhat dot com
2004-12-28 9:18 ` law at redhat dot com
` (29 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: law at redhat dot com @ 2004-12-28 8:22 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From law at redhat dot com 2004-12-28 08:22 -------
A few more notes.
I would suggest starting the testcase in comment #8. While the simpler
testcase shows the underlying issue, it's simplified so far that the "problem"
could be fixed in various ways which totally avoid the copy, but which do not
help the larger testcases.
If we go back to the testcase in comment #8 and analyze the code closely we
will see that a copy is inevitable. If we copy propagate into the PHI, then
the copy gets inserted into its own block. If we do not copy propagate
into the PHI, then the copy sits just before the loop exit test. Having the
copy before the loop exit test rather than splitting the loop backedge is
better for a number of reasons.
So the issue at hand is not an extraneous copy, but instead the *placement*
of that copy.
We queue the copy to be inserted on the backedge of the loop, which is clearly
a correct insertion point. However, if we use life analysis we can determine
that we can place the copy before the loop exit test. The out-of-ssa code
computes object lifetimes, so, in theory we have the information we need.
What we'd need to do is enhance the copy placement code to be a little smarter
when we want to insert a copy on a critical edge. Given a copy dest = src
that we want to place on a critical edge E, we'd want to check if dest is
live on any of the other edges in e->src->succs -- if dest is not live on
any of those edges, then we can place the copy at the end of e->src. Or
something along those lines.
I have no idea how hard that would be to implement within the current
out-of-ssa framework we have in place.
It's worth nothing that copy placement on loop backedges for out-of-ssa has
been an area that has been problematical before. I ran into a number of
regressions in the RTL SSA code with poor placements (mostly in ways that
confused the old old rtl loop optimizers). So it's not a _huge_ surprise
that we're seeing some issues in the same areas with the tree out-of-ssa code.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (22 preceding siblings ...)
2004-12-28 8:22 ` law at redhat dot com
@ 2004-12-28 9:18 ` law at redhat dot com
2004-12-28 23:32 ` law at redhat dot com
` (28 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: law at redhat dot com @ 2004-12-28 9:18 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From law at redhat dot com 2004-12-28 09:17 -------
One final set of notes before I sign off for the night.
Daniel -- thanks for the pointers to the new papers. I've studied Sreedhar's
algorith in the past and thought it looked promising -- it's good to see
some hard #s comparing it to Briggs. I'm not convinced Sreedhar's is
simpler to implement, but it does look like it's worth some effort.
I'm not yet sure if Sreedhar's algorithm will result in a better placement
for the copy in this testcase or not. I'll have to re-review the algorithm.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (23 preceding siblings ...)
2004-12-28 9:18 ` law at redhat dot com
@ 2004-12-28 23:32 ` law at redhat dot com
2004-12-29 19:33 ` steven at gcc dot gnu dot org
` (27 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: law at redhat dot com @ 2004-12-28 23:32 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From law at redhat dot com 2004-12-28 23:31 -------
Yea, the code I've got lying around here to undo unprofitable const/copy
propagations definitely helps this code. And better yet, we don't need
the full-blown version (which is queued for 4.1). We can get by with the
very simplistic version. For the testcase in comment #8 we generate the
following .optimized dump:
# BLOCK 0
# PRED: ENTRY [100.0%] (fallthru,exec)
<L4>:;
# SUCC: 1 [100.0%] (fallthru,exec)
# BLOCK 1
# PRED: 0 [100.0%] (fallthru,exec) 1 [89.0%] (dfs_back,false,exec)
<L0>:;
temp1 = temp2 * xlvj_ - temp3 * zlvj_;
temp3 = temp2 * zlvj_ + temp3 * xlvj_;
temp2 = temp1;
if (temp3 != 0) goto <L2>; else goto <L0>;
# SUCC: 2 [11.0%] (loop_exit,true,exec) 1 [89.0%] (dfs_back,false,exec)
# BLOCK 2
# PRED: 1 [11.0%] (loop_exit,true,exec)
<L2>:;
return yv1j_ * yv2j_;
# SUCC: EXIT [100.0%]
}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (24 preceding siblings ...)
2004-12-28 23:32 ` law at redhat dot com
@ 2004-12-29 19:33 ` steven at gcc dot gnu dot org
2004-12-29 20:50 ` pinskia at gcc dot gnu dot org
` (26 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-12-29 19:33 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2004-12-29 19:33 -------
http://gcc.gnu.org/ml/gcc-patches/2004-12/msg02063.html
--
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |patch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (25 preceding siblings ...)
2004-12-29 19:33 ` steven at gcc dot gnu dot org
@ 2004-12-29 20:50 ` pinskia at gcc dot gnu dot org
2004-12-29 20:52 ` pinskia at gcc dot gnu dot org
` (25 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-29 20:50 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-29 20:50 -------
The reference patch does not fix the problem all the time, see PR 14741 for a testcase where the patch
fails, and it fails still for sixtrack but I should note that this would be now caused by IV-OPTs.
--
What |Removed |Added
----------------------------------------------------------------------------
Keywords|patch |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (26 preceding siblings ...)
2004-12-29 20:50 ` pinskia at gcc dot gnu dot org
@ 2004-12-29 20:52 ` pinskia at gcc dot gnu dot org
2004-12-29 21:06 ` law at redhat dot com
` (24 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-29 20:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-29 20:52 -------
(In reply to comment #29)
> The reference patch does not fix the problem all the time, see PR 14741 for a testcase where the
patch
> fails, and it fails still for sixtrack but I should note that this would be now caused by IV-OPTs.
For that testcase:
without IV-OPTS:
L33:
lfd f13,0(r9)
add r9,r9,r8
lfd f0,0(r11)
addi r11,r11,8
fmadd f0,f13,f0,f12
fmr f12,f0
bdnz L33
With:
L43:
fmr f12,f0
L33:
lfd f13,0(r9)
add r9,r9,r0
lfd f0,0(r11)
addi r11,r11,8
fmadd f0,f13,f0,f12
bdnz L43
Notice how there are two BB for this loop.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (27 preceding siblings ...)
2004-12-29 20:52 ` pinskia at gcc dot gnu dot org
@ 2004-12-29 21:06 ` law at redhat dot com
2004-12-30 17:27 ` pinskia at gcc dot gnu dot org
` (23 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: law at redhat dot com @ 2004-12-29 21:06 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From law at redhat dot com 2004-12-29 21:06 -------
Subject: Re: [4.0 Regression] out-of ssa
causing loops to have more than one BB
On Wed, 2004-12-29 at 20:50 +0000, pinskia at gcc dot gnu dot org wrote:
> ------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-29 20:50 -------
> The reference patch does not fix the problem all the time, see PR 14741 for a testcase where the patch
> fails, and it fails still for sixtrack but I should note that this would be now caused by IV-OPTs.
I don't expect to fix it all the time. There's a fundamental problem in that
copy propagation can lengthen lifetimes and thus require a copy at
out-of-ssa time. The patch is meant to fix those cases which can be
trivially detected and for which we can generate better code.
Changing the out-of-ssa code to use Sreedhar's algorithm (which appears
to do a lot better at avoiding copies) isn't something I'd advocate
for 4.1. But even with a better out-of-ssa algorithm there are still
going to be cases where out-of-ssa is going to have to generate a
copy. No way around it.
Now if you think that PR is a trivial case that should be caught, then,
show me why and I'll take a closer look.
jeff
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (28 preceding siblings ...)
2004-12-29 21:06 ` law at redhat dot com
@ 2004-12-30 17:27 ` pinskia at gcc dot gnu dot org
2004-12-30 18:39 ` law at redhat dot com
` (22 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-30 17:27 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-30 17:27 -------
(In reply to comment #31)
> Subject: Re: [4.0 Regression] out-of ssa
> causing loops to have more than one BB
> Now if you think that PR is a trivial case that should be caught, then,
> show me why and I'll take a closer look.
The reason why it is not caught is because we don't cleanup the cfg while doing the loop optimizations,
this has been fixed already on the tcb.
Oh, by the way I see that sixtrack has regressed on x86 now with your patch applied, I think this is
because we still have the same problem as before as ivopts puts the new instruction in an empty BB
which becomes from not cleaning up the cfg.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (29 preceding siblings ...)
2004-12-30 17:27 ` pinskia at gcc dot gnu dot org
@ 2004-12-30 18:39 ` law at redhat dot com
2004-12-30 18:49 ` pinskia at gcc dot gnu dot org
` (21 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: law at redhat dot com @ 2004-12-30 18:39 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From law at redhat dot com 2004-12-30 18:39 -------
Subject: Re: [4.0 Regression] out-of ssa
causing loops to have more than one BB
On Thu, 2004-12-30 at 17:27 +0000, pinskia at gcc dot gnu dot org wrote:
> ------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-30 17:27 -------
> (In reply to comment #31)
> > Subject: Re: [4.0 Regression] out-of ssa
> > causing loops to have more than one BB
>
> > Now if you think that PR is a trivial case that should be caught, then,
> > show me why and I'll take a closer look.
>
> The reason why it is not caught is because we don't cleanup the cfg while doing the
> loop optimizations, this has been fixed already on the tcb.
Can you be more precise how cleaning up the CFG during the loop
optimizer affects the code that we see during out-of-ssa. Specifically
how does it affect PHI arguments on backedges and the proper marking
of backedges in the CFG?
>
> Oh, by the way I see that sixtrack has regressed on x86 now with your patch applied, I think this is
> because we still have the same problem as before as ivopts puts the new instruction in an empty BB
> which becomes from not cleaning up the cfg.
Again, more information please on how this affects us during out-of-ssa?
I'm happy to look into these problems, but you've apparently got a lot
more state on them than I do. I'd like to learn what you already know
to speed up that process.
jeff
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (30 preceding siblings ...)
2004-12-30 18:39 ` law at redhat dot com
@ 2004-12-30 18:49 ` pinskia at gcc dot gnu dot org
2004-12-30 21:51 ` pinskia at gcc dot gnu dot org
` (20 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-30 18:49 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-30 18:49 -------
(In reply to comment #33)
> Subject: Re: [4.0 Regression] out-of ssa
> causing loops to have more than one BB
> > > > Now if you think that PR is a trivial case that should be caught, then,
> > > show me why and I'll take a closer look.
> >
> > The reason why it is not caught is because we don't cleanup the cfg while doing the
> > loop optimizations, this has been fixed already on the tcb.
> Can you be more precise how cleaning up the CFG during the loop
> optimizer affects the code that we see during out-of-ssa. Specifically
> how does it affect PHI arguments on backedges and the proper marking
> of backedges in the CFG?
It has nothing to do with out-of-ssa any more, sorry for not being clear but here are the chain
of events for the current problem.
We split the critial edges so when we try to create the IV (in iv-opts), we insert an instruction on the bb
which is the empty and otherwise useless but if we had cleaned up the CFG before running IV-OPTS, we
would insert it right before the condition just like your patch does for out of ssa.
And now we have this extra bb that is not useless, out of ssa is going to use instead of using your new
scheme.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (31 preceding siblings ...)
2004-12-30 18:49 ` pinskia at gcc dot gnu dot org
@ 2004-12-30 21:51 ` pinskia at gcc dot gnu dot org
2005-01-04 4:04 ` law at redhat dot com
` (19 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-30 21:51 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-30 21:50 -------
(In reply to comment #34)
Actually it is because we are placing statements in the loop's latch.
It is done in create_new_iv.
--
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2004-12-20 18:25:23 |2004-12-30 21:50:59
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (32 preceding siblings ...)
2004-12-30 21:51 ` pinskia at gcc dot gnu dot org
@ 2005-01-04 4:04 ` law at redhat dot com
2005-01-04 4:10 ` pinskia at gcc dot gnu dot org
` (18 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: law at redhat dot com @ 2005-01-04 4:04 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From law at redhat dot com 2005-01-04 04:04 -------
Subject: Re: [4.0 Regression] out-of ssa
causing loops to have more than one BB
On Thu, 2004-12-30 at 21:51 +0000, pinskia at gcc dot gnu dot org wrote:
> ------- Additional Comments From pinskia at gcc dot gnu dot org 2004-12-30 21:50 -------
> (In reply to comment #34)
> Actually it is because we are placing statements in the loop's latch.
> It is done in create_new_iv.
So just to be clear, it's not my change that is causing a regression.
The IV code creates situations which prevent my change from having
any kind of impact because the loop backedge is already split by
the IV code (and thus the loop backedge is no longer critical and
my out-of-ssa code does nothing).
Right?
Jeff
ps. It seems to me that the IV code could use a trick similar to what I
did to the out-of-ssa code. The only significant complication would be
that the IV code would have to verify that the code it wants to insert
is safe on both path (loop backedge and loop exit).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (33 preceding siblings ...)
2005-01-04 4:04 ` law at redhat dot com
@ 2005-01-04 4:10 ` pinskia at gcc dot gnu dot org
2005-01-10 20:02 ` pinskia at gcc dot gnu dot org
` (17 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-04 4:10 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-04 04:10 -------
(In reply to comment #36)
> Subject: Re: [4.0 Regression] out-of ssa
> causing loops to have more than one BB> So just to be clear, it's not my change that is causing
a regression.
> The IV code creates situations which prevent my change from having
> any kind of impact because the loop backedge is already split by
> the IV code (and thus the loop backedge is no longer critical and
> my out-of-ssa code does nothing).
>
> Right?
Yes.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (34 preceding siblings ...)
2005-01-04 4:10 ` pinskia at gcc dot gnu dot org
@ 2005-01-10 20:02 ` pinskia at gcc dot gnu dot org
2005-01-12 13:34 ` amacleod at redhat dot com
` (16 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-10 20:02 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
Priority|P1 |P2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (35 preceding siblings ...)
2005-01-10 20:02 ` pinskia at gcc dot gnu dot org
@ 2005-01-12 13:34 ` amacleod at redhat dot com
2005-01-18 0:50 ` [Bug tree-optimization/19038] [4.0 Regression] ivopts " pinskia at gcc dot gnu dot org
` (15 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: amacleod at redhat dot com @ 2005-01-12 13:34 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From amacleod at redhat dot com 2005-01-12 13:34 -------
So let me see, If I read this properly, this is no longer an out of ssa problem?
notes:
- changing where we place copies based on jeffs observations in comment 25 ought
not be difficult.
- If we wish to change the coalesce algorithm, we should be able to simply plug
a new one in. The existing one was just a quick straightforward implementation
to get it working with the full intention of improving it later. Its also
reasonably flexible and cost driven. tree-outof-ssa.c::coalesce_ssa_name() does
all the setup work. A list of possible coalesces is created, given costs and the
sorted. Coalesces are then performed by calling
tree-ssa-live::coalesce_tpa_members() (tpa's are generic lists which associate
trees, ie VAR trees to a partition index). The routine
tree-ssa-live.c::pop_best_coalesce() deterimes the order to attempt coalesces.
currently this simply retreives them in sorted ordered by "cost". cost is
calculated during the building of the coalesce list via calls to
tree-ssa-live.c::add_coalesce(), and is a cost value passed to the routine.
The current implementation simply calls add_coalesce with a value of 1 all the
time, so it ends up being simply a copy count.. ie, we try to coalesce variables
which are copied between each other 4 times before we try those copied once. The
original intention was to multiple each copy by 10^nest_level_of_copy, but that
info wasnt easily available when I implemented it. I beleive it is now, and just
haven't gotten around to doing it. It would be trivial to add that to the calls
to add_coalesce.
We could increase the cost again if its going to insert a copy on a latch edge
quite easily. (maybe 20^nest_level instead or some such thing) Making those even
more important.
add_coalesce should probably be changed to avoid wraparound as well I suppose.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] ivopts causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (36 preceding siblings ...)
2005-01-12 13:34 ` amacleod at redhat dot com
@ 2005-01-18 0:50 ` pinskia at gcc dot gnu dot org
2005-01-18 5:02 ` law at redhat dot com
` (14 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-18 0:50 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-18 00:50 -------
Changing the summary to reflect the current problem.
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |rakdver at gcc dot gnu dot
| |org
Summary|[4.0 Regression] out-of ssa |[4.0 Regression] ivopts
|causing loops to have more |causing loops to have more
|than one BB |than one BB
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] ivopts causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (37 preceding siblings ...)
2005-01-18 0:50 ` [Bug tree-optimization/19038] [4.0 Regression] ivopts " pinskia at gcc dot gnu dot org
@ 2005-01-18 5:02 ` law at redhat dot com
2005-01-19 22:50 ` cvs-commit at gcc dot gnu dot org
` (13 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: law at redhat dot com @ 2005-01-18 5:02 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From law at redhat dot com 2005-01-18 05:02 -------
Subject: Re: [4.0 Regression] out-of ssa
causing loops to have more than one BB
On Wed, 2005-01-12 at 13:34 +0000, amacleod at redhat dot com wrote:
> ------- Additional Comments From amacleod at redhat dot com 2005-01-12 13:34 -------
> So let me see, If I read this properly, this is no longer an out of ssa problem?
>
> notes:
> - changing where we place copies based on jeffs observations in comment 25 ought
> not be difficult.
Good to know. Though I doubt it's really necessary.
> - If we wish to change the coalesce algorithm, we should be able to simply plug
> a new one in.
Maybe, maybe not.
The quick and dirty synopsis of Sreedhar's approach is to rewrite PHIs
so that none of the args conflict with each other or the PHI result.
Once that's done you coalesce the PHI args & result together, which you
know is always safe. Coupled with that idea is some improvements in
when/where copies are inserted (which have certain parallels to
the ideas in comment #25.
Contrast that to what we do, which is more like coalescing in a register
allocator. If after coalescing we find that the PHI result and one or
more of its arguments are in different partitions, then we insert a
copy on the edge associated with the PHI argument.
So the algorithms differ first in when they insert copies, where the
copies are inserted and to some extent how they determine when copies
are needed.
[ There's another approach from the Rice team which is similar to
Sreedhar, but from what I could tell does not perform particularly
well. ]
If I were looking for incremental improvements to our exiting
out-of-ssa, I would look at bumping the cost of copies associated
with backedges. Sreedhar's approach isn't really an incremental
improvement -- it's more like a pre-pass for out-of-ssa which
obliviates the need to do copy coalescing during out-of-ssa.
jeff
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] ivopts causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (38 preceding siblings ...)
2005-01-18 5:02 ` law at redhat dot com
@ 2005-01-19 22:50 ` cvs-commit at gcc dot gnu dot org
2005-01-19 23:25 ` [Bug tree-optimization/19038] [4.0 Regression] out of ssa " pinskia at gcc dot gnu dot org
` (12 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2005-01-19 22:50 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From cvs-commit at gcc dot gnu dot org 2005-01-19 22:50 -------
Subject: Bug 19038
CVSROOT: /cvs/gcc
Module name: gcc
Changes by: rakdver@gcc.gnu.org 2005-01-19 22:50:06
Modified files:
gcc : ChangeLog tree-ssa-loop-ivopts.c
Log message:
PR tree-optimization/19038
* tree-ssa-loop-ivopts.c (allow_ip_end_pos_p): New function.
(add_candidate): Add ivs with increment in latch only if
allow_ip_end_pos_p is true.
(determine_iv_cost): Use empty_block_p.
Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.7191&r2=2.7192
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-loop-ivopts.c.diff?cvsroot=gcc&r1=2.41&r2=2.42
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (39 preceding siblings ...)
2005-01-19 22:50 ` cvs-commit at gcc dot gnu dot org
@ 2005-01-19 23:25 ` pinskia at gcc dot gnu dot org
2005-01-19 23:34 ` pinskia at gcc dot gnu dot org
` (11 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-19 23:25 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-19 23:25 -------
Now it is back to out of ssa causing one extra BB.
--
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.0 Regression] ivopts |[4.0 Regression] out of ssa
|causing loops to have more |causing loops to have more
|than one BB |than one BB
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (40 preceding siblings ...)
2005-01-19 23:25 ` [Bug tree-optimization/19038] [4.0 Regression] out of ssa " pinskia at gcc dot gnu dot org
@ 2005-01-19 23:34 ` pinskia at gcc dot gnu dot org
2005-01-20 0:31 ` pinskia at gcc dot gnu dot org
` (10 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-19 23:34 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-19 23:33 -------
(In reply to comment #42)
> Now it is back to out of ssa causing one extra BB.
I will try get a smaller testcase for this problem now. Thanks Zdenek for fixing the IV-OPTs related
problem.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (41 preceding siblings ...)
2005-01-19 23:34 ` pinskia at gcc dot gnu dot org
@ 2005-01-20 0:31 ` pinskia at gcc dot gnu dot org
2005-01-20 0:47 ` pinskia at gcc dot gnu dot org
` (9 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-20 0:31 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-20 00:31 -------
(In reply to comment #43)
> (In reply to comment #42)
> > Now it is back to out of ssa causing one extra BB.
> I will try get a smaller testcase for this problem now. Thanks Zdenek for fixing the IV-OPTs related
> problem.
Here is the reduced fortran testcase (I tried to convert it to C but I must not be matching all the syntax
correctly) and change the names so I would not get into trouble with SPEC:
subroutine function11(max)
parameter(maxindex=64)
common array1(maxindex),array2(maxindex)
save
do 20 j=1,max
array1(j)=array2(j)
20 continue
return
end
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (42 preceding siblings ...)
2005-01-20 0:31 ` pinskia at gcc dot gnu dot org
@ 2005-01-20 0:47 ` pinskia at gcc dot gnu dot org
2005-01-20 1:15 ` pinskia at gcc dot gnu dot org
` (8 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-20 0:47 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-20 00:46 -------
Here is another reduced testcase:
subroutine thin6d(max)
parameter(maxindex=64)
common array1(maxindex)
save
do 20 j=1,max
array1(j)=1.0
20 continue
return
end
but note disabling ivopts "fixes" the problem but it is not caused by ivopts per say this time:
Before out of SSA:
# ivtmp.6_21 = PHI <ivtmp.6_24(1), ivtmp.6_22(3)>;
# ivtmp.3_12 = PHI <0(1), ivtmp.3_20(3)>;
<L7>:;
[thin6d.f : 6] *ivtmp.6_21 = 1.0e+0;
__label_000020:;
ivtmp.3_20 = ivtmp.3_12 + 1;
ivtmp.6_22 = ivtmp.6_21 + 4B;
[thin6d.f : 1] D.493_30 = D.451_4 - 1;
[thin6d.f : 1] D.494_31 = (<unnamed type>) D.493_30;
[thin6d.f : 1] D.495_32 = D.494_31 + 1;
[thin6d.f : 5] if (ivtmp.3_20 == D.495_32) [thin6d.f : 5] goto <L11>; else goto <L7>;
Note this is looks very normal and like it should work correctly (except for the ((cast)(a-1))+1 but I
know Zdenek has a patch for that).
But for some reason (have not looked actually) we add an extra BB in out of SSA:
<L7>:;
[thin6d.f : 6] *ivtmp.6 = 1.0e+0;
__label_000020:;
ivtmp.10 = ivtmp.3 + 1;
ivtmp.6 = ivtmp.6 + 4B;
[thin6d.f : 5] if (ivtmp.10 == (<unnamed type>) (D.451 - 1) + 1) [thin6d.f : 5] goto <L11>; else goto
<L13>;
<L13>:;
ivtmp.3 = ivtmp.10;
goto <bb 2> (<L7>);
Looks like we are not coalescing the following PHI:
# ivtmp.3_12 = PHI <0(1), ivtmp.3_20(3)>;
But why not.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (43 preceding siblings ...)
2005-01-20 0:47 ` pinskia at gcc dot gnu dot org
@ 2005-01-20 1:15 ` pinskia at gcc dot gnu dot org
2005-01-21 13:18 ` steven at gcc dot gnu dot org
` (7 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-20 1:15 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-20 01:15 -------
(In reply to comment #45)
>From -details .optimized file:
> Coalesce list: (0)ivtmp.3_12 & (1)ivtmp.3_20 [map: 0, 1] : Fail due to conflict
Huh, how do they conflict, I don't see how.
Hmm, I assume that tree-ssa-live is wrong here.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (44 preceding siblings ...)
2005-01-20 1:15 ` pinskia at gcc dot gnu dot org
@ 2005-01-21 13:18 ` steven at gcc dot gnu dot org
2005-01-21 13:22 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
` (6 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-21 13:18 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-21 13:18 -------
No, tree-ssa-live is quite right. ivtmp.3_12 and ivtmp.3_20 are defined
and live at some common statements, so they conflict:
# BLOCK 1
# PRED: 0 [88.4%] (true,exec) 1 [89.0%] (dfs_back,false,exec)
# ivtmp.3_12 = PHI <0(0), ivtmp.3_20(1)>;
<L8>:;
D.501_28 = &__BLNK__.array1[ivtmp.3_12];
# __BLNK___11 = V_MAY_DEF <__BLNK___2>;
*D.501_28 = 1.0e+0;
ivtmp.3_20 = ivtmp.3_12 + 1;
D.503_30 = D.457_4 - 1;
D.504_31 = (<unnamed type>) D.503_30;
D.505_32 = D.504_31 + 1;
if (ivtmp.3_20 == D.505_32) goto <L12>; else goto <L8>;
# SUCC: 2 [11.0%] (loop_exit,true,exec) 1 [89.0%] (dfs_back,false,exec)
# BLOCK 2
# PRED: 1 [11.0%] (loop_exit,true,exec)
<L12>:;
D.497_22 = ivtmp.3_12 + 2;
lsm_tmp.2_23 = (int4) D.497_22;
# j_7 = V_MUST_DEF <j_6>;
j = lsm_tmp.2_23;
# SUCC: 3 [100.0%] (fallthru)
Between "ivtmp.3_20 = ivtmp.3_12 + 1;" and "D.497_22 = ivtmp.3_12 + 2;"
both versions of ivtmp.3 are live. I have not looked at what causes this,
but IMOH all passes should avoid letting induction variables escape a loop.
Interesting note: I think "D.497_22 = ivtmp.3_12 + 2;" could be replaced
with "D.497_22 = ivtmp.3_20 + 1;" which would fix this problem.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (45 preceding siblings ...)
2005-01-21 13:18 ` steven at gcc dot gnu dot org
@ 2005-01-21 13:22 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-21 13:27 ` steven at gcc dot gnu dot org
` (5 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-21 13:22 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-21 13:22 -------
Subject: Re: [4.0 Regression] out of ssa causing loops to have more than one BB
> No, tree-ssa-live is quite right. ivtmp.3_12 and ivtmp.3_20 are defined
> and live at some common statements, so they conflict:
>
> # BLOCK 1
> # PRED: 0 [88.4%] (true,exec) 1 [89.0%] (dfs_back,false,exec)
> # ivtmp.3_12 = PHI <0(0), ivtmp.3_20(1)>;
> <L8>:;
> D.501_28 = &__BLNK__.array1[ivtmp.3_12];
> # __BLNK___11 = V_MAY_DEF <__BLNK___2>;
> *D.501_28 = 1.0e+0;
> ivtmp.3_20 = ivtmp.3_12 + 1;
> D.503_30 = D.457_4 - 1;
> D.504_31 = (<unnamed type>) D.503_30;
> D.505_32 = D.504_31 + 1;
> if (ivtmp.3_20 == D.505_32) goto <L12>; else goto <L8>;
> # SUCC: 2 [11.0%] (loop_exit,true,exec) 1 [89.0%] (dfs_back,false,exec)
>
> # BLOCK 2
> # PRED: 1 [11.0%] (loop_exit,true,exec)
> <L12>:;
> D.497_22 = ivtmp.3_12 + 2;
> lsm_tmp.2_23 = (int4) D.497_22;
> # j_7 = V_MUST_DEF <j_6>;
> j = lsm_tmp.2_23;
> # SUCC: 3 [100.0%] (fallthru)
>
> Between "ivtmp.3_20 = ivtmp.3_12 + 1;" and "D.497_22 = ivtmp.3_12 + 2;"
> both versions of ivtmp.3 are live. I have not looked at what causes this,
> but IMOH all passes should avoid letting induction variables escape a loop.
There's not much to avoid -- the iv is simply used outside of the loop,
and we cannot do anything about it.
> Interesting note: I think "D.497_22 = ivtmp.3_12 + 2;" could be replaced
> with "D.497_22 = ivtmp.3_20 + 1;" which would fix this problem.
I think this is how it is immediately after ivopts (if not, it would be
a bug); but dom performs this replacement.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] out of ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (46 preceding siblings ...)
2005-01-21 13:22 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-21 13:27 ` steven at gcc dot gnu dot org
2005-01-21 13:29 ` [Bug tree-optimization/19038] [4.0 Regression] tree-ssa " steven at gcc dot gnu dot org
` (4 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-21 13:27 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-21 13:27 -------
Correctemundo Zdenek!
Before dom3:
# BLOCK 2
# PRED: 1 [100.0%] (fallthru) 2 [89.0%] (dfs_back,false,exec)
# ivtmp.3_12 = PHI <0(1), ivtmp.3_20(2)>;
# __BLNK___2 = PHI <__BLNK___8(1), __BLNK___11(2)>;
<L8>:;
D.498_25 = (<unnamed type>) lsm_tmp.2_14;
D.499_26 = ivtmp.3_12 + D.498_25;
D.500_27 = D.499_26 + ffffffffffffffff;
D.501_28 = &__BLNK__.array1[D.500_27];
ruatmp.9_29 = D.501_28;
# __BLNK___11 = V_MAY_DEF <__BLNK___2>;
*ruatmp.9_29 = 1.0e+0;
ivtmp.3_20 = ivtmp.3_12 + 1;
D.503_30 = D.457_4 - lsm_tmp.2_14;
D.504_31 = (<unnamed type>) D.503_30;
D.505_32 = D.504_31 + 1;
if (ivtmp.3_20 == D.505_32) goto <L12>; else goto <L8>;
# SUCC: 3 [11.0%] (loop_exit,true,exec) 2 [89.0%] (dfs_back,false,exec)
# BLOCK 3
# PRED: 2 [11.0%] (loop_exit,true,exec)
# ivtmp.3_24 = PHI <ivtmp.3_20(2)>;
<L12>:;
D.496_21 = (<unnamed type>) lsm_tmp.2_14;
D.497_22 = ivtmp.3_24 + D.496_21;
lsm_tmp.2_23 = (int4) D.497_22;
lsm_tmp.2_19 = lsm_tmp.2_23;
# j_7 = V_MUST_DEF <j_6>;
j = lsm_tmp.2_19;
# SUCC: 4 [100.0%] (fallthru)
After dom3:
# BLOCK 2
# PRED: 1 [100.0%] (fallthru) 2 [89.0%] (dfs_back,false,exec)
# ivtmp.3_12 = PHI <0(1), ivtmp.3_20(2)>;
# __BLNK___2 = PHI <__BLNK___8(1), __BLNK___11(2)>;
<L8>:;
D.498_25 = 1;
D.499_26 = ivtmp.3_12 + 1;
D.500_27 = ivtmp.3_12;
D.501_28 = &__BLNK__.array1[ivtmp.3_12];
ruatmp.9_29 = D.501_28;
# __BLNK___11 = V_MAY_DEF <__BLNK___2>;
*D.501_28 = 1.0e+0;
ivtmp.3_20 = ivtmp.3_12 + 1;
D.503_30 = D.457_4 - 1;
D.504_31 = (<unnamed type>) D.503_30;
D.505_32 = D.504_31 + 1;
if (ivtmp.3_20 == D.505_32) goto <L12>; else goto <L8>;
# SUCC: 3 [11.0%] (loop_exit,true,exec) 2 [89.0%] (dfs_back,false,exec)
# BLOCK 3
# PRED: 2 [11.0%] (loop_exit,true,exec)
# ivtmp.3_24 = PHI <ivtmp.3_20(2)>;
<L12>:;
D.496_21 = 1;
D.497_22 = ivtmp.3_12 + 2;
lsm_tmp.2_23 = (int4) D.497_22;
lsm_tmp.2_19 = lsm_tmp.2_23;
# j_7 = V_MUST_DEF <j_6>;
j = lsm_tmp.2_23;
# SUCC: 4 [100.0%] (fallthru)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] tree-ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (47 preceding siblings ...)
2005-01-21 13:27 ` steven at gcc dot gnu dot org
@ 2005-01-21 13:29 ` steven at gcc dot gnu dot org
2005-01-21 19:19 ` dberlin at gcc dot gnu dot org
` (3 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-21 13:29 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-21 13:28 -------
Not an out-of-ssa problem. This is a DOM problem.
--
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.0 Regression] out of ssa |[4.0 Regression] tree-ssa
|causing loops to have more |causing loops to have more
|than one BB |than one BB
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] tree-ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (48 preceding siblings ...)
2005-01-21 13:29 ` [Bug tree-optimization/19038] [4.0 Regression] tree-ssa " steven at gcc dot gnu dot org
@ 2005-01-21 19:19 ` dberlin at gcc dot gnu dot org
2005-01-22 1:51 ` dberlin at gcc dot gnu dot org
` (2 subsequent siblings)
52 siblings, 0 replies; 54+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2005-01-21 19:19 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-21 19:19 -------
I just tired this on struct-aliasing, to come up iwth a dom fix, and it
load-store moved everything, so there was nothing left for ivopts to create
ivtmps of :)
I think that's good.
I'll have to go try my patch on the mainline.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] tree-ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (49 preceding siblings ...)
2005-01-21 19:19 ` dberlin at gcc dot gnu dot org
@ 2005-01-22 1:51 ` dberlin at gcc dot gnu dot org
2005-01-22 16:48 ` cvs-commit at gcc dot gnu dot org
2005-01-22 16:49 ` dberlin at gcc dot gnu dot org
52 siblings, 0 replies; 54+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2005-01-22 1:51 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|unassigned at gcc dot gnu |dberlin at gcc dot gnu dot
|dot org |org
Status|NEW |ASSIGNED
Last reconfirmed|2004-12-30 21:50:59 |2005-01-22 01:50:52
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] tree-ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (50 preceding siblings ...)
2005-01-22 1:51 ` dberlin at gcc dot gnu dot org
@ 2005-01-22 16:48 ` cvs-commit at gcc dot gnu dot org
2005-01-22 16:49 ` dberlin at gcc dot gnu dot org
52 siblings, 0 replies; 54+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2005-01-22 16:48 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From cvs-commit at gcc dot gnu dot org 2005-01-22 16:48 -------
Subject: Bug 19038
CVSROOT: /cvs/gcc
Module name: gcc
Changes by: dberlin@gcc.gnu.org 2005-01-22 16:48:24
Modified files:
gcc : ChangeLog tree-ssa-dom.c
Log message:
2005-01-20 Daniel Berlin <dberlin@dberlin.org>
Fix PR tree-optimization/19038
* tree-ssa-dom.c (cprop_operand): Don't replace loop invaeriant
copies with loop variant ones.
Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.7228&r2=2.7229
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-dom.c.diff?cvsroot=gcc&r1=2.86&r2=2.87
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread
* [Bug tree-optimization/19038] [4.0 Regression] tree-ssa causing loops to have more than one BB
2004-12-16 15:52 [Bug rtl-optimization/19038] New: Loop header copying dje at gcc dot gnu dot org
` (51 preceding siblings ...)
2005-01-22 16:48 ` cvs-commit at gcc dot gnu dot org
@ 2005-01-22 16:49 ` dberlin at gcc dot gnu dot org
52 siblings, 0 replies; 54+ messages in thread
From: dberlin at gcc dot gnu dot org @ 2005-01-22 16:49 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-22 16:49 -------
Fixed
--
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038
^ permalink raw reply [flat|nested] 54+ messages in thread