public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop
@ 2005-09-13 9:46 rguenth at gcc dot gnu dot org
2005-09-13 13:12 ` [Bug tree-optimization/23855] " rguenth at gcc dot gnu dot org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2005-09-13 9:46 UTC (permalink / raw)
To: gcc-bugs
void bar(void);
void foo(int ie, int je)
{
int i, j;
for (j=0; j<je; ++j)
for (i=0; i<ie; ++i)
bar();
}
should _not_ be transformed to
foo (ie, je)
{
int j;
int i;
<bb 0>:
if (je > 0) goto <L23>; else goto <L5>;
<L23>:;
j = 0;
goto <bb 3> (<L2>);
<L22>:;
i = 0;
<L1>:;
bar ();
i = i + 1;
if (ie != i) goto <L1>; else goto <L3>;
<L3>:;
j = j + 1;
if (je != j) goto <L2>; else goto <L5>;
<L2>:;
if (ie > 0) goto <L22>; else goto <L3>;
<L5>:;
return;
}
i.e. containing an loop-invariant check if (ie > 0).
Both DOM and copy-header do this transformation. Disabling both
we get
;; Function foo (foo)
foo (ie, je)
{
int j;
int i;
<bb 0>:
j = 0;
goto <bb 4> (<L4>);
<L1>:;
bar ();
i = i + 1;
<L2>:;
if (i < ie) goto <L1>; else goto <L3>;
<L3>:;
j = j + 1;
<L4>:;
if (j < je) goto <L8>; else goto <L5>;
<L8>:;
i = 0;
goto <bb 2> (<L2>);
<L5>:;
return;
}
which is a _lot_ faster for small ie. Optimally we would hoist the
loop invariant check out of the j loop.
--
Summary: Should not do loop header copying for inner loop
Product: gcc
Version: 4.1.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: rguenth at gcc dot gnu dot org
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
2005-09-13 9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
@ 2005-09-13 13:12 ` rguenth at gcc dot gnu dot org
2005-09-13 13:30 ` rakdver at gcc dot gnu dot org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2005-09-13 13:12 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at gcc dot gnu dot org 2005-09-13 13:12 -------
Unswitching should clean this up, but unfortunately(?) only looks at innermost
loops. For a reason it seems, just removing this checks results in an ICE.
Testcase for unswitching:
void bar(void);
void foo(int ie, int je)
{
int j;
int i;
if (je <= 0)
goto L5;
j = 0;
L2:
if (ie <= 0)
goto L3;
i = 0;
L1:
bar ();
i = i + 1;
if (ie != i) goto L1;
L3:
j = j + 1;
if (je != j) goto L2;
L5:
return;
}
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |rakdver at atrey dot karlin
| |dot mff dot cuni dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
2005-09-13 9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
2005-09-13 13:12 ` [Bug tree-optimization/23855] " rguenth at gcc dot gnu dot org
@ 2005-09-13 13:30 ` rakdver at gcc dot gnu dot org
2005-09-13 13:47 ` rguenth at gcc dot gnu dot org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2005-09-13 13:30 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rakdver at gcc dot gnu dot org 2005-09-13 13:29 -------
It is not clear to me why you find the code without header copying better --
number of checks of each condition is exactly the same in both cases, and with
right ordering of basic blocks, there should be one less jump executed.
Depending on branch prediction on the target, either the original code or the
code with copied headers may be faster, but it is hard to determine which one.
Unswitching of non-innermost loops should not be hard to implement (most of
code is in place, it only was quite some time that it was tested for anything
but innermost loops, so there probably is some bitrot; and a few pieces on tree
level may be missing).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
2005-09-13 9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
2005-09-13 13:12 ` [Bug tree-optimization/23855] " rguenth at gcc dot gnu dot org
2005-09-13 13:30 ` rakdver at gcc dot gnu dot org
@ 2005-09-13 13:47 ` rguenth at gcc dot gnu dot org
2005-09-13 14:01 ` pinskia at gcc dot gnu dot org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2005-09-13 13:47 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at gcc dot gnu dot org 2005-09-13 13:47 -------
Well, it is the case that I have some numerical application that has
such loops and the case of small ie (1 or 2) does happen during boundary
updates, so instead of
if (ie <= 0)
return;
for (j=0; j<je; ++j)
i=0;
do {
foo();
i=i+1;
} while (i<ie);
and executing the if (ie <= 0) exactly one time, we execute it
je times, which is as often as half times the number of total iterations
of the inner loop body. You are probably right that the execution number
without loop header copying is exactly the same, but unswitching this
condition certainly helps for small ie (where loop versioning with constant
ie and completely unrolled inner loop would help even more, of course).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
2005-09-13 9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
` (2 preceding siblings ...)
2005-09-13 13:47 ` rguenth at gcc dot gnu dot org
@ 2005-09-13 14:01 ` pinskia at gcc dot gnu dot org
2005-09-13 14:09 ` rguenth at gcc dot gnu dot org
2005-09-13 20:08 ` pinskia at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-09-13 14:01 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-09-13 14:01 -------
On PPC, we get optimal and almost unswitched loop:
L4:
ble- cr4,L7
li r31,0
L6:
addi r31,r31,1
bl _bar
cmpw cr7,r30,r31
bne+ cr7,L6
L7:
addi r29,r29,1
cmpw cr7,r28,r29
bne+ cr7,L4
cr4 is a callee save register.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
2005-09-13 9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
` (3 preceding siblings ...)
2005-09-13 14:01 ` pinskia at gcc dot gnu dot org
@ 2005-09-13 14:09 ` rguenth at gcc dot gnu dot org
2005-09-13 20:08 ` pinskia at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2005-09-13 14:09 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rguenth at gcc dot gnu dot org 2005-09-13 14:09 -------
Heh - you unswitched the comparison but not the jump!
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/23855] Should not do loop header copying for inner loop
2005-09-13 9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
` (4 preceding siblings ...)
2005-09-13 14:09 ` rguenth at gcc dot gnu dot org
@ 2005-09-13 20:08 ` pinskia at gcc dot gnu dot org
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-09-13 20:08 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-09-13 20:08 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-13 9:46 [Bug tree-optimization/23855] New: Should not do loop header copying for inner loop rguenth at gcc dot gnu dot org
2005-09-13 13:12 ` [Bug tree-optimization/23855] " rguenth at gcc dot gnu dot org
2005-09-13 13:30 ` rakdver at gcc dot gnu dot org
2005-09-13 13:47 ` rguenth at gcc dot gnu dot org
2005-09-13 14:01 ` pinskia at gcc dot gnu dot org
2005-09-13 14:09 ` rguenth at gcc dot gnu dot org
2005-09-13 20:08 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).