public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression
@ 2004-05-25 11:37 steinmtz at us dot ibm dot com
2004-05-25 11:40 ` [Bug rtl-optimization/15632] " pinskia at gcc dot gnu dot org
` (9 more replies)
0 siblings, 10 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-05-25 11:37 UTC (permalink / raw)
To: gcc-bugs
Using: gcc version 3.5.0 20040430 (experimental) for PowerPC
Here is an example of a performance regression when using FDO. The
"ParseVid" component of Skidmarks runs about 5% slower using FDO. I traced
it down to a loop that is unrolled in the non-FDO compilation, but is not
unrolled in the FDO version. I hand unrolled the subject loop and got back
the full 5% plus an additional 2%, so it's not only introducing a penalty,
but it's also negating other positive transformations.
To reproduce given the files I will attach:
gcc -O2 -funroll-loops -m32 -c parse.i
Looking at the code for procedure "ParseVideoSegment", you'll see that the
loop in question has been completely unrolled. It's also interesting to
note that the loop within which it is nested has become a bct loop:
.L8:
lhz 11,0(4)
li 0,12
lhz 8,2(4)
slwi 9,30,3
slwi 11,11,16
sth 0,8(5)
or 11,11,8
addi 9,9,-12
srwi 10,11,20
sth 9,10(5)
rlwinm 0,10,0,30,31
rlwinm 8,11,10,31,31
add 0,29,0
stw 3,0(5)
slwi 0,0,1
stw 11,4(5)
lhzx 9,24,0
stw 6,120(12)
stw 7,124(12)
sth 9,128(12)
stb 8,130(12)
stb 25,131(12)
stw 6,0(12)
stw 7,4(12)
stw 6,8(12)
stw 7,12(12)
stw 6,16(12)
stw 7,20(12)
stw 6,24(12)
stw 7,28(12)
stw 6,32(12)
stw 7,36(12)
stw 6,40(12)
stw 7,44(12)
stw 6,48(12)
stw 7,52(12)
stw 6,56(12)
stw 7,60(12)
stw 6,64(12)
stw 7,68(12)
stw 6,72(12)
stw 7,76(12)
stw 6,80(12)
stw 7,84(12)
stw 6,88(12)
stw 7,92(12)
stw 6,96(12)
stw 7,100(12)
stw 6,104(12)
stw 7,108(12)
stw 6,112(12)
stw 7,116(12)
rlwinm 10,10,20,0,8
srawi 10,10,18
add 4,4,30
sth 10,0(12)
addi 5,5,16
addi 12,12,132
addi 31,31,1
bdz .L119
If the same code is compiled using FDO, the loop is no longer unrolled. It
becomes a bct loop, which in turn prevents the outer loop from becoming a bct
loop. The use of indexed stores and the associated address computations also
form a very undesirable dependence chain within the loop, as well as within the
two peeled iterations.
gcc -O2 -fprofile-use -funroll-loops -m32 -c parse.i
Generates:
.L111:
addi 9,27,1
slwi 0,27,3
addi 11,9,1
slwi 9,9,3
cmplwi 7,11,15
stwx 4,6,0
la 6,4(6)
stwx 5,6,0
la 6,-4(6)
stwx 4,9,6
la 9,4(9)
stwx 5,9,6
la 9,-4(9)
bgt- 7,.L112
addi 0,11,1
cmplwi 7,0,16
subfic 0,11,16
mtctr 0
bgt- 7,.L128
.L94:
slwi 0,11,3
addi 11,11,1
stwx 4,6,0
la 6,4(6)
stwx 5,6,0
la 6,-4(6)
bdnz .L94
I will attach the files needed to duplicate (parse.i, parse.gcda and parse.gcno)
--
Summary: Failure to unroll loop when using FDO causes performance
regression
Product: gcc
Version: 3.5.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: translation
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: steinmtz at us dot ibm dot com
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: PowerPC-Linux
GCC host triplet: PowerPC-Linux
GCC target triplet: PowerPC-Linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
@ 2004-05-25 11:40 ` pinskia at gcc dot gnu dot org
2004-05-25 11:42 ` steinmtz at us dot ibm dot com
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-05-25 11:40 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
Component|translation |rtl-optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
2004-05-25 11:40 ` [Bug rtl-optimization/15632] " pinskia at gcc dot gnu dot org
@ 2004-05-25 11:42 ` steinmtz at us dot ibm dot com
2004-05-25 12:04 ` steinmtz at us dot ibm dot com
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-05-25 11:42 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steinmtz at us dot ibm dot com 2004-05-24 14:51 -------
Created an attachment (id=6368)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=6368&action=view)
Preprocessed source code, compressed with gzip
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
2004-05-25 11:40 ` [Bug rtl-optimization/15632] " pinskia at gcc dot gnu dot org
2004-05-25 11:42 ` steinmtz at us dot ibm dot com
@ 2004-05-25 12:04 ` steinmtz at us dot ibm dot com
2004-05-25 12:06 ` steinmtz at us dot ibm dot com
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-05-25 12:04 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steinmtz at us dot ibm dot com 2004-05-24 14:52 -------
Created an attachment (id=6369)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=6369&action=view)
FDO data compressed with gzip
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
` (2 preceding siblings ...)
2004-05-25 12:04 ` steinmtz at us dot ibm dot com
@ 2004-05-25 12:06 ` steinmtz at us dot ibm dot com
2004-06-13 22:55 ` pinskia at gcc dot gnu dot org
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-05-25 12:06 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steinmtz at us dot ibm dot com 2004-05-24 14:53 -------
Created an attachment (id=6370)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=6370&action=view)
FDO data compressed with gzip
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
` (3 preceding siblings ...)
2004-05-25 12:06 ` steinmtz at us dot ibm dot com
@ 2004-06-13 22:55 ` pinskia at gcc dot gnu dot org
2004-06-13 23:03 ` pinskia at gcc dot gnu dot org
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-13 22:55 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
` (4 preceding siblings ...)
2004-06-13 22:55 ` pinskia at gcc dot gnu dot org
@ 2004-06-13 23:03 ` pinskia at gcc dot gnu dot org
2004-06-14 13:10 ` steinmtz at us dot ibm dot com
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-13 23:03 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-06-13 23:03 -------
Can you try with <http://gcc.gnu.org/ml/gcc-patches/2004-05/msg01558.html> applied and see if
this is fixed with that patch, if you cannot test just say so.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
` (5 preceding siblings ...)
2004-06-13 23:03 ` pinskia at gcc dot gnu dot org
@ 2004-06-14 13:10 ` steinmtz at us dot ibm dot com
2004-10-07 9:48 ` giovannibajo at libero dot it
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-06-14 13:10 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steinmtz at us dot ibm dot com 2004-06-14 13:10 -------
http://gcc.gnu.org/ml/gcc-patches/2004-05/msg01558.html is a separate problem
and does not fix this bug. Thanks.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
` (6 preceding siblings ...)
2004-06-14 13:10 ` steinmtz at us dot ibm dot com
@ 2004-10-07 9:48 ` giovannibajo at libero dot it
2004-10-08 16:28 ` steinmtz at us dot ibm dot com
2004-10-08 16:44 ` reichelt at gcc dot gnu dot org
9 siblings, 0 replies; 11+ messages in thread
From: giovannibajo at libero dot it @ 2004-10-07 9:48 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-10-07 09:48 -------
Pete, can you double-check if the problem is still actual on mainline? Now that
part of LNO was merged, this could be fixed already.
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |giovannibajo at libero dot
| |it
Status|UNCONFIRMED |WAITING
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
` (7 preceding siblings ...)
2004-10-07 9:48 ` giovannibajo at libero dot it
@ 2004-10-08 16:28 ` steinmtz at us dot ibm dot com
2004-10-08 16:44 ` reichelt at gcc dot gnu dot org
9 siblings, 0 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-10-08 16:28 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steinmtz at us dot ibm dot com 2004-10-08 16:28 -------
It appears that the specific problem documented here has been resolved in
mainline. There is still a performance regression when using FDO, however,
but it must be for other reasons.
Marking this one as resolved and will work on figuring out what is the current
cause of the regression.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
` (8 preceding siblings ...)
2004-10-08 16:28 ` steinmtz at us dot ibm dot com
@ 2004-10-08 16:44 ` reichelt at gcc dot gnu dot org
9 siblings, 0 replies; 11+ messages in thread
From: reichelt at gcc dot gnu dot org @ 2004-10-08 16:44 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |4.0.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2004-10-08 16:44 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
2004-05-25 11:40 ` [Bug rtl-optimization/15632] " pinskia at gcc dot gnu dot org
2004-05-25 11:42 ` steinmtz at us dot ibm dot com
2004-05-25 12:04 ` steinmtz at us dot ibm dot com
2004-05-25 12:06 ` steinmtz at us dot ibm dot com
2004-06-13 22:55 ` pinskia at gcc dot gnu dot org
2004-06-13 23:03 ` pinskia at gcc dot gnu dot org
2004-06-14 13:10 ` steinmtz at us dot ibm dot com
2004-10-07 9:48 ` giovannibajo at libero dot it
2004-10-08 16:28 ` steinmtz at us dot ibm dot com
2004-10-08 16:44 ` reichelt at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).