public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression
@ 2004-05-25 11:37 steinmtz at us dot ibm dot com
  2004-05-25 11:40 ` [Bug rtl-optimization/15632] " pinskia at gcc dot gnu dot org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-05-25 11:37 UTC (permalink / raw)
  To: gcc-bugs

Using: gcc version 3.5.0 20040430 (experimental) for PowerPC

Here is an example of a performance regression when using FDO.  The
"ParseVid" component of Skidmarks runs about 5% slower using FDO.  I traced
it down to a loop that is unrolled in the non-FDO compilation, but is not
unrolled in the FDO version.  I hand unrolled the subject loop and got back
the full 5% plus an additional 2%, so it's not only introducing a penalty,
but it's also negating other positive transformations.

To reproduce given the files I will attach:

gcc  -O2 -funroll-loops -m32 -c parse.i

Looking at the code for procedure "ParseVideoSegment", you'll see that the
loop in question has been completely unrolled.  It's also interesting to
note that the loop within which it is nested has become a bct loop:

.L8:
      lhz 11,0(4)
      li 0,12
      lhz 8,2(4)
      slwi 9,30,3
      slwi 11,11,16
      sth 0,8(5)
      or 11,11,8
      addi 9,9,-12
      srwi 10,11,20
      sth 9,10(5)
      rlwinm 0,10,0,30,31
      rlwinm 8,11,10,31,31
      add 0,29,0
      stw 3,0(5)
      slwi 0,0,1
      stw 11,4(5)
      lhzx 9,24,0
      stw 6,120(12)
      stw 7,124(12)
      sth 9,128(12)
      stb 8,130(12)
      stb 25,131(12)
      stw 6,0(12)
      stw 7,4(12)
      stw 6,8(12)
      stw 7,12(12)
      stw 6,16(12)
      stw 7,20(12)
      stw 6,24(12)
      stw 7,28(12)
      stw 6,32(12)
      stw 7,36(12)
      stw 6,40(12)
      stw 7,44(12)
      stw 6,48(12)
      stw 7,52(12)
      stw 6,56(12)
      stw 7,60(12)
      stw 6,64(12)
      stw 7,68(12)
      stw 6,72(12)
      stw 7,76(12)
      stw 6,80(12)
      stw 7,84(12)
      stw 6,88(12)
      stw 7,92(12)
      stw 6,96(12)
      stw 7,100(12)
      stw 6,104(12)
      stw 7,108(12)
      stw 6,112(12)
      stw 7,116(12)
      rlwinm 10,10,20,0,8
      srawi 10,10,18
      add 4,4,30
      sth 10,0(12)
      addi 5,5,16
      addi 12,12,132
      addi 31,31,1
      bdz .L119

If the same code is compiled using FDO, the loop is no longer unrolled.  It 
becomes a bct loop, which in turn prevents the outer loop from becoming a bct 
loop.  The use of indexed stores and the associated address computations also 
form a very undesirable dependence chain within the loop, as well as within the 
two peeled iterations.

gcc -O2 -fprofile-use -funroll-loops -m32 -c parse.i

Generates:

.L111:
      addi 9,27,1
      slwi 0,27,3
      addi 11,9,1
      slwi 9,9,3
      cmplwi 7,11,15
      stwx 4,6,0
      la 6,4(6)
      stwx 5,6,0
      la 6,-4(6)
      stwx 4,9,6
      la 9,4(9)
      stwx 5,9,6
      la 9,-4(9)
      bgt- 7,.L112
      addi 0,11,1
      cmplwi 7,0,16
      subfic 0,11,16
      mtctr 0
      bgt- 7,.L128
.L94:
      slwi 0,11,3
      addi 11,11,1
      stwx 4,6,0
      la 6,4(6)
      stwx 5,6,0
      la 6,-4(6)
      bdnz .L94

I will attach the files needed to duplicate (parse.i, parse.gcda and parse.gcno)

-- 
           Summary: Failure to unroll loop when using FDO causes performance
                    regression
           Product: gcc
           Version: 3.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: translation
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: steinmtz at us dot ibm dot com
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: PowerPC-Linux
  GCC host triplet: PowerPC-Linux
GCC target triplet: PowerPC-Linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
  2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
@ 2004-05-25 11:40 ` pinskia at gcc dot gnu dot org
  2004-05-25 11:42 ` steinmtz at us dot ibm dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-05-25 11:40 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|translation                 |rtl-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
  2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
  2004-05-25 11:40 ` [Bug rtl-optimization/15632] " pinskia at gcc dot gnu dot org
@ 2004-05-25 11:42 ` steinmtz at us dot ibm dot com
  2004-05-25 12:04 ` steinmtz at us dot ibm dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-05-25 11:42 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steinmtz at us dot ibm dot com  2004-05-24 14:51 -------
Created an attachment (id=6368)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=6368&action=view)
Preprocessed source code, compressed with gzip


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
  2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
  2004-05-25 11:40 ` [Bug rtl-optimization/15632] " pinskia at gcc dot gnu dot org
  2004-05-25 11:42 ` steinmtz at us dot ibm dot com
@ 2004-05-25 12:04 ` steinmtz at us dot ibm dot com
  2004-05-25 12:06 ` steinmtz at us dot ibm dot com
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-05-25 12:04 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steinmtz at us dot ibm dot com  2004-05-24 14:52 -------
Created an attachment (id=6369)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=6369&action=view)
FDO data compressed with gzip


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
  2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
                   ` (2 preceding siblings ...)
  2004-05-25 12:04 ` steinmtz at us dot ibm dot com
@ 2004-05-25 12:06 ` steinmtz at us dot ibm dot com
  2004-06-13 22:55 ` pinskia at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-05-25 12:06 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steinmtz at us dot ibm dot com  2004-05-24 14:53 -------
Created an attachment (id=6370)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=6370&action=view)
FDO data compressed with gzip


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
  2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
                   ` (3 preceding siblings ...)
  2004-05-25 12:06 ` steinmtz at us dot ibm dot com
@ 2004-06-13 22:55 ` pinskia at gcc dot gnu dot org
  2004-06-13 23:03 ` pinskia at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-13 22:55 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
  2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
                   ` (4 preceding siblings ...)
  2004-06-13 22:55 ` pinskia at gcc dot gnu dot org
@ 2004-06-13 23:03 ` pinskia at gcc dot gnu dot org
  2004-06-14 13:10 ` steinmtz at us dot ibm dot com
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-06-13 23:03 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-06-13 23:03 -------
Can you try with <http://gcc.gnu.org/ml/gcc-patches/2004-05/msg01558.html> applied and see if 
this is fixed with that patch, if you cannot test just say so.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
  2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
                   ` (5 preceding siblings ...)
  2004-06-13 23:03 ` pinskia at gcc dot gnu dot org
@ 2004-06-14 13:10 ` steinmtz at us dot ibm dot com
  2004-10-07  9:48 ` giovannibajo at libero dot it
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-06-14 13:10 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steinmtz at us dot ibm dot com  2004-06-14 13:10 -------
http://gcc.gnu.org/ml/gcc-patches/2004-05/msg01558.html is a separate problem 
and does not fix this bug.  Thanks.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
  2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
                   ` (6 preceding siblings ...)
  2004-06-14 13:10 ` steinmtz at us dot ibm dot com
@ 2004-10-07  9:48 ` giovannibajo at libero dot it
  2004-10-08 16:28 ` steinmtz at us dot ibm dot com
  2004-10-08 16:44 ` reichelt at gcc dot gnu dot org
  9 siblings, 0 replies; 11+ messages in thread
From: giovannibajo at libero dot it @ 2004-10-07  9:48 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From giovannibajo at libero dot it  2004-10-07 09:48 -------
Pete, can you double-check if the problem is still actual on mainline? Now that 
part of LNO was merged, this could be fixed already.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |giovannibajo at libero dot
                   |                            |it
             Status|UNCONFIRMED                 |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
  2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
                   ` (7 preceding siblings ...)
  2004-10-07  9:48 ` giovannibajo at libero dot it
@ 2004-10-08 16:28 ` steinmtz at us dot ibm dot com
  2004-10-08 16:44 ` reichelt at gcc dot gnu dot org
  9 siblings, 0 replies; 11+ messages in thread
From: steinmtz at us dot ibm dot com @ 2004-10-08 16:28 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steinmtz at us dot ibm dot com  2004-10-08 16:28 -------
It appears that the specific problem documented here has been resolved in 
mainline.  There is still a performance regression when using FDO, however, 
but it must be for other reasons.

Marking this one as resolved and will work on figuring out what is the current 
cause of the regression.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug rtl-optimization/15632] Failure to unroll loop when using FDO causes performance regression
  2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
                   ` (8 preceding siblings ...)
  2004-10-08 16:28 ` steinmtz at us dot ibm dot com
@ 2004-10-08 16:44 ` reichelt at gcc dot gnu dot org
  9 siblings, 0 replies; 11+ messages in thread
From: reichelt at gcc dot gnu dot org @ 2004-10-08 16:44 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |4.0.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15632


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2004-10-08 16:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-25 11:37 [Bug translation/15632] New: Failure to unroll loop when using FDO causes performance regression steinmtz at us dot ibm dot com
2004-05-25 11:40 ` [Bug rtl-optimization/15632] " pinskia at gcc dot gnu dot org
2004-05-25 11:42 ` steinmtz at us dot ibm dot com
2004-05-25 12:04 ` steinmtz at us dot ibm dot com
2004-05-25 12:06 ` steinmtz at us dot ibm dot com
2004-06-13 22:55 ` pinskia at gcc dot gnu dot org
2004-06-13 23:03 ` pinskia at gcc dot gnu dot org
2004-06-14 13:10 ` steinmtz at us dot ibm dot com
2004-10-07  9:48 ` giovannibajo at libero dot it
2004-10-08 16:28 ` steinmtz at us dot ibm dot com
2004-10-08 16:44 ` reichelt at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).