public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/29953]  New: [SH-4] Perfomance regression in loops. cmp/eq used instead of dt
@ 2006-11-23 10:14 nbkolchin at gmail dot com
  2006-11-23 10:15 ` [Bug target/29953] " nbkolchin at gmail dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: nbkolchin at gmail dot com @ 2006-11-23 10:14 UTC (permalink / raw)
  To: gcc-bugs

GCC 4.1.1 (probably all 4.* versions, tested 4.3.0-svn also), uses cmp/eq 
instead of "dt" in loops. This leads to ~20% perfomance decrease.

Technically, loop processing algorithm is completely different between
versions.

Example (sources in attach):

CFLAGS="-m4 -O3 -fomit-frame-pointer"

gcc 3.4.4:
----------------------------
.LFB2:
        mov.l   .L11,r3
        mov     #0,r0
        mov.l   .L12,r2
.L5:
        mov.l   @r3+,r1 ! !!!
        dt      r2      ! !!!
        bf/s    .L5
        add     r1,r0
        rts
        nop
.L13:
        .align 2
.L11:
        .long   -1946157056
.L12:
        .long   1000000
-----------------------------

gcc 4.1.1:
-----------------------------
.LFB2:
        mov.l   .L8,r2
        mov     #0,r0
        mov.l   .L9,r3
.L2:
        mov.l   @r2+,r1 ! !!!
        cmp/eq  r3,r2   ! !!!
        bf/s    .L2
        add     r1,r0
        rts
        nop
.L10:
        .align 2
.L8:
        .long   -1946157056
.L9:
        .long   -1942157056
-----------------------------

P.S. We are porting application from gcc3.4 to gcc4.1 and have about 60% 
perfomance decrease. So this is probably just first report. :(


-- 
           Summary: [SH-4] Perfomance regression in loops. cmp/eq used
                    instead of dt
           Product: gcc
           Version: 4.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: nbkolchin at gmail dot com
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: sh-rtemself


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt
  2006-11-23 10:14 [Bug target/29953] New: [SH-4] Perfomance regression in loops. cmp/eq used instead of dt nbkolchin at gmail dot com
@ 2006-11-23 10:15 ` nbkolchin at gmail dot com
  2007-03-28 10:51 ` mano at roarinelk dot homelinux dot net
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: nbkolchin at gmail dot com @ 2006-11-23 10:15 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from nbkolchin at gmail dot com  2006-11-23 10:15 -------
Created an attachment (id=12671)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12671&action=view)
test.cpp

Testcase


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt
  2006-11-23 10:14 [Bug target/29953] New: [SH-4] Perfomance regression in loops. cmp/eq used instead of dt nbkolchin at gmail dot com
  2006-11-23 10:15 ` [Bug target/29953] " nbkolchin at gmail dot com
@ 2007-03-28 10:51 ` mano at roarinelk dot homelinux dot net
  2007-04-03  6:43 ` christian dot bruel at st dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: mano at roarinelk dot homelinux dot net @ 2007-03-28 10:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from mano at roarinelk dot homelinux dot net  2007-03-28 11:50 -------
gcc-4.1.2 and 3.4.6 for linux; 3.3.5 and 2.95 for QNX also create
similar code without the dt instruction. How come your 3.4.4 is
so smart?


-- 

mano at roarinelk dot homelinux dot net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mano at roarinelk dot
                   |                            |homelinux dot net


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt
  2006-11-23 10:14 [Bug target/29953] New: [SH-4] Perfomance regression in loops. cmp/eq used instead of dt nbkolchin at gmail dot com
  2006-11-23 10:15 ` [Bug target/29953] " nbkolchin at gmail dot com
  2007-03-28 10:51 ` mano at roarinelk dot homelinux dot net
@ 2007-04-03  6:43 ` christian dot bruel at st dot com
  2007-04-03 14:30 ` christian dot bruel at st dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: christian dot bruel at st dot com @ 2007-04-03  6:43 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from christian dot bruel at st dot com  2007-04-03 07:43 -------
thank you for reporting this,

There is indeed a data dependency on 'r2' introduced by the cmp/eq instruction,
preventing the mov and the comparaison to be executed in parallel, unlike the
dt on the induction variable.

The use of dt seems to be quite sensitive, I checked on various versions sh-gcc
(3.3.x, 3.4.x) and it was difficult to find one using the correct strategy.
I'll check that issue.

Regards,


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt
  2006-11-23 10:14 [Bug target/29953] New: [SH-4] Perfomance regression in loops. cmp/eq used instead of dt nbkolchin at gmail dot com
                   ` (2 preceding siblings ...)
  2007-04-03  6:43 ` christian dot bruel at st dot com
@ 2007-04-03 14:30 ` christian dot bruel at st dot com
  2007-04-03 15:34 ` steven at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: christian dot bruel at st dot com @ 2007-04-03 14:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from christian dot bruel at st dot com  2007-04-03 15:30 -------
This missed optimisation appears with all counted loops. The ir in gimple
produces

  j = 0;
  <D1202>:;
  j = j + 1;
  if (j <= 999)
    {
      goto <D1202>;
    }

The transformation to do ( j=1000; j=j-1; if (j)...) will allow the decrement
and test pattern to be catched by combine.
Since this transformation needs to know about code selection (and is only
useful if the number of issued instructions is > 1), it seems best to do it in
rtl. I'm thinking about strength_reduce in loop.c when we optimize bivs.

Question: does it make sense to do this transformation in loop.c ? I'm thinking
at strength_reduce. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt
  2006-11-23 10:14 [Bug target/29953] New: [SH-4] Perfomance regression in loops. cmp/eq used instead of dt nbkolchin at gmail dot com
                   ` (3 preceding siblings ...)
  2007-04-03 14:30 ` christian dot bruel at st dot com
@ 2007-04-03 15:34 ` steven at gcc dot gnu dot org
  2007-05-15  9:31 ` chrbr at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: steven at gcc dot gnu dot org @ 2007-04-03 15:34 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from steven at gcc dot gnu dot org  2007-04-03 16:34 -------
Re. comment #4:

Answer: Go ahead and implement it in loop.c.
If you want to fix it only for GCC 4.1, that is.  There is no loop.c in GCC 4.2
and later.

So does it make sense?  Depends on what you want to achieve.


-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2007-04-03 16:34:17
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt
  2006-11-23 10:14 [Bug target/29953] New: [SH-4] Perfomance regression in loops. cmp/eq used instead of dt nbkolchin at gmail dot com
                   ` (4 preceding siblings ...)
  2007-04-03 15:34 ` steven at gcc dot gnu dot org
@ 2007-05-15  9:31 ` chrbr at gcc dot gnu dot org
  2007-06-08  7:59 ` chrbr at gcc dot gnu dot org
  2007-06-08  8:18 ` chrbr at gcc dot gnu dot org
  7 siblings, 0 replies; 9+ messages in thread
From: chrbr at gcc dot gnu dot org @ 2007-05-15  9:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from chrbr at gcc dot gnu dot org  2007-05-15 10:30 -------

I dropped the 4.1 and implemented a -finvert-loops option on the trunk.

This option allows a basic induction variable to be decremented instead of
incremented to support exit testing against 0.

I'm validating a patch on intel and sh. 


-- 

chrbr at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |chrbr at gcc dot gnu dot org
                   |dot org                     |
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2007-04-03 16:34:17         |2007-05-15 10:30:36
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt
  2006-11-23 10:14 [Bug target/29953] New: [SH-4] Perfomance regression in loops. cmp/eq used instead of dt nbkolchin at gmail dot com
                   ` (5 preceding siblings ...)
  2007-05-15  9:31 ` chrbr at gcc dot gnu dot org
@ 2007-06-08  7:59 ` chrbr at gcc dot gnu dot org
  2007-06-08  8:18 ` chrbr at gcc dot gnu dot org
  7 siblings, 0 replies; 9+ messages in thread
From: chrbr at gcc dot gnu dot org @ 2007-06-08  7:59 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from chrbr at gcc dot gnu dot org  2007-06-08 07:58 -------
Subject: Bug 29953

Author: chrbr
Date: Fri Jun  8 07:58:41 2007
New Revision: 125564

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=125564
Log:
PR target/29953
* config/sh/sh.md (doloop_end): New pattern and splitter.
* loop-iv.c (simple_rhs_p): Check for hardware registers.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md
    trunk/gcc/loop-iv.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt
  2006-11-23 10:14 [Bug target/29953] New: [SH-4] Perfomance regression in loops. cmp/eq used instead of dt nbkolchin at gmail dot com
                   ` (6 preceding siblings ...)
  2007-06-08  7:59 ` chrbr at gcc dot gnu dot org
@ 2007-06-08  8:18 ` chrbr at gcc dot gnu dot org
  7 siblings, 0 replies; 9+ messages in thread
From: chrbr at gcc dot gnu dot org @ 2007-06-08  8:18 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from chrbr at gcc dot gnu dot org  2007-06-08 08:18 -------
doloop_optimize does the iv inversion with the doloop_end insn support in the
machine description.


-- 

chrbr at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|---                         |4.3.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-06-08  8:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-11-23 10:14 [Bug target/29953] New: [SH-4] Perfomance regression in loops. cmp/eq used instead of dt nbkolchin at gmail dot com
2006-11-23 10:15 ` [Bug target/29953] " nbkolchin at gmail dot com
2007-03-28 10:51 ` mano at roarinelk dot homelinux dot net
2007-04-03  6:43 ` christian dot bruel at st dot com
2007-04-03 14:30 ` christian dot bruel at st dot com
2007-04-03 15:34 ` steven at gcc dot gnu dot org
2007-05-15  9:31 ` chrbr at gcc dot gnu dot org
2007-06-08  7:59 ` chrbr at gcc dot gnu dot org
2007-06-08  8:18 ` chrbr at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).