public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/33431]  New: [SH4] performance regression between 3.4.6 and 4.x
@ 2007-09-14  9:56 nbkolchin at gmail dot com
  2007-09-14 11:46 ` [Bug target/33431] " kkojima at gcc dot gnu dot org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: nbkolchin at gmail dot com @ 2007-09-14  9:56 UTC (permalink / raw)
  To: gcc-bugs

I've found serious performance regression between GCC version 3.4.6 and
4.2/4.3.

SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark
================================================================
           GCC: 3.4.6   4.2.1   4.3.0 (20070907)
     Composite:  6.05    5.01    4.82
           FFT:  4.90    4.15    4.21
           SOR: 10.10    8.36    7.64
    MonteCarlo:  3.68    3.06    3.04
Sparse matmult:  5.45    4.45    4.03
            LU:  6.10    5.03    5.18
================================================================

BYTEmark* Native Mode Benchmark ver. 2 (10/95)
================================================================
             GCC:      3.4.6      4.2.1  4.3.0 (20070907)
    NUMERIC SORT:     35.459       32.2      29.327
     STRING SORT:     0.5943    0.57604      0.8603
        BITFIELD: 1.0585e+07  9.269e+06  9.4138e+06
    FP EMULATION:     4.4944     4.6012       5.364
         FOURIER:     272.28     241.34      259.12
      ASSIGNMENT:    0.35997    0.38373     0.39683
            IDEA:     124.11     95.057      100.07
         HUFFMAN:     45.593     52.083      56.391
      NEURAL NET:    0.36153    0.30922     0.31348
LU DECOMPOSITION:     11.331     9.4938       8.255
================================================================

The "real world application" has 20%-200% performance regression with GCC 4.x.

All tests were compiled with this arguments:
 -O3 -ffast-math -fomit-frame-pointer -funroll-loops -ftracer
 -funit-at-a-time
 -m4 -ml

This arguments were tuned for the best results under 3.4.6. I've played with
various settings under 4.x, but can't achieve any performance improvement.

I can rerun them with any key combination you want.

This tests compilable under Linux can be downloaded from:
- scimark: http://oktetlabs.ru/~snob/scimark.tgz
- nbench: http://oktetlabs.ru/~snob/nbench.tgz

I can attach this files to bugreport if this is acceptable and will not pollute
bugzilla.

Our target hardware has SH7750 processor running in little endian mode under
RTEMS. Unfortunetaly there is no way to boot linux there.

Can I ask you to run this tests under linux-sh? At least scimark one.

After lurking inside backend sources, I found that m4 has several variants in
GCC 4.x: m4-100, m4-200, etc. I've tried to compile this tests with m4-200
switch, but it looks like m4-200 enforces big-endian.

Backend sources show, that there is a lot of work going on SH4 GCC part.

I also wrote simple stupid tests to compare code generation between different
compiler versions (I can mail/attach them to you, but they are really stupid)
to
understand what can cause such performance regression. But generated assembler
is really different across versions. I can found only two obvious things:
- GCC4 has a much more aggressive inline and loop unrolling. (-funroll-loops
  was dropped from compiler arguments with no positive result)
- GCC4 has different command scheduling, which probably leads to performance
  regression.


-- 
           Summary: [SH4] performance regression between 3.4.6 and 4.x
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: nbkolchin at gmail dot com
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: sh-unknown-rtemself


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/33431] [SH4] performance regression between 3.4.6 and 4.x
  2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
@ 2007-09-14 11:46 ` kkojima at gcc dot gnu dot org
  2007-09-14 16:11 ` nbkolchin at gmail dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: kkojima at gcc dot gnu dot org @ 2007-09-14 11:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from kkojima at gcc dot gnu dot org  2007-09-14 11:46 -------
I've run scimark on my box:
sh4-unknown-linux-gnu / linux-kernel 2.6.22-rc4 / SH7751R

with -O3 -ffast-math -fomit-frame-pointer -funroll-loops -ftracer
-funit-at-a-time:

                       gcc-3.4.6    gcc-4.2.1    gcc-4.3.0(20070910)
Composite Score:           16.76        16.86        16.99
FFT              Mflops:    12.92        13.36        13.36
SOR              Mflops:    27.88        26.76        28.01
MonteCarlo:    Mflops:     9.96         9.73         9.67
Sparse matmult Mflops:    14.95        16.06        14.84
LU               Mflops:    18.08        18.39        19.05

Hmm... I can't reproduce the regression in linux-sh, at least
for SH7751R.


-- 

kkojima at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kkojima at gcc dot gnu dot
                   |                            |org
            Summary|[SH4] performance regression|[SH4] performance regression
                   |between 3.4.6 and 4.x       |between 3.4.6 and 4.x


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/33431] [SH4] performance regression between 3.4.6 and 4.x
  2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
  2007-09-14 11:46 ` [Bug target/33431] " kkojima at gcc dot gnu dot org
@ 2007-09-14 16:11 ` nbkolchin at gmail dot com
  2007-09-14 22:10 ` kkojima at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: nbkolchin at gmail dot com @ 2007-09-14 16:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from nbkolchin at gmail dot com  2007-09-14 16:10 -------
Thank you for your reply. 

Variants:
- you are not using: "-m4 -ml", but some other architecture settings.
- SH7751R and SH7750R have different instruction pipeline (probably not, both 
  are SH4-200 variants as I know).
- gcc for linux is different from gcc for RTEMS (how this can be checked?)
- processor endians are different.


-- 

nbkolchin at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nbkolchin at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/33431] [SH4] performance regression between 3.4.6 and 4.x
  2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
  2007-09-14 11:46 ` [Bug target/33431] " kkojima at gcc dot gnu dot org
  2007-09-14 16:11 ` nbkolchin at gmail dot com
@ 2007-09-14 22:10 ` kkojima at gcc dot gnu dot org
  2007-09-15 12:13 ` nbkolchin at gmail dot com
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: kkojima at gcc dot gnu dot org @ 2007-09-14 22:10 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from kkojima at gcc dot gnu dot org  2007-09-14 22:10 -------
-ml and -m4 are default on sh4-unknown-linux-gnu compilers.
For 4.3.0, you could see how the default target specific options
are set with 'cc1 --target-help'.  I've got the following result
on my linux box.  You can compare it with your RETMS's one.

The following options are target specific:
  -m4                                   [enabled]
  -m4-100                               [disabled]
  -m4-200                               [disabled]
  -m4-300                               [disabled]
  -m4a                                  [disabled]
  -mb                                   [disabled]
  -mbigtable                            [disabled]
  -mbranch-cost=                        0xffffffff
  -mcbranchdi                           [enabled]
  -mcmpeqdi                             [disabled]
  -mcut2-workaround                     [disabled]
  -mdalign                              [disabled]
  -mdiv=
  -mdivsi3_libfunc=
  -mexpand-cbranchdi                    [enabled]
  -mfused-madd                          [disabled]
  -mgettrcost=                          0xffffffff
  -mglibc                               [enabled]
  -mhitachi                             [disabled]
  -mieee                                [disabled]
  -minline-ic_invalidate                [disabled]
  -misize                               [disabled]
  -ml                                   [enabled]
  -mnomacsave                           [disabled]
  -mpadstruct                           [disabled]
  -mprefergot                           [disabled]
  -mpretend-cmove                       [disabled]
  -mrelax                               [disabled]
  -mrenesas                             [disabled]
  -mspace                               [disabled]
  -muclibc                              [disabled]
  -multcost=                            0xffffffff
  -musermode                            [enabled]


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/33431] [SH4] performance regression between 3.4.6 and 4.x
  2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
                   ` (2 preceding siblings ...)
  2007-09-14 22:10 ` kkojima at gcc dot gnu dot org
@ 2007-09-15 12:13 ` nbkolchin at gmail dot com
  2007-09-17 10:30 ` andrew dot stubbs at st dot com
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: nbkolchin at gmail dot com @ 2007-09-15 12:13 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from nbkolchin at gmail dot com  2007-09-15 12:13 -------
There are no differences in "cc1 --target-help" output. I will try to split
scimark in small pieces and test them separately. Thank you for your help.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/33431] [SH4] performance regression between 3.4.6 and 4.x
  2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
                   ` (3 preceding siblings ...)
  2007-09-15 12:13 ` nbkolchin at gmail dot com
@ 2007-09-17 10:30 ` andrew dot stubbs at st dot com
  2007-09-17 10:30 ` [Bug target/33431] New: " Andrew STUBBS
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: andrew dot stubbs at st dot com @ 2007-09-17 10:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from andrew dot stubbs at st dot com  2007-09-17 10:30 -------
Subject: Re:   New: [SH4] performance regression between
 3.4.6 and 4.x

nbkolchin at gmail dot com wrote:
> Our target hardware has SH7750 processor running in little endian mode under
> RTEMS. Unfortunetaly there is no way to boot linux there.
> 
> After lurking inside backend sources, I found that m4 has several variants in
> GCC 4.x: m4-100, m4-200, etc. I've tried to compile this tests with m4-200
> switch, but it looks like m4-200 enforces big-endian.

The 7750 has direct mapped caches and so is not the best platform for 
benchmarking. A slight code perturbation can give a large change in 
performance. :(

The m4-200 option is NOT suitable for that target. The 7750 is a 100 
series core (not that that was a nomenclature that existed when it came 
out). As far as I know, anybody that has a 200 series or above has an 
official ST toolset to go with it (GCC of course).

Andrew


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bug target/33431]  New: [SH4] performance regression between  3.4.6 and 4.x
  2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
                   ` (4 preceding siblings ...)
  2007-09-17 10:30 ` andrew dot stubbs at st dot com
@ 2007-09-17 10:30 ` Andrew STUBBS
  2009-03-31 16:05 ` [Bug target/33431] [4.3/4.4/4.5 Regression] " jsm28 at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Andrew STUBBS @ 2007-09-17 10:30 UTC (permalink / raw)
  To: gcc-bugzilla; +Cc: gcc-bugs

nbkolchin at gmail dot com wrote:
> Our target hardware has SH7750 processor running in little endian mode under
> RTEMS. Unfortunetaly there is no way to boot linux there.
> 
> After lurking inside backend sources, I found that m4 has several variants in
> GCC 4.x: m4-100, m4-200, etc. I've tried to compile this tests with m4-200
> switch, but it looks like m4-200 enforces big-endian.

The 7750 has direct mapped caches and so is not the best platform for 
benchmarking. A slight code perturbation can give a large change in 
performance. :(

The m4-200 option is NOT suitable for that target. The 7750 is a 100 
series core (not that that was a nomenclature that existed when it came 
out). As far as I know, anybody that has a 200 series or above has an 
official ST toolset to go with it (GCC of course).

Andrew


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/33431] [4.3/4.4/4.5 Regression] [SH4] performance regression between 3.4.6 and 4.x
  2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
                   ` (5 preceding siblings ...)
  2007-09-17 10:30 ` [Bug target/33431] New: " Andrew STUBBS
@ 2009-03-31 16:05 ` jsm28 at gcc dot gnu dot org
  2009-03-31 19:01 ` jakub at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: jsm28 at gcc dot gnu dot org @ 2009-03-31 16:05 UTC (permalink / raw)
  To: gcc-bugs



-- 

jsm28 at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[SH4] performance regression|[4.3/4.4/4.5 Regression]
                   |between 3.4.6 and 4.x       |[SH4] performance regression
                   |                            |between 3.4.6 and 4.x
   Target Milestone|---                         |4.3.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/33431] [4.3/4.4/4.5 Regression] [SH4] performance regression between 3.4.6 and 4.x
  2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
                   ` (6 preceding siblings ...)
  2009-03-31 16:05 ` [Bug target/33431] [4.3/4.4/4.5 Regression] " jsm28 at gcc dot gnu dot org
@ 2009-03-31 19:01 ` jakub at gcc dot gnu dot org
  2009-08-04 12:37 ` rguenth at gcc dot gnu dot org
  2009-09-23  0:39 ` kkojima at gcc dot gnu dot org
  9 siblings, 0 replies; 11+ messages in thread
From: jakub at gcc dot gnu dot org @ 2009-03-31 19:01 UTC (permalink / raw)
  To: gcc-bugs



-- 

jakub at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/33431] [4.3/4.4/4.5 Regression] [SH4] performance regression between 3.4.6 and 4.x
  2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
                   ` (7 preceding siblings ...)
  2009-03-31 19:01 ` jakub at gcc dot gnu dot org
@ 2009-08-04 12:37 ` rguenth at gcc dot gnu dot org
  2009-09-23  0:39 ` kkojima at gcc dot gnu dot org
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-08-04 12:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from rguenth at gcc dot gnu dot org  2009-08-04 12:28 -------
GCC 4.3.4 is being released, adjusting target milestone.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.3.4                       |4.3.5


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/33431] [4.3/4.4/4.5 Regression] [SH4] performance regression between 3.4.6 and 4.x
  2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
                   ` (8 preceding siblings ...)
  2009-08-04 12:37 ` rguenth at gcc dot gnu dot org
@ 2009-09-23  0:39 ` kkojima at gcc dot gnu dot org
  9 siblings, 0 replies; 11+ messages in thread
From: kkojima at gcc dot gnu dot org @ 2009-09-23  0:39 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from kkojima at gcc dot gnu dot org  2009-09-23 00:39 -------
There has been no news for ~2 years and we have no reproducible
test case.  Probably it's due to the 7750's cache pointed out
in #5 by Andrew.  I'd like to close this PR.


-- 

kkojima at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |WORKSFORME


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33431


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-09-23  0:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-14  9:56 [Bug target/33431] New: [SH4] performance regression between 3.4.6 and 4.x nbkolchin at gmail dot com
2007-09-14 11:46 ` [Bug target/33431] " kkojima at gcc dot gnu dot org
2007-09-14 16:11 ` nbkolchin at gmail dot com
2007-09-14 22:10 ` kkojima at gcc dot gnu dot org
2007-09-15 12:13 ` nbkolchin at gmail dot com
2007-09-17 10:30 ` andrew dot stubbs at st dot com
2007-09-17 10:30 ` [Bug target/33431] New: " Andrew STUBBS
2009-03-31 16:05 ` [Bug target/33431] [4.3/4.4/4.5 Regression] " jsm28 at gcc dot gnu dot org
2009-03-31 19:01 ` jakub at gcc dot gnu dot org
2009-08-04 12:37 ` rguenth at gcc dot gnu dot org
2009-09-23  0:39 ` kkojima at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).