public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/40644]  New: Optimizing for pentium-m gives worse code than optimizing for i486
@ 2009-07-03 18:28 aanisimov at inbox dot ru
  2009-07-03 18:28 ` [Bug target/40644] " aanisimov at inbox dot ru
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: aanisimov at inbox dot ru @ 2009-07-03 18:28 UTC (permalink / raw)
  To: gcc-bugs

Try compiling the attached program with the following options (they differ only
in -march specification)

1. gcc -std=c99 -march=i486 -funroll-loops -fprefetch-loop-arrays
-ftree-vectorize -O3 -o gen_weyl_group gen_weyl_group.c
2. gcc -std=c99 -march=i686 -funroll-loops -fprefetch-loop-arrays
-ftree-vectorize -O3 -o gen_weyl_group gen_weyl_group.c
3. gcc -std=c99 -march=pentium-m -funroll-loops -fprefetch-loop-arrays
-ftree-vectorize -O3 -o gen_weyl_group gen_weyl_group.c

 With my notebook (CPU core is Dothan) I get the following execution times:
i486  37.510
i686  37.534
p-m   53.959

Results for i486 and i686 are roughly the same, but compiling for pentium-m
results in a seriously degraded performance.

I first noted this behaviour with gcc 4.3.3 that is my system's stock compiler;
the abovementioned times were measured for 4.5.0-svn149207, so, probably, all
versions from 4.3 to 4.5 are affected by this bug.

GCC 4.5.0, used to compile the tests, was configured with the following
options:

--prefix=/home/artem/testing/gcc45 --enable-shared --enable-bootstrap
--enable-languages=c --enable-threads=posix --enable-checking=release
--with-system-zlib --with-gnu-ld --verbose --with-arch=i686


-- 
           Summary: Optimizing for pentium-m gives worse code than
                    optimizing for i486
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: aanisimov at inbox dot ru
 GCC build triplet: i486-slackware-linux
  GCC host triplet: i486-slackware-linux
GCC target triplet: i486-slackware-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40644


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/40644] Optimizing for pentium-m gives worse code than optimizing for i486
  2009-07-03 18:28 [Bug target/40644] New: Optimizing for pentium-m gives worse code than optimizing for i486 aanisimov at inbox dot ru
@ 2009-07-03 18:28 ` aanisimov at inbox dot ru
  2009-07-03 18:55 ` rguenth at gcc dot gnu dot org
  2009-07-03 19:13 ` aanisimov at inbox dot ru
  2 siblings, 0 replies; 4+ messages in thread
From: aanisimov at inbox dot ru @ 2009-07-03 18:28 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from aanisimov at inbox dot ru  2009-07-03 18:28 -------
Created an attachment (id=18137)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18137&action=view)
Sample program


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40644


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/40644] Optimizing for pentium-m gives worse code than optimizing for i486
  2009-07-03 18:28 [Bug target/40644] New: Optimizing for pentium-m gives worse code than optimizing for i486 aanisimov at inbox dot ru
  2009-07-03 18:28 ` [Bug target/40644] " aanisimov at inbox dot ru
@ 2009-07-03 18:55 ` rguenth at gcc dot gnu dot org
  2009-07-03 19:13 ` aanisimov at inbox dot ru
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-07-03 18:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from rguenth at gcc dot gnu dot org  2009-07-03 18:55 -------
Try -march=pentium-m -mtune=generic.  Pentium-M never received any special
tuning (it is the same as for pentium-pro).  So is -march=i686 btw, but
i686 does not have SSE, so it is likely vectorization and/or prefetching
that slows your case 3. down.

Try disabling prefetching.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40644


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/40644] Optimizing for pentium-m gives worse code than optimizing for i486
  2009-07-03 18:28 [Bug target/40644] New: Optimizing for pentium-m gives worse code than optimizing for i486 aanisimov at inbox dot ru
  2009-07-03 18:28 ` [Bug target/40644] " aanisimov at inbox dot ru
  2009-07-03 18:55 ` rguenth at gcc dot gnu dot org
@ 2009-07-03 19:13 ` aanisimov at inbox dot ru
  2 siblings, 0 replies; 4+ messages in thread
From: aanisimov at inbox dot ru @ 2009-07-03 19:13 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from aanisimov at inbox dot ru  2009-07-03 19:12 -------
> 
> Try disabling prefetching.
> 

Indeed, removing -fprefetch-loop-arrays made the program run in 37.534 seconds,
exactly like one compiled for i686.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40644


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-07-03 19:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-03 18:28 [Bug target/40644] New: Optimizing for pentium-m gives worse code than optimizing for i486 aanisimov at inbox dot ru
2009-07-03 18:28 ` [Bug target/40644] " aanisimov at inbox dot ru
2009-07-03 18:55 ` rguenth at gcc dot gnu dot org
2009-07-03 19:13 ` aanisimov at inbox dot ru

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).