public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/40644] New: Optimizing for pentium-m gives worse code than optimizing for i486
@ 2009-07-03 18:28 aanisimov at inbox dot ru
2009-07-03 18:28 ` [Bug target/40644] " aanisimov at inbox dot ru
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: aanisimov at inbox dot ru @ 2009-07-03 18:28 UTC (permalink / raw)
To: gcc-bugs
Try compiling the attached program with the following options (they differ only
in -march specification)
1. gcc -std=c99 -march=i486 -funroll-loops -fprefetch-loop-arrays
-ftree-vectorize -O3 -o gen_weyl_group gen_weyl_group.c
2. gcc -std=c99 -march=i686 -funroll-loops -fprefetch-loop-arrays
-ftree-vectorize -O3 -o gen_weyl_group gen_weyl_group.c
3. gcc -std=c99 -march=pentium-m -funroll-loops -fprefetch-loop-arrays
-ftree-vectorize -O3 -o gen_weyl_group gen_weyl_group.c
With my notebook (CPU core is Dothan) I get the following execution times:
i486 37.510
i686 37.534
p-m 53.959
Results for i486 and i686 are roughly the same, but compiling for pentium-m
results in a seriously degraded performance.
I first noted this behaviour with gcc 4.3.3 that is my system's stock compiler;
the abovementioned times were measured for 4.5.0-svn149207, so, probably, all
versions from 4.3 to 4.5 are affected by this bug.
GCC 4.5.0, used to compile the tests, was configured with the following
options:
--prefix=/home/artem/testing/gcc45 --enable-shared --enable-bootstrap
--enable-languages=c --enable-threads=posix --enable-checking=release
--with-system-zlib --with-gnu-ld --verbose --with-arch=i686
--
Summary: Optimizing for pentium-m gives worse code than
optimizing for i486
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: aanisimov at inbox dot ru
GCC build triplet: i486-slackware-linux
GCC host triplet: i486-slackware-linux
GCC target triplet: i486-slackware-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40644
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/40644] Optimizing for pentium-m gives worse code than optimizing for i486
2009-07-03 18:28 [Bug target/40644] New: Optimizing for pentium-m gives worse code than optimizing for i486 aanisimov at inbox dot ru
@ 2009-07-03 18:28 ` aanisimov at inbox dot ru
2009-07-03 18:55 ` rguenth at gcc dot gnu dot org
2009-07-03 19:13 ` aanisimov at inbox dot ru
2 siblings, 0 replies; 4+ messages in thread
From: aanisimov at inbox dot ru @ 2009-07-03 18:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from aanisimov at inbox dot ru 2009-07-03 18:28 -------
Created an attachment (id=18137)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18137&action=view)
Sample program
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40644
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/40644] Optimizing for pentium-m gives worse code than optimizing for i486
2009-07-03 18:28 [Bug target/40644] New: Optimizing for pentium-m gives worse code than optimizing for i486 aanisimov at inbox dot ru
2009-07-03 18:28 ` [Bug target/40644] " aanisimov at inbox dot ru
@ 2009-07-03 18:55 ` rguenth at gcc dot gnu dot org
2009-07-03 19:13 ` aanisimov at inbox dot ru
2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-07-03 18:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from rguenth at gcc dot gnu dot org 2009-07-03 18:55 -------
Try -march=pentium-m -mtune=generic. Pentium-M never received any special
tuning (it is the same as for pentium-pro). So is -march=i686 btw, but
i686 does not have SSE, so it is likely vectorization and/or prefetching
that slows your case 3. down.
Try disabling prefetching.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40644
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/40644] Optimizing for pentium-m gives worse code than optimizing for i486
2009-07-03 18:28 [Bug target/40644] New: Optimizing for pentium-m gives worse code than optimizing for i486 aanisimov at inbox dot ru
2009-07-03 18:28 ` [Bug target/40644] " aanisimov at inbox dot ru
2009-07-03 18:55 ` rguenth at gcc dot gnu dot org
@ 2009-07-03 19:13 ` aanisimov at inbox dot ru
2 siblings, 0 replies; 4+ messages in thread
From: aanisimov at inbox dot ru @ 2009-07-03 19:13 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from aanisimov at inbox dot ru 2009-07-03 19:12 -------
>
> Try disabling prefetching.
>
Indeed, removing -fprefetch-loop-arrays made the program run in 37.534 seconds,
exactly like one compiled for i686.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40644
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-07-03 19:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-03 18:28 [Bug target/40644] New: Optimizing for pentium-m gives worse code than optimizing for i486 aanisimov at inbox dot ru
2009-07-03 18:28 ` [Bug target/40644] " aanisimov at inbox dot ru
2009-07-03 18:55 ` rguenth at gcc dot gnu dot org
2009-07-03 19:13 ` aanisimov at inbox dot ru
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).