public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
@ 2003-06-11 22:41 ` pinskia@physics.uc.edu
2003-06-21 1:44 ` dhazeghi at yahoo dot com
` (15 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: pinskia@physics.uc.edu @ 2003-06-11 22:41 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
pinskia@physics.uc.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|3.4 |3.3.1
------- Additional Comments From pinskia@physics.uc.edu 2003-06-11 22:41 -------
Does using -fnew-ra get back to 2.95 speed?
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
2003-06-11 22:41 ` [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95 pinskia@physics.uc.edu
@ 2003-06-21 1:44 ` dhazeghi at yahoo dot com
2003-06-24 13:23 ` o dot lauffenburger at topsolid dot com
` (14 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: dhazeghi at yahoo dot com @ 2003-06-21 1:44 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
dhazeghi at yahoo dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
GCC build triplet| |i686-pc-cygwin
GCC host triplet| |i686-pc-cygwin
GCC target triplet| |i686-pc-cygwin
Priority|P3 |P2
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
2003-06-11 22:41 ` [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95 pinskia@physics.uc.edu
2003-06-21 1:44 ` dhazeghi at yahoo dot com
@ 2003-06-24 13:23 ` o dot lauffenburger at topsolid dot com
2003-06-24 14:38 ` pinskia at physics dot uc dot edu
` (13 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: o dot lauffenburger at topsolid dot com @ 2003-06-24 13:23 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
------- Additional Comments From o dot lauffenburger at topsolid dot com 2003-06-24 12:35 -------
I have tested the -fnew-ra option with version 3.3 and the other options (-O3 -
ffast-math -fomit-frame-pointer).
Without -fnew-ra : 4746 ms
With -fnew-ra : 9063 ms
(With gcc 2.95 : 2914 ms)
So it is apparently worse with the option -fnew-ra.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (2 preceding siblings ...)
2003-06-24 13:23 ` o dot lauffenburger at topsolid dot com
@ 2003-06-24 14:38 ` pinskia at physics dot uc dot edu
2003-07-23 7:02 ` mmitchel at gcc dot gnu dot org
` (12 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: pinskia at physics dot uc dot edu @ 2003-06-24 14:38 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
------- Additional Comments From pinskia at physics dot uc dot edu 2003-06-24 13:28 -------
Then there is two bugs here a general regression and one due to fnew-ra.
fnew-ra should be at least the same speed as without it.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (3 preceding siblings ...)
2003-06-24 14:38 ` pinskia at physics dot uc dot edu
@ 2003-07-23 7:02 ` mmitchel at gcc dot gnu dot org
2003-10-16 2:38 ` mmitchel at gcc dot gnu dot org
` (11 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2003-07-23 7:02 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
mmitchel at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|3.3.1 |3.3.2
------- Additional Comments From mmitchel at gcc dot gnu dot org 2003-07-23 07:02 -------
Jan says, via private email, that this is a "random" slowdown. In other words,
that regstack makes some decisions that are easily perturbed and that some
sometimes it gets luck and sometimes idt doesn't.
We should fix that, but not before 3.3.1, so I've postponed this bug until GCC
3.3.2.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (4 preceding siblings ...)
2003-07-23 7:02 ` mmitchel at gcc dot gnu dot org
@ 2003-10-16 2:38 ` mmitchel at gcc dot gnu dot org
2003-10-30 6:26 ` uros at kss-loka dot si
` (10 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2003-10-16 2:38 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
mmitchel at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|3.3.2 |3.4
------- Additional Comments From mmitchel at gcc dot gnu dot org 2003-10-16 02:38 -------
Postponed until GCC 3.4; this doesn't sound like it's going to have an easy fix.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (5 preceding siblings ...)
2003-10-16 2:38 ` mmitchel at gcc dot gnu dot org
@ 2003-10-30 6:26 ` uros at kss-loka dot si
2004-01-01 4:11 ` pinskia at gcc dot gnu dot org
` (9 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: uros at kss-loka dot si @ 2003-10-30 6:26 UTC (permalink / raw)
To: gcc-bugs
PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
------- Additional Comments From uros at kss-loka dot si 2003-10-30 06:23 -------
I have some measurements with RedHat 7.3 gcc-2.96 and gcc-3.3. The results on
166 MHz pentium MMX are quite interesting. They show that for attached testcase
(test.c in attachments, modified to plain .c file) gcc-3.3 is faster that
gcc-2.96. Also of interest is gcc 3.3 with -fnew-ra and -funroll-all-loops
switches. This combination is the fastest one, however ony -fnew-ra is the worst
one.
[uros@localhost test]$ gcc -v
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-110)
===
[uros@localhost test]$ gcc -ffast-math -fomit-frame-pointer -O3 test.c
[uros@localhost test]$ time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
real 0m22.352s
user 0m22.310s
sys 0m0.010s
[uros@localhost test]$ gcc -ffast-math -fomit-frame-pointer -funroll-all-loops
-O3 test.c
[uros@localhost test]$ time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
real 0m19.831s
user 0m19.780s
sys 0m0.020s
===
===
[uros@localhost test]$ gcc -v
Reading specs from /usr/local/lib/gcc-lib/i586-pc-linux-gnu/3.3/specs
Configured with: ../gcc-3.3/configure
Thread model: posix
gcc version 3.3
===
[uros@localhost test]$ gcc -ffast-math -fomit-frame-pointer -O3 test.c
[uros@localhost test]$ time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
real 0m19.408s
user 0m19.320s
sys 0m0.010s
===
[uros@localhost test]$ gcc -ffast-math -fomit-frame-pointer -funroll-all-loops
-O3 test.c
[uros@localhost test]$ time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
real 0m14.518s
user 0m14.470s
sys 0m0.010s
===
[uros@localhost test]$ gcc -ffast-math -fomit-frame-pointer -funroll-all-loops
-fnew-ra -O3 test.c
[uros@localhost test]$ time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
real 0m13.540s
user 0m13.520s
sys 0m0.010s
===
[uros@localhost test]$ gcc -ffast-math -fomit-frame-pointer -fnew-ra -O3 test.c
[uros@localhost test]$ time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
real 0m27.185s
user 0m27.140s
sys 0m0.010s
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (6 preceding siblings ...)
2003-10-30 6:26 ` uros at kss-loka dot si
@ 2004-01-01 4:11 ` pinskia at gcc dot gnu dot org
2004-01-01 10:25 ` hubicka at ucw dot cz
` (8 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-01-01 4:11 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-01-01 04:11 -------
What is weird is that -march=i386 is faster than -march=i686 on a pentium3:
grendel:~/src/gnu/gcctest>gcc -O3 -ffast-math -fomit-frame-pointer pr8126.c -march=i386
grendel:~/src/gnu/gcctest>time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
2.726u 0.000s 0:02.74 99.2% 0+0k 0+0io 2pf+0w
grendel:~/src/gnu/gcctest>time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
2.710u 0.000s 0:02.74 98.9% 0+0k 0+0io 0pf+0w
grendel:~/src/gnu/gcctest>gcc -O3 -ffast-math -fomit-frame-pointer pr8126.c -march=i686
grendel:~/src/gnu/gcctest>time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
2.843u 0.007s 0:02.87 98.9% 0+0k 0+0io 2pf+0w
grendel:~/src/gnu/gcctest>gcc -O3 -ffast-math -fomit-frame-pointer pr8126.c -march=i586
grendel:~/src/gnu/gcctest>time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
2.703u 0.000s 0:02.72 99.2% 0+0k 0+0io 2pf+0w
grendel:~/src/gnu/gcctest>gcc -O3 -ffast-math -fomit-frame-pointer pr8126.c -march=
pentium3
grendel:~/src/gnu/gcctest>time ./a.out
Start?
Stop!
Result = 0.000000, 0.000000, 1.000000
2.843u 0.007s 0:02.87 98.9% 0+0k 0+0io 2pf+0w
Is it looks like a choosing the wrong instruction for pentium3. (pentium4 is different and
does not matter that mcuh).
--
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|0000-00-00 00:00:00 |2004-01-01 04:11:34
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (7 preceding siblings ...)
2004-01-01 4:11 ` pinskia at gcc dot gnu dot org
@ 2004-01-01 10:25 ` hubicka at ucw dot cz
2004-01-03 18:39 ` hubicka at gcc dot gnu dot org
` (7 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: hubicka at ucw dot cz @ 2004-01-01 10:25 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From hubicka at ucw dot cz 2004-01-01 10:25 -------
Subject: Re: [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
>
> ------- Additional Comments From pinskia at gcc dot gnu dot org 2004-01-01 04:11 -------
> What is weird is that -march=i386 is faster than -march=i686 on a pentium3:
> grendel:~/src/gnu/gcctest>gcc -O3 -ffast-math -fomit-frame-pointer pr8126.c -march=i386
> grendel:~/src/gnu/gcctest>time ./a.out
> Start?
> Stop!
> Result = 0.000000, 0.000000, 1.000000
> 2.726u 0.000s 0:02.74 99.2% 0+0k 0+0io 2pf+0w
> grendel:~/src/gnu/gcctest>time ./a.out
> Start?
> Stop!
> Result = 0.000000, 0.000000, 1.000000
> 2.710u 0.000s 0:02.74 98.9% 0+0k 0+0io 0pf+0w
> grendel:~/src/gnu/gcctest>gcc -O3 -ffast-math -fomit-frame-pointer pr8126.c -march=i686
> grendel:~/src/gnu/gcctest>time ./a.out
> Start?
> Stop!
> Result = 0.000000, 0.000000, 1.000000
> 2.843u 0.007s 0:02.87 98.9% 0+0k 0+0io 2pf+0w
> grendel:~/src/gnu/gcctest>gcc -O3 -ffast-math -fomit-frame-pointer pr8126.c -march=i586
> grendel:~/src/gnu/gcctest>time ./a.out
> Start?
> Stop!
> Result = 0.000000, 0.000000, 1.000000
> 2.703u 0.000s 0:02.72 99.2% 0+0k 0+0io 2pf+0w
> grendel:~/src/gnu/gcctest>gcc -O3 -ffast-math -fomit-frame-pointer pr8126.c -march=
> pentium3
> grendel:~/src/gnu/gcctest>time ./a.out
> Start?
> Stop!
> Result = 0.000000, 0.000000, 1.000000
> 2.843u 0.007s 0:02.87 98.9% 0+0k 0+0io 2pf+0w
>
> Is it looks like a choosing the wrong instruction for pentium3. (pentium4 is different and
> does not matter that mcuh).
No, it is the scheduler (you will likely reproduce similar results via
-fno-schedule-insns2). Scheduler does not take into account the stack
register file and reg-stack does not reorder and works by blindly
inserting exchange operations when the code does not match stack nature,
thus we get 100% random results performance wise out of the backend.
The unscheduled code usually fare slightly better as the structure of
original expression trees is still somewhat preserved, but it is still
far fom optimal. There is not much to do on this front in short term,
unfortunately.
I've had limited luck with a patch teaching scheduler that two
consetuctive FP operations are cheaper when the other uses same operand
as destination of the first, but it does not fit very well to current
scheduler model (and it is missdesign). Proper sollution is to
reorganize scheduler core into kind of library and make reg-stack to use
it to fix ordering as needed. I am not planning to dig into it anytime
soon tought, home that the importance of x87 will fade.
Honza
>
> --
> What |Removed |Added
> ----------------------------------------------------------------------------
> Last reconfirmed|0000-00-00 00:00:00 |2004-01-01 04:11:34
> date| |
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
>
> ------- You are receiving this mail because: -------
> You are the assignee for the bug, or are watching the assignee.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (8 preceding siblings ...)
2004-01-01 10:25 ` hubicka at ucw dot cz
@ 2004-01-03 18:39 ` hubicka at gcc dot gnu dot org
2004-01-23 16:58 ` [Bug optimization/8126] [3.3/3.4/3.5 " dhazeghi at yahoo dot com
` (6 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2004-01-03 18:39 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From hubicka at gcc dot gnu dot org 2004-01-03 18:39 -------
We will unlikely redesign reg-stack for this release :(
hope for the best in the future
--
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|3.4.0 |3.5.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug optimization/8126] [3.3/3.4/3.5 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (9 preceding siblings ...)
2004-01-03 18:39 ` hubicka at gcc dot gnu dot org
@ 2004-01-23 16:58 ` dhazeghi at yahoo dot com
2004-09-30 16:28 ` [Bug rtl-optimization/8126] [3.3/3.4/4.0 " pinskia at gcc dot gnu dot org
` (5 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: dhazeghi at yahoo dot com @ 2004-01-23 16:58 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dhazeghi at yahoo dot com 2004-01-23 16:58 -------
Are you currently working on this Jan, or should we unassign it? Thanks.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/8126] [3.3/3.4/4.0 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (10 preceding siblings ...)
2004-01-23 16:58 ` [Bug optimization/8126] [3.3/3.4/3.5 " dhazeghi at yahoo dot com
@ 2004-09-30 16:28 ` pinskia at gcc dot gnu dot org
2004-11-26 9:38 ` uros at gcc dot gnu dot org
` (4 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-09-30 16:28 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-09-30 16:28 -------
Hmm, 4.0.0 is faster and smaller at least on a pentium4.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/8126] [3.3/3.4/4.0 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (11 preceding siblings ...)
2004-09-30 16:28 ` [Bug rtl-optimization/8126] [3.3/3.4/4.0 " pinskia at gcc dot gnu dot org
@ 2004-11-26 9:38 ` uros at gcc dot gnu dot org
2005-01-05 21:53 ` hubicka at gcc dot gnu dot org
` (3 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: uros at gcc dot gnu dot org @ 2004-11-26 9:38 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From uros at gcc dot gnu dot org 2004-11-26 09:38 -------
(In reply to comment #15)
> Hmm, 4.0.0 is faster and smaller at least on a pentium4.
The faster and smaller code is produced because scheduler is disabled for pentium4.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/8126] [3.3/3.4/4.0 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (12 preceding siblings ...)
2004-11-26 9:38 ` uros at gcc dot gnu dot org
@ 2005-01-05 21:53 ` hubicka at gcc dot gnu dot org
2005-01-16 3:35 ` ian at airs dot com
` (2 subsequent siblings)
16 siblings, 0 replies; 17+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-01-05 21:53 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From hubicka at gcc dot gnu dot org 2005-01-05 21:53 -------
I don't see much to do without regstack reorg and I don't have time for that :(
--
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|hubicka at gcc dot gnu dot |unassigned at gcc dot gnu
|org |dot org
Status|ASSIGNED |NEW
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/8126] [3.3/3.4/4.0 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (13 preceding siblings ...)
2005-01-05 21:53 ` hubicka at gcc dot gnu dot org
@ 2005-01-16 3:35 ` ian at airs dot com
2005-01-16 13:52 ` steven at gcc dot gnu dot org
2005-01-27 9:08 ` uros at kss-loka dot si
16 siblings, 0 replies; 17+ messages in thread
From: ian at airs dot com @ 2005-01-16 3:35 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From ian at airs dot com 2005-01-16 03:35 -------
If we're going to mark this as a regression, can somebody pin down the cases
where mainline gcc is slower than gcc 2.95?
On my system it is about 35% faster. But that is on a Pentium 4.
I know that Roger Sayle did some work on reg-stack shuffling, but I don't know
how much that affects this PR, if at all.
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |ian at airs dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/8126] [3.3/3.4/4.0 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (14 preceding siblings ...)
2005-01-16 3:35 ` ian at airs dot com
@ 2005-01-16 13:52 ` steven at gcc dot gnu dot org
2005-01-27 9:08 ` uros at kss-loka dot si
16 siblings, 0 replies; 17+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-16 13:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-16 13:52 -------
I think this is a WONTFIX regression.
As Honza pointed out, interactions between regstack and sched2 can sometimes
produce really odd results. I don't see us produce a "sched3"-like pass for
x87 any time soon. It should not be hard to teach regstack to use the DFA
interface, but realistically I don't think anyone is interested in doing so,
except for Roger Sayle maybe...?
IMHO this is a BS bug, because overall we are not worse for FP at all, and
compared to 2.95.x we are in fact *much* better overall.
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |sayle at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
^ permalink raw reply [flat|nested] 17+ messages in thread
* [Bug rtl-optimization/8126] [3.3/3.4/4.0 regression] Floating point computation far slower in 3.2 than in 2.95
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
` (15 preceding siblings ...)
2005-01-16 13:52 ` steven at gcc dot gnu dot org
@ 2005-01-27 9:08 ` uros at kss-loka dot si
16 siblings, 0 replies; 17+ messages in thread
From: uros at kss-loka dot si @ 2005-01-27 9:08 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From uros at kss-loka dot si 2005-01-27 09:07 -------
I don't think that this has anything to do with regstack and sched2. The fact
is, that for fp-intensive applications, 8 FP regs (either stacked x87 or
non-stack SSE type) is not enough. When there is a shorthage of registers, gcc
starts to swap registers to and from memory.
Please note that reg/reg and reg/mem fops have the same latency/throuhput on P4,
but moving FP registers to and from memory introduces a big performance penalty
and these moves should be minimised as much as possible.
There are some measurements to prove this (-O2 only to avoid fast-math intrinsic
shortcuts, P4-3.2 timings):
a) -march=pentium -mfpmath=387: scheduling and reg-stack interactions:
real 0m34.073s
user 0m33.756s
sys 0m0.018s
b) -march=pentium -msse2 -mfpmath=sse: scheduling and no reg-stack:
real 0m35.063s
user 0m34.674s
sys 0m0.076s
c) -march=pentium4 -mfpmath=387: no scheduling with reg-stack:
real 0m33.720s
user 0m33.348s
sys 0m0.037s
d) -march=pentium4 -mfpmath=sse: no scheduling and no reg-stack:
real 0m35.399s
user 0m35.016s
sys 0m0.035s
The question I would like to ask: is there a functionality in gcc to optimise
register moving, considering the cost of reg/reg vs. reg/mem FP operators and
the cost of register<->mem move?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8126
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2005-01-27 9:08 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20021002075601.8126.o.lauffenburger@topsolid.com>
2003-06-11 22:41 ` [Bug optimization/8126] [3.3/3.4 regression] Floating point computation far slower in 3.2 than in 2.95 pinskia@physics.uc.edu
2003-06-21 1:44 ` dhazeghi at yahoo dot com
2003-06-24 13:23 ` o dot lauffenburger at topsolid dot com
2003-06-24 14:38 ` pinskia at physics dot uc dot edu
2003-07-23 7:02 ` mmitchel at gcc dot gnu dot org
2003-10-16 2:38 ` mmitchel at gcc dot gnu dot org
2003-10-30 6:26 ` uros at kss-loka dot si
2004-01-01 4:11 ` pinskia at gcc dot gnu dot org
2004-01-01 10:25 ` hubicka at ucw dot cz
2004-01-03 18:39 ` hubicka at gcc dot gnu dot org
2004-01-23 16:58 ` [Bug optimization/8126] [3.3/3.4/3.5 " dhazeghi at yahoo dot com
2004-09-30 16:28 ` [Bug rtl-optimization/8126] [3.3/3.4/4.0 " pinskia at gcc dot gnu dot org
2004-11-26 9:38 ` uros at gcc dot gnu dot org
2005-01-05 21:53 ` hubicka at gcc dot gnu dot org
2005-01-16 3:35 ` ian at airs dot com
2005-01-16 13:52 ` steven at gcc dot gnu dot org
2005-01-27 9:08 ` uros at kss-loka dot si
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).