* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
@ 2007-03-18 9:50 ` pinskia at gcc dot gnu dot org
2007-03-18 10:20 ` dominiq at lps dot ens dot fr
` (17 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-03-18 9:50 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from pinskia at gcc dot gnu dot org 2007-03-18 09:49 -------
The only reason why cexp is slow on PPC darwin is because the ABI is stupid.
Complex float arguments are passed via the GPR and returned also the same way
instead of via the FPRs. So you will get a transfer of registers. This is
also true of PPC64 darwin, why they made the same mistake twice I have no idea,
guess they did not expect people to use complex that much.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
2007-03-18 9:50 ` [Bug middle-end/31249] " pinskia at gcc dot gnu dot org
@ 2007-03-18 10:20 ` dominiq at lps dot ens dot fr
2007-03-19 9:28 ` dominiq at lps dot ens dot fr
` (16 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: dominiq at lps dot ens dot fr @ 2007-03-18 10:20 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from dominiq at lps dot ens dot fr 2007-03-18 10:20 -------
Andrew,
Thanks for the answer. Additional timings for AMD Opteron(tm) Processor 250,
2.4Ghz:
Target: x86_64-unknown-linux-gnu
...
gcc version 4.3.0 20061231 (experimental)
[tocata] test/fortran> gfc -O3 sincos.f90
[tocata] test/fortran> time a.out
-6.324121691031215E-002 -2.934957388823078E-003
19.847u 0.001s 0:20.41 97.2% 0+0k 0+0io 0pf+0w
[tocata] test/fortran> gfc -O3 sincos_o.f90
[tocata] test/fortran> time a.out
-6.324121619598655E-002 -2.934957388823078E-003
19.793u 0.000s 0:19.80 99.9% 0+0k 0+0io 0pf+0w
[tocata] test/fortran> gfc -O3 cexp.f90
[tocata] test/fortran> time a.out
-6.324121619598655E-002 -2.934957388823078E-003
15.613u 0.000s 0:15.63 99.8% 0+0k 0+0io 0pf+0w
sin+cos is not optimized as cexpi.
Target: i386-pc-linux-gnu
...
gcc version 4.3.0 20070225 (experimental)
[tocata] test/fortran> gfc32 -Wa,-32 -O3 -fdump-tree-optimized sincos.f90
[tocata] test/fortran> time a.out
-6.324122144403047E-002 -2.934963088285132E-003
10.757u 0.000s 0:10.76 99.9% 0+0k 0+0io 0pf+0w
[tocata] test/fortran> gfc32 -Wa,-32 -O3 -fdump-tree-optimized sincos_o.f90
tocata] test/fortran> time a.out
-6.324122124732012E-002 -2.934963117388848E-003
7.291u 0.001s 0:07.47 97.5% 0+0k 0+0io 4pf+0w
[tocata] test/fortran> gfc32 -Wa,-32 -O3 -fdump-tree-optimized cexp.f90
[tocata] test/fortran> time a.out
-6.324122124732012E-002 -2.934963117388848E-003
11.412u 0.000s 0:11.41 100.0% 0+0k 0+0io 0pf+0w
sin+cos is optimized as cexpi which is faster than cexp -> real optimization!
The i386 code is almost twice as fast as the x86_64 one.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
2007-03-18 9:50 ` [Bug middle-end/31249] " pinskia at gcc dot gnu dot org
2007-03-18 10:20 ` dominiq at lps dot ens dot fr
@ 2007-03-19 9:28 ` dominiq at lps dot ens dot fr
2007-03-19 10:43 ` rguenth at gcc dot gnu dot org
` (15 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: dominiq at lps dot ens dot fr @ 2007-03-19 9:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from dominiq at lps dot ens dot fr 2007-03-19 09:28 -------
BTW, did I miss an option to turn this optimization off?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (2 preceding siblings ...)
2007-03-19 9:28 ` dominiq at lps dot ens dot fr
@ 2007-03-19 10:43 ` rguenth at gcc dot gnu dot org
2007-03-19 12:44 ` dominiq at lps dot ens dot fr
` (14 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-03-19 10:43 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from rguenth at gcc dot gnu dot org 2007-03-19 10:43 -------
There is no option to turn it off. But for !TARGET_C99_FUNCTIONS and
!TARGET_HAS_SINCOS targets it's off. Usually (in fact, for every libm I looked
into), cexp is implemented as
complex double cexp (complex double x)
{
double cos = cos (imag(x));
double sin = sin (imag(x));
double e = 1;
if (real(x) != 0)
e = exp (real(x));
...
possibly computing cos and sin in an efficient way (using sincos). So
cexp () should be never slower than calling sin () and cos (). If the ABI
were not stupid of course ;)
Does darwin have a sincos() library function?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (3 preceding siblings ...)
2007-03-19 10:43 ` rguenth at gcc dot gnu dot org
@ 2007-03-19 12:44 ` dominiq at lps dot ens dot fr
2007-03-19 17:52 ` Andrew Pinski
2007-03-19 17:53 ` pinskia at gmail dot com
` (13 subsequent siblings)
18 siblings, 1 reply; 21+ messages in thread
From: dominiq at lps dot ens dot fr @ 2007-03-19 12:44 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from dominiq at lps dot ens dot fr 2007-03-19 12:43 -------
> There is no option to turn it off. But for !TARGET_C99_FUNCTIONS and
> !TARGET_HAS_SINCOS targets it's off.
>From my understanding of the thread
http://gcc.gnu.org/ml/gcc/2007-03/msg00639.html
if !TARGET_64BIT, then TARGET_C99_FUNCTIONS depends on
darwin_macosx_version_min which seems presently default to 10.1.
So TARGET_C99_FUNCTIONS seems to be set to 0 at least for a G5 under
OSX 10.3 and a G4 under 10.4.
> Does darwin have a sincos() library function?
I don't know. If there is no answer in this PR, I can ask the
question on gcc@gcc.gnu.org.
>From the behavior reported in PR30969, PR30980, and PR31161, it seems
that the optimization is off for gcc, but on for g++ and gfortran,
though I cannot figure out why.
> If the ABI were not stupid of course ;)
Since sin() and cos() are non trivial functions, I am very surprised
that a wrong API makes a 50% difference.
If the API is so time consuming, why not inline sin() and cos()?
In addition a decent optimizer should be able to eliminate redundant
part of the codes, making the use of a sincos() function not necessary,
or am I too naive?
What is the best way to collect data on the different platform
to evaluate how this optimization is really working?
I have only access to OSX and Intel or AMD64 under Linux.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-19 12:44 ` dominiq at lps dot ens dot fr
@ 2007-03-19 17:52 ` Andrew Pinski
0 siblings, 0 replies; 21+ messages in thread
From: Andrew Pinski @ 2007-03-19 17:52 UTC (permalink / raw)
To: gcc-bugzilla; +Cc: gcc-bugs
On 19 Mar 2007 12:43:49 -0000, dominiq at lps dot ens dot fr
<gcc-bugzilla@gcc.gnu.org> wrote:
>
> Since sin() and cos() are non trivial functions, I am very surprised
> that a wrong API makes a 50% difference.
Well Here is how it can make a 50% difference (at least on the Cell,
the 970 has less of a restriction and only the dispatch group is
rejected). Modern PowerPC processors like not to store stuff to the
stack and then load it again with in a number of cycles (cell is
around 50 cycles while the 970 is just within a dispatch group).
Transfering between the integer register set and the floating point
register set can only be done via memory so you will get a LHS or a
LRU reject (depending on what processor you are on). This can either
cause a 50 cycle delay or reject of the dispatch group (the later can
cause multiple rejects). The number of cycles used up by this issue
can add up with both sides of the function having this hazard.
Thanks,
Andrew Pinski
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (4 preceding siblings ...)
2007-03-19 12:44 ` dominiq at lps dot ens dot fr
@ 2007-03-19 17:53 ` pinskia at gmail dot com
2007-03-20 11:04 ` rguenth at gcc dot gnu dot org
` (12 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: pinskia at gmail dot com @ 2007-03-19 17:53 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from pinskia at gmail dot com 2007-03-19 17:52 -------
Subject: Re: pseudo-optimzation with sincos/cexpi
On 19 Mar 2007 12:43:49 -0000, dominiq at lps dot ens dot fr
<gcc-bugzilla@gcc.gnu.org> wrote:
>
> Since sin() and cos() are non trivial functions, I am very surprised
> that a wrong API makes a 50% difference.
Well Here is how it can make a 50% difference (at least on the Cell,
the 970 has less of a restriction and only the dispatch group is
rejected). Modern PowerPC processors like not to store stuff to the
stack and then load it again with in a number of cycles (cell is
around 50 cycles while the 970 is just within a dispatch group).
Transfering between the integer register set and the floating point
register set can only be done via memory so you will get a LHS or a
LRU reject (depending on what processor you are on). This can either
cause a 50 cycle delay or reject of the dispatch group (the later can
cause multiple rejects). The number of cycles used up by this issue
can add up with both sides of the function having this hazard.
Thanks,
Andrew Pinski
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (5 preceding siblings ...)
2007-03-19 17:53 ` pinskia at gmail dot com
@ 2007-03-20 11:04 ` rguenth at gcc dot gnu dot org
2007-03-20 13:57 ` dominiq at lps dot ens dot fr
` (11 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-03-20 11:04 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from rguenth at gcc dot gnu dot org 2007-03-20 11:04 -------
I agree it's surprising to get user-visible effects with the
TARGET_C99_FUNCTIONS
difference between the frontends, but they are (supposed to) providing C99
runtime completion by their runtime libraries. And they rely on full C99
support.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (6 preceding siblings ...)
2007-03-20 11:04 ` rguenth at gcc dot gnu dot org
@ 2007-03-20 13:57 ` dominiq at lps dot ens dot fr
2007-03-20 14:03 ` dominiq at lps dot ens dot fr
` (10 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: dominiq at lps dot ens dot fr @ 2007-03-20 13:57 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from dominiq at lps dot ens dot fr 2007-03-20 13:57 -------
> The only reason why cexp is slow on PPC darwin is because the ABI is stupid.
> Complex float arguments are passed via the GPR and returned also the same way
> instead of via the FPRs. So you will get a transfer of registers. This is
> also true of PPC64 darwin, why they made the same mistake twice I have no idea,
> guess they did not expect people to use complex that much.
Is this also true for complex double on 32 bit architectures (i.e., 4 GPRs)
or do you mean the GPR is used to pass a pointer?
> ... . The number of cycles used up by this issue
> can add up with both sides of the function having this hazard.
You are comforting my prejudice against using procedures in critical loops.
Now if you cannot convince darwin people to fix the problem, I cannot how I
could. My short term interests are:
(1) to understand how this reverse optimization is triggered.
(2) to know what are the non-Linux platform that are affected beside
Darwin.
(3) to get a work around less hackish that what I did in my first example.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (7 preceding siblings ...)
2007-03-20 13:57 ` dominiq at lps dot ens dot fr
@ 2007-03-20 14:03 ` dominiq at lps dot ens dot fr
2007-03-20 14:26 ` rguenth at gcc dot gnu dot org
` (9 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: dominiq at lps dot ens dot fr @ 2007-03-20 14:03 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from dominiq at lps dot ens dot fr 2007-03-20 14:03 -------
> I agree it's surprising to get user-visible effects with the
> TARGET_C99_FUNCTIONS difference between the frontends,
> but they are (supposed to) providing C99 runtime completion
> by their runtime libraries. And they rely on full C99 support.
Do you mean that g++ and gfortran set TARGET_C99_FUNCTIONS on
their own?
If yes, the cexpi optimization should probably another condition:
what is the point to replace sin+cos by a call to a function
calling sin+cos?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (8 preceding siblings ...)
2007-03-20 14:03 ` dominiq at lps dot ens dot fr
@ 2007-03-20 14:26 ` rguenth at gcc dot gnu dot org
2007-03-20 14:58 ` dominiq at lps dot ens dot fr
` (8 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-03-20 14:26 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from rguenth at gcc dot gnu dot org 2007-03-20 14:26 -------
That sin+cos is practically sincos (so you get one for free). Just not every
library exports that sincos.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (9 preceding siblings ...)
2007-03-20 14:26 ` rguenth at gcc dot gnu dot org
@ 2007-03-20 14:58 ` dominiq at lps dot ens dot fr
2007-03-20 15:07 ` rguenth at gcc dot gnu dot org
` (7 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: dominiq at lps dot ens dot fr @ 2007-03-20 14:58 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from dominiq at lps dot ens dot fr 2007-03-20 14:57 -------
Subject: Re: pseudo-optimzation with sincos/cexpi
> That sin+cos is practically sincos (so you get one for free). Just not every
> library exports that sincos.
Does not this assume that it exists a real sincos(x) and not a faked one
thorugh cexp((1,x))?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (10 preceding siblings ...)
2007-03-20 14:58 ` dominiq at lps dot ens dot fr
@ 2007-03-20 15:07 ` rguenth at gcc dot gnu dot org
2007-03-20 16:09 ` dominiq at lps dot ens dot fr
` (6 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-03-20 15:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from rguenth at gcc dot gnu dot org 2007-03-20 15:06 -------
Depends on how you name it ;) You can propose that we only enable sincos
transformation if TARGET_HAS_SINCOS is set, I wouldn't necessarily object to
that.
(The targets I care for have a sincos)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (11 preceding siblings ...)
2007-03-20 15:07 ` rguenth at gcc dot gnu dot org
@ 2007-03-20 16:09 ` dominiq at lps dot ens dot fr
2007-03-20 16:12 ` rguenth at gcc dot gnu dot org
` (5 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: dominiq at lps dot ens dot fr @ 2007-03-20 16:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from dominiq at lps dot ens dot fr 2007-03-20 16:08 -------
> You can propose that we only enable sincos transformation
> if TARGET_HAS_SINCOS is set, I wouldn't necessarily object to
> that. (The targets I care for have a sincos)
Sound reasonable: replacing:
return (TARGET_HAS_SINCOS
|| TARGET_C99_FUNCTIONS)
&& optimize;
by
return TARGET_HAS_SINCOS
&& optimize;
in gcc/tree-ssa-math-opts.c, isn't it?
I can even do a preliminary test to check that it
does not break anything.
What's bother me is that i suspect the problem is present
for all non-Linux platforms and to have no feedback from them.
If you have some idea about the way to trigger their interest,
it would be nice.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (12 preceding siblings ...)
2007-03-20 16:09 ` dominiq at lps dot ens dot fr
@ 2007-03-20 16:12 ` rguenth at gcc dot gnu dot org
2007-03-20 16:14 ` pinskia at gcc dot gnu dot org
` (4 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-03-20 16:12 UTC (permalink / raw)
To: gcc-bugs
------- Comment #14 from rguenth at gcc dot gnu dot org 2007-03-20 16:12 -------
The recommended way is to post a message to gcc@gcc.gnu.org or
gcc-patches@gcc.gnu.org
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (13 preceding siblings ...)
2007-03-20 16:12 ` rguenth at gcc dot gnu dot org
@ 2007-03-20 16:14 ` pinskia at gcc dot gnu dot org
2007-03-21 15:36 ` dominiq at lps dot ens dot fr
` (3 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-03-20 16:14 UTC (permalink / raw)
To: gcc-bugs
------- Comment #15 from pinskia at gcc dot gnu dot org 2007-03-20 16:14 -------
> Is this also true for complex double on 32 bit architectures (i.e., 4 GPRs)
> or do you mean the GPR is used to pass a pointer?
4 GPRS
Yes this was a stupid decission on Apple's part for not looking at fixing GCC
before setting an ABI.
And really this problem is only with PPC no other target has this stupid ABI
issue.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (14 preceding siblings ...)
2007-03-20 16:14 ` pinskia at gcc dot gnu dot org
@ 2007-03-21 15:36 ` dominiq at lps dot ens dot fr
2007-03-21 15:57 ` rguenth at gcc dot gnu dot org
` (2 subsequent siblings)
18 siblings, 0 replies; 21+ messages in thread
From: dominiq at lps dot ens dot fr @ 2007-03-21 15:36 UTC (permalink / raw)
To: gcc-bugs
------- Comment #16 from dominiq at lps dot ens dot fr 2007-03-21 15:36 -------
> The recommended way is to post a message to gcc@gcc.gnu.org or
> gcc-patches@gcc.gnu.org
I'll follow your advice, but before I'ld like some feedback about what follows.
I have applied the following patch
--- gcc-4.3-20070316/gcc/tree-ssa-math-opts.c Thu Mar 8 20:02:51 2007
+++ gcc-4.3-20070317/gcc/tree-ssa-math-opts.c Tue Mar 20 17:21:16 2007
@@ -704,9 +704,7 @@
gate_cse_sincos (void)
{
/* Make sure we have either sincos or cexp. */
- return (TARGET_HAS_SINCOS
- || TARGET_C99_FUNCTIONS)
- && optimize;
+ return TARGET_HAS_SINCOS && optimize;
}
struct tree_opt_pass pass_cse_sincos =
I have regtested it with no change in the reports for gcc, g++, gfortran,
and objc. The timings before and after are
before after
-O0 -O1 % -O0 -O1 %
g++-4 sincos_o.c 6.2 9.6 +55 6.3 5.6 -11
gfc sincos_o.f90 6.3 9.6 +52 6.4 5.5 -14
for the following C and Fotran tests:
[karma] bug/timing% cat sincos_o.c
#include <math.h>
#include <stdio.h>
int main()
{
long n = 1000000;
long i;
double mo = -1.0;
double pi = acos(mo);
double sc = 0.0;
double ss = 0.0;
double t = 0.0;
double dt = pi/n;
printf("%.17g \n", pi);
printf("%.17g \n", dt);
for (i=0; i< 40*n; i++) {
sc += cos(t);
ss += sin(t);
t += dt;
}
printf("%.17g %.17g \n", sc, ss);
}
[karma] bug/timing% cat sincos_o.f90
integer, parameter :: n=1000000
integer :: i
real(8) :: pi, ss, sc, t, dt
pi = acos(-1.0d0)
dt=pi/n
sc=0
ss=0
t=0
do i= 1, 40*n
sc = sc + cos(t)
ss = ss + sin(t)
t = t+dt
end do
print *, sc, ss
end
So from the PPC Darwin point of view, everything is working as expected.
I have done a search on the regtest list based on
FAIL: gcc.dg/builtins-59.c scan-tree-dump __builtin_cexpi
assuming that platforms that do not pass it, are likely to have not
__builtin_cexpi, thus are exposed to the same bad optimization.
I have found the following list tested on a regular basis:
powerpc-apple-darwin8.5.0
hppa2.0w-hp-hpux11.11
v850-unknown-elf
sparc-unknown-elf
sh-unknown-elf
powerpc-unknown-eabisim
mips-unknown-elf
m32r-unknown-elf
m32c-unknown-elf
avr-unknown-none
frv-unknown-elf
arm-unknown-elf
cris-axis-elf
arm-none-eabi
As far as I can tell, only the first two are tested against g++ and
gfortran.
Note that the list does not include powerpc64-apple-darwin8.8.0.
So it seems that it has __builtin_cexpi.
I don't know what will be the final decision about the proposed patch,
but there is no "emergency" since I can use it for my coming weekly builds.
I would prefer to have some feedback from the listed platforms before
seeing the patch applied.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (15 preceding siblings ...)
2007-03-21 15:36 ` dominiq at lps dot ens dot fr
@ 2007-03-21 15:57 ` rguenth at gcc dot gnu dot org
2007-03-21 16:09 ` dominiq at lps dot ens dot fr
2009-12-04 17:10 ` [Bug middle-end/31249] pseudo-optimization " dominiq at lps dot ens dot fr
18 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-03-21 15:57 UTC (permalink / raw)
To: gcc-bugs
------- Comment #17 from rguenth at gcc dot gnu dot org 2007-03-21 15:57 -------
It would be nice to know whether darwin does not implement cexp in an optimal
way (special casing zero real part and dispatching to a sincos equivalent for
the imaginary part) or if the issue is only a bad ABI for complex values.
Otherwise the proposal looks ok, those other targets are mostly embedded ones
and as such don't care too much about optimized sincos probably.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimzation with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (16 preceding siblings ...)
2007-03-21 15:57 ` rguenth at gcc dot gnu dot org
@ 2007-03-21 16:09 ` dominiq at lps dot ens dot fr
2009-12-04 17:10 ` [Bug middle-end/31249] pseudo-optimization " dominiq at lps dot ens dot fr
18 siblings, 0 replies; 21+ messages in thread
From: dominiq at lps dot ens dot fr @ 2007-03-21 16:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #18 from dominiq at lps dot ens dot fr 2007-03-21 16:09 -------
> It would be nice to know whether darwin does not implement cexp in an optimal way ...
I have forgotten to mention that I did some (quick) profiling: cexp seens
trivially implemented.
It calls sin, cos and exp + the ABI problem mentionned by Andrew. It seems
that any
inplementation of cexp calling sin and cos would at best lead to a draw and
more likely to
a regression in timing. Now I cannot rule out that on some platforms cexp is
implemented
in a clever way, using some kind of sincos alrgorithm (this is what I would
like to know).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug middle-end/31249] pseudo-optimization with sincos/cexpi
2007-03-17 21:11 [Bug middle-end/31249] New: pseudo-optimzation with sincos/cexpi dominiq at lps dot ens dot fr
` (17 preceding siblings ...)
2007-03-21 16:09 ` dominiq at lps dot ens dot fr
@ 2009-12-04 17:10 ` dominiq at lps dot ens dot fr
18 siblings, 0 replies; 21+ messages in thread
From: dominiq at lps dot ens dot fr @ 2009-12-04 17:10 UTC (permalink / raw)
To: gcc-bugs
--
dominiq at lps dot ens dot fr changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |minor
Summary|pseudo-optimzation with |pseudo-optimization with
|sincos/cexpi |sincos/cexpi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31249
^ permalink raw reply [flat|nested] 21+ messages in thread