public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/18589] New: could optimize FP multiplies better
@ 2004-11-21 9:28 debian-gcc at lists dot debian dot org
2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: debian-gcc at lists dot debian dot org @ 2004-11-21 9:28 UTC (permalink / raw)
To: gcc-bugs
[forwarded from http://bugs.debian.org/268115]
Matthias
The bug submitter writes:
compiling this function:
double baz(double foo, double bar)
{
return foo*foo*foo*foo*bar*bar*bar*bar;
}
on amd64 with -O6 -ffast-math, gcc emits this code:
foo.o: file format elf64-x86-64
Disassembly of section .text:
... (some similar functions that I was messing around with) ...
0000000000000050 <ddbar>:
50: f2 0f 59 c0 mulsd %xmm0,%xmm0
54: f2 0f 59 c0 mulsd %xmm0,%xmm0
58: f2 0f 59 c1 mulsd %xmm1,%xmm0
5c: f2 0f 59 c1 mulsd %xmm1,%xmm0
60: f2 0f 59 c1 mulsd %xmm1,%xmm0
64: f2 0f 59 c1 mulsd %xmm1,%xmm0
68: c3 retq
So, it notices that it can do foo*foo*foo*foo with two mulsd instructions,
but it misses the same optimization for bar*bar*bar*bar. It would save one
FP multiply overall to do:
mulsd %xmm0, %xmm0
mulsd %xmm1, %xmm1
mulsd %xmm0, %xmm0
mulsd %xmm1, %xmm1
mulsd %xmm1, %xmm0
retq
Also, the two non-dependent muls could run in parallel.
Without -ffast-math, of course, gcc can't take advantage of the laws of
arithmetic like that and has to do all the multiplies the straightforward
way.
--
Summary: could optimize FP multiplies better
Product: gcc
Version: 3.4.2
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: debian-gcc at lists dot debian dot org
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: amd64-linux
GCC host triplet: amd64-linux
GCC target triplet: amd64-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/18589] could optimize FP multiplies better
2004-11-21 9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org
@ 2004-11-21 14:24 ` pinskia at gcc dot gnu dot org
2005-01-12 6:50 ` [Bug tree-optimization/18589] " pinskia at gcc dot gnu dot org
2005-07-05 19:37 ` pinskia at gcc dot gnu dot org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-21 14:24 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-21 14:24 -------
Confirmed.
Actually the most optimial code would be:
mulsd %xmm1, %xmm0
mulsd %xmm0, %xmm0
mulsd %xmm0, %xmm0
aka
(foo*bar)^4
--
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Keywords| |missed-optimization
Last reconfirmed|0000-00-00 00:00:00 |2004-11-21 14:24:08
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/18589] could optimize FP multiplies better
2004-11-21 9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org
2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org
@ 2005-01-12 6:50 ` pinskia at gcc dot gnu dot org
2005-07-05 19:37 ` pinskia at gcc dot gnu dot org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-12 6:50 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-12 06:50 -------
This is actually not a target issue, it can be shown on ppc also and other targets including x86.
doing (f1*f2)^2^2 will be the best every where as it is only three instructions and it would take the
same time as what is proposed if there are two FPU units but what I said is the smallest and fastest
version.
--
What |Removed |Added
----------------------------------------------------------------------------
Component|target |tree-optimization
GCC build triplet|amd64-linux |
GCC host triplet|amd64-linux |
GCC target triplet|amd64-linux |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/18589] could optimize FP multiplies better
2004-11-21 9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org
2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org
2005-01-12 6:50 ` [Bug tree-optimization/18589] " pinskia at gcc dot gnu dot org
@ 2005-07-05 19:37 ` pinskia at gcc dot gnu dot org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-07-05 19:37 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-07-05 19:37 -------
I think PR 22312 mentions what the current problem with reassoc is (well once I submit the patch to
introduce reassociation for fp).
--
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |22312
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-07-05 19:37 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-21 9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org
2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org
2005-01-12 6:50 ` [Bug tree-optimization/18589] " pinskia at gcc dot gnu dot org
2005-07-05 19:37 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).