public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/18589] New: could optimize FP multiplies better
@ 2004-11-21  9:28 debian-gcc at lists dot debian dot org
  2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: debian-gcc at lists dot debian dot org @ 2004-11-21  9:28 UTC (permalink / raw)
  To: gcc-bugs

[forwarded from http://bugs.debian.org/268115]

  Matthias

The bug submitter writes:


compiling this function:
double baz(double foo, double bar)
{
   return foo*foo*foo*foo*bar*bar*bar*bar;
}

 on amd64 with -O6 -ffast-math, gcc emits this code:

foo.o:     file format elf64-x86-64

Disassembly of section .text:

... (some similar functions that I was messing around with) ...
0000000000000050 <ddbar>:
  50:	f2 0f 59 c0          	mulsd  %xmm0,%xmm0
  54:	f2 0f 59 c0          	mulsd  %xmm0,%xmm0
  58:	f2 0f 59 c1          	mulsd  %xmm1,%xmm0
  5c:	f2 0f 59 c1          	mulsd  %xmm1,%xmm0
  60:	f2 0f 59 c1          	mulsd  %xmm1,%xmm0
  64:	f2 0f 59 c1          	mulsd  %xmm1,%xmm0
  68:	c3                   	retq   


 So, it notices that it can do foo*foo*foo*foo with two mulsd instructions,
but it misses the same optimization for bar*bar*bar*bar.  It would save one
FP multiply overall to do:
mulsd %xmm0, %xmm0
mulsd %xmm1, %xmm1
mulsd %xmm0, %xmm0
mulsd %xmm1, %xmm1
mulsd %xmm1, %xmm0
retq
 Also, the two non-dependent muls could run in parallel.
 
 Without -ffast-math, of course, gcc can't take advantage of the laws of
arithmetic like that and has to do all the multiplies the straightforward
way.

-- 
           Summary: could optimize FP multiplies better
           Product: gcc
           Version: 3.4.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: debian-gcc at lists dot debian dot org
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: amd64-linux
  GCC host triplet: amd64-linux
GCC target triplet: amd64-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/18589] could optimize FP multiplies better
  2004-11-21  9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org
@ 2004-11-21 14:24 ` pinskia at gcc dot gnu dot org
  2005-01-12  6:50 ` [Bug tree-optimization/18589] " pinskia at gcc dot gnu dot org
  2005-07-05 19:37 ` pinskia at gcc dot gnu dot org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-21 14:24 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-11-21 14:24 -------
Confirmed.
Actually the most optimial code would be:
        mulsd   %xmm1, %xmm0
        mulsd   %xmm0, %xmm0
        mulsd   %xmm0, %xmm0
aka
(foo*bar)^4

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
           Keywords|                            |missed-optimization
   Last reconfirmed|0000-00-00 00:00:00         |2004-11-21 14:24:08
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/18589] could optimize FP multiplies better
  2004-11-21  9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org
  2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org
@ 2005-01-12  6:50 ` pinskia at gcc dot gnu dot org
  2005-07-05 19:37 ` pinskia at gcc dot gnu dot org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-12  6:50 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-01-12 06:50 -------
This is actually not a target issue, it can be shown on ppc also and other targets including x86.
doing (f1*f2)^2^2 will be the best every where as it is only three instructions and it would take the 
same time as what is proposed if there are two FPU units  but what I said is the smallest and fastest 
version.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |tree-optimization
  GCC build triplet|amd64-linux                 |
   GCC host triplet|amd64-linux                 |
 GCC target triplet|amd64-linux                 |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/18589] could optimize FP multiplies better
  2004-11-21  9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org
  2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org
  2005-01-12  6:50 ` [Bug tree-optimization/18589] " pinskia at gcc dot gnu dot org
@ 2005-07-05 19:37 ` pinskia at gcc dot gnu dot org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-07-05 19:37 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-07-05 19:37 -------
I think PR 22312 mentions what the current problem with reassoc is (well once I submit the patch to 
introduce reassociation for fp).

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
  BugsThisDependsOn|                            |22312


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-07-05 19:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-21  9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org
2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org
2005-01-12  6:50 ` [Bug tree-optimization/18589] " pinskia at gcc dot gnu dot org
2005-07-05 19:37 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).