public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug target/18589] New: could optimize FP multiplies better @ 2004-11-21 9:28 debian-gcc at lists dot debian dot org 2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org ` (2 more replies) 0 siblings, 3 replies; 4+ messages in thread From: debian-gcc at lists dot debian dot org @ 2004-11-21 9:28 UTC (permalink / raw) To: gcc-bugs [forwarded from http://bugs.debian.org/268115] Matthias The bug submitter writes: compiling this function: double baz(double foo, double bar) { return foo*foo*foo*foo*bar*bar*bar*bar; } on amd64 with -O6 -ffast-math, gcc emits this code: foo.o: file format elf64-x86-64 Disassembly of section .text: ... (some similar functions that I was messing around with) ... 0000000000000050 <ddbar>: 50: f2 0f 59 c0 mulsd %xmm0,%xmm0 54: f2 0f 59 c0 mulsd %xmm0,%xmm0 58: f2 0f 59 c1 mulsd %xmm1,%xmm0 5c: f2 0f 59 c1 mulsd %xmm1,%xmm0 60: f2 0f 59 c1 mulsd %xmm1,%xmm0 64: f2 0f 59 c1 mulsd %xmm1,%xmm0 68: c3 retq So, it notices that it can do foo*foo*foo*foo with two mulsd instructions, but it misses the same optimization for bar*bar*bar*bar. It would save one FP multiply overall to do: mulsd %xmm0, %xmm0 mulsd %xmm1, %xmm1 mulsd %xmm0, %xmm0 mulsd %xmm1, %xmm1 mulsd %xmm1, %xmm0 retq Also, the two non-dependent muls could run in parallel. Without -ffast-math, of course, gcc can't take advantage of the laws of arithmetic like that and has to do all the multiplies the straightforward way. -- Summary: could optimize FP multiplies better Product: gcc Version: 3.4.2 Status: UNCONFIRMED Severity: normal Priority: P2 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: debian-gcc at lists dot debian dot org CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: amd64-linux GCC host triplet: amd64-linux GCC target triplet: amd64-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug target/18589] could optimize FP multiplies better 2004-11-21 9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org @ 2004-11-21 14:24 ` pinskia at gcc dot gnu dot org 2005-01-12 6:50 ` [Bug tree-optimization/18589] " pinskia at gcc dot gnu dot org 2005-07-05 19:37 ` pinskia at gcc dot gnu dot org 2 siblings, 0 replies; 4+ messages in thread From: pinskia at gcc dot gnu dot org @ 2004-11-21 14:24 UTC (permalink / raw) To: gcc-bugs ------- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-21 14:24 ------- Confirmed. Actually the most optimial code would be: mulsd %xmm1, %xmm0 mulsd %xmm0, %xmm0 mulsd %xmm0, %xmm0 aka (foo*bar)^4 -- What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement Status|UNCONFIRMED |NEW Ever Confirmed| |1 Keywords| |missed-optimization Last reconfirmed|0000-00-00 00:00:00 |2004-11-21 14:24:08 date| | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/18589] could optimize FP multiplies better 2004-11-21 9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org 2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org @ 2005-01-12 6:50 ` pinskia at gcc dot gnu dot org 2005-07-05 19:37 ` pinskia at gcc dot gnu dot org 2 siblings, 0 replies; 4+ messages in thread From: pinskia at gcc dot gnu dot org @ 2005-01-12 6:50 UTC (permalink / raw) To: gcc-bugs ------- Additional Comments From pinskia at gcc dot gnu dot org 2005-01-12 06:50 ------- This is actually not a target issue, it can be shown on ppc also and other targets including x86. doing (f1*f2)^2^2 will be the best every where as it is only three instructions and it would take the same time as what is proposed if there are two FPU units but what I said is the smallest and fastest version. -- What |Removed |Added ---------------------------------------------------------------------------- Component|target |tree-optimization GCC build triplet|amd64-linux | GCC host triplet|amd64-linux | GCC target triplet|amd64-linux | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/18589] could optimize FP multiplies better 2004-11-21 9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org 2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org 2005-01-12 6:50 ` [Bug tree-optimization/18589] " pinskia at gcc dot gnu dot org @ 2005-07-05 19:37 ` pinskia at gcc dot gnu dot org 2 siblings, 0 replies; 4+ messages in thread From: pinskia at gcc dot gnu dot org @ 2005-07-05 19:37 UTC (permalink / raw) To: gcc-bugs ------- Additional Comments From pinskia at gcc dot gnu dot org 2005-07-05 19:37 ------- I think PR 22312 mentions what the current problem with reassoc is (well once I submit the patch to introduce reassociation for fp). -- What |Removed |Added ---------------------------------------------------------------------------- BugsThisDependsOn| |22312 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18589 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-07-05 19:37 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-11-21 9:28 [Bug target/18589] New: could optimize FP multiplies better debian-gcc at lists dot debian dot org 2004-11-21 14:24 ` [Bug target/18589] " pinskia at gcc dot gnu dot org 2005-01-12 6:50 ` [Bug tree-optimization/18589] " pinskia at gcc dot gnu dot org 2005-07-05 19:37 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).