public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug tree-optimization/40168] missing unrolling/scalarization/reassoc/free [not found] <bug-40168-4@http.gcc.gnu.org/bugzilla/> @ 2013-03-29 10:51 ` Joost.VandeVondele at mat dot ethz.ch 2021-12-25 7:01 ` [Bug tree-optimization/40168] finding common subexpressions pinskia at gcc dot gnu.org 1 sibling, 0 replies; 2+ messages in thread From: Joost.VandeVondele at mat dot ethz.ch @ 2013-03-29 10:51 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40168 Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed|2009-12-18 14:45:13 |2013-03-29 CC| |Joost.VandeVondele at mat | |dot ethz.ch --- Comment #21 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2013-03-29 10:50:49 UTC --- So, the testcase in comment #14 is indeed still (4.9.0) yielding the 324 multiplies for subroutine S2, instead of the more optimal 192 as shown in S1. Ifort also results in 324 multiplies, but is able to do a couple of them with mulpd instead of mulsd. So the common subexpressions are still not found. ^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug tree-optimization/40168] finding common subexpressions [not found] <bug-40168-4@http.gcc.gnu.org/bugzilla/> 2013-03-29 10:51 ` [Bug tree-optimization/40168] missing unrolling/scalarization/reassoc/free Joost.VandeVondele at mat dot ethz.ch @ 2021-12-25 7:01 ` pinskia at gcc dot gnu.org 1 sibling, 0 replies; 2+ messages in thread From: pinskia at gcc dot gnu.org @ 2021-12-25 7:01 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40168 --- Comment #22 from Andrew Pinski <pinskia at gcc dot gnu.org> --- With the trunk on the second testcase, SLP works but we get stuff like: _2 = (*b_420(D))[80]; _4 = (*b_420(D))[79]; _538 = {_2, _4}; _980 = _538 * vect__1.182_559; Shouldn't that just be loading a vector from (*b_420(D))[79] and then doing an VEC_PERM? In a reduced C testcase we get the correct thing: typedef double array[1000]; void f(array *a, array *b, array *c) { double t = (*a)[1] * (*b)[0]; double t1 = (*a)[0] * (*b)[0]; double t2 = (*a)[3] * (*b)[1]; double t3 = (*a)[2] * (*b)[1]; (*c)[0] = t; (*c)[1] = t1; (*c)[2] = t2; (*c)[3] = t3; } ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2021-12-25 7:01 UTC | newest] Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <bug-40168-4@http.gcc.gnu.org/bugzilla/> 2013-03-29 10:51 ` [Bug tree-optimization/40168] missing unrolling/scalarization/reassoc/free Joost.VandeVondele at mat dot ethz.ch 2021-12-25 7:01 ` [Bug tree-optimization/40168] finding common subexpressions pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).