public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/50819] New: missed SLP vectorization
@ 2011-10-21 9:29 vincenzo.innocente at cern dot ch
2011-10-22 12:26 ` [Bug tree-optimization/50819] " irar at il dot ibm.com
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2011-10-21 9:29 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819
Bug #: 50819
Summary: missed SLP vectorization
Classification: Unclassified
Product: gcc
Version: 4.7.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: vincenzo.innocente@cern.ch
in this example sum2 vectorize sum1 does not.
As you may suspect all current code looks more like sum1…
typedef float Value;
struct LorentzVector
{
LorentzVector(Value x=0, Value y=0, Value z=0, Value t=0) :
theX(x),theY(y),theZ(z),theT(t){}
LorentzVector & operator+=(const LorentzVector & a) {
theX += a.theX;
theY += a.theY;
theZ += a.theZ;
theT += a.theT;
return *this;
}
Value theX;
Value theY;
Value theZ;
Value theT;
} __attribute__ ((aligned(16)));
inline LorentzVector
operator+(LorentzVector const & a, LorentzVector const & b) {
return
LorentzVector(a.theX+b.theX,a.theY+b.theY,a.theZ+b.theZ,a.theT+b.theT);
}
inline LorentzVector
operator*(LorentzVector const & a, Value s) {
return LorentzVector(a.theX*s,a.theY*s,a.theZ*s,a.theT*s);
}
inline LorentzVector
operator*(Value s, LorentzVector const & a) {
return a*s;
}
void sum1(LorentzVector & res, Value s, LorentzVector const & v1, LorentzVector
const & v2) {
res += s*(v1+v2);
}
void sum2(LorentzVector & res, Value s, LorentzVector const & v1, LorentzVector
const & v2) {
res = res + s*(v1+v2);
}
c++ -O3 -c FourVec.cc
Vincenzos-MacBook-Pro:ctest innocent$ otool -V -t -v -X FourVec.o | c++filt
sum1(LorentzVector&, float, LorentzVector const&, LorentzVector const&):
movss 0x0c(%rsi),%xmm1
movss 0x08(%rsi),%xmm2
movss 0x04(%rsi),%xmm3
movss (%rsi),%xmm4
addss 0x0c(%rdx),%xmm1
addss 0x08(%rdx),%xmm2
addss 0x04(%rdx),%xmm3
addss (%rdx),%xmm4
mulss %xmm0,%xmm1
mulss %xmm0,%xmm2
mulss %xmm0,%xmm3
mulss %xmm0,%xmm4
addss 0x0c(%rdi),%xmm1
addss 0x08(%rdi),%xmm2
addss 0x04(%rdi),%xmm3
addss (%rdi),%xmm4
movss %xmm1,0x0c(%rdi)
movss %xmm2,0x08(%rdi)
movss %xmm3,0x04(%rdi)
movss %xmm4,(%rdi)
ret
nopl (%rax)
sum2(LorentzVector&, float, LorentzVector const&, LorentzVector const&):
movaps (%rsi),%xmm1
shufps $0x0,%xmm0,%xmm0
addps (%rdx),%xmm1
mulps %xmm1,%xmm0
addps (%rdi),%xmm0
movaps %xmm0,(%rdi)
ret
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/50819] missed SLP vectorization
2011-10-21 9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
@ 2011-10-22 12:26 ` irar at il dot ibm.com
2011-10-22 12:28 ` irar at il dot ibm.com
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: irar at il dot ibm.com @ 2011-10-22 12:26 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819
Ira Rosen <irar at il dot ibm.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2011-10-22
CC| |irar at il dot ibm.com
AssignedTo|unassigned at gcc dot |irar at gcc dot gnu.org
|gnu.org |
Ever Confirmed|0 |1
--- Comment #1 from Ira Rosen <irar at il dot ibm.com> 2011-10-22 12:25:30 UTC ---
SLP data dependence analysis checks that all the loads are before all the
stores in the basic block, and for sum1 we get
res_4(D)->theX = D.2306_31;
D.2305_32 = res_4(D)->theY;
D.2303_34 = D.2289_26 + D.2305_32;
res_4(D)->theY = D.2303_34;
D.2302_35 = res_4(D)->theZ;
D.2300_37 = D.2287_24 + D.2302_35;
res_4(D)->theZ = D.2300_37;
D.2299_38 = res_4(D)->theT;
D.2297_40 = D.2285_22 + D.2299_38;
res_4(D)->theT = D.2297_40;
while for sum2 the loads and stores are not mixed:
D.2391_29 = MEM[(const struct LorentzVector &)res_4(D)].theT;
D.2389_31 = D.2365_22 + D.2391_29;
D.2388_32 = MEM[(const struct LorentzVector &)res_4(D)].theZ;
D.2386_34 = D.2367_24 + D.2388_32;
D.2385_35 = MEM[(const struct LorentzVector &)res_4(D)].theY;
D.2383_37 = D.2369_26 + D.2385_35;
D.2382_38 = MEM[(const struct LorentzVector &)res_4(D)].theX;
D.2380_40 = D.2371_28 + D.2382_38;
res_4(D)->theX = D.2380_40;
res_4(D)->theY = D.2383_37;
res_4(D)->theZ = D.2386_34;
res_4(D)->theT = D.2389_31;
The attached patch relaxes the above check a bit. Since we go through all the
ddrs anyway, we can check the order between the loads and the stores in
vect_analyze_data_ref_dependence. We don't care about independent load-store
pairs, so we only need to add this check to the pairs with unknown dependence.
(Known dependencies are already checked in vect_drs_dependent_in_basic_block).
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/50819] missed SLP vectorization
2011-10-21 9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
2011-10-22 12:26 ` [Bug tree-optimization/50819] " irar at il dot ibm.com
@ 2011-10-22 12:28 ` irar at il dot ibm.com
2011-10-22 14:50 ` vincenzo.innocente at cern dot ch
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: irar at il dot ibm.com @ 2011-10-22 12:28 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819
--- Comment #2 from Ira Rosen <irar at il dot ibm.com> 2011-10-22 12:27:51 UTC ---
Created attachment 25574
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25574
Patch for this PR and also fo PR 50730
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/50819] missed SLP vectorization
2011-10-21 9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
2011-10-22 12:26 ` [Bug tree-optimization/50819] " irar at il dot ibm.com
2011-10-22 12:28 ` irar at il dot ibm.com
@ 2011-10-22 14:50 ` vincenzo.innocente at cern dot ch
2011-10-23 12:14 ` irar at gcc dot gnu.org
2011-11-03 8:51 ` irar at il dot ibm.com
4 siblings, 0 replies; 6+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2011-10-22 14:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819
--- Comment #3 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 2011-10-22 14:50:01 UTC ---
excellent!
thanks Ira for the fast fix.
It does work. No side effect at first look
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/50819] missed SLP vectorization
2011-10-21 9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
` (2 preceding siblings ...)
2011-10-22 14:50 ` vincenzo.innocente at cern dot ch
@ 2011-10-23 12:14 ` irar at gcc dot gnu.org
2011-11-03 8:51 ` irar at il dot ibm.com
4 siblings, 0 replies; 6+ messages in thread
From: irar at gcc dot gnu.org @ 2011-10-23 12:14 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819
--- Comment #4 from irar at gcc dot gnu.org 2011-10-23 12:13:57 UTC ---
Author: irar
Date: Sun Oct 23 12:13:49 2011
New Revision: 180334
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=180334
Log:
PR tree-optimization/50819
* tree-vectorizer.h (vect_analyze_data_ref_dependences): Remove
the last argument.
* tree-vect-loop.c (vect_analyze_loop_2): Update call to
vect_analyze_data_ref_dependences.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence):
Remove the last argument. Check load-after-store dependence
for unknown dependencies in basic blocks.
(vect_analyze_data_ref_dependences): Update call to
vect_analyze_data_ref_dependences.
* tree-vect-patterns.c (vect_recog_widen_shift_pattern): Fix
typo.
* tree-vect-slp.c (vect_bb_vectorizable_with_dependencies):
Remove.
(vect_slp_analyze_bb_1): Update call to
vect_analyze_data_ref_dependences. Don't call
vect_bb_vectorizable_with_dependencies.
Added:
trunk/gcc/testsuite/g++.dg/vect/slp-pr50819.cc
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/g++.dg/vect/vect.exp
trunk/gcc/tree-vect-data-refs.c
trunk/gcc/tree-vect-loop.c
trunk/gcc/tree-vect-patterns.c
trunk/gcc/tree-vect-slp.c
trunk/gcc/tree-vectorizer.h
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/50819] missed SLP vectorization
2011-10-21 9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
` (3 preceding siblings ...)
2011-10-23 12:14 ` irar at gcc dot gnu.org
@ 2011-11-03 8:51 ` irar at il dot ibm.com
4 siblings, 0 replies; 6+ messages in thread
From: irar at il dot ibm.com @ 2011-11-03 8:51 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819
Ira Rosen <irar at il dot ibm.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
--- Comment #5 from Ira Rosen <irar at il dot ibm.com> 2011-11-03 08:50:48 UTC ---
Fixed.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-11-03 8:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-21 9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
2011-10-22 12:26 ` [Bug tree-optimization/50819] " irar at il dot ibm.com
2011-10-22 12:28 ` irar at il dot ibm.com
2011-10-22 14:50 ` vincenzo.innocente at cern dot ch
2011-10-23 12:14 ` irar at gcc dot gnu.org
2011-11-03 8:51 ` irar at il dot ibm.com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).