public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/50819] New: missed SLP vectorization
@ 2011-10-21  9:29 vincenzo.innocente at cern dot ch
  2011-10-22 12:26 ` [Bug tree-optimization/50819] " irar at il dot ibm.com
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2011-10-21  9:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819

             Bug #: 50819
           Summary: missed SLP vectorization
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: vincenzo.innocente@cern.ch


in this example sum2 vectorize sum1 does not.
As you may suspect all current code looks more like sum1…

typedef float Value;

struct LorentzVector
{

  LorentzVector(Value x=0, Value  y=0, Value  z=0, Value  t=0) :
theX(x),theY(y),theZ(z),theT(t){} 
  LorentzVector & operator+=(const LorentzVector & a) {
    theX += a.theX;
    theY += a.theY;
    theZ += a.theZ;
    theT += a.theT;
    return *this;
  }

  Value theX;
  Value theY;
  Value theZ;
  Value theT;
}  __attribute__ ((aligned(16)));

inline LorentzVector
operator+(LorentzVector const & a, LorentzVector const & b) {
  return
LorentzVector(a.theX+b.theX,a.theY+b.theY,a.theZ+b.theZ,a.theT+b.theT);
}

inline LorentzVector
operator*(LorentzVector const & a, Value s) {
    return LorentzVector(a.theX*s,a.theY*s,a.theZ*s,a.theT*s);
}

inline LorentzVector
operator*(Value s, LorentzVector const & a) {
  return a*s;
}


void sum1(LorentzVector & res, Value s, LorentzVector const & v1, LorentzVector
const & v2) {
  res += s*(v1+v2);
}

void sum2(LorentzVector & res, Value s, LorentzVector const & v1, LorentzVector
const & v2) {
  res = res + s*(v1+v2);
}


c++ -O3 -c FourVec.cc
Vincenzos-MacBook-Pro:ctest innocent$ otool -V -t -v -X  FourVec.o | c++filt
sum1(LorentzVector&, float, LorentzVector const&, LorentzVector const&):
    movss    0x0c(%rsi),%xmm1
    movss    0x08(%rsi),%xmm2
    movss    0x04(%rsi),%xmm3
    movss    (%rsi),%xmm4
    addss    0x0c(%rdx),%xmm1
    addss    0x08(%rdx),%xmm2
    addss    0x04(%rdx),%xmm3
    addss    (%rdx),%xmm4
    mulss    %xmm0,%xmm1
    mulss    %xmm0,%xmm2
    mulss    %xmm0,%xmm3
    mulss    %xmm0,%xmm4
    addss    0x0c(%rdi),%xmm1
    addss    0x08(%rdi),%xmm2
    addss    0x04(%rdi),%xmm3
    addss    (%rdi),%xmm4
    movss    %xmm1,0x0c(%rdi)
    movss    %xmm2,0x08(%rdi)
    movss    %xmm3,0x04(%rdi)
    movss    %xmm4,(%rdi)
    ret
    nopl    (%rax)
sum2(LorentzVector&, float, LorentzVector const&, LorentzVector const&):
    movaps    (%rsi),%xmm1
    shufps    $0x0,%xmm0,%xmm0
    addps    (%rdx),%xmm1
    mulps    %xmm1,%xmm0
    addps    (%rdi),%xmm0
    movaps    %xmm0,(%rdi)
    ret


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/50819] missed SLP vectorization
  2011-10-21  9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
@ 2011-10-22 12:26 ` irar at il dot ibm.com
  2011-10-22 12:28 ` irar at il dot ibm.com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: irar at il dot ibm.com @ 2011-10-22 12:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819

Ira Rosen <irar at il dot ibm.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2011-10-22
                 CC|                            |irar at il dot ibm.com
         AssignedTo|unassigned at gcc dot       |irar at gcc dot gnu.org
                   |gnu.org                     |
     Ever Confirmed|0                           |1

--- Comment #1 from Ira Rosen <irar at il dot ibm.com> 2011-10-22 12:25:30 UTC ---
SLP data dependence analysis checks that all the loads are before all the
stores in the basic block, and for sum1 we get

  res_4(D)->theX = D.2306_31;
  D.2305_32 = res_4(D)->theY;
  D.2303_34 = D.2289_26 + D.2305_32;
  res_4(D)->theY = D.2303_34;
  D.2302_35 = res_4(D)->theZ;
  D.2300_37 = D.2287_24 + D.2302_35;
  res_4(D)->theZ = D.2300_37;
  D.2299_38 = res_4(D)->theT;
  D.2297_40 = D.2285_22 + D.2299_38;
  res_4(D)->theT = D.2297_40;

while for sum2 the loads and stores are not mixed:

  D.2391_29 = MEM[(const struct LorentzVector &)res_4(D)].theT;
  D.2389_31 = D.2365_22 + D.2391_29;
  D.2388_32 = MEM[(const struct LorentzVector &)res_4(D)].theZ;
  D.2386_34 = D.2367_24 + D.2388_32;
  D.2385_35 = MEM[(const struct LorentzVector &)res_4(D)].theY;
  D.2383_37 = D.2369_26 + D.2385_35;
  D.2382_38 = MEM[(const struct LorentzVector &)res_4(D)].theX;
  D.2380_40 = D.2371_28 + D.2382_38;
  res_4(D)->theX = D.2380_40;
  res_4(D)->theY = D.2383_37;
  res_4(D)->theZ = D.2386_34;
  res_4(D)->theT = D.2389_31;

The attached patch relaxes the above check a bit. Since we go through all the
ddrs anyway, we can check the order between the loads and the stores in
vect_analyze_data_ref_dependence. We don't care about independent load-store
pairs, so we only need to add this check to the pairs with unknown dependence.
(Known dependencies are already checked in vect_drs_dependent_in_basic_block).


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/50819] missed SLP vectorization
  2011-10-21  9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
  2011-10-22 12:26 ` [Bug tree-optimization/50819] " irar at il dot ibm.com
@ 2011-10-22 12:28 ` irar at il dot ibm.com
  2011-10-22 14:50 ` vincenzo.innocente at cern dot ch
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: irar at il dot ibm.com @ 2011-10-22 12:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819

--- Comment #2 from Ira Rosen <irar at il dot ibm.com> 2011-10-22 12:27:51 UTC ---
Created attachment 25574
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25574
Patch for this PR and also fo PR 50730


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/50819] missed SLP vectorization
  2011-10-21  9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
  2011-10-22 12:26 ` [Bug tree-optimization/50819] " irar at il dot ibm.com
  2011-10-22 12:28 ` irar at il dot ibm.com
@ 2011-10-22 14:50 ` vincenzo.innocente at cern dot ch
  2011-10-23 12:14 ` irar at gcc dot gnu.org
  2011-11-03  8:51 ` irar at il dot ibm.com
  4 siblings, 0 replies; 6+ messages in thread
From: vincenzo.innocente at cern dot ch @ 2011-10-22 14:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819

--- Comment #3 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 2011-10-22 14:50:01 UTC ---
excellent! 
thanks Ira for the fast fix.
It does work. No side effect at first look


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/50819] missed SLP vectorization
  2011-10-21  9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
                   ` (2 preceding siblings ...)
  2011-10-22 14:50 ` vincenzo.innocente at cern dot ch
@ 2011-10-23 12:14 ` irar at gcc dot gnu.org
  2011-11-03  8:51 ` irar at il dot ibm.com
  4 siblings, 0 replies; 6+ messages in thread
From: irar at gcc dot gnu.org @ 2011-10-23 12:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819

--- Comment #4 from irar at gcc dot gnu.org 2011-10-23 12:13:57 UTC ---
Author: irar
Date: Sun Oct 23 12:13:49 2011
New Revision: 180334

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=180334
Log:

        PR tree-optimization/50819
        * tree-vectorizer.h (vect_analyze_data_ref_dependences): Remove
        the last argument.
        * tree-vect-loop.c (vect_analyze_loop_2): Update call to
        vect_analyze_data_ref_dependences.
        * tree-vect-data-refs.c (vect_analyze_data_ref_dependence):
        Remove the last argument.  Check load-after-store dependence
        for unknown dependencies in basic blocks.
        (vect_analyze_data_ref_dependences): Update call to
        vect_analyze_data_ref_dependences.
        * tree-vect-patterns.c (vect_recog_widen_shift_pattern): Fix
        typo.
        * tree-vect-slp.c (vect_bb_vectorizable_with_dependencies):
        Remove.
        (vect_slp_analyze_bb_1): Update call to
        vect_analyze_data_ref_dependences.  Don't call
        vect_bb_vectorizable_with_dependencies.


Added:
    trunk/gcc/testsuite/g++.dg/vect/slp-pr50819.cc
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/g++.dg/vect/vect.exp
    trunk/gcc/tree-vect-data-refs.c
    trunk/gcc/tree-vect-loop.c
    trunk/gcc/tree-vect-patterns.c
    trunk/gcc/tree-vect-slp.c
    trunk/gcc/tree-vectorizer.h


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/50819] missed SLP vectorization
  2011-10-21  9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
                   ` (3 preceding siblings ...)
  2011-10-23 12:14 ` irar at gcc dot gnu.org
@ 2011-11-03  8:51 ` irar at il dot ibm.com
  4 siblings, 0 replies; 6+ messages in thread
From: irar at il dot ibm.com @ 2011-11-03  8:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50819

Ira Rosen <irar at il dot ibm.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED

--- Comment #5 from Ira Rosen <irar at il dot ibm.com> 2011-11-03 08:50:48 UTC ---
Fixed.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-11-03  8:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-21  9:29 [Bug tree-optimization/50819] New: missed SLP vectorization vincenzo.innocente at cern dot ch
2011-10-22 12:26 ` [Bug tree-optimization/50819] " irar at il dot ibm.com
2011-10-22 12:28 ` irar at il dot ibm.com
2011-10-22 14:50 ` vincenzo.innocente at cern dot ch
2011-10-23 12:14 ` irar at gcc dot gnu.org
2011-11-03  8:51 ` irar at il dot ibm.com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).