public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP
@ 2012-02-03 17:07 ddesics at gmail dot com
  2012-02-03 17:40 ` [Bug tree-optimization/52112] " pinskia at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: ddesics at gmail dot com @ 2012-02-03 17:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112

             Bug #: 52112
           Summary: Vectorizer fails when using CRTP
    Classification: Unclassified
           Product: gcc
           Version: 4.6.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: ddesics@gmail.com


Created attachment 26565
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26565
The test case code.

Using the CRTP along with static_cast pointers prevents auto-vectorization.
If the code below is compiled with -ftree-vectorizer-verbose=7, the CRTP method
fails to vectorize with "not vectorized: control flow in loop."  However, if
compiled with -O3 -S -fno-tree-vectorize, both methods produce identical
assembly.

I don't know how difficult this would be to change, but it could certainly
speed up a lot of c++ code.  For instance, this currently prevents boost ublas
from vectorizing.

#include<iostream>

template<typename E, typename Tp> class CRTP_base {
  public:
   typedef E& reference;
   typedef Tp value_type;

   reference operator()() { return *static_cast<E*>(this); }
   value_type square() { return (*this)().x() * (*this)().x(); }
  protected:
   CRTP_base() {}
   ~CRTP_base() {}
};

template<typename Tp> class CRTP_child : public CRTP_base<CRTP_child<Tp>,Tp> {
   Tp xval;
   typedef CRTP_base<CRTP_child<Tp>,Tp> parent;
  public:
   CRTP_child(Tp xv = Tp()) : xval(xv) {}
   Tp x() { return xval; }
   using parent::square;
};

int main() {
  const int N = 100;
  double A[N] __attribute__((aligned(16)));
  double B[N] __attribute__((aligned(16)));
  double sum1=0.0;

  for(int i = 0; i < N; ++i) { A[i] = i; }

  for(int i = 0; i < N; ++i) { B[i] = A[i]*A[i]; }
  for(int i = 0; i < N; ++i) { sum1 += B[i]; }
  std::cout << "Sum of method 1: " << sum1;

  for(int i = 0; i < N; ++i) { B[i] = CRTP_child<double>(A[i]).square(); }
  for(int i = 0; i < N; ++i) { sum1 += B[i]; }
  std::cout << "\nSum of method 2: " << sum2 << std::endl;
  return 0;
}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/52112] Vectorizer fails when using CRTP
  2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com
@ 2012-02-03 17:40 ` pinskia at gcc dot gnu.org
  2012-02-03 17:43 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-02-03 17:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-02-03 17:40:23 UTC ---
This is vectorized for me on x86_64 on the trunk.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/52112] Vectorizer fails when using CRTP
  2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com
  2012-02-03 17:40 ` [Bug tree-optimization/52112] " pinskia at gcc dot gnu.org
@ 2012-02-03 17:43 ` pinskia at gcc dot gnu.org
  2012-02-03 17:45 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-02-03 17:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-02-03 17:43:41 UTC ---
Even in 4.4, both multiplication loops are vectorized


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/52112] Vectorizer fails when using CRTP
  2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com
  2012-02-03 17:40 ` [Bug tree-optimization/52112] " pinskia at gcc dot gnu.org
  2012-02-03 17:43 ` pinskia at gcc dot gnu.org
@ 2012-02-03 17:45 ` pinskia at gcc dot gnu.org
  2012-02-03 18:25 ` ddesics at gmail dot com
  2012-02-03 19:01 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-02-03 17:45 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-02-03 17:44:36 UTC ---
<bb 10>:
  vect_var_.146 = MEM[base: &A, index: ivtmp.196];
  MEM[base: &B, index: ivtmp.196] = [mult_expr] vect_var_.146 * vect_var_.146;
  ivtmp.196 = ivtmp.196 + 16;
  if (ivtmp.196 != 800)
    goto <bb 10>;
  else
    goto <bb 11>;

Is what we get for 4.4 which is obviously vectorized.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/52112] Vectorizer fails when using CRTP
  2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com
                   ` (2 preceding siblings ...)
  2012-02-03 17:45 ` pinskia at gcc dot gnu.org
@ 2012-02-03 18:25 ` ddesics at gmail dot com
  2012-02-03 19:01 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: ddesics at gmail dot com @ 2012-02-03 18:25 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112

--- Comment #4 from Daniel Davis <ddesics at gmail dot com> 2012-02-03 18:24:49 UTC ---
Any thoughts on why it won't vectorize for me on x86_64 4.6.1?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/52112] Vectorizer fails when using CRTP
  2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com
                   ` (3 preceding siblings ...)
  2012-02-03 18:25 ` ddesics at gmail dot com
@ 2012-02-03 19:01 ` jakub at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-02-03 19:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
                 CC|                            |jakub at gcc dot gnu.org
         Resolution|                            |FIXED

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-02-03 19:00:44 UTC ---
Before http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181014
only 2 loops are vectorized, after that 3.  With -Ofast before that change 4
loops are vectorized and after that 5 (the double reduction needs -ffast-math
to be vectorized).

So fixed for 4.7.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-02-03 19:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com
2012-02-03 17:40 ` [Bug tree-optimization/52112] " pinskia at gcc dot gnu.org
2012-02-03 17:43 ` pinskia at gcc dot gnu.org
2012-02-03 17:45 ` pinskia at gcc dot gnu.org
2012-02-03 18:25 ` ddesics at gmail dot com
2012-02-03 19:01 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).