public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP @ 2012-02-03 17:07 ddesics at gmail dot com 2012-02-03 17:40 ` [Bug tree-optimization/52112] " pinskia at gcc dot gnu.org ` (4 more replies) 0 siblings, 5 replies; 6+ messages in thread From: ddesics at gmail dot com @ 2012-02-03 17:07 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112 Bug #: 52112 Summary: Vectorizer fails when using CRTP Classification: Unclassified Product: gcc Version: 4.6.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned@gcc.gnu.org ReportedBy: ddesics@gmail.com Created attachment 26565 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26565 The test case code. Using the CRTP along with static_cast pointers prevents auto-vectorization. If the code below is compiled with -ftree-vectorizer-verbose=7, the CRTP method fails to vectorize with "not vectorized: control flow in loop." However, if compiled with -O3 -S -fno-tree-vectorize, both methods produce identical assembly. I don't know how difficult this would be to change, but it could certainly speed up a lot of c++ code. For instance, this currently prevents boost ublas from vectorizing. #include<iostream> template<typename E, typename Tp> class CRTP_base { public: typedef E& reference; typedef Tp value_type; reference operator()() { return *static_cast<E*>(this); } value_type square() { return (*this)().x() * (*this)().x(); } protected: CRTP_base() {} ~CRTP_base() {} }; template<typename Tp> class CRTP_child : public CRTP_base<CRTP_child<Tp>,Tp> { Tp xval; typedef CRTP_base<CRTP_child<Tp>,Tp> parent; public: CRTP_child(Tp xv = Tp()) : xval(xv) {} Tp x() { return xval; } using parent::square; }; int main() { const int N = 100; double A[N] __attribute__((aligned(16))); double B[N] __attribute__((aligned(16))); double sum1=0.0; for(int i = 0; i < N; ++i) { A[i] = i; } for(int i = 0; i < N; ++i) { B[i] = A[i]*A[i]; } for(int i = 0; i < N; ++i) { sum1 += B[i]; } std::cout << "Sum of method 1: " << sum1; for(int i = 0; i < N; ++i) { B[i] = CRTP_child<double>(A[i]).square(); } for(int i = 0; i < N; ++i) { sum1 += B[i]; } std::cout << "\nSum of method 2: " << sum2 << std::endl; return 0; } ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/52112] Vectorizer fails when using CRTP 2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com @ 2012-02-03 17:40 ` pinskia at gcc dot gnu.org 2012-02-03 17:43 ` pinskia at gcc dot gnu.org ` (3 subsequent siblings) 4 siblings, 0 replies; 6+ messages in thread From: pinskia at gcc dot gnu.org @ 2012-02-03 17:40 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112 --- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-02-03 17:40:23 UTC --- This is vectorized for me on x86_64 on the trunk. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/52112] Vectorizer fails when using CRTP 2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com 2012-02-03 17:40 ` [Bug tree-optimization/52112] " pinskia at gcc dot gnu.org @ 2012-02-03 17:43 ` pinskia at gcc dot gnu.org 2012-02-03 17:45 ` pinskia at gcc dot gnu.org ` (2 subsequent siblings) 4 siblings, 0 replies; 6+ messages in thread From: pinskia at gcc dot gnu.org @ 2012-02-03 17:43 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112 --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-02-03 17:43:41 UTC --- Even in 4.4, both multiplication loops are vectorized ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/52112] Vectorizer fails when using CRTP 2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com 2012-02-03 17:40 ` [Bug tree-optimization/52112] " pinskia at gcc dot gnu.org 2012-02-03 17:43 ` pinskia at gcc dot gnu.org @ 2012-02-03 17:45 ` pinskia at gcc dot gnu.org 2012-02-03 18:25 ` ddesics at gmail dot com 2012-02-03 19:01 ` jakub at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: pinskia at gcc dot gnu.org @ 2012-02-03 17:45 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112 --- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-02-03 17:44:36 UTC --- <bb 10>: vect_var_.146 = MEM[base: &A, index: ivtmp.196]; MEM[base: &B, index: ivtmp.196] = [mult_expr] vect_var_.146 * vect_var_.146; ivtmp.196 = ivtmp.196 + 16; if (ivtmp.196 != 800) goto <bb 10>; else goto <bb 11>; Is what we get for 4.4 which is obviously vectorized. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/52112] Vectorizer fails when using CRTP 2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com ` (2 preceding siblings ...) 2012-02-03 17:45 ` pinskia at gcc dot gnu.org @ 2012-02-03 18:25 ` ddesics at gmail dot com 2012-02-03 19:01 ` jakub at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: ddesics at gmail dot com @ 2012-02-03 18:25 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112 --- Comment #4 from Daniel Davis <ddesics at gmail dot com> 2012-02-03 18:24:49 UTC --- Any thoughts on why it won't vectorize for me on x86_64 4.6.1? ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/52112] Vectorizer fails when using CRTP 2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com ` (3 preceding siblings ...) 2012-02-03 18:25 ` ddesics at gmail dot com @ 2012-02-03 19:01 ` jakub at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: jakub at gcc dot gnu.org @ 2012-02-03 19:01 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52112 Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED CC| |jakub at gcc dot gnu.org Resolution| |FIXED --- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-02-03 19:00:44 UTC --- Before http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181014 only 2 loops are vectorized, after that 3. With -Ofast before that change 4 loops are vectorized and after that 5 (the double reduction needs -ffast-math to be vectorized). So fixed for 4.7. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-02-03 19:01 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-02-03 17:07 [Bug tree-optimization/52112] New: Vectorizer fails when using CRTP ddesics at gmail dot com 2012-02-03 17:40 ` [Bug tree-optimization/52112] " pinskia at gcc dot gnu.org 2012-02-03 17:43 ` pinskia at gcc dot gnu.org 2012-02-03 17:45 ` pinskia at gcc dot gnu.org 2012-02-03 18:25 ` ddesics at gmail dot com 2012-02-03 19:01 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).