From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24749 invoked by alias); 21 Jun 2011 09:04:46 -0000 Received: (qmail 24738 invoked by uid 22791); 21 Jun 2011 09:04:45 -0000 X-SWARE-Spam-Status: No, hits=-2.7 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,TW_TM X-Spam-Check-By: sourceware.org Received: from localhost (HELO gcc.gnu.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 21 Jun 2011 09:04:31 +0000 From: "vincenzo.innocente at cern dot ch" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/49483] New: unable to vectorize code equivalent to "scalbnf" X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Keywords: X-Bugzilla-Severity: major X-Bugzilla-Who: vincenzo.innocente at cern dot ch X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Date: Tue, 21 Jun 2011 09:04:00 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2011-06/txt/msg01845.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49483 Summary: unable to vectorize code equivalent to "scalbnf" Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: major Priority: P3 Component: tree-optimization AssignedTo: unassigned@gcc.gnu.org ReportedBy: vincenzo.innocente@cern.ch I'm trying to write simplified versions of trigonometric and trascendental functions that gcc can auto-vectorize. at the moment I'm blocked with the vectorization of "scalbnf" I'm using code equivalent to the one in glibc sysdeps/ieee754/flt-32/s_scalbnf.c and math/math_private.h which in my c++ version reads cat vldexpf.cc inline float i2f(int x) { union { float f; int i; } tmp; tmp.i=x; return tmp.f; } inline float vect_ldexpf(float x, int n) { n = (n+0x7f)<<23; return x * i2f(n); } float __attribute__ ((aligned(16))) a[1024]; float __attribute__ ((aligned(16))) b[1024]; float __attribute__ ((aligned(16))) c[1024]; void tV() { for (int i=0; i!=1024; ++i) { float z = a[i]; int n = b[i]; c[i] = vect_ldexpf(z,n); } } compiling it produces c++ -Ofast -c vldexpf.cc -msse4.2 -ftree-vectorizer-verbose=7 vldexpf.cc:16: note: vect_model_load_cost: aligned. vldexpf.cc:16: note: vect_get_data_access_cost: inside_cost = 1, outside_cost = 0. vldexpf.cc:16: note: vect_model_load_cost: aligned. vldexpf.cc:16: note: vect_get_data_access_cost: inside_cost = 2, outside_cost = 0. vldexpf.cc:16: note: vect_model_store_cost: aligned. vldexpf.cc:16: note: vect_get_data_access_cost: inside_cost = 3, outside_cost = 0. vldexpf.cc:16: note: vect_model_load_cost: aligned. vldexpf.cc:16: note: vect_model_load_cost: inside_cost = 1, outside_cost = 0 . vldexpf.cc:16: note: vect_model_load_cost: aligned. vldexpf.cc:16: note: vect_model_load_cost: inside_cost = 1, outside_cost = 0 . vldexpf.cc:16: note: vect_model_simple_cost: inside_cost = 1, outside_cost = 1 . vldexpf.cc:16: note: vect_model_simple_cost: inside_cost = 1, outside_cost = 1 . vldexpf.cc:16: note: not vectorized: relevant stmt not supported: D.2243_14 = VIEW_CONVERT_EXPR(n_13); vldexpf.cc:15: note: vectorized 0 loops in function. I'm using c++ -v Using built-in specs. COLLECT_GCC=c++ COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-apple-darwin10.7.0/4.7.0/lto-wrapper Target: x86_64-apple-darwin10.7.0 Configured with: ./configure --enable-languages=c,c++,fortran --enable-lto --with-build-config=bootstrap-lto CFLAGS='-O2 -ftree-vectorize -fPIC' CXXFLAGS='-O2 -fPIC -ftree-vectorize -fvisibility-inlines-hidden' Thread model: posix gcc version 4.7.0 20110528 (experimental) (GCC)