From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21502 invoked by alias); 11 Dec 2012 06:43:02 -0000 Received: (qmail 21441 invoked by uid 48); 11 Dec 2012 06:42:41 -0000 From: "vincenzo.innocente at cern dot ch" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/55645] New: skipping unlike branch in vectorized loops using movmsk or equivalent Date: Tue, 11 Dec 2012 06:43:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: vincenzo.innocente at cern dot ch X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2012-12/txt/msg01059.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55645 Bug #: 55645 Summary: skipping unlike branch in vectorized loops using movmsk or equivalent Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassigned@gcc.gnu.org ReportedBy: vincenzo.innocente@cern.ch I'm wondering if the vectorization engine could accommodate some mechanism to skip unlike branches using a global test based on movmsk or similar below a trivial example including a possible SLP implementation that happens to compile with 4.8 as c++ -std=c++11 -Ofast -mavx2 -S divergent.cc; less divergent.s float a[1024]; float b[1024]; float c[1024]; #define likely(x) (__builtin_expect(x, true)) // possible syntax void compute() { for (int i=0;i!=1024;++i) { if likely(a[i] typedef float __attribute__( ( vector_size( 32 ) ) ) float32x8_t; typedef int __attribute__( ( vector_size( 32 ) ) ) int32x8_t; float32x8_t va[1024]; float32x8_t vb[1024]; float32x8_t vc[1024]; void computeV() { for (int i=0;i!=1024;++i) { float32x8_t mask = va[i]