From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23687 invoked by alias); 11 Mar 2013 13:35:02 -0000 Received: (qmail 22336 invoked by uid 48); 11 Mar 2013 13:34:39 -0000 From: "ysrumyan at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/56595] New: Tree-ssa-pre can create loop carried dependencies which prevent loop vectorization. Date: Mon, 11 Mar 2013 13:35:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: ysrumyan at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2013-03/txt/msg00853.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56595 Bug #: 56595 Summary: Tree-ssa-pre can create loop carried dependencies which prevent loop vectorization. Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned@gcc.gnu.org ReportedBy: ysrumyan@gmail.com In some cases pre can create loop carried dependencies spanning multiple iterations aka scalar replacement. This deficiency can be illustrated with attached test-case. After pre for stmt DO I = 0,I2 T1 = 0.5D0 * (U1(I,J,K) + U1(I+1,J,K)) pre creates loop carried dependence: : ... pretmp_690 = MEM[(real(kind=8)[0:] *)pretmp_675][pretmp_689]; ... : # i_1 = PHI <0(172), i_437(175)> # prephitmp_691 = PHI Note that in this particular test-case we have arrays with unknown stride1. If we have arrays with stride1 == 1 such transformation does not happen as for the following simple test-case which is successfully vectorized: subroutine bar(a,b,c,d,n, m) integer n, m real*8 a(n,*), b(n,*), c(n,*), d(n,*) do j=1,m do i=1,m x1 = 0.5 * (a(i,j) + a(i+1,j)) x2 = 0.5 * (b(i,j) + b(i+1,j)) x3 = 0.5 * (c(i,j) + c(i+1,j)) d(i,j) = (x1 + x2 + x3) / 3.0 enddo enddo end