From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8470 invoked by alias); 18 Sep 2013 11:54:48 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 8409 invoked by uid 48); 18 Sep 2013 11:54:45 -0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/58453] [4.9 Regression] Revision 202431 results in miscompare for CPU2006 434.zeusmp Date: Wed, 18 Sep 2013 11:54:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 4.9.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: rguenth at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.9.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2013-09/txt/msg01339.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58453 --- Comment #7 from Richard Biener --- loop distribution distributes this into q2 = ( g2a(i+1) * g31a(i+1) * v1(i+1,j,k) 1 - g2a(i ) * g31a(i ) * v1(i ,j,k) ) 2 * dvl1ai(i) 3 + ( g32a(j+1) * v2(i,j+1,k) 4 - g32a(j ) * v2(i,j ,k) ) 5 * g2bi(i) * dvl2ai(j) 6 + ( v3(i,j,k+1) - v3(i,j,k) ) 7 * g31bi(i) * g32bi(j) * dvl3ai(k) q2 = q2 * q1 e(i,j,k) = ( 1.0 - q2 ) / ( 1.0 + q2 ) * e(i,j,k) and a memcpy for dlo(i,j,k) = d(i,j,k) and eod(i,j,k) = e(i,j,k) / d(i,j,k) (re-computing e(i,j,k) instead of loading it from the stored value - a known deficiency) This doesn't look wrong on the first glance (but it's probably slower). What the revision in question changed was remove some very odd code from rdg_flag_uses: - if (gimple_code (stmt) != GIMPLE_PHI) - { - if ((use_p = gimple_vuse_op (stmt)) != NULL_USE_OPERAND_P) - { - tree use = USE_FROM_PTR (use_p); - - if (TREE_CODE (use) == SSA_NAME - && !SSA_NAME_IS_DEFAULT_DEF (use)) - { - gimple def_stmt = SSA_NAME_DEF_STMT (use); - int v = rdg_vertex_for_stmt (rdg, def_stmt); - - if (v >= 0 - && !already_processed_vertex_p (processed, v)) - rdg_flag_vertex_and_dependent (rdg, v, partition, loops, - processed); - } - } - } that just doesn't make sense, but it likely made sure everything ended up in a single partition. Does the benchmark fail if you build with -ftree-loop-distribution -fno-tree-loop-distribute-patterns? (it should emit a loop instead of the memcpy call)