From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26012 invoked by alias); 18 Dec 2009 23:29:38 -0000 Received: (qmail 26004 invoked by uid 22791); 18 Dec 2009 23:29:37 -0000 X-SWARE-Spam-Status: No, hits=-7.7 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI X-Spam-Check-By: sourceware.org Received: from cantor.suse.de (HELO mx1.suse.de) (195.135.220.2) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 18 Dec 2009 23:29:32 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.221.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 4D79A8FEA2 for ; Sat, 19 Dec 2009 00:29:29 +0100 (CET) Date: Sat, 19 Dec 2009 00:13:00 -0000 From: Richard Guenther To: gcc-patches@gcc.gnu.org Subject: [PATCH] Fix one part of PR42108 Message-ID: User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2009-12/txt/msg00925.txt.bz2 This fixes one part of PR42108, the missed discovery of a full redundant load. The issue is that the SSA SCC value-numberer does not visit loads and stores in a defined order. The (or rather one) fix is to properly canonicalize the virtual operand SSA names we record in the expression hash tables. The proper canonical virtual operand is the def of the first dominating may-definition (or a PHI node vdef, but we can as well choose a non-may-definition without loss of precision and generality). The patch possibly slows down SCCVN a bit for examples like # .MEM_2 = VDEF <.MEM_1(D)> may-def .... # VUSE <.MEM_120> ... = X; # VUSE <.MEM_120> ... = X; where discovering the redundant load of X needs to canonicalize their VUSE SSA name twice (previously we entered the expression into the hashtable with .MEM_120 so it would be found immediately). Now if we had # VUSE <.MEM_60> ... = X; inbetween the may-def and the other loads we previously discovered the full redundancy only if we first visited the load with .MEM_60 and only after that the loads with .MEM_120. But nothing guarantees this - this is the case the patch fixes. There are about 0.5% more redundant loads discovered in tramp3d with this patch. This is a regression of the alias-improvements branch merge as previously we had different virtual operands and thus in more cases the canonical vuses were automagically chosen. Bootstrapped and tested on x86_64-unknown-linux-gnu. I have patched one of our SPEC / C++ testers for more testing coverage. Richard. 2009-12-18 Richard Guenther PR tree-optimization/42108 * tree-ssa-sccvn.c (last_vuse_ptr): New variable. (vn_reference_lookup_2): Update last seen VUSE. (vn_reference_lookup_3): Avoid updating last seen VUSE after translating. (visit_reference_op_load): Use last seen VUSE from the first lookup when entering into the table. * gfortran.dg/pr42108.f90: New testcase. Index: gcc/testsuite/gfortran.dg/pr42108.f90 =================================================================== *** gcc/testsuite/gfortran.dg/pr42108.f90 (revision 0) --- gcc/testsuite/gfortran.dg/pr42108.f90 (revision 0) *************** *** 0 **** --- 1,27 ---- + ! { dg-do compile } + ! { dg-options "-O2 -fdump-tree-fre" } + + subroutine eval(foo1,foo2,foo3,foo4,x,n,nnd) + implicit real*8 (a-h,o-z) + dimension foo3(n),foo4(n),x(nnd) + nw=0 + foo3(1)=foo2*foo4(1) + do i=2,n + foo3(i)=foo2*foo4(i) + do j=1,i-1 + temp=0.0d0 + jmini=j-i + do k=i,nnd,n + temp=temp+(x(k)-x(k+jmini))**2 + end do + temp = sqrt(temp+foo1) + foo3(i)=foo3(i)+temp*foo4(j) + foo3(j)=foo3(j)+temp*foo4(i) + end do + end do + end subroutine eval + + ! There should be only one load from n left + + ! { dg-final { scan-tree-dump-times "\\*n_" 1 "fre" } } + ! { dg-final { cleanup-tree-dump "fre" } } Index: gcc/tree-ssa-sccvn.c =================================================================== *** gcc/tree-ssa-sccvn.c (revision 155346) --- gcc/tree-ssa-sccvn.c (working copy) *************** vn_reference_lookup_1 (vn_reference_t vr *** 984,989 **** --- 984,991 ---- return NULL_TREE; } + static tree *last_vuse_ptr; + /* Callback for walk_non_aliased_vuses. Adjusts the vn_reference_t VR_ with the current VUSE and performs the expression lookup. */ *************** vn_reference_lookup_2 (ao_ref *op ATTRIB *** 994,999 **** --- 996,1004 ---- void **slot; hashval_t hash; + if (last_vuse_ptr) + *last_vuse_ptr = vuse; + /* Fixup vuse and hash. */ vr->hashcode = vr->hashcode - iterative_hash_expr (vr->vuse, 0); vr->vuse = SSA_VAL (vuse); *************** vn_reference_lookup_3 (ao_ref *ref, tree *** 1161,1166 **** --- 1166,1174 ---- return (void *)-1; *ref = r; + /* Do not update last seen VUSE after translating. */ + last_vuse_ptr = NULL; + /* Keep looking for the adjusted *REF / VR pair. */ return NULL; } *************** static bool *** 1961,1967 **** visit_reference_op_load (tree lhs, tree op, gimple stmt) { bool changed = false; ! tree result = vn_reference_lookup (op, gimple_vuse (stmt), true, NULL); /* If we have a VCE, try looking up its operand as it might be stored in a different type. */ --- 1969,1981 ---- visit_reference_op_load (tree lhs, tree op, gimple stmt) { bool changed = false; ! tree last_vuse; ! tree result; ! ! last_vuse = gimple_vuse (stmt); ! last_vuse_ptr = &last_vuse; ! result = vn_reference_lookup (op, gimple_vuse (stmt), true, NULL); ! last_vuse_ptr = NULL; /* If we have a VCE, try looking up its operand as it might be stored in a different type. */ *************** visit_reference_op_load (tree lhs, tree *** 2045,2051 **** else { changed = set_ssa_val_to (lhs, lhs); ! vn_reference_insert (op, lhs, gimple_vuse (stmt)); } return changed; --- 2059,2065 ---- else { changed = set_ssa_val_to (lhs, lhs); ! vn_reference_insert (op, lhs, last_vuse); } return changed;