From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 100085 invoked by alias); 26 Aug 2015 14:04:40 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 100074 invoked by uid 89); 26 Aug 2015 14:04:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Wed, 26 Aug 2015 14:04:29 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 244B4AD93; Wed, 26 Aug 2015 14:04:25 +0000 (UTC) Date: Wed, 26 Aug 2015 14:08:00 -0000 From: Martin Jambor To: Alan Lawrence Cc: gcc-patches@gcc.gnu.org, rguenther@suse.de Subject: Re: [RFC 4/5] Handle constant-pool entries Message-ID: <20150826140424.GJ32341@virgil.suse.cz> Mail-Followup-To: Alan Lawrence , gcc-patches@gcc.gnu.org, rguenther@suse.de References: <1440500777-25966-1-git-send-email-alan.lawrence@arm.com> <1440500777-25966-5-git-send-email-alan.lawrence@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1440500777-25966-5-git-send-email-alan.lawrence@arm.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes X-SW-Source: 2015-08/txt/msg01623.txt.bz2 Hi, On Tue, Aug 25, 2015 at 12:06:16PM +0100, Alan Lawrence wrote: > This makes SRA replace loads of records/arrays from constant pool entries, > with elementwise assignments of the constant values, hence, overcoming the > fundamental problem in PR/63679. > > As a first pass, the approach I took was to look for constant-pool loads as > we scanned through other accesses, and add them as candidates there; to build a > constant replacement_decl for any such accesses in completely_scalarize; and to > use any existing replacement_decl rather than creating a variable in > create_access_replacement. (I did try using CONSTANT_CLASS_P in the latter, but > that does not allow addresses of labels, which can still end up in the constant > pool.) > > Feedback as to the approach or how it might be better structured / fitted into > SRA, is solicited ;). I'm not familiar with constant pools very much, but I'll try: > > Bootstrapped + check-gcc on x86-none-linux-gnu, aarch64-none-linux-gnu and > arm-none-linux-gnueabihf, including with the next patch (rfc), which greatly increases the number of testcases in which this code is exercised! > > Have also verified that the ssa-dom-cse-2.c scan-tree-dump test passes (using a stage 1 compiler only, without execution) on alpha, hppa, powerpc, sparc, avr, and sh. > > gcc/ChangeLog: > > * tree-sra.c (create_access): Scan for uses of constant pool and add > to candidates. > (subst_initial): New. > (scalarize_elem): Build replacement_decl using subst_initial. > (create_access_replacement): Use replacement_decl if set. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/ssa-dom-cse-2.c: Remove xfail, add --param > sra-max-scalarization-size-Ospeed. > --- > gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c | 7 +--- > gcc/tree-sra.c | 56 +++++++++++++++++++++++++-- > 2 files changed, 55 insertions(+), 8 deletions(-) > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c > index 9eccdc9..b13d583 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized" } */ > +/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized --param sra-max-scalarization-size-Ospeed=32" } */ > > int > foo () > @@ -17,7 +17,4 @@ foo () > /* After late unrolling the above loop completely DOM should be > able to optimize this to return 28. */ > > -/* See PR63679 and PR64159, if the target forces the initializer to memory then > - DOM is not able to perform this optimization. */ > - > -/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail aarch64*-*-* alpha*-*-* hppa*-*-* powerpc*-*-* sparc*-*-* s390*-*-* } } } */ > +/* { dg-final { scan-tree-dump "return 28;" "optimized" } } */ > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index af35fcc..a3ff2df 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -865,6 +865,17 @@ create_access (tree expr, gimple stmt, bool write) > else > ptr = false; > > + /* FORNOW: scan for uses of constant pool as we go along. */ > + if (TREE_CODE (base) == VAR_DECL && DECL_IN_CONSTANT_POOL (base) > + && !bitmap_bit_p (candidate_bitmap, DECL_UID (base))) > + { > + gcc_assert (!write); > + bitmap_set_bit (candidate_bitmap, DECL_UID (base)); > + tree_node **slot = candidates->find_slot_with_hash (base, DECL_UID (base), > + INSERT); > + *slot = base; > + } > + I believe you only want to do this if (sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA). The idea of candidates is that we gather them in find_var_candidates and ten we only eliminate them, this has the benefit of not worrying about disqualifying a candidate and then erroneously re-adding it later. So if you could find a way to structure your code this way, I'd much happier. If it is impossible without traversing the whole function just for that purpose, we may need some mechanism to prevent us from making a disqualified decl a candidate again. Or, if we come to the conclusion that constant pool decls do not ever get disqualified, a gcc_assert making sure it actually does not happen in disqualify_candidate. And of course at find_var_candidates time we check that all candidates pass simple checks in maybe_add_sra_candidate. I suppose many of them do not make sense for constant pool decls but at least please have a look whether that is the case for all of them or not. > if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base))) > return NULL; > > @@ -1025,6 +1036,37 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref) > } > } > > +static tree > +subst_initial (tree expr, tree var) This needs a comment and a better name. A name that would make it clear this is for constant pool stuff only. > +{ > + if (TREE_CODE (expr) == VAR_DECL) > + { > + gcc_assert (DECL_IN_CONSTANT_POOL (expr)); > + gcc_assert (expr == var); > + return DECL_INITIAL (expr); > + } > + if (TREE_CODE (expr) == COMPONENT_REF) > + { > + gcc_assert (TREE_CODE (TREE_OPERAND (expr, 1)) == FIELD_DECL); > + gcc_assert (TREE_OPERAND (expr, 2) == NULL_TREE); > + return fold_build3 (COMPONENT_REF, TREE_TYPE (expr), > + subst_initial (TREE_OPERAND (expr, 0), var), > + TREE_OPERAND (expr, 1), > + NULL_TREE); > + } > + if (TREE_CODE (expr) == ARRAY_REF) > + { > + gcc_assert (TREE_OPERAND (expr, 2) == NULL_TREE); > + gcc_assert (TREE_OPERAND (expr, 3) == NULL_TREE); > + return fold (build4 (ARRAY_REF, TREE_TYPE (expr), > + subst_initial (TREE_OPERAND (expr, 0), var), > + TREE_OPERAND (expr, 1), > + NULL_TREE, > + NULL_TREE)); > + } > + gcc_unreachable (); > +} > + > static void > scalarize_elem (tree base, HOST_WIDE_INT pos, HOST_WIDE_INT size, > tree ref, tree type) > @@ -1033,6 +1075,9 @@ scalarize_elem (tree base, HOST_WIDE_INT pos, HOST_WIDE_INT size, > { > struct access *access = create_access_1 (base, pos, size); > access->expr = ref; > + if (TREE_CODE (base) == VAR_DECL > + && DECL_IN_CONSTANT_POOL (base)) > + access->replacement_decl = subst_initial (ref, base); So replacement_decl ceases to be a decl and should be renamed to replacement_expr. (Such cleanups can be done as a followup, of course). Thanks, Martin