From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 968E33858D28 for ; Fri, 1 Apr 2022 11:49:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 968E33858D28 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 5964621A91; Fri, 1 Apr 2022 11:49:55 +0000 (UTC) Received: from murzim.suse.de (murzim.suse.de [10.160.4.192]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 3D57AA3B82; Fri, 1 Apr 2022 11:49:55 +0000 (UTC) Date: Fri, 1 Apr 2022 13:49:55 +0200 (CEST) From: Richard Biener To: Martin Jambor cc: GCC Patches , Jan Hubicka Subject: Re: [RFC] ipa-cp: Feed results of IPA-CP into SCCVN In-Reply-To: Message-ID: <4247o845-r8s1-n167-7n69-24roro37s2s@fhfr.qr> References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2022 11:49:58 -0000 On Fri, 1 Apr 2022, Martin Jambor wrote: > Hi, > > PRs 68930 and 92497 show that when IPA-CP figures out constants in > aggregate parameters or when passed by reference but the loads happen > in an inlined function the information is lost. This happens even > when the inlined function itself was known to have - or even cloned to > have - such constants in incoming parameters because the transform > phase of IPA passes is not run on them. See discussion in the bugs > for reasons why. > > Honza suggested that we can plug the results of IPA-CP analysis into > value numbering, so that FRE can figure out that some loads fetch > known constants. This is what this patch attempts to do. > > Although I spent quite some time reading tree-sccvn.c, it is complex > enough that I am sure I am not aware of various caveats and so I would > not be terribly surprised if there were some issues with my approach > that I am not aware of. Nevertheless, it seems to work well for simple > cases and even passes bootstrap and testing (and LTO bootstrap) on > x86_64-linux. > > I have experimented a little with using this approach instead of the > function walking parts of the IPA-CP transformation phase. This would > mean that the known-constants would not participate in the passes after > IPA but before FRE - which are not many but there is a ccp and fwprop > pass among others. For simple testcases like > gcc/testsuite/gcc.dg/ipa/ipcp-agg-*.c, it makes not assembly difference > at all. > > What do you think? Comments below > Martin > > > gcc/ChangeLog: > > 2022-03-30 Martin Jambor > > PR ipa/68930 > PR ipa/92497 > * ipa-prop.cc (ipcp_get_aggregate_const): New function. > (ipcp_transform_function): Do not deallocate transformation info. > * ipa-prop.h (ipcp_get_aggregate_const): Declare. > * tree-ssa-sccvn.cc: Include alloc-pool.h, symbol-summary.h and > ipa-prop.h. > (vn_reference_lookup_2): When hitting default-def vuse, query > IPA-CP transformation info for any known constants. > > gcc/testsuite/ChangeLog: > > 2022-03-30 Martin Jambor > > PR ipa/68930 > PR ipa/92497 > * gcc.dg/ipa/pr92497-1.c: New test. > * gcc.dg/ipa/pr92497-2.c: Likewise. > --- > gcc/ipa-prop.cc | 43 ++++++++++++++++++++++++---- > gcc/ipa-prop.h | 2 ++ > gcc/testsuite/gcc.dg/ipa/pr92497-1.c | 26 +++++++++++++++++ > gcc/testsuite/gcc.dg/ipa/pr92497-2.c | 26 +++++++++++++++++ > gcc/tree-ssa-sccvn.cc | 35 +++++++++++++++++++++- > 5 files changed, 126 insertions(+), 6 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92497-1.c > create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92497-2.c > > diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc > index e55fe2776f2..a73a5d9ec1d 100644 > --- a/gcc/ipa-prop.cc > +++ b/gcc/ipa-prop.cc > @@ -5748,6 +5748,44 @@ ipcp_modif_dom_walker::before_dom_children (basic_block bb) > return NULL; > } > > +/* If IPA-CP discovered a constant in parameter PARM at OFFSET of a given SIZE > + - whether passed by reference or not is given by BY_REF - return that > + constant. Otherwise return NULL_TREE. */ > + > +tree > +ipcp_get_aggregate_const (tree parm, bool by_ref, > + HOST_WIDE_INT offset, HOST_WIDE_INT size) I'd prefer to pass in the function decl or struct function or cgraph node. > +{ > + cgraph_node *cnode = cgraph_node::get (current_function_decl); > + > + ipa_agg_replacement_value *aggval = ipa_get_agg_replacements_for_node (cnode); > + if (!aggval) > + return NULL_TREE; > + > + int index = 0; > + for (tree p = DECL_ARGUMENTS (current_function_decl); > + p != parm; p = DECL_CHAIN (p)) > + { > + index++; > + if (!p) > + return NULL_TREE; > + } > + > + ipa_agg_replacement_value *v; > + for (v = aggval; v; v = v->next) > + if (v->index == index > + && v->offset == offset) > + break; > + if (!v > + || v->by_ref != by_ref > + || maybe_ne (tree_to_poly_int64 (TYPE_SIZE (TREE_TYPE (v->value))), > + size)) > + return NULL_TREE; two linear searches here - ugh. I wonder if we should instead pre-fill a hash-map from PARM_DECL to a ipa_agg_replacement_value * vector sorted by offset which we can binary search? That could be done once when starting value-numbering (not on regions). Is there any reason the data structure is as it is? It seems that even ipcp_modif_dom_walker::before_dom_children will do a linear walk and ipa_get_param_decl_index_1 also linearly walks parameters. That looks highly sub-optimal to me, it's also done for each mention of a parameter. > + return v->value; > +} > + > + > /* Return true if we have recorded VALUE and MASK about PARM. > Set VALUE and MASk accordingly. */ > > @@ -6055,11 +6093,6 @@ ipcp_transform_function (struct cgraph_node *node) > free_ipa_bb_info (bi); > fbi.bb_infos.release (); > > - ipcp_transformation *s = ipcp_transformation_sum->get (node); > - s->agg_values = NULL; > - s->bits = NULL; > - s->m_vr = NULL; > - > vec_free (descriptors); > if (cfg_changed) > delete_unreachable_blocks_update_callgraph (node, false); > diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h > index 553adfc9f35..aa6fcb522ac 100644 > --- a/gcc/ipa-prop.h > +++ b/gcc/ipa-prop.h > @@ -1181,6 +1181,8 @@ void ipa_dump_param (FILE *, class ipa_node_params *info, int i); > void ipa_release_body_info (struct ipa_func_body_info *); > tree ipa_get_callee_param_type (struct cgraph_edge *e, int i); > bool ipcp_get_parm_bits (tree, tree *, widest_int *); > +tree ipcp_get_aggregate_const (tree parm, bool by_ref, > + HOST_WIDE_INT offset, HOST_WIDE_INT size); > bool unadjusted_ptr_and_unit_offset (tree op, tree *ret, > poly_int64 *offset_ret); > > diff --git a/gcc/testsuite/gcc.dg/ipa/pr92497-1.c b/gcc/testsuite/gcc.dg/ipa/pr92497-1.c > new file mode 100644 > index 00000000000..eb8f2e75fd0 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/ipa/pr92497-1.c > @@ -0,0 +1,26 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -fno-early-inlining" } */ > + > +struct a {int a;}; > +static int //__attribute__ ((noinline)) > +foo (struct a a) > +{ > + if (!__builtin_constant_p (a.a)) > + __builtin_abort (); > + return a.a; > +} > + > +static int __attribute__ ((noinline)) > +bar (struct a a) > +{ > + return foo(a); > +} > + > +volatile int r; > + > +int main() > +{ > + struct a a={1}; > + r = bar (a); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/ipa/pr92497-2.c b/gcc/testsuite/gcc.dg/ipa/pr92497-2.c > new file mode 100644 > index 00000000000..56e611df8f9 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/ipa/pr92497-2.c > @@ -0,0 +1,26 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -fno-early-inlining -fno-ipa-sra" } */ > + > +struct a {int a;}; > +static int //__attribute__ ((noinline)) > +foo (struct a *a) > +{ > + if (!__builtin_constant_p (a->a)) > + __builtin_abort (); > + return a->a; > +} > + > +static int __attribute__ ((noinline)) > +bar (struct a *a) > +{ > + return foo(a); > +} > + > +volatile int r; > + > +int main() > +{ > + struct a a={1}; > + r = bar (&a); > + return 0; > +} > diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc > index 66b4fc21f5b..da558055054 100644 > --- a/gcc/tree-ssa-sccvn.cc > +++ b/gcc/tree-ssa-sccvn.cc > @@ -74,6 +74,9 @@ along with GCC; see the file COPYING3. If not see > #include "ipa-modref-tree.h" > #include "ipa-modref.h" > #include "tree-ssa-sccvn.h" > +#include "alloc-pool.h" > +#include "symbol-summary.h" > +#include "ipa-prop.h" > > /* This algorithm is based on the SCC algorithm presented by Keith > Cooper and L. Taylor Simpson in "SCC-Based Value numbering" > @@ -2273,7 +2276,7 @@ vn_walk_cb_data::push_partial_def (pd_data pd, > with the current VUSE and performs the expression lookup. */ > > static void * > -vn_reference_lookup_2 (ao_ref *op ATTRIBUTE_UNUSED, tree vuse, void *data_) > +vn_reference_lookup_2 (ao_ref *op, tree vuse, void *data_) > { > vn_walk_cb_data *data = (vn_walk_cb_data *)data_; > vn_reference_t vr = data->vr; > @@ -2307,6 +2310,36 @@ vn_reference_lookup_2 (ao_ref *op ATTRIBUTE_UNUSED, tree vuse, void *data_) > return *slot; > } > > + if (SSA_NAME_IS_DEFAULT_DEF (vuse)) > + { > + HOST_WIDE_INT offset, size; > + tree v = NULL_TREE; > + if (op->base && TREE_CODE (op->base) == PARM_DECL > + && op->offset.is_constant (&offset) > + && op->size.is_constant (&size)) you probably also want to check op->max_size_known_p () && known_eq (op->size, op->max_size) I see data->partial_defs.is_empty () is already checked in this function, but we won't currently ever call vn_reference_lookup_3 for default defs. That said, ideally we'd handle also partial defs, but we can start without. > + v = ipcp_get_aggregate_const (op->base, false, offset, size); > + else if (op->ref) > + { > + HOST_WIDE_INT offset, size; > + bool reverse; > + tree base = get_ref_base_and_extent_hwi (op->ref, &offset, > + &size, &reverse); > + if (base && TREE_CODE (base) == PARM_DECL) In which case do you arrive here but op->base is not set above? Maybe you need to use ao_ref_base (op->base) since op->base is filled lazily from op->ref. > + v = ipcp_get_aggregate_const (base, false, offset, size); > + else if (base > + && TREE_CODE (base) == MEM_REF > + && integer_zerop (TREE_OPERAND (base, 1)) > + && TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME > + && SSA_NAME_IS_DEFAULT_DEF (TREE_OPERAND (base, 0)) > + && (TREE_CODE (SSA_NAME_VAR (TREE_OPERAND (base, 0))) > + == PARM_DECL)) This code should also be in the op->base part. I'd prefer if you pass DECL_BY_REFERENCE rather than second-guessing it (or assert you guessed right). > + v = ipcp_get_aggregate_const (SSA_NAME_VAR (TREE_OPERAND (base, 0)), > + true, offset, size); > + } > + if (v) > + return data->finish (vr->set, vr->base_set, v); > + } > + > return NULL; > } > > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)