From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14918 invoked by alias); 10 Aug 2012 03:12:50 -0000 Received: (qmail 14909 invoked by uid 22791); 10 Aug 2012 03:12:48 -0000 X-SWARE-Spam-Status: No, hits=-3.8 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_W,RCVD_IN_HOSTKARMA_WL,TW_TM,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 10 Aug 2012 03:12:33 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 1C39F888004; Fri, 10 Aug 2012 05:12:31 +0200 (CEST) Date: Fri, 10 Aug 2012 03:12:00 -0000 From: Jan Hubicka To: GCC Patches , Jan Hubicka Subject: Re: [PATCH 2/3] Incorporate aggregate jump functions into inlining analysis Message-ID: <20120810031231.GE12788@kam.mff.cuni.cz> References: <20120802192811.GC20202@virgil.arch.suse.de> <20120803155016.GA29154@virgil.arch.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120803155016.GA29154@virgil.arch.suse.de> User-Agent: Mutt/1.5.18 (2008-05-17) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-08/txt/msg00595.txt.bz2 > Hi, > > this patch uses the aggregate jump functions created by the previous > patch in the series to determine benefits of inlining a particular > call graph edge. It has not changed much since the last time I posted > it, except for the presence of by_ref flags and removal of checks > required by TBAA which we now do not use. > > The patch works in fairly straightforward way. It ads two flags to > struct condition to specify it actually refers to an aggregate passed > by value or something passed by reference, in both cases at a > particular offset, also newly stored in the structures. Functions > which build the predicates specifying under which conditions CFG edges > will be taken or individual statements are actually executed then > simply also look whether a value comes from an aggregate passed to us > in a parameter (either by value or reference) and if so, create > appropriate conditions. Later on, predicates are evaluated as before, > we only also look at aggregate contents of the jump functions of the > edge we are considering to inline when evaluating the predicates, and > also remap the offsets of the jump functions when remapping over an > ancestor jump function. > > This patch alone makes us inline the function bar in testcase of PR > 48636 in comment #4. It also passes bootstrap and testing on > x86_64-linux. I successfully LTO-built Firefox with it too. > > Thanks for all comments and suggestions, > > Martin > > > 2012-07-31 Martin Jambor > > PR fortran/48636 > * ipa-inline.h (condition): New fields offset, agg_contents and by_ref. > * ipa-inline-analysis.c (agg_position_info): New type. > (add_condition): New parameter aggpos, also store agg_contents, by_ref > and offset. > (dump_condition): Also dump aggregate conditions. > (evaluate_conditions_for_known_args): Also handle aggregate > conditions. New parameter known_aggs. > (evaluate_properties_for_edge): Gather known aggregate contents. > (inline_node_duplication_hook): Pass NULL known_aggs to > evaluate_conditions_for_known_args. > (unmodified_parm): Split into unmodified_parm and unmodified_parm_1. > (unmodified_parm_or_parm_agg_item): New function. > (set_cond_stmt_execution_predicate): Handle values passed in > aggregates. > (set_switch_stmt_execution_predicate): Likewise. > (will_be_nonconstant_predicate): Likewise. > (estimate_edge_devirt_benefit): Pass new parameter known_aggs to > ipa_get_indirect_edge_target. > (estimate_calls_size_and_time): New parameter known_aggs, pass it > recrsively to itself and to estimate_edge_devirt_benefit. > (estimate_node_size_and_time): New vector known_aggs, pass it o > functions which need it. > (remap_predicate): New parameter offset_map, use it to remap aggregate > conditions. > (remap_edge_summaries): New parameter offset_map, pass it recursively > to itself and to remap_predicate. > (inline_merge_summary): Also create and populate vector offset_map. > (do_estimate_edge_time): New vector of known aggregate contents, > passed to functions which need it. > (inline_read_section): Stream new fields of condition. > (inline_write_summary): Likewise. > * ipa-cp.c (ipa_get_indirect_edge_target): Also examine the aggregate > contents. Let all local callers pass NULL for known_aggs. > > * testsuite/gfortran.dg/pr48636.f90: New test. OK with the following changes. I plan to push out my inline hints code, so it would be nice if you commited soon so I cn resolve conflicts on my side. > Index: src/gcc/ipa-inline.h > =================================================================== > *** src.orig/gcc/ipa-inline.h > --- src/gcc/ipa-inline.h > *************** along with GCC; see the file COPYING3. > *** 28,36 **** > --- 28,45 ---- > > typedef struct GTY(()) condition > { > + /* If agg_contents is set, this is the offset from which the used data was > + loaded. */ > + HOST_WIDE_INT offset; > tree val; > int operand_num; > enum tree_code code; > + /* Set if the used data were loaded from an aggregate parameter or from > + data received by reference. */ > + unsigned agg_contents : 1; > + /* If agg_contents is set, this differentiates between loads from data > + passed by reference and by value. */ > + unsigned by_ref : 1; Do you have any data on memory usage? I was originally concerned about memory use of the whole predicate thingy on WPA level. Eventually we could add simple inheritance on conditions and sort them into mutiple vectors if needed. But I assume it is OK or we will work out on Mozilla builds soonish. One obvious thing is to patch CODE and the bitfields so we fit in 3 64bit words. > *************** dump_condition (FILE *f, conditions cond > *** 519,524 **** > --- 554,561 ---- > c = VEC_index (condition, conditions, > cond - predicate_first_dynamic_condition); > fprintf (f, "op%i", c->operand_num); > + if (c->agg_contents) > + fprintf (f, "[offset: " HOST_WIDE_INT_PRINT_DEC "]", c->offset); Dumping of by_ref? > if (c->code == IS_NOT_CONSTANT) > { > fprintf (f, " not constant"); > --- 718,761 ---- > tree val; > tree res; > > ! /* We allow call stmt to have fewer arguments than the callee function > ! (especially for K&R style programs). So bound check here (we assume > ! known_aggs vector, if non-NULL, has the same length as > ! known_vals). */ > ! gcc_assert (!known_aggs Perhaps checking_assert. This tends to be hot path of WPA inliner. > --- 833,857 ---- > es->param, > i)->change_prob) > VEC_replace (tree, known_vals, i, error_mark_node); > + /* TODO: When IPA-CP starts merging aggregate jump functions, use its > + knowledge of the caller too, just like the scalar case above. */ > + VEC_replace (ipa_agg_jump_function_p, known_aggs, i, &jf->agg); By merging of arggregate functions you meen propagating from caller to callee? > ! /* If OP refers to a value of a function parameter or value loaded from an > ! aggregate passed to a parameter (either by value or reference), return TRUE > ! and store the number of the parameter to *INDEX_P and information whether > ! and how it has been loaded from an aggregate into *AGGPOS. INFO describes > ! the function parameters, STMT is the statement in which OP is used or > ! loaded. */ > ! > ! static bool > ! unmodified_parm_or_parm_agg_item (struct ipa_node_params *info, > ! gimple stmt, tree op, int *index_p, > ! struct agg_position_info *aggpos) I assume this follows the same startegy as the ipa-cp code, right? > *************** set_cond_stmt_execution_predicate (struc > *** 1480,1485 **** > --- 1609,1615 ---- > || gimple_call_num_args (set_stmt) != 1) > return; > op2 = gimple_call_arg (set_stmt, 0); > + /* TODO: Use unmodified_parm_or_parm_agg_item also here. */ Why it is TODO? > base = get_base_address (op2); > parm = unmodified_parm (set_stmt, base ? base : op2); > if (!parm) > --- 1800,1817 ---- > return p; > > is_load = gimple_vuse (stmt) != NULL; > /* Loads can be optimized when the value is known. */ > if (is_load) > { > ! tree op; > gcc_assert (gimple_assign_single_p (stmt)); > ! op = gimple_assign_rhs1 (stmt); > ! if (!unmodified_parm_or_parm_agg_item (info, stmt, op, &base_index, > ! &aggpos)) > return p; > } > + else > + base_index = -1; Don't we miss the actual check that the load is load of known aggregate value on this path? > /* Translate all conditions from callee representation into caller > representation and symbolically evaluate predicate P into new predicate. > > ! INFO is inline_summary of function we are adding predicate into, CALLEE_INFO > ! is summary of function predicate P is from. OPERAND_MAP is array giving > ! callee formal IDs the caller formal IDs. POSSSIBLE_TRUTHS is clausule of all > ! callee conditions that may be true in caller context. TOPLEV_PREDICATE is > ! predicate under which callee is executed. OFFSET_MAP is an array of of > ! offsets that need to be added to conditions, negative offset means that > ! conditions relying on values passed by reference have to be discarded > ! because they might not be preserved (and should be considered offset zero > ! for other purposes). */ Do you handle cases like arg passed by value turning into arg passed by reference? If not, just add an TODO. Where do we verify that by_reference flags match? Thanks, Honza