From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18834 invoked by alias); 8 Jul 2011 18:24:51 -0000 Received: (qmail 18825 invoked by uid 22791); 8 Jul 2011 18:24:49 -0000 X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL,BAYES_00,TW_JF,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 08 Jul 2011 18:24:33 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 0933D9AC87B; Fri, 8 Jul 2011 20:24:31 +0200 (CEST) Date: Fri, 08 Jul 2011 19:07:00 -0000 From: Jan Hubicka To: Jan Hubicka , GCC Patches Subject: Re: [PATCH] New IPA-CP with real function cloning Message-ID: <20110708182430.GB21021@kam.mff.cuni.cz> References: <20110623192404.GC2736@virgil.arch.suse.de> <20110707160307.GG26368@atrey.karlin.mff.cuni.cz> <20110708165959.GB19992@virgil.arch.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110708165959.GB19992@virgil.arch.suse.de> User-Agent: Mutt/1.5.18 (2008-05-17) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-07/txt/msg00678.txt.bz2 > > > /* Structure holding data required to describe a pass-through jump function. */ > > > > > > struct GTY(()) ipa_pass_through_data > > > { > > > /* If an operation is to be performed on the original parameter, this is the > > > second (constant) operand. */ > > > tree operand; > > > /* Number of the caller's formal parameter being passed. */ > > > int formal_id; > > > > I probably should use this in ipa-inline-analsysi where I call it for some reason operand_num :) > > So far I have resisted the urge to rename this but it pre-dates my > involvement with gcc. I'd like it to be called parm_index but since > we might want to use it also for global variables and we might need > something more complex for also handling parts of aggregates, I left > it for later. parm_index sounds to me good, too. formal_id is the name used in paper so it makes sense, but its meaning is unobvious. We could rename it later, together with ipa-inline-analysis one (I think consistency here is more important). Yep, we will have to see how the jfuncs will look like once they reffer to global vars and parts of agregates.. > > > > > > > > > struct ipcp_value; > > > > I wonder if the jump functions used by several passes and ipcp > > internal types both has o go into the same header? > > Well, I originally wanted ipa-prop to provide services to the outside > world like ipa-inline-analysis.c and to have ipa-cp self-contained. > But I guess the data separation is more important. So if we move > ipa_cst_from_jfunc (and its 3 friends) to ipa-cp and move > ipcp_lattice.decl and ipcp_lattice.used to a special structure, I > might move ipcp_value_source, ipcp_value, and ipcp_lattice altogether > to ipa-cp. I also think that having ipa-prop as a generic module for propagating& looking into args is the goal. Separating datastructures form ipa-cp definitely makes this more obvious. I've done that to the inliner, too. inline_summary is now all about the function body size/time estimates and the inliner heuristics do have their own datastructures in their own space. > > At the moment I'm not sure whether I want to do this as a followup > patch or incorporate it in the changes. I think I'll start with the > latter and revert to the former if it is too invasive to parts which > have not been touched so far by the change. Lets see, I am happy with the patch going in with current organization of datastructures if we move into privatizing it later. > > > { > > > /* Pointer to an array of structures describing individual formal > > > parameters. */ > > > struct ipcp_lattice *lattices; > > > > Hmm, how we get here around the need to mark this GTY(). I.e are we sure that all the known_vals > > must be referneced from elsewhere at ggc time? > > (Scalar) constants that are results of arithmetic jump functions may > not be referenced from elsewhere, everything else is referenced from > the jump functions. If it is a problem it is already present in the > current IPA-CP. ipa_node_params and lattices are not GTYed there > either. Hmm, I guess it is not really problem only because the lattices are used only in ipa-cp so the values do not really live across GGC call. Well, this will be solved by separating out the ipa-cp datastructures, so it is not a problem. > > > I would also slowly switch those things to VECtors.. > > Perhaps, but individual lattices are and always have been accessed > through ipa_get_lattice which checks bounds and so there's no big > reason to do that. Yep, I added the bounds check when I was debugging. Not big deal, just we sort of do have agreement using our VECtor API where it fits.. > > > > > > /* Only for versioned nodes this field would not be NULL, > > > it points to the node that IPA cp cloned from. */ > > > struct cgraph_node *ipcp_orig_node; > > Why not use node->clone_of here? > > That would not work if the node was a clone created by some other > pass. I need to differentiate between clones I create because they do > not have lattices but do have the exact values for individual > parameters in known_vals (which is NULL otherwise). I should probably > use a flag though, the code I ended up only checks it for NULL anyway. Hmm, OK, either flag or keeping the pointer is fine. > > > > > > /* ipa_edge_args stores information related to a callsite and particularly its > > > arguments. It can be accessed by the IPA_EDGE_REF macro. */ > > > typedef struct GTY(()) ipa_edge_args > > > > probably edge_summary would be my preferred name. > > Ugh, this is the current name, we may change it later. In any event > the name should probably tell that the summary is about parameters. Hmm, OK, it is not bad name after all. > > > > > > { > > > /* Next pointer in a linked list of clones of the same function. */ > > > struct cgraph_edge *next_edge_clone; > > > > What this is needed for? > > For get_info_about_necessary_edges and gather_edges_for_value. If a > value comes from a node and that is itself cloned, it now may come > from two nodes and so when I iterate over potential sources, I have to > include all such edges. This is their linked list. OK, I see. > > This field could be put into a separate structure because it is not > really part of the summary, it is needed only when the clones are > being created. I'll do that too. > > > > > > > /* Number of actual arguments in this callsite. When set to 0, > > > this callsite's parameters would not be analyzed by the different > > > stages of IPA CP. */ > > > int argument_count; > > > /* Array of the callsite's jump function of each parameter. */ > > > struct ipa_jump_func GTY ((length ("%h.argument_count"))) *jump_functions; > > > > Would be prettier as VECtor. > > The patch does not touch places which manipulate jump functions and > I'd prefer to keep it that way. Yep, lets do that incrementally. > > > The goal of this transformation is to > > > > > > 1) discover functions which are always invoked with some arguments with the > > > same known constant values and modify the functions so that the > > > subsequent optimizations can take advantage of the knowledge, and > > > > > > 2) create specialized versions of functions transformed in this way if > > > some parameters are known constants only in certain contexts but the > > > estimated tradeoff between speedup and cost size is deemed good. > > > > > > The algorithm also propagates types and attempts to perform type based > > > devirtualization. Types are propagated much like constants. > > > > I would add that the algorithm is not doing only the traditional ipa-cp propagation, > > but also handles partial specialization and propagate known types > > for devirtualization. > > I might call point 2 "partial specialization" but otherwise I believe > the comments above already state exactly that... do they not? Hmm, I guess it does.. > > > > > || !inline_summary (node)->versionable > > > || TYPE_ATTRIBUTES (TREE_TYPE (node->decl)) > > > > This is getting an issue for Fortran that attach the function arguments quite commonly, > > perhaps we should start moving ahead on this and ruling out only the arguments that > > gets can't be preserved. > > Yes, for example in 433.milc basically all the functions are > considered not versionable because of this. I also thought of not > removing parameters for such functions. > > > Also you need to test attributes of DECL itself. > > Really? We are not doing it now, nether for IPA-CP nor for IPA-SRA > (which is good at triggering problems with these). Hmm, all the ipa-inline code checks DECL_ATTRIBUTES only. I believe the type attributes ends up being copied to decl attributes but not vice versa. I guess the testcase should be __attribute__ ((really_bad_attribute)) function (int param) { use param } and then let ipa-cp to invent param is a constant. > > > > > I think this all should be part of can_change_signature_p code, since we can version > > with all attributes I can thunk of when we don't change signature. > > > > > || cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE) > > > res = false; > > > else > > > /* Removing arguments doesn't work if the function takes varargs > > > or use __builtin_apply_args. > > > FIXME: handle this together with can_change_signature flag. */ > > > for (edge = node->callees; edge; edge = edge->next_callee) > > > { > > > tree t = edge->callee->decl; > > > if (DECL_BUILT_IN_CLASS (t) == BUILT_IN_NORMAL > > > && (DECL_FUNCTION_CODE (t) == BUILT_IN_APPLY_ARGS > > > || DECL_FUNCTION_CODE (t) == BUILT_IN_VA_START)) > > > > can_change_sigunature will also handle apply_args. > > Add VA_START code there, too. For the other use of this flag (in i386) VA_START > > rules out the change, too, but so far we explicitely tested that in the backend. > > > > Iguess with these changes, inline_summary (node)->versionable should coincide > > with IPA_NODE_REF (node)->node_versionable making this whole code unnecesary > > (it was supposed to work this way, I am not sure why we now have to versionable > > flags). > > That would be nice. I was wondering why the difference between the > two. Again, I am yet to decide whether this should be done as a > followup or within this change. OK. > > Why you compute count_sum instead of using node->count? Is it because you are > > interested in local calls only? > > I'm only interested in known calls, that I can redirect, i.e. in "local" > direct ones. OK > > If you use for_node_thunk_and_aliases you can remove the recursion you do above > > and it will work for aliases of thunk correctly, too ;) > > But I wouldn't be able to check the edge for hotness. BTW currently the edges from thunks miss any profile info. (i.e. it will be 0). I guess we ought to make ipa-profile pass to estimate those (it is difficult ot measure count of an alias). If you walk only hot edges, then you need to make your function descent into both alias refs and thunks calls, or the aliases of thunks will not be seen then. > > I always wondered if we need this, given that we check for use of > > stdarg. Sure writing vararg function without actually using the > > arguments is sort of stupid, but it seems that the code should just > > work. > > I have seen ICEs because of this when the user passed an incorrect > number of parameters even though the functions was not vararg. And I > think that wasn't even with LTO. But a flag saying "just ignore this > for whatever unspecified reason" might be better, yes. Yep, K&R programs can mismatch parameters. In ipa-inline I however simply bound check and assume nothing on the missing parameter. I think you can easilly do that in the propagation stage, too. > > > > > (at least with one bounds check in propagation for case function is called with > > too few argumetns) > > I don't understand the above. > > > > > Also naturally propagating into stdarg functions is possible. > > > > Sure. But not really important to worry about it now, is it? Yep, not big deal. Honza > > Thanks a lot. I'll try to coma up with the structure reorganization > soon. > > Martin