From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 13033 invoked by alias); 10 Oct 2004 16:52:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 13021 invoked from network); 10 Oct 2004 16:52:18 -0000 Received: from unknown (HELO Cantor.suse.de) (195.135.220.2) by sourceware.org with SMTP; 10 Oct 2004 16:52:18 -0000 Received: from extimap.suse.de (extimap.suse.de [195.135.220.6]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id C6C52D1F3D8; Sun, 10 Oct 2004 18:49:45 +0200 (CEST) Received: from stevenb.home.suse.de (70-90.ipact.nl [82.210.90.70]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (Client did not present a certificate) by extimap.suse.de (Postfix) with ESMTP id ED7B96164A; Sun, 10 Oct 2004 18:49:44 +0200 (CEST) From: Steven Bosscher Organization: SUSE Labs To: Razya Ladelsky , gcc-patches@gcc.gnu.org Subject: Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION) Date: Sun, 10 Oct 2004 17:10:00 -0000 User-Agent: KMail/1.5.4 Cc: hubicka@ucw.cz, Mircea Namolaru , Ayal Zaks References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200410101850.14044.stevenb@suse.de> X-SW-Source: 2004-10/txt/msg00858.txt.bz2 On Sunday 10 October 2004 17:20, Razya Ladelsky wrote: > Comments are welcome. [ The first part of this is turned into a bit of a rant, as I'm feeling really bad about where you are implementing these optimizations. IMHO the abstraction level is wrong, we are not taking things in the right direction... Skip to /rant for comments on the actual patch ;-) ] I don't mean to say anything bad about your work, but why do you not try to help implementing *proper* IPA, starting with a CFG inliner and early global (ie. intraprocedural) optimizations *before* IPA, like apparently most compilers do? To quote myself from "http://gcc.gnu.org/projects/tree-profiling.html": ============================================================== (...) plans exist for reorganisation of the compiler passes as follows: 1. For each SCC in the call graph in reverse topological order do simple intraprocedural optimizations in SSA form (basically, run anything that is cheap and can not create overlapping live ranges, but will improve the results of the next step); collect intraprocedural information necessary for interprocedural data flow analyses; store the function's intermediate representation somewhere (*) 2. Perform interprocedural analyses and optimizations (at least constant propagation, side-effects analysis, and function cloning and specialization). 3. For each SCC in the call graph in reverse topological order do remaining intraprocedural optimizations; hand over the optimized function to the RTL expanders for finalization and code generation. ============================================================== (*) Either in memory or in a data base. And yes I *am* willing to open that discussion again if it turns out that we can do great things with IPA but storing in memory is too expennsive (d'oh!). This means that when we start working on this (most likely before the end of this year), we (at SUSE, IBM, or whereever) will have to rewrite for the new IPA framework *everything* that you're now sort-of hackishly implementing in the existing framework that clearly is unsuitable for proper IPA. The reason *why* almost everything would have to be redone is because your implementation does not work on lowered GIMPLE, and also because we'll need to clone function CFGs (ie. not function body trees) or lose the CFG for cloned/inlined functions, and reconstruct them later on, somehow (but that is not really an option IMO). This is wasteful. I would *much* rather have everyone agree on clean IPA framework design, and work together on implementing that. Don't get me wrong, it is really really cool that y'all want to have this implemented in GCC. So do I. It is also great that you're experimenting with these algorithms, demonstrating that it can be done for GCC. But what is happening right now is making things only more difficult in the long run. We already have this lame excuse for IPA with the "Inter Module" C front end hack that will have to be redone if GCC is ever going to support real IPA. And IMHO we do not need more obfuscated pseudo-IPA/IPO implementations like that; not now that we have the opportunity to do it properly with the cgraph module and low-GIMPLE which is relatively easy to serialize and dump to a data base. If only these reasons mattered, I'd really rather not have this patch in CVS. Perhaps other people have a different opinion, of course ;-) Now, about the patch. First off, you should probably just create a new file cgraph-clone.c or something to put most of the new functions. cgraph*.[ch] is already a bit messy and we plan to clean it up. Second, I'm still very curious about numbers. What does this do to the compile time? Have you counted propagated constants and/or cloned functions on SPEC or GCC itself (or some other benchmark)? A few comments on the code itself: > + /* If the parametr's address is taken, > + it could be modified. */ > if( TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL ) > { > i = ipcp_method_tree_map (mt, TREE_OPERAND (t, 0)); > ipcp_method_modify_set (mt, i, true); > } > break; This is one of the reasons why I believe we should first work on some framework to do early optimizations. After the tree optimizers, we'll have much better information about what parameters have their address taken. Also, funny you have a tab before "i =" and spaces almost everywhere else. I don't care about spaces vs. tabs much, but I believe the GCC style requires tabs. (I'd say, at least don't do both! ;-) > /* Perform reachability analysis and reclaim all unreachable nodes. > ! If BEFORE_INLINING_P is true this function is called before inlining > ! decisions has been made. If BEFORE_INLINING_P is false this function also > ! removes unneeded bodies of extern inline functions. */ > static bool > ! cgraph_remove_unreachable_nodes (bool before_inlining_p) You should really explain what the difference between BEFORE_INLINING_P and !BEFORE_INLINING_P is: How does this function behave differently with this flag set or not set? > + for (i=0; i < strlen (tmp_name); i++) > + { > + if (tmp_name[i] == '.') > + tmp_name[i] = '_'; > + } So you're going to call strlen for every loop iteration? > --- integrate.c 5 Oct 2004 08:35:55 -0000 > *************** copy_decl_for_inlining (tree decl, tree > *** 176,181 **** > --- 176,248 ---- > > return copy; > } > + > + /* Copy NODE (which must be a DECL). The DECL originally was in the FROM_FN, > + but now it will be in the TO_FN. As apposed for to copy_decl_for_inlining () > + function; we do not give a special treatment to parm_decl and result_decl. */ > + tree > + copy_decl_for_versioning (tree decl, tree from_fn, tree to_fn) This function has no business in integrate.c, this file used to implement the RTL inliner which is long dead and gone. Do not put new code there. This should go into the new file, or otherwise I guess tree-inline is a more appropriate place. Indeed, copy_decl_for_inlining should be moved out of this file also. > + if (TREE_CODE (copy) == LABEL_DECL) > + { > + TREE_ADDRESSABLE (copy) = 0; > + } This is not GNU/GCC code style. > ! { > ! cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i)); > ! if (const_param == 0) > ! { > ! /* Compute how many callers node has. */ > ! node_callers = 0; > ! for (cs = node->callers; cs != NULL; > ! cs = cs->next_caller) > ! node_callers++; The indent of "if" also is not quite GNU/GCC style conforming. And would it be useful to cache the number of callers/callees in the cgraph_node? IIRC I've seen this kind of counting elsewhere, too (but I'm not sure). > + /* ??? Remove the following gaurd. */ > + if (lang_hooks.tree_versioning.cannot_version_tree_fn (&fndecl)) > + return false; Typo: guard ;-) And I actually like the idea that the language can say, "do not clone this", why would you want to remove this? > + /* ??? Is there a way to version a > + constructor? */ Good question. I would hope so, I'd expect (but have no data to support the suspicion, unfortunately) to see constants in constructors a lot, e.g. booleans, number of items/slots/elements/etc... Does anyone know if this has been studied and any data has been published? > + /* The new variable/label has no RTL, yet. */ > + if (!TREE_STATIC (copy) && !DECL_EXTERNAL (copy)) > + SET_DECL_RTL (copy, NULL_RTX); Hmm... This somehow looks odd, looks like a remnant of the RTL inliner. My understanding is that since we inline before expand, we should never have RTL for local labels and variables when we get here. Hence, this check might very well be redundant (you could try replacing it with a "gcc_assert (!DECL_RTL (copy))"). > + /* lang_hooks.tree_versioning.cannot_version_tree_fn is called to > + determine whether there are language-specific reasons for not > + creating a new version to a given function. */ "for a given function". > *************** ipcp_method_compute_modify (ipcp_method > *** 743,749 **** > > ipcp_method_modify_init (mt); > decl = mt->decl; > ! if (!tree_inlinable_function_p (decl)) > { > for (j = 0; j < ipcp_method_formal_count (mt); j++) > { > --- 799,807 ---- > > ipcp_method_modify_init (mt); > decl = mt->decl; > ! /* ??? Handle pending sizes case. Set all parameters > ! of the method to be modified. */ > ! if (DECL_UNINLINABLE (decl)) > { > for (j = 0; j < ipcp_method_formal_count (mt); j++) > { What is the reason for this change? I cannot tell from the ChangeLog, because you don't really *have* a ChangeLog entry for ipa_prop.c: > * ipa_prop.c: Update IPA constant propagation. > * ipa_prop.h: Same. Please add a proper ChangeLog entry for all patches you post. It is required for good reasons, patch review without a ChangeLog entry is very difficult sometimes. It would generally be helpful if you give a short explanation for the changes you make to existing files, especially for the places you've annotated with a FIXME, ???, XXX, or some other marker indicating that the code as-is may not be perfect. Thanks, Gr. Steven