From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 13354 invoked by alias); 26 Jun 2014 18:46:22 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 13317 invoked by uid 89); 26 Jun 2014 18:46:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 26 Jun 2014 18:46:18 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 3067C543B09; Thu, 26 Jun 2014 20:46:15 +0200 (CEST) Date: Thu, 26 Jun 2014 18:46:00 -0000 From: Jan Hubicka To: Jeff Law Cc: mliska , gcc-patches@gcc.gnu.org, hubicka@ucw.cz Subject: Re: [PATCH 3/5] IPA ICF pass Message-ID: <20140626184614.GB29056@kam.mff.cuni.cz> References: <53A9E021.8000400@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53A9E021.8000400@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-SW-Source: 2014-06/txt/msg02159.txt.bz2 Jeff, thanks for review! I did some passes over the patch before it got to the ML, I am happy to have independent opinion. > >+@item -fipa-icf > >+@opindex fipa-icf > >+Perform Identical Code Folding for functions and read-only variables. > >+Behavior is similar to Gold Linker ICF optimization. Symbols proved > >+as semantically equivalent are redirected to corresponding symbol. The pass > >+sensitively decides for usage of alias, thunk or local redirection. > >+This flag is enabled by default at @option{-O2}. > So you've added this at -O2, what is the general compile-time > impact? Would it make more sense to instead have it be part of -O3, > particularly since ICF is rarely going to improve performance (sans > icache issues). I think code size optimization not sacrifying any (or too much of) performance are generally very welcome at -O2. Compared to LLVM and Microsoft compilers we are on code bloat side at -O2. http://hubicka.blogspot.ca/2014/04/linktime-optimization-in-gcc-2-firefox.html has some numbers for -O2 GGC/LLVM. I believe this is result of tunning for relatively small benchamrks (SPECS) and I hope we could revisit -O2 for code size considerations for 4.10 somewhat. If tuned well, ICF has no reason to be expnesive wrt compile time. So lets shoot for that. The considerable donwside of enabling ICF IMO should be only disturbing effect on debug info. > >+ return true; > >+} > Isn't this really checking for equivalence? "do correspond" seems > awkward here. The function turns the names equivalent on first invocation for a given name and later checks that this tentative equivalence holds. Not sure what is best name for it (originaly it was verify that did not sound right to me either) > > >+ > >+/* Verification function for edges E1 and E2. */ > >+ > >+bool > >+func_checker::compare_edge (edge e1, edge e2) > >+{ > >+ if (e1->flags != e2->flags) > >+ return false; > Presumably there's no flags we can safely ignore. So absolute > equality seems reasonable here. Yep > >+/* Compare two types if are same aliases in case of strict aliasing > >+ is enabled. */ > >+bool > >+sem_item::compare_for_aliasing (tree t1, tree t2) > >+{ > >+ if (flag_strict_aliasing) > >+ { > >+ alias_set_type s1 = get_deref_alias_set (TREE_TYPE (t1)); > >+ alias_set_type s2 = get_deref_alias_set (TREE_TYPE (t2)); > >+ > >+ return s1 == s2; > >+ } > >+ > >+ return true; > >+} > Is returning TRUE really the conservatively correct thing to do in > the absence of aliasing information? Isn't that case really "I > don't know" in which case the proper return value is FALSE? I think with -fno-strict-aliasing the set should be 0 (Richi?) and thus we can compare for equality. We probably can be on agressive side and let 0 alias set prevail the non-0. But that can be done incrementally. We also need to match type inheritance equality for polymorphic types. I will add function for that into ipa-devirt. > >+/* References independent hash function. */ > >+ > >+hashval_t > >+sem_function::get_hash (void) > >+{ > >+ if(!hash) > >+ { > >+ hash = 177454; /* Random number for function type. */ > >+ > >+ hash = iterative_hash_object (arg_count, hash); > >+ hash = iterative_hash_object (bb_count, hash); > >+ hash = iterative_hash_object (edge_count, hash); > >+ hash = iterative_hash_object (cfg_checksum, hash); > Does CFG_CHECKSUM encompass the bb/edge counts? It is one used by profiling code to match profiles, so it should. > >+ SE_EXIT_FALSE(); > >+ > >+ if (!equals_wpa (item)) > >+ return false; > >+ > >+ /* Checking function arguments. */ > >+ tree decl1 = DECL_ATTRIBUTES (decl); > >+ tree decl2 = DECL_ATTRIBUTES (compared_func->decl); > So are there any attributes we can safely ignore? Probably not. > However, we ought to handle the case where the attributes appear in > different orders. There are few, like we can ignore "weak" or "visibility" attribute because we do produce alias with proper visibility anyway. My plan is to start removing those attributes from declarations once they are turned into suitable representation in symbol table (or for attributes like const/noreturn/pure where we have explicit decl flags). This will make our life bit easier later, too. We probably then can whitelist some attributes, but I would deal with this later. > >+/* Returns cgraph_node. */ > >+ > >+struct cgraph_node * > >+sem_function::get_node (void) > >+{ > >+ return cgraph (node); > >+} > >+ > >+/* Initialize semantic item by info reachable during LTO WPA phase. */ > >+ > >+void > >+sem_function::init_wpa (void) > >+{ > >+ parse_tree_args (); > >+} > inline? Worth or not worth the headache? We ought to autoinline simple wrappers even at -Os (for size) (I am not agains explicit inline keywords here tough) > > > >+ > >+bool > >+sem_function::compare_bb (sem_bb_t *bb1, sem_bb_t *bb2, tree func1, tree func2) > So this routine walks down the gimple statements and compares them > for equality. Would it make sense to have the equality testing in > gimple? That way if someone adds a new gimple code the places they > need to check/update are at least somewhat more localized? There are few places that does equality testing (tailmerge) AFAIK. They are all somewhat different - I think we can export this equality machinery with reosnable API and try to turn those to use it, but it may be good incremental project IMO. > > > >+ > >+ for (i = 0; i < size1; ++i) > >+ { > >+ t1 = gimple_phi_arg (phi1, i)->def; > >+ t2 = gimple_phi_arg (phi2, i)->def; > >+ > >+ if (!compare_operand (t1, t2, func1, func2)) > >+ SE_EXIT_FALSE (); > >+ > >+ e1 = gimple_phi_arg_edge (phi1, i); > >+ e2 = gimple_phi_arg_edge (phi2, i); > >+ > >+ if (!checker.compare_edge (e1, e2)) > >+ SE_EXIT_FALSE (); > >+ } > I don't think we guarantee any particular order on the PHI args. > ISTM you'd want to sort them or something so as not to reject a > possible duplicate simply because of ordering issues. > > Related, I'm not sure bb indexes are even guaranteed to have any > stable ordering. So ISTM you'd want to do something like a DFS walk > to set a index for each block, then sort the PHI arguments based on > DFS index to get a stable, consistent check here. Yep, there are no resonable orders on it. If function starts same in source code they ought to end up same here. Plan was to first match for exact equality and then play with smarter tricks here. > > I'm starting to gloss over things.... It feels like we've got too > much stuff in this one file. Breaking things out would help (like > for example the hashing/equality bits). I'd like to see things > broken out a bit and reposted for further reviewing. Yep, the pass has grown up to be rather long. The gimple equality testing is the main body of work, so perhaps doing this in separate file is good idea. Honza > > Jeff