From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9131 invoked by alias); 18 Jan 2004 22:33:46 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 9124 invoked from network); 18 Jan 2004 22:33:45 -0000 Received: from unknown (HELO atrey.karlin.mff.cuni.cz) (195.113.31.123) by sources.redhat.com with SMTP; 18 Jan 2004 22:33:45 -0000 Received: by atrey.karlin.mff.cuni.cz (Postfix, from userid 4018) id 115E94C0381; Sun, 18 Jan 2004 23:33:18 +0100 (CET) Date: Sun, 18 Jan 2004 22:33:00 -0000 From: Jan Hubicka To: Chris Lattner Cc: Richard Henderson , "Kaveh R. Ghazi" , ian@wasabisystems.com, gcc@gcc.gnu.org, "Joseph S. Myers" Subject: Re: Can we speed up the gcc_target structure? Message-ID: <20040118223317.GE11268@atrey.karlin.mff.cuni.cz> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.4i X-SW-Source: 2004-01/txt/msg01201.txt.bz2 > > Richard Henderson wrote: > > On Sun, Jan 18, 2004 at 09:14:14PM +0000, Joseph S. Myers wrote: > > > When --enable-intermodule is used, does (or should) the compiler > > > manage to detect which parts of the target structure are in fact > > > constant (even without constifying)? > > LLVM is very good at this kind of stuff. Yeah.. > > > However, that sort of optimization requires that you see the *entire* > > program, not just large parts of it, as with the current intermodule > > code. So I expect this sort of thing is relatively far away. > > This is not really true. At some point, the structure needs to be marked > as having internal linkage. In LLVM, this is accomplished with the > "internalize" pass, which by default marks all symbols internal if the > linked program contains a main (ie, this does not happen for libraries). > This change enables a _lot_ of interprocedural optimizations that would > not be safe to perform otherwise. Of course the internalize pass can be > completely disabled, or enabled for a list of symbols as needed. In my prototype unit-at-a-time code I have either the default mode, where all external symbols are just external and -fwhole-program mode where all symbols that do have deifnitions are considered internal except for main and symbols marked by "used" attribute. I suppose this is mostly equivalent to what LLVM have except for the fact that list of symbols with external linkage is boundled in the source itself, instead of being in separate file. Do you have any experience with the separate file having considerable benefits? > > Note that it is quite possible that the user would like to run the > internalize pass _before_ the whole program is available, for example, to > prune the public symbols exposed by a library. > > > One possibility is a switch that says "except for main, nothing > > outside these files reference any of the symbols herein defined." > > That might get you the same effect as whole-program optimization > > without having to have extra info about external runtime libraries. > > This is _extremely_ dangerous, and in practice, cannot be done. A > compiler _very rarely_ has the entire program to analyze, and must > therefore be able to handle the fact that there is external code that can > access program structures (e.g., there can be precompiled libraries (ie, > libc, libm), dynamically loaded libraries (plugins), etc). > > The nice thing about LLVM using the internalize pass is that if it is used > incorrectly, a program will not link. If you use a "whole program" > compiler incorrectly, it will be silently misoptimized, which IMHO is > _much_ worse. My plan is to make all symbols that are considered local static so it will elliminate this problem. Honza > > -Chris > > -- > http://llvm.cs.uiuc.edu/ > http://www.nondot.org/~sabre/Projects/