From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21308 invoked by alias); 28 Feb 2004 01:09:04 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 21270 invoked from network); 28 Feb 2004 01:09:03 -0000 Received: from unknown (HELO mail-out3.apple.com) (17.254.13.22) by sources.redhat.com with SMTP; 28 Feb 2004 01:09:03 -0000 Received: from mailgate1.apple.com (a17-128-100-225.apple.com [17.128.100.225]) by mail-out3.apple.com (8.12.11/8.12.11) with ESMTP id i1S193Pp001652 for ; Fri, 27 Feb 2004 17:09:03 -0800 (PST) Received: from relay3.apple.com (relay3.apple.com) by mailgate1.apple.com (Content Technologies SMTPRS 4.3.6) with ESMTP id ; Fri, 27 Feb 2004 17:09:02 -0800 Received: from [17.201.21.152] (bowdro.apple.com [17.201.21.152]) by relay3.apple.com (8.12.11/8.12.11) with ESMTP id i1S18j5P027616; Sat, 28 Feb 2004 01:08:46 GMT In-Reply-To: <200402280020.i1S0KZ4J006446@scanner2.ics.uci.edu> References: <2B448A8E-6977-11D8-B63F-000A957D89DA@apple.com> <200402280020.i1S0KZ4J006446@scanner2.ics.uci.edu> Mime-Version: 1.0 (Apple Message framework v612) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: Content-Transfer-Encoding: 7bit Cc: gcc@gcc.gnu.org, Robert Bowdidge From: Robert Bowdidge Subject: Re: cutting the size of CONST_DECLs Date: Sat, 28 Feb 2004 02:38:00 -0000 To: Dan Nicolaescu X-SW-Source: 2004-02/txt/msg01473.txt.bz2 On Feb 27, 2004, at 4:20 PM, Dan Nicolaescu wrote: > You might also want to look at doing something about FUNCTION_DECLs. > There are > a few fields in struct tree_decl that are only used for > FUNCTION_DECLs: inlined_fns, saved_tree. Yep, separating FUNCTION_DECLs from everything else would definitely help. One problem is that function declarations aren't the most common node we see when parsing our headers; in my examples, FUNCTION_DECLs were only 14-20% of declarations. PARM_DECLs were usually 33-36% of declarations, so separating them out makes twice the difference. Here's the distributions of declaration kinds I saw on some example compiles: PARAM FN CONST TYPE FIELD VAR FinderFE 30% 29% 24% 16% ? ? gcc 30% 34% 10% 3% 21% 3% rogue 27% 59% 0% 5% 9% 5% AppearanceSample 36% 20% 18% 16% 7% 0% SimpleText 33% 14% 28% 13% 11% 4% Each represents declaration kinds found during compile of a representative file. FinderFE is the C++ front end to the Mac OS Finder. Appearance Sample and SimpleText are two GUI applications from Apple's sample code; AppearanceSample is C++, and SimpleText C. One possibility I've strongly considered is to have three levels of declarations: small: CONST_DECL medium: PARM_DECL, TYPE_DECL,FIELD_DECL large: FN_DECL, VAR_DECL On 3.3, small declarations to 72 bytes, medium declarations 100 bytes, and large declarations 116 bytes. I've done a few measurements with this set of objects; memory use and compile times improve again from the CONST_DECL/everything else scheme, though the memory improvements are still in the <10% ballpark, and compile time improvements are much smaller. I still need to do more measurements on non-Apple source code, and double-check that the changes aren't slowing the compiler. Adding different kinds of declarations adds a bit of overhead, so we don't want to take this too far. At a minimum, the DECL_P() macro needs to do more work to decide if a given node is a declaration. On a stock system, this is a single comparison: is the tree code class 'd'? With three kinds of declarations, I could make the tree code classes 'd', 'D', and '$'. With these characters, the test in DECL_P() can use a bitmask to see if (tree_code & 0x1f) == 0x04 rather than doing three comparisons. Deciding which kinds of declarations behave similarly can be difficult from just the table of access patterns. I ended up using concept analysis to identify likely hierarchies. (Snelting and Tip, "Understanding Class Hierarchies using Concept Analysis", ACM Transactions on Programming Languages and Systems 22(3), May 2000). There's a tool available on Sourceforge ("conexp") to draw an optimized class hierarchy (well, a concept lattice) given a set of objects and properties. The three-level scheme pretty much fell out once I looked at the lattice. Robert