public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* cutting the size of CONST_DECLs
@ 2004-02-28  0:21 Robert Bowdidge
  2004-02-28  1:09 ` Dan Nicolaescu
  0 siblings, 1 reply; 3+ messages in thread
From: Robert Bowdidge @ 2004-02-28  0:21 UTC (permalink / raw)
  To: gcc; +Cc: Robert Bowdidge

Hi, all,

At Apple, the compiler's memory use (and its effect on compile speed) 
is our biggest concern.  One specific worry is declarations.  On common 
Mac programs which include headers for our GUI libraries, declarations 
fill a significant amount of garbage-collected memory.  With 
-fmem-report, we usually see about 35% of memory filled with 
declarations.
CONST_DECLs are an obvious offender in our programs; many of the Mac 
headers make heavy use of enums to define symbolic constants.   In some 
programs, CONST_DECLS are the third most common kind of declaration 
after FUNCTION_DECL and PARM_DECL, and can be responsible for 30% of 
declarations and 10% of gc'd memory.

I've tried one change to cut memory use: creating a new, smaller 
structure for CONST_DECLs only, and having all other declarations 
inherit this structure and extend it with the rest of the fields.  On a 
couple typical applications, this change cuts memory use by 4-6% and 
compilation time by 0-3%.

I'm interested in hearing whether others believe such a data structure 
change would be appropriate in gcc, and whether similar tricks could be 
done in other parts of gcc.

The changes would look something like the following:

struct tree_super_decl {
   struct tree_common common;
   /* fields needed by CONST_DECL only */
  ...
}

struct tree_decl {
   struct tree_super_decl super_decl;
   /* Fields needed by all other declarations */
   ...
}

Parse tree nodes using the tree_super_decl as their structure have a 
different size from other declarations, so the nodes get a new tree 
code class ('D' rather than 'd')  Macros accessing fields in 
tree_super_decl now just need to be changed slightly:

#define DECL_NAME(x)  (DECL_CHECK(x)->decl.name)
becomes
#define DECL_NAME(x) (SUPER_DECL_CHECK(x)->super_decl.name)

In addition, DECL_P() would need to be changed to recognize both 'D' 
and 'd' as classes.  Case statements switching on the tree code class 
would need the new class added.

Similar approaches have been proposed in the past; the difference is 
that we're focusing on specific kinds of declarations, and we've seen 
measurable differences in real code.  Dan Nicolaescu proposed a similar 
scheme in February 2003 to take fields only used by FUNCTION_DECL out 
of the tree_decl structure:

http://gcc.gnu.org/ml/gcc/2003-02/msg00587.html

Carlo Wood also suggested similar ideas for tree_expr, though he 
proposed extending tree_expr for CALL_EXPR nodes only:
http://gcc.gnu.org/ml/gcc/2003-10/msg00170.html

Advantages:

* Cuts the size of CONST_DECL.  On 3.3, I've been able to get 
CONST_DECL down to 72 bytes as compared to 116 bytes.  More shrinking 
may be possible by figuring out which field accesses on CONST_DECLs are 
accidental or pointless.

* The same technique could be used to shrink other declarations: 
PARM_DECL, TYPE_DECL, and RESULT_DECL all appear to use a subset of the 
tree_decl fields.  All three can be shrunk to 100 bytes with no 
changes.

* The change doesn't change tree.h much, and appears only to require 
minor fixes elsewhere in the gcc sources: some changes in the tree 
allocation code, adding additional case statements or if clauses when 
the tree class code is checked explicitly for 'd', changing the 
DECL_P() to detect both kinds of declaration structures.

Disadvantages:

* All the bit field and flag variables should be placed in the 
"tree_super_decl" field to avoid structure size growth due to padding.

* Deciding which fields should be in the "small" declaration requires a 
combination of knowledge of the gcc code and runtime tracing to figure 
out which fields in a declaration are used for constants.  I chose the 
functions by instrumenting the accessor macros to record references to 
the different variables over several runs, then did more compiles with 
"--enable-checking" on to catch the last few stragglers.  There's 
always a chance that some bit of code deep in the compiler accesses a 
small declaration in the large sense; the only way to catch these is 
with --enable-checking.

* Similarly, bug fixes made after changes to the data structure could 
add cases where fields not available in a particular declaration; 
again, these will only get caught by runtime checks.  This may be a 
good thing; once folks are worrying much more about what sort of 
declaration they may reference, it may encourage more care about which 
fields are truly needed for each type of declaration.

I'm not sure there's any good way to statically identify the spots 
where constant declarations may be accessed.  Although we could use 
something like the Stanford MC checker to propagage potential types 
through function calls and if clauses; I'd guess we'd still get bitten 
by cases where we couldn't prove a declaration might never be a 
constant declaration.  So far, tracking down the last few stragglers by 
turning on --enable-checking and passing lots of code through the 
compiler seemed pretty quick.  I'd feel even more confident by building 
all of Apple's sources with an instrumented compiler; the only 
remaining issues would be port-specific field access patterns.

Comments?

Robert


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-02-28  1:09 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-02-28  0:21 cutting the size of CONST_DECLs Robert Bowdidge
2004-02-28  1:09 ` Dan Nicolaescu
2004-02-28  2:38   ` Robert Bowdidge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).