public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Robert Bowdidge <bowdidge@apple.com>
To: gcc@gcc.gnu.org
Cc: Robert Bowdidge <bowdidge@apple.com>
Subject: Logging data structure field accesses
Date: Sat, 28 Feb 2004 01:06:00 -0000	[thread overview]
Message-ID: <B6944002-6980-11D8-B63F-000A957D89DA@apple.com> (raw)

To figure out which fields of struct tree_decl were used by constant 
declarations, I instrumented the compiler to log every access to fields 
of the data structure; I then summarized the data in a table listing 
the number of accesses for each field for each kind of declaration.  
Folks here at Apple found the results interesting, and suggested 
sharing 'em.

To recap, I added this instrumentation because I wanted to find which 
fields were truly used by each kind of declaration -- not just what the 
header files say are used.  The number of accesses to each field also 
gives us some hints about how the fields are used, and which might 
indicate performance bottlenecks:

* If the number of accesses is small relative to the number of objects 
of that kind, then we know the field is only used intermittently, and 
might be worth hiding in a hash table to cut the size of declaration 
data structures.

* If the number of accesses is equal to the number of objects of that 
kind, then we can guess that the values may only be initialized or are 
only touched once, and thus might not actually be needed.

* If the number of accesses is a large multiple of the number of 
objects of that kind, it might indicate that we're inappropriately 
traversing every object multiple times.


Adding the logging was pretty quick -- only an hour or two of typing 
and debugging -- on a 3.4 compiler.   I hooked into the DECL_CHECK 
routine, as it's called for every field access, adding a second 
parameter naming the field being accessed in the code immediately after 
the check:

#define DECL_BUILT_IN_CLASS(NODE) \
    (FUNCTION_DECL_CHECK 
(DECL_CHECK(NODE,"built_in_class"))->decl.built_in_class)

DECL_CHECK passed the field name (or NULL) into TREE_CLASS_CHECK; I 
added additional code to TREE_CLASS_CHECK so if the field name wasn't 
NULL, the routine would print the tree code (kind of object), tree code 
class (data structure used), field name, source file, and source line 
to standard error.  I then used grep and awk to grab all the accesses 
for a given kind of declaration, and sort and count accesses for each 
field name.

One problem I didn't bother to address was spewing the logging records 
when the current compiler's used to build portions of the compile at 
the end of the build; for now, I just make sure to send output to 
/dev/null when building the instrumented compiler.  I tried adding 
similar instrumentation to a 3.3-compiler, but the node check before 
the field accesses were often done with FUNCTION_DECL_CHECK and other 
macros created by gencheck;  these don't go through TREE_CLASS_CHECK 
but through TREE_CHECK, and made the changes a bit more involved.


Here's two examples of the produced data.  Each represents the fields 
of declarations accessed when compiling a program that only includes 
the <Carbon/Carbon.h> header file (which brings in about 100K lines of 
header files.)  The first web page shows the results when  compiled 
with gcc, the second with g++.  (I'd done these measurements to 
understand why the C compiler only took 1 second to compile the 
headers, but the C++ compiler took two seconds.)  The count of accesses 
to the uid field indicates the number of declarations of each type, as 
the field is only accessed during creation of the declaration node.

http://home.earthlink.net/~bowdidge/c-carbon.html
http://home.earthlink.net/~bowdidge/cpp-carbon.html


Robert

             reply	other threads:[~2004-02-27 23:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-28  1:06 Robert Bowdidge [this message]
2004-03-04 18:02 ` law
2004-03-04 18:07   ` Andrew Pinski
2004-03-04 18:27     ` Joe Buck
2004-03-04 23:52       ` Robert Bowdidge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B6944002-6980-11D8-B63F-000A957D89DA@apple.com \
    --to=bowdidge@apple.com \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).