From: Robert Bowdidge <bowdidge@apple.com>
To: gcc@gcc.gnu.org
Cc: Robert Bowdidge <bowdidge@apple.com>
Subject: Logging data structure field accesses
Date: Sat, 28 Feb 2004 01:06:00 -0000 [thread overview]
Message-ID: <B6944002-6980-11D8-B63F-000A957D89DA@apple.com> (raw)
To figure out which fields of struct tree_decl were used by constant
declarations, I instrumented the compiler to log every access to fields
of the data structure; I then summarized the data in a table listing
the number of accesses for each field for each kind of declaration.
Folks here at Apple found the results interesting, and suggested
sharing 'em.
To recap, I added this instrumentation because I wanted to find which
fields were truly used by each kind of declaration -- not just what the
header files say are used. The number of accesses to each field also
gives us some hints about how the fields are used, and which might
indicate performance bottlenecks:
* If the number of accesses is small relative to the number of objects
of that kind, then we know the field is only used intermittently, and
might be worth hiding in a hash table to cut the size of declaration
data structures.
* If the number of accesses is equal to the number of objects of that
kind, then we can guess that the values may only be initialized or are
only touched once, and thus might not actually be needed.
* If the number of accesses is a large multiple of the number of
objects of that kind, it might indicate that we're inappropriately
traversing every object multiple times.
Adding the logging was pretty quick -- only an hour or two of typing
and debugging -- on a 3.4 compiler. I hooked into the DECL_CHECK
routine, as it's called for every field access, adding a second
parameter naming the field being accessed in the code immediately after
the check:
#define DECL_BUILT_IN_CLASS(NODE) \
(FUNCTION_DECL_CHECK
(DECL_CHECK(NODE,"built_in_class"))->decl.built_in_class)
DECL_CHECK passed the field name (or NULL) into TREE_CLASS_CHECK; I
added additional code to TREE_CLASS_CHECK so if the field name wasn't
NULL, the routine would print the tree code (kind of object), tree code
class (data structure used), field name, source file, and source line
to standard error. I then used grep and awk to grab all the accesses
for a given kind of declaration, and sort and count accesses for each
field name.
One problem I didn't bother to address was spewing the logging records
when the current compiler's used to build portions of the compile at
the end of the build; for now, I just make sure to send output to
/dev/null when building the instrumented compiler. I tried adding
similar instrumentation to a 3.3-compiler, but the node check before
the field accesses were often done with FUNCTION_DECL_CHECK and other
macros created by gencheck; these don't go through TREE_CLASS_CHECK
but through TREE_CHECK, and made the changes a bit more involved.
Here's two examples of the produced data. Each represents the fields
of declarations accessed when compiling a program that only includes
the <Carbon/Carbon.h> header file (which brings in about 100K lines of
header files.) The first web page shows the results when compiled
with gcc, the second with g++. (I'd done these measurements to
understand why the C compiler only took 1 second to compile the
headers, but the C++ compiler took two seconds.) The count of accesses
to the uid field indicates the number of declarations of each type, as
the field is only accessed during creation of the declaration node.
http://home.earthlink.net/~bowdidge/c-carbon.html
http://home.earthlink.net/~bowdidge/cpp-carbon.html
Robert
next reply other threads:[~2004-02-27 23:57 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-02-28 1:06 Robert Bowdidge [this message]
2004-03-04 18:02 ` law
2004-03-04 18:07 ` Andrew Pinski
2004-03-04 18:27 ` Joe Buck
2004-03-04 23:52 ` Robert Bowdidge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=B6944002-6980-11D8-B63F-000A957D89DA@apple.com \
--to=bowdidge@apple.com \
--cc=gcc@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).