public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Logging data structure field accesses
@ 2004-02-28  1:06 Robert Bowdidge
  2004-03-04 18:02 ` law
  0 siblings, 1 reply; 5+ messages in thread
From: Robert Bowdidge @ 2004-02-28  1:06 UTC (permalink / raw)
  To: gcc; +Cc: Robert Bowdidge

To figure out which fields of struct tree_decl were used by constant 
declarations, I instrumented the compiler to log every access to fields 
of the data structure; I then summarized the data in a table listing 
the number of accesses for each field for each kind of declaration.  
Folks here at Apple found the results interesting, and suggested 
sharing 'em.

To recap, I added this instrumentation because I wanted to find which 
fields were truly used by each kind of declaration -- not just what the 
header files say are used.  The number of accesses to each field also 
gives us some hints about how the fields are used, and which might 
indicate performance bottlenecks:

* If the number of accesses is small relative to the number of objects 
of that kind, then we know the field is only used intermittently, and 
might be worth hiding in a hash table to cut the size of declaration 
data structures.

* If the number of accesses is equal to the number of objects of that 
kind, then we can guess that the values may only be initialized or are 
only touched once, and thus might not actually be needed.

* If the number of accesses is a large multiple of the number of 
objects of that kind, it might indicate that we're inappropriately 
traversing every object multiple times.


Adding the logging was pretty quick -- only an hour or two of typing 
and debugging -- on a 3.4 compiler.   I hooked into the DECL_CHECK 
routine, as it's called for every field access, adding a second 
parameter naming the field being accessed in the code immediately after 
the check:

#define DECL_BUILT_IN_CLASS(NODE) \
    (FUNCTION_DECL_CHECK 
(DECL_CHECK(NODE,"built_in_class"))->decl.built_in_class)

DECL_CHECK passed the field name (or NULL) into TREE_CLASS_CHECK; I 
added additional code to TREE_CLASS_CHECK so if the field name wasn't 
NULL, the routine would print the tree code (kind of object), tree code 
class (data structure used), field name, source file, and source line 
to standard error.  I then used grep and awk to grab all the accesses 
for a given kind of declaration, and sort and count accesses for each 
field name.

One problem I didn't bother to address was spewing the logging records 
when the current compiler's used to build portions of the compile at 
the end of the build; for now, I just make sure to send output to 
/dev/null when building the instrumented compiler.  I tried adding 
similar instrumentation to a 3.3-compiler, but the node check before 
the field accesses were often done with FUNCTION_DECL_CHECK and other 
macros created by gencheck;  these don't go through TREE_CLASS_CHECK 
but through TREE_CHECK, and made the changes a bit more involved.


Here's two examples of the produced data.  Each represents the fields 
of declarations accessed when compiling a program that only includes 
the <Carbon/Carbon.h> header file (which brings in about 100K lines of 
header files.)  The first web page shows the results when  compiled 
with gcc, the second with g++.  (I'd done these measurements to 
understand why the C compiler only took 1 second to compile the 
headers, but the C++ compiler took two seconds.)  The count of accesses 
to the uid field indicates the number of declarations of each type, as 
the field is only accessed during creation of the declaration node.

http://home.earthlink.net/~bowdidge/c-carbon.html
http://home.earthlink.net/~bowdidge/cpp-carbon.html


Robert

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Logging data structure field accesses
  2004-02-28  1:06 Logging data structure field accesses Robert Bowdidge
@ 2004-03-04 18:02 ` law
  2004-03-04 18:07   ` Andrew Pinski
  0 siblings, 1 reply; 5+ messages in thread
From: law @ 2004-03-04 18:02 UTC (permalink / raw)
  To: Robert Bowdidge; +Cc: gcc

In message <B6944002-6980-11D8-B63F-000A957D89DA@apple.com>, Robert Bowdidge wr
ites:
 >Here's two examples of the produced data.  Each represents the fields 
 >of declarations accessed when compiling a program that only includes 
 >the <Carbon/Carbon.h> header file (which brings in about 100K lines of 
 >header files.)  The first web page shows the results when  compiled 
 >with gcc, the second with g++.  (I'd done these measurements to 
 >understand why the C compiler only took 1 second to compile the 
 >headers, but the C++ compiler took two seconds.)  The count of accesses 
 >to the uid field indicates the number of declarations of each type, as 
 >the field is only accessed during creation of the declaration node.
 >
 >http://home.earthlink.net/~bowdidge/c-carbon.html
 >http://home.earthlink.net/~bowdidge/cpp-carbon.html
Unfortunately the page renders horribly in Mozilla....  I'd be interested in
the data, but with the poor rendering it's impossible to extract any data
from your pages.

jeff


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Logging data structure field accesses
  2004-03-04 18:02 ` law
@ 2004-03-04 18:07   ` Andrew Pinski
  2004-03-04 18:27     ` Joe Buck
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Pinski @ 2004-03-04 18:07 UTC (permalink / raw)
  To: law; +Cc: gcc, Robert Bowdidge, Andrew Pinski

On Mar 4, 2004, at 10:02, law@redhat.com wrote:

> In message <B6944002-6980-11D8-B63F-000A957D89DA@apple.com>, Robert 
> Bowdidge wr
> ites:
>> Here's two examples of the produced data.  Each represents the fields
>> of declarations accessed when compiling a program that only includes
>> the <Carbon/Carbon.h> header file (which brings in about 100K lines of
>> header files.)  The first web page shows the results when  compiled
>> with gcc, the second with g++.  (I'd done these measurements to
>> understand why the C compiler only took 1 second to compile the
>> headers, but the C++ compiler took two seconds.)  The count of 
>> accesses
>> to the uid field indicates the number of declarations of each type, as
>> the field is only accessed during creation of the declaration node.
>>
>> http://home.earthlink.net/~bowdidge/c-carbon.html
>> http://home.earthlink.net/~bowdidge/cpp-carbon.html
> Unfortunately the page renders horribly in Mozilla....  I'd be 
> interested in
> the data, but with the poor rendering it's impossible to extract any 
> data
> from your pages.

What is not that interesting it renders poorly on anything except for 
IE but
then again it was made using Excel so what do you expect.

-Pinski

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Logging data structure field accesses
  2004-03-04 18:07   ` Andrew Pinski
@ 2004-03-04 18:27     ` Joe Buck
  2004-03-04 23:52       ` Robert Bowdidge
  0 siblings, 1 reply; 5+ messages in thread
From: Joe Buck @ 2004-03-04 18:27 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: law, gcc, Robert Bowdidge


On Mar 4, 2004, at 10:02, law@redhat.com wrote:
> > Unfortunately the page renders horribly in Mozilla....  I'd be 
> > interested in
> > the data, but with the poor rendering it's impossible to extract any 
> > data
> > from your pages.

On Thu, Mar 04, 2004 at 10:07:02AM -0800, Andrew Pinski wrote:
> What is not that interesting it renders poorly on anything except for 
> IE but
> then again it was made using Excel so what do you expect.

You could make the Excel files available; Gnumeric and OpenOffice have
no trouble reading them.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Logging data structure field accesses
  2004-03-04 18:27     ` Joe Buck
@ 2004-03-04 23:52       ` Robert Bowdidge
  0 siblings, 0 replies; 5+ messages in thread
From: Robert Bowdidge @ 2004-03-04 23:52 UTC (permalink / raw)
  To: gcc; +Cc: Robert Bowdidge

Much thanks to Mike Capp for making the tables more readable.  Next 
time, I'll do the tables by hand.

> Here's two examples of the produced data.  Each represents the fields
> of declarations accessed when compiling a program that only includes
> the <Carbon/Carbon.h> header file (which brings in about 100K lines of
> header files.)  The first web page shows the results when  compiled
> with gcc, the second with g++.  (I'd done these measurements to
> understand why the C compiler only took 1 second to compile the
> headers, but the C++ compiler took two seconds.)  The count of accesses
> to the uid field indicates the number of declarations of each type, as
> the field is only accessed during creation of the declaration node.

BTW, I found the field access information most useful for creating 
smaller versions of the declaration data structure.  The information 
was less useful when trying to identify the differences between the 
compilers.  Just because the number of accesses to a particular field 
was high didn't imply that accessing that field was inefficient.  
However, looking at where the access patterns were different did 
highlight what code was expensive.

The biggest differences between the two compilers was in how the name 
field was accessed for function, type, and field declarations -- the 
name field gets accessed hundreds of thousands of times in the C++ 
compiler compared with tens of thousands of times in the C compiler.  
Most of the unusual field accesses were in functions called from 
finish_struct_1; profiling and instrumentation showed that about half 
of the time difference between the C and C++ compiler -- (.35 - .45 
seconds) -- was in this code.  I'd first thought that the compiler was 
doing too many separate passes over all the elements of each structure, 
but profiling showed that most of the time (.16 - .25 secs) was spent 
creating the implicit member functions needed for each struct.  The 
headers define 1200 structures, all intended to be used as C structures 
rather than classes.  For each, three function declarations 
(constructor, copy constructor, and assignment operator) get defined.  
Work in grokdeclarator and below is responsible for the majority of the 
time.

There are checks in grokdeclarator that are probably irrelevant for the 
implicit functions, or that could be done by examination of the 
class/structure rather than each function. For example, a non-trivial 
amount of time (about 8% of the cost of finish_struct_1) was being 
spent in no_linkage_check deciding whether any of the implicit 
functions referenced anonymous types that couldn't be exported.  
grokfield also looks at names trying to decide if any of the implicit 
function names are "_vptr", and thus will conflict with the virtual 
function table's field name.

I'm not sure we'd want to have separate paths for creating regular 
declarations and those for the implicit functions, but it was 
interesting to see where the time's going.

Robert

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-03-04 23:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-02-28  1:06 Logging data structure field accesses Robert Bowdidge
2004-03-04 18:02 ` law
2004-03-04 18:07   ` Andrew Pinski
2004-03-04 18:27     ` Joe Buck
2004-03-04 23:52       ` Robert Bowdidge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).