From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ken Raeburn <raeburn@cygnus.com>
To: Colin McCormack <colin@field.medicine.adelaide.edu.au>
Cc: egcs@cygnus.com, Tristan Gingold <gingold@spec.saclay.cea.fr>
Subject: Re: Code Generation
Date: Sun, 31 Jan 1999 23:58:00 -0000
Message-id: <tx1k8z6atlw.fsf@cygnus.com>
References: <368D6868.5F1AF98D@field.medicine.adelaide.edu.au>
X-SW-Source: 1999-01n/msg00025.html

Colin McCormack <colin@field.medicine.adelaide.edu.au> writes:

> There's a program called Checker ( http://sunsite.unc.edu
> /pub/Linux/devel/c) which acts like Purify, to police references to
> external storage.  It relies, I believe, upon intercepting each such
> reference, and each modification, to ensure that only legal targets may
> be referenced.
> 
> I think it would be a good thing if all such references could be
> (optionally) emitted by the code generator as macros which an assembler
> could expand to support a program like Checker.

Um, why?

If it's done in the assembler, the assembler has to insert calls or
memory references while effectively not changing any register values
the compiler may be depending on.

What's wrong with doing it in the compiler as it is now?

> I think it would be a good thing if all basic blocks were bracketed by a
> pair of assembler macros which an assembler could expand into calls to
> read a high resolution timer and accumulate accurate timings of basic
> block execution times.  Perhaps gcov already does this?

Again with the assembler macros... :-)

In profile.c, output_arc_profiler generates code for incrementing an
execution counter.  You could change that to read a cycle counter or
other fine-grained clock and compute deltas, accumulating totals for
each block.  Or emit a call to some subroutine to do the same thing.

Remember, while in some ways this will give more precise data than
statistical sampling, it alters the effect of the profiling code on
the timing data itself.  The time spent in functions with lots of
small basic blocks will be inflated by a larger factor than the time
spent in functions with a few large basic blocks, even if the original
execution time of the two would have been the same.

> In ColdStore ( http://field.medicine.adelaide.edu.au/coldstore ) we have a
> need to ensure that all constructors call a bit of code to change their
> _vptr.
> 
> I think it would be useful to us if all constructors were bracketed by a
> pair of assembler macros indicating the semantic role of the call.

So add a -fcoldstore option that alters the constructor behavior.  For
anything non-trivial, probably emitting a function call is the best
way to avoid requiring the compiler to know too much about your
system.

> I would like your comments/thoughts on these proposals, as my past
> experience has been that compiler writers are loathe to have their code
> generators spit macros.

I'm not surprised.  Having worked on both gcc and gas, I agree.

The compiler expends rather a lot of effort to try to get good
performance, and many of its aspects (scheduling and register
allocation for example) depend on knowing *exactly* what's going on.
Generating assembler macros whose definitions may vary means the
compiler has less information available to it.  And the macros have to
be written so as to avoid breaking any of those assumptions, or those
assumptions have to be adjusted.

For example, the compiler may think it knows what's in every machine
register before and after a memory access.  If it has to be done with
a macro that *might* clobber register X, then register X cannot be
considered to be available across that memory reference, even if you
don't happen to use a version of the macro that uses that register.

The MIPS back end has to do this sometimes, because of the way older
MIPS assemblers work.  It could undoubtedly generate better code if it
could dictate the literal pool references and relocations directly, as
well as the exact instructions and their order.


I can only see two possible advantages to doing it with assembler
macros: (1) processing the assembly code through a filter, which
complicates the compilation process quite a bit, or (2) hacking the
assembler itself is easier than hacking the compiler, but only because
the assembler is a simpler program, in part because it *doesn't* have
any of this sort of stuff in it.  What is your reason for wanting to
do it here?

Maybe we need an easier way to let the user do generic instrumentation
at certain points of the code generation, expressed in RTL and fully
optimized with the rest of the code....


Per Bothner <bothner@cygnus.com> writes:

> The Checker implementation does not I believe violate any patents.
> However, I don't think it is the right approach.  It uses "fat
> pointers" (i.e. each pointer is represented by a two- or three-word
> set of pointers).  This breaks binary compatibility, which makes
> it very inconvenient.
> 
> A better approach uses "thin" (normal) pointers in function parameters,
> results, globals, and structure fields, but uses "fat" pointers
> *internally* in a function.  Thin pointers are coerced to fat
> pointers as needed by looking them up in a run-time table.

I think you're remembering this backwards from the last time we
discussed it internally at Cygnus.  The Checker code -- at least, last
time I looked at it -- does not use fat pointers in any form in the
compiler.  It does as Colin said, changing each "normal" memory
reference so it's preceded by a check of and/or update to the access
"permissions" for the storage location in question.  Conceptually,
there is a bitmap representing access permissions for storage
locations.  The granularity and indeed the representation used in the
support code are not dictated by the implementation in the compiler,
but the function names were chosen based on some assumptions.

Currently this instrumentation is implemented as ordinary function
calls, though I think for good performance that should eventually
change.  More important though IMHO, and more difficult, is teaching
the compiler how to optimize away or combine some of these calls for
multiple accesses to the same location.

Both of the fat-pointer versions did come up in our discussions
though.  Depending on the details, they may have the disadvantage of
not detecting allocated-but-not-initialized locations.


The Checker package (as opposed to the -fcheck-memory-usage option)
also uses some tricks to rename functions, so that the instrumented
functions call instrumented functions, and link-time errors result if
a function (1) isn't compiled with instrumentation, and (2) doesn't
have a wrapper function supplied.  I haven't played with that aspect
much, so I can't really address it.

Ken