public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* an experiment with garbage collection
@ 1998-05-09  3:34 Richard Henderson
  1998-05-09 15:57 ` Francois-Rene Rideau
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Richard Henderson @ 1998-05-09  3:34 UTC (permalink / raw)
  To: egcs

Having been annoyed by obstacks one too many times, I decided to
try and do something about it.  There now exists a branch of EGCS
in the CVS tree tagged `egcs_gc_branch'.  On it, you will find a
fledgeling attempt to replace all of the stupid obstack nonsense
with an actual garbage collector.

There are several design constraints that ought to be mentioned.  

  (1) We will only garbage collect rtl and trees.  Everything else 
      will eventually be converted to use plain malloc/free.

  (2) The collector will be given all of the roots that need to stay
      live across calls to the collector.  If the collector is to 
      invoke GC implicitly on a failed allocation request, it must 
      be able to find all roots not given (i.e. a conservative gc).

  (3) The default collector must be strictly conforming -- or at least
      as much as GCC currently is with its allocation and casting.
      I.e. the default collector must run on every system gcc is 
      currently supported on.

  (4) The interface from gcc to the collector must be generic enough
      to drop in different gc implementations.

Current status of the work is that rtl and tree allocation now
happens with the collector, while the rest of the obstack usage is
mostly unchanged.  It is good enough to compile enquire.c with `-O0'
and `-g -O4'.  Time spent in GC seems to be about the same as sched,
though this is with a sample-set of 1.

Note that anything but the C front end will not even build at the moment.

Anyway, I'll keep working on this off-and-on as I find time.  Check
it out and let me know what you think.


r~

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: an experiment with garbage collection
  1998-05-09 15:57 ` Francois-Rene Rideau
@ 1998-05-09 11:03   ` Richard Henderson
  0 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 1998-05-09 11:03 UTC (permalink / raw)
  To: Francois-Rene Rideau; +Cc: Richard Henderson, egcs

On Sat, May 09, 1998 at 07:42:13PM +0200, Francois-Rene Rideau wrote:
> Why not use the Boehm-Weiser conservative collector?
> It might work as a drop-in (on supported platforms),
> and provide state-of-the-art conservative GC to GCC...

I fully intend to provide a module that uses Boehm's collector.
However, that collector does not work on every host GCC runs on,
and anything less is unacceptable as a general solution.


r~

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: an experiment with garbage collection
  1998-05-09  3:34 an experiment with garbage collection Richard Henderson
@ 1998-05-09 15:57 ` Francois-Rene Rideau
  1998-05-09 11:03   ` Richard Henderson
  1998-05-09 15:57 ` Joern Rennecke
  1998-05-09 16:35 ` John Carr
  2 siblings, 1 reply; 11+ messages in thread
From: Francois-Rene Rideau @ 1998-05-09 15:57 UTC (permalink / raw)
  To: Richard Henderson; +Cc: egcs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1695 bytes --]

Richard Henderson:
   Having been annoyed by obstacks one too many times, I decided to
   try and do something about it.  There now exists a branch of EGCS
   in the CVS tree tagged `egcs_gc_branch'.  On it, you will find a
   fledgeling attempt to replace all of the stupid obstack nonsense
   with an actual garbage collector.

Why not use the Boehm-Weiser conservative collector?
It might work as a drop-in (on supported platforms),
and provide state-of-the-art conservative GC to GCC...

http://reality.sgi.com/employees/boehm_mti/gc.html

As for obstacks, there's a robust "cord" package written by Boehm that provides
similar functionality (and more), and interacts nicely with the GC.

The advantage of using it rather than writing your own GC,
is that it's already fairly debugged, supports a wide range of platforms,
and offers debugging facilities to track down memory allocation problems.
It can work almost as an malloc drop-in, and is multi-threading compliant
on many platforms (not linuxthreads yet, though Fergus Henderson
of Mercury fame is working on it).

It's free software, though not GPL (kind of BSD license).
It could easily be bundled together with GCC.

## Faré | VN: þ£ng-Vû Bân   | Join the TUNES project!  http://www.tunes.org/ ##
## FR: François-René Rideau |    TUNES is a Useful, Not Expedient System     ##
## Reflection&Cybernethics  | Project for a Free Reflective Computing System ##
As far as natural selection applies to the human world, we don't ever get to
"let nature decide", because we ARE part of that nature that decides. Hence,
any claim to "let the nature decide" is just a fallacy to promote one point
of view against others.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: an experiment with garbage collection
  1998-05-09  3:34 an experiment with garbage collection Richard Henderson
  1998-05-09 15:57 ` Francois-Rene Rideau
@ 1998-05-09 15:57 ` Joern Rennecke
  1998-05-09 21:03   ` Richard Henderson
  1998-05-09 22:13   ` Jeffrey A Law
  1998-05-09 16:35 ` John Carr
  2 siblings, 2 replies; 11+ messages in thread
From: Joern Rennecke @ 1998-05-09 15:57 UTC (permalink / raw)
  To: rth; +Cc: egcs

>   (1) We will only garbage collect rtl and trees.  Everything else 
>       will eventually be converted to use plain malloc/free.

I don't think that makes sense.  In some cases, there are actaully a lot
of small temporary allocations of non-tree non-rtl objects which
are to be released together; these are most easily handled whith obstacks.

Just cut the obstack usage down to such applications where they are
a natural fit.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: an experiment with garbage collection
  1998-05-09  3:34 an experiment with garbage collection Richard Henderson
  1998-05-09 15:57 ` Francois-Rene Rideau
  1998-05-09 15:57 ` Joern Rennecke
@ 1998-05-09 16:35 ` John Carr
  1998-05-09 16:35   ` Richard Henderson
  1998-05-09 21:03   ` David S. Miller
  2 siblings, 2 replies; 11+ messages in thread
From: John Carr @ 1998-05-09 16:35 UTC (permalink / raw)
  To: Richard Henderson; +Cc: egcs

I have speculated, without any real evidence, that a copying garbage
collector might speed up gcc by improving locality of reference.  To be
useful this GC would have to run between optimization phases.  Not much is
live across source functions (except for C++ with a lot of inline
functions, but making inline function rtl probably doesn't generate much
garbage).

Can you guess how much work it would take to be able to run a
not-conservative GC in the middle of compiling a function?

Even if you don't like obstack semantics, I think it is still useful to
have a special allocator for rtl to avoid malloc overhead (both memory and
time).  At least special-case the smallest, most common size: 8 bytes (16
on 64 bit systems).


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: an experiment with garbage collection
  1998-05-09 16:35 ` John Carr
@ 1998-05-09 16:35   ` Richard Henderson
  1998-05-09 21:03   ` David S. Miller
  1 sibling, 0 replies; 11+ messages in thread
From: Richard Henderson @ 1998-05-09 16:35 UTC (permalink / raw)
  To: John Carr; +Cc: Richard Henderson, egcs

On Sat, May 09, 1998 at 06:32:24PM -0400, John Carr wrote:
> Can you guess how much work it would take to be able to run a
> not-conservative GC in the middle of compiling a function?

That is what I am doing, actually.  The only thing extra you would
need from the information I am collecting now is the address of the
pointers.  This could be trivially gotten with

	#define ggc_mark_rtx(x)	  ggc_mark_rtxp (&(x))

or similar.

> Even if you don't like obstack semantics, I think it is still useful to
> have a special allocator for rtl to avoid malloc overhead (both memory and
> time).  At least special-case the smallest, most common size: 8 bytes (16
> on 64 bit systems).

This is something I planned to look into after it works.

Oh, I've been able to do some more testing.  GC is currently eating
about 15% of the run time, which is a bit more than I expected.  No
data on size reduction yet.


r~

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: an experiment with garbage collection
  1998-05-09 15:57 ` Joern Rennecke
@ 1998-05-09 21:03   ` Richard Henderson
  1998-05-09 22:13   ` Jeffrey A Law
  1 sibling, 0 replies; 11+ messages in thread
From: Richard Henderson @ 1998-05-09 21:03 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: rth, egcs

On Sat, May 09, 1998 at 10:49:32PM +0100, Joern Rennecke wrote:
> Just cut the obstack usage down to such applications where they are
> a natural fit.

You are right, of course, that this is most sensible.  The main
thing I object to wrt obstacks is the juggling act that's done
with switching them about, particularly in the parser.


r~

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: an experiment with garbage collection
  1998-05-09 16:35 ` John Carr
  1998-05-09 16:35   ` Richard Henderson
@ 1998-05-09 21:03   ` David S. Miller
  1 sibling, 0 replies; 11+ messages in thread
From: David S. Miller @ 1998-05-09 21:03 UTC (permalink / raw)
  To: jfc; +Cc: rth, egcs

   From: John Carr <jfc@mit.edu>
   Date: Sat, 09 May 1998 18:32:24 -0400

   Even if you don't like obstack semantics, I think it is still
   useful to have a special allocator for rtl to avoid malloc overhead
   (both memory and time).  At least special-case the smallest, most
   common size: 8 bytes (16 on 64 bit systems).

I once thought about how efficient something like SLAB (which is used
in the Linux and Solaris kernel) would be for RTL.  A SLAB pool could
be made for each RTL type at zero cost (sans some initialized data
space in the compiler image).

This special SLAB could know about RTL internals, and have something
which effectively is garbage collection which walks through the SLABS
and frees up RTL, because it knows this is RTL.

SLAB is excellent for performance for three reasons:

1) Low overhead on allocations, something like 40 instructions common
   case.

2) It does virtual cache coloring so references to different RTL
   objects across the compiler tend to not collide in the L1 cache.

3) It has constructors and destructors and object persistence
   capabilities.  (ie. you free an RTL object after you are done
   with it, this leaves some known parts of the RTL structure in
   well defined initial states, so when you grab this during an
   rtl_alloc(), next time you need not init these parts of the state)

Such a SLAB could use host configuration parameters to know if it can
use mmap() or brk() to create the caches as allocation requests come
in, depending upon what is available on a particular host.

It's good that people are looking into this area of GCC, it's probably
the highest contributor to gcc's horrible performance.

Later,
David S. Miller
davem@dm.cobaltmicro.com


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: an experiment with garbage collection
  1998-05-09 15:57 ` Joern Rennecke
  1998-05-09 21:03   ` Richard Henderson
@ 1998-05-09 22:13   ` Jeffrey A Law
  1 sibling, 0 replies; 11+ messages in thread
From: Jeffrey A Law @ 1998-05-09 22:13 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: rth, egcs

  In message < 199805092149.WAA26982@phal.cygnus.co.uk >you write:
  > >   (1) We will only garbage collect rtl and trees.  Everything else 
  > >       will eventually be converted to use plain malloc/free.
  > 
  > I don't think that makes sense.  In some cases, there are actaully a lot
  > of small temporary allocations of non-tree non-rtl objects which
  > are to be released together; these are most easily handled whith obstacks.
  > 
  > Just cut the obstack usage down to such applications where they are
  > a natural fit.
I believe there will always be a few places where obstacks are the
right model.  I think a better statement about the goal of the GC
and related work is that we want the capability of using different
memory management techniques instead of just obstacks.

On hope is that by using GC for RTL & trees we can reduce the overall
memory footprint of the compiler *and* open up the possibility for
some optimizations which are not practical when tree and rtl nodes are
allocated from obstacks.

I also strongly believe that using GC for RTL & trees will reduce
bugs in the compiler.  There's quite a bit of hair in the compiler
to switch back and forth between obstacks when allocating tree and
rtl nodes which has led to numerous bugs over the years.  I look
forward to the day when I don't have to worry that some tree node
got allocated on the expression, momentrary, temporary, function,
etc obstack instead of a longer living obstack by mistake.

jeff


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: an experiment with garbage collection
@ 1998-05-11 15:07 Mike Stump
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Stump @ 1998-05-11 15:07 UTC (permalink / raw)
  To: egcs, rth

I generally like this idea.  It solves some of the copy to permanent
type things, the `we freed it too soon' problem, and can wind up
freeing things that aren't used anymore, but we just got the calls to
obstack wrong.  We will just need to be careful to zero out any stray
pointers that we don't need anymore in order to keep memory usage down.

Let us know what type of results you see in working set size, and max
memory used and compilation speed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: an experiment with garbage collection
@ 1998-05-10 18:07 Michael Meissner
  0 siblings, 0 replies; 11+ messages in thread
From: Michael Meissner @ 1998-05-10 18:07 UTC (permalink / raw)
  To: rth; +Cc: egcs

One thought that occurs to me, is that most passes start out with a scan over
all of the insns.  You could do GC as part of that scan.  Another more limited
possibility, is a lot of phases call regcan.  Regscan could do the gc and
not have to touch other phases into you get into register allocation.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~1998-05-11 15:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-05-09  3:34 an experiment with garbage collection Richard Henderson
1998-05-09 15:57 ` Francois-Rene Rideau
1998-05-09 11:03   ` Richard Henderson
1998-05-09 15:57 ` Joern Rennecke
1998-05-09 21:03   ` Richard Henderson
1998-05-09 22:13   ` Jeffrey A Law
1998-05-09 16:35 ` John Carr
1998-05-09 16:35   ` Richard Henderson
1998-05-09 21:03   ` David S. Miller
1998-05-10 18:07 Michael Meissner
1998-05-11 15:07 Mike Stump

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).