public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Locality problems caused by size based allocation?
@ 2002-12-16 11:19 Daniel Berlin
  2002-12-17 15:47 ` Jeff Sturm
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Berlin @ 2002-12-16 11:19 UTC (permalink / raw)
  To: gcc

In search of the real cause of the locality problems, I built a copying 
collector based on ggc-page. It specifically makes sure that we don't 
reuse  pages we've just freed when copying (to avoid non-contiguousness 
as much as possible).

It ends up, subtracting gc time, being just as fast/slow as ggc-page 
normally is. In fact, it's slightly slower if i make it not reuse pages 
than it is if i do.

This leads me to believe that our locality problems might actually be 
caused by the fact that we use size based allocation, which has 
basically no locality whatsoever in gcc.

I tested this theory (by changing the mark and sweep ggc-page to not 
size allocate), and it appears to be correct, since ggc-page with mark 
and sweep became just about as fast as my original copying collector 
tests (for large files, the fragmentation started affecting us 
slightly).

What was the rationale behind size based allocation?


I also, based on a offhand suggestion by zack, made a ggc-boehm.
It works (though it fails if you have it collect whenever ggc_collect 
is called, even if i tell it it's not safe to collect without me 
explicitly calling the collection function. It also fails in a few 
cases. We must be hiding pointers somehow), and is faster than ggc-page 
with size allocation, running the same speed as the copying collector 
and the non-size allocating ggc-page (garbage collection times are much 
faster, too).

It obviously doesn't collect as much garbage as the accurate collectors 
(roughly half as much, in fact).
But the fact that it was as fast lends credence to the theory that it's 
size based allocation that is doing us in.

--Dan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Locality problems caused by size based allocation?
  2002-12-16 11:19 Locality problems caused by size based allocation? Daniel Berlin
@ 2002-12-17 15:47 ` Jeff Sturm
  2002-12-17 16:13   ` Daniel Berlin
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff Sturm @ 2002-12-17 15:47 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gcc


On Mon, 16 Dec 2002, Daniel Berlin wrote:
> This leads me to believe that our locality problems might actually be
> caused by the fact that we use size based allocation, which has
> basically no locality whatsoever in gcc.

Interesting.

What exactly is "size based allocation"?

> I also, based on a offhand suggestion by zack, made a ggc-boehm.

I've done this also...

> It works (though it fails if you have it collect whenever ggc_collect
> is called, even if i tell it it's not safe to collect without me
> explicitly calling the collection function. It also fails in a few
> cases. We must be hiding pointers somehow),

Yes.  IIRC I found two places where the backend hides pointers.  One is in
cse_basic_block:

  qty_table
    = (struct qty_table_elem *) xmalloc ((max_qty - max_reg)
                     * sizeof (struct qty_table_elem));
  qty_table -= max_reg;

> and is faster than ggc-page
> with size allocation, running the same speed as the copying collector
> and the non-size allocating ggc-page (garbage collection times are much
> faster, too).

I didn't see that.  I found no better than a 5% improvement (measured as
wall-clock time) using my ggc-boehm, no matter how I tuned it, with
generational collection enabled or not.

Anyhow I didn't find the results interesting enough  to continue the
experiment.

> It obviously doesn't collect as much garbage as the accurate collectors
> (roughly half as much, in fact).

I very much doubt that.  Pointer misidentification is certainly common
with a conversative collector, but 50% lossage is quite rare.  And if
that were happening you'd see plenty of blacklisted pointers (unless all
messages are suppressed, as with -DSILENT).

How are you measuring free space?  GC_get_total_bytes (for instance) isn't
very accurate; it doesn't count pages waiting to be swept or free lists.

Jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Locality problems caused by size based allocation?
  2002-12-17 15:47 ` Jeff Sturm
@ 2002-12-17 16:13   ` Daniel Berlin
  2002-12-17 20:03     ` Zack Weinberg
                       ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Daniel Berlin @ 2002-12-17 16:13 UTC (permalink / raw)
  To: Jeff Sturm; +Cc: gcc



On Tue, 17 Dec 2002, Jeff Sturm wrote:

>
> On Mon, 16 Dec 2002, Daniel Berlin wrote:
> > This leads me to believe that our locality problems might actually be
> > caused by the fact that we use size based allocation, which has
> > basically no locality whatsoever in gcc.
>
> Interesting.
>
> What exactly is "size based allocation"?
What ggc-page does, which is segregate objects based on size rather than
age or object type.

This doesn't work well for RTL, for instance, where you have 3 (or was it
2 or 4, i forget which) sizes of RTL, so size allocation guarantees that
in a linear walk of RTL, you'll end up moving across all different
random pages.


> > > I also, based on a offhand suggestion by zack, made a ggc-boehm.
>
> I've done this also...
>
> > It works (though it fails if you have it collect whenever ggc_collect
> > is called, even if i tell it it's not safe to collect without me
> > explicitly calling the collection function. It also fails in a few
> > cases. We must be hiding pointers somehow),
>
> Yes.  IIRC I found two places where the backend hides pointers.  One is in
> cse_basic_block:
>
>   qty_table
>     = (struct qty_table_elem *) xmalloc ((max_qty - max_reg)
>                      * sizeof (struct qty_table_elem));
>   qty_table -= max_reg;
>
> > and is faster than ggc-page
> > with size allocation, running the same speed as the copying collector
> > and the non-size allocating ggc-page (garbage collection times are much
> > faster, too).
>
> I didn't see that.  I found no better than a 5% improvement (measured as
> wall-clock time) using my ggc-boehm, no matter how I tuned it, with
> generational collection enabled or not.

What version were you using?

Also, I also have a few tests where locality matters quite a bit,
otherwise you'll completely trash the caches.

>
> Anyhow I didn't find the results interesting enough  to continue the
> experiment.
Nor did I. It crashes in some of my testcases (we must be hiding pointers
or something).

>
> > It obviously doesn't collect as much garbage as the accurate collectors
> > (roughly half as much, in fact).
>
> I very much doubt that.
I'm going by.

  Pointer misidentification is certainly common
> with a conversative collector, but 50% lossage is quite rare.

Not in gcc.
  And if
> that were happening you'd see plenty of blacklisted pointers (unless all
> messages are suppressed, as with -DSILENT).

You do.
100's of k worth.

>
> How are you measuring free space?  GC_get_total_bytes (for instance) isn't
> very accurate; it doesn't count pages waiting to be swept or free lists.
I'm measuring *used* space before and after collection.
Not free space.
And I do it with gc_get_heap_size() - gc_get_free_bytes () before and
after a collection.
IE size of heap - number of bytes free in the heap.
I also looked at the stats output to make sure, and it wasn't collecting
much in most collections, which jived with the 50% (it's actually worse in
a lot of collections i'm being nice).

For example, on one test case, we go from
22404k -> 21540k in the first collection, and the stats on stderr say it
only collects ~1 meg of garbage (IE the numbers match).

ggc-page and ggc-copy go ~20meg -> ~4.9 meg (ggc-page is 21142k -> 4973k).





>
> Jeff
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Locality problems caused by size based allocation?
  2002-12-17 16:13   ` Daniel Berlin
@ 2002-12-17 20:03     ` Zack Weinberg
  2002-12-17 21:01       ` Daniel Berlin
  2002-12-17 23:16     ` Jeff Sturm
  2002-12-18  5:44     ` Fergus Henderson
  2 siblings, 1 reply; 9+ messages in thread
From: Zack Weinberg @ 2002-12-17 20:03 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Jeff Sturm, gcc

Daniel Berlin <dberlin@dberlin.org> writes:

> On Tue, 17 Dec 2002, Jeff Sturm wrote:
>> What exactly is "size based allocation"?
> What ggc-page does, which is segregate objects based on size rather than
> age or object type.
>
> This doesn't work well for RTL, for instance, where you have 3 (or was it
> 2 or 4, i forget which) sizes of RTL, so size allocation guarantees that
> in a linear walk of RTL, you'll end up moving across all different
> random pages.

FYI, I plan to address this particular issue after the New Year, by
picking up Dave Miller's suggestion from awhile back to avoid ever
allocating long-lived RTL, and then reverting RTL to obstacks -- but
just one obstack, data on which lives only during calls to expand_body
(or something like that).

zw

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Locality problems caused by size based allocation?
  2002-12-17 20:03     ` Zack Weinberg
@ 2002-12-17 21:01       ` Daniel Berlin
  0 siblings, 0 replies; 9+ messages in thread
From: Daniel Berlin @ 2002-12-17 21:01 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Jeff Sturm, gcc


On Tuesday, December 17, 2002, at 08:12  PM, Zack Weinberg wrote:

> Daniel Berlin <dberlin@dberlin.org> writes:
>
>> On Tue, 17 Dec 2002, Jeff Sturm wrote:
>>> What exactly is "size based allocation"?
>> What ggc-page does, which is segregate objects based on size rather 
>> than
>> age or object type.
>>
>> This doesn't work well for RTL, for instance, where you have 3 (or 
>> was it
>> 2 or 4, i forget which) sizes of RTL, so size allocation guarantees 
>> that
>> in a linear walk of RTL, you'll end up moving across all different
>> random pages.
>
> FYI, I plan to address this particular issue after the New Year, by
> picking up Dave Miller's suggestion from awhile back to avoid ever
> allocating long-lived RTL, and then reverting RTL to obstacks -- but
> just one obstack, data on which lives only during calls to expand_body
> (or something like that).
obstacks aren't something we should revert to.
Implement a pool/subpool system like APR has, or copy the region stuff 
out of libbanshee/libcompat  (it does subregions and whatnot. It's 
actually part of a runtime system for a compiler based partly on gcc 
that does safe regions, so it's rather well implemented).  It will 
compile without problems and without anything outside that directory 
being needed (it has a name conflict in profile.[ch], but renaming 
works fine)
Obstacks just don't have enough flexibility or promise that we should 
go back to them.
It doesn't change the greater locality problem however.
I would think mark and sweep regions/pools (you can have subpools and 
whatnot, for per iteration things) per pass, one for rtl, and one for 
trees, would be a better idea.
>
> zw

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Locality problems caused by size based allocation?
  2002-12-17 16:13   ` Daniel Berlin
  2002-12-17 20:03     ` Zack Weinberg
@ 2002-12-17 23:16     ` Jeff Sturm
  2002-12-18  0:41       ` Zack Weinberg
  2002-12-18  5:44     ` Fergus Henderson
  2 siblings, 1 reply; 9+ messages in thread
From: Jeff Sturm @ 2002-12-17 23:16 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gcc

On Tue, 17 Dec 2002, Daniel Berlin wrote:
> > What exactly is "size based allocation"?
> What ggc-page does, which is segregate objects based on size rather than
> age or object type.

Got it.  But that puzzles me:  the Boehm collector allocates one size
of object per heap block, fitting your desription of size-based
allocation, yet there was a measureable improvement over ggc-page?

> What version were you using?

Whatever was in the mainline GCC tree, probably 6.0.

> > Anyhow I didn't find the results interesting enough  to continue the
> > experiment.
>
> Nor did I. It crashes in some of my testcases (we must be hiding pointers
> or something).

Yes, see my earlier example from cse.c.  But the pointer hiding is easy to
avoid, once identified.

I was able to bootstrap GCC with boehm-gc, at one time.

> > that were happening you'd see plenty of blacklisted pointers (unless all
> > messages are suppressed, as with -DSILENT).
>
> You do.
> 100's of k worth.

Hmm... that's odd.  Either your test cases exhibit very different behavior
than mine (I picked some of the larger sources in /gcc/) or your collector
is configured differently.

You'd see this if the heap blocks are intermixed with other mmap'ed
regions, for instance.

May I ask for your ggc-boehm patches (offline)?

Jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Locality problems caused by size based allocation?
  2002-12-17 23:16     ` Jeff Sturm
@ 2002-12-18  0:41       ` Zack Weinberg
  2002-12-18 12:59         ` Jeff Sturm
  0 siblings, 1 reply; 9+ messages in thread
From: Zack Weinberg @ 2002-12-18  0:41 UTC (permalink / raw)
  To: Jeff Sturm; +Cc: Daniel Berlin, gcc

Jeff Sturm <jsturm@one-point.com> writes:

> On Tue, 17 Dec 2002, Daniel Berlin wrote:
>> > What exactly is "size based allocation"?
>> What ggc-page does, which is segregate objects based on size rather than
>> age or object type.
>
> Got it.  But that puzzles me:  the Boehm collector allocates one size
> of object per heap block, fitting your desription of size-based
> allocation, yet there was a measureable improvement over ggc-page?

The Boehm collector may choose better size buckets.  Our current
choices are crap, and I have statistics to prove it.  (details will
cheerfully be provided, but not now, because I'm not going to sit and
wait for gnumeric to paint its window over this remote X11 session).

Also, I believe it uses a generational algorithm, which should knock
down the large amount of time spent marking live objects.

zw

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Locality problems caused by size based allocation?
  2002-12-17 16:13   ` Daniel Berlin
  2002-12-17 20:03     ` Zack Weinberg
  2002-12-17 23:16     ` Jeff Sturm
@ 2002-12-18  5:44     ` Fergus Henderson
  2 siblings, 0 replies; 9+ messages in thread
From: Fergus Henderson @ 2002-12-18  5:44 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: Jeff Sturm, gcc

On 17-Dec-2002, Daniel Berlin <dberlin@dberlin.org> wrote:
> 
> On Tue, 17 Dec 2002, Jeff Sturm wrote:
> > Pointer misidentification is certainly common
> > with a conversative collector, but 50% lossage is quite rare.
> 
> Not in gcc.
>
> > And if
> > that were happening you'd see plenty of blacklisted pointers (unless all
> > messages are suppressed, as with -DSILENT).
> 
> You do.
> 100's of k worth.

In that case, it may help to provide the collector with a little more
type information, e.g. using GC_malloc_atomic() rather than GC_malloc()
where appropriate (e.g. for things like strings and floating point numbers).
Often a small number of allocation sites are responsible for the allocations
which result in most of the words which are misidentified as pointers.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Locality problems caused by size based allocation?
  2002-12-18  0:41       ` Zack Weinberg
@ 2002-12-18 12:59         ` Jeff Sturm
  0 siblings, 0 replies; 9+ messages in thread
From: Jeff Sturm @ 2002-12-18 12:59 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Daniel Berlin, gcc

On Tue, 17 Dec 2002, Zack Weinberg wrote:
> > Got it.  But that puzzles me:  the Boehm collector allocates one size
> > of object per heap block, fitting your desription of size-based
> > allocation, yet there was a measureable improvement over ggc-page?
>
> The Boehm collector may choose better size buckets.  Our current
> choices are crap, and I have statistics to prove it.  (details will
> cheerfully be provided, but not now, because I'm not going to sit and
> wait for gnumeric to paint its window over this remote X11 session).

That makes sense.  Then it should be an interesting experiment to apply
Boehm's MERGE_SIZES algorithm to ggc-page.

> Also, I believe it uses a generational algorithm, which should knock
> down the large amount of time spent marking live objects.

Dan didn't say whether he had done GC_enable_incremental().  Besides,
Boehm's incremental/generational mode doesn't seem to help that much with
overall collector performance... I studied this quite a bit for Java
programs, and can forward details to anyone interested.

Jeff

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2002-12-18 20:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-12-16 11:19 Locality problems caused by size based allocation? Daniel Berlin
2002-12-17 15:47 ` Jeff Sturm
2002-12-17 16:13   ` Daniel Berlin
2002-12-17 20:03     ` Zack Weinberg
2002-12-17 21:01       ` Daniel Berlin
2002-12-17 23:16     ` Jeff Sturm
2002-12-18  0:41       ` Zack Weinberg
2002-12-18 12:59         ` Jeff Sturm
2002-12-18  5:44     ` Fergus Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).