public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Varray memory consumption strikes back
@ 2004-09-04  9:59 Jan Hubicka
  2004-09-04 18:11 ` Jeffrey A Law
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Hubicka @ 2004-09-04  9:59 UTC (permalink / raw)
  To: gcc, law, amacleod

Jeff,
In January I was concerned about memory consumption:

> I am somewhat concerned about use of varrays in GGC memory that produce
> relatively large amount of garbage (a lot of that without good reason as
> we can use malloc scheme instead, just we don't), thus I've implemented
> statistics routines.  Quick checking found that for Geralds testcase
> does about 13MB of varrays, while tree-SSA does 63MB.  The average
> overhead per varray is over 10KB so operands are not to blame here IMO.

and the proposed patch to make varrays allocatable either in GGC and
memory flag has been rejected and you mentioned to have better sollution
in prototype stage (moving away from varrays to sane datastructures and
improving them at the same time).

Today it seems that memory consumption on the same testcase is 122MB (up
from 63MB I reported but with realistic chance that 43MB of immediate
uses varray might go away before the freeze, but these are ggcfreed
anyway so missing in the garbage statistics) counting only the GGC
garbage, not the overall memory allocation like I did in January before
ggc_free has been implemented as can be seen in the -fmem-report:

varray.c:138 (varray_init)                         59901920: 8.5% 23203392: 6.6%      11280: 0.0%   12794096: 7.5%     419188
varray.c:171 (varray_grow)                         68242924: 9.7% 224774832:63.6%     271448: 0.3%  83130484:49.0%     244896
source location                                    Garbage        Freed               Leak          Overhead            Times

Is the better sollution anywhere near to be ready (huge portion of the
arrays is still attributed to tree-ssa-dom code as can be seen in the
report attached)?

If not, I would suggest to fall back for the allocation patch at least
for 3.5 release.  It has been able to cut significant portion of the
memory usage or perhaps more consistenly use ggc_free to explicitely
deallocate varrays but I don't like the overall increased overhead of
this allocation method either.

Honza

VARRAY Kind            Count      Bytes  Resized copied
-------------------------------------------------------
bbs_to_duplicate         208      24656      15      15
reg_base_value           217     432680     717      23
Elimination Constant Copies    334      64128       0       0
immediate_uses        254956   43235680  144954  144954
vrp_variables           1775      85200       0       0
reg_equiv_memory_loc       1     440904     668       4
local_classes              1         96       0       0
processed_ptrs          1002    1358064     949     949
num_references          1002    1444588     523      92
used_temp_slots          334      68768     580     580
aliases                26707    1772144   10223   10223
ttype_data               126      20160       0       0
mangling substitutions      1        544       6       5
ehspec_data              126      12096       0       0
saved_tokens            2402     127064     107     107
ssdf_decls                 1        288       0       0
referenced_vars          334    1580128    1177    1177
block_locators_locs      334      85184     147     147
reg_n_info                 1       7816      14       9
build uses               334      37408       0       0
ssa_names table          334    6489088    1074    1074
work list               1336    1155840     517     517
basic blocks             334     100440       0       0
block_locators_blocks    334     159680     147     147
inlined_fns              312     137216     141     141
line_locators_locs       334      53440       0       0
prologue                   1       5580     505       2
Elimination Edge List    334      37408       0       0
file_locators_locs       334      53440       0       0
ib_boundaries_block      334    1381888     489     489
 Elimination Stack       334      50768       0       0
classes of registers early clobbered in an insn    334      37408       0       0
sibcall_epilogue           1        428      37       4
const_and_copies        1002   10140376      34       2
dest_array               374      41888       0       0
stack                  28523    3214692       0       0
build defs               334      24048       0       0
work_stack              2076     783448       0       0
epilogue                   1       6768     609       2
build v_may_defs         334     103328     357     357
build vuses              334      48368      99      99
vrp records             3213     154224       0       0
build v_must_defs        334      37408       0       0
current_lang_base      10355    1159760       0       0
first_partition          932     470516     583     577
interesting_ssa_edges    334      64608       3       3
basic blocks for the next iter.    334     100440       0       0
varying_ssa_edges        334      80768      91      91
redirection data         191       9168       0       0
cfg_blocks               334      66208      12      12
RTTI decls                 1       1312       4       4
VARS worklist           1002     117504      52      52
COND worklist           1002     223984     645     645
freelist                3746     540320    8594    8594
block_data              3746     540320    8594    8594
strings                 2402     691776       0       0
block_defs             28179   19576768   30032   30032
trees                    932     911208     583     580
vrp_data                1002   10140376      34       2
alias sets                 1      10656    1319       9
tpa nodes                269      22240       0       0
basic_block_info         668     382864    2041     785
tpa to clear             269      62408       0       0
label to block map       334     362008    1007     566
inline_parm_levels         1         48       0       0
block_nonzero_vars      6413     401232    3637    3637
block_avail_exprs       8800    1815520     719     719
part_link                269      46200       0       0
Static declarations        4       5088       1       1
line_locators_lines      334      53440       0       0
block_const_and_copies   7947    5335568   28111   28111
case_labels               21       2528       6       6
local_names                1         96       0       0
action_record_data       126      12096       0       0
file_locators_files      334      96192       0       0
stmts_to_rescan         5213    1094816     524     524
succ                   12409     601680   12653     244
insn_addresses           334     638272       0       0
deferred_fns               1      65568       8       8
fns                    19441     882688       0       0
pending_statics            1       8224       5       5
Elimination Node List    334      90848       0       0
-------------------------------------------------------
Total                 449727  121700084
-------------------------------------------------------

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-04  9:59 Varray memory consumption strikes back Jan Hubicka
@ 2004-09-04 18:11 ` Jeffrey A Law
  2004-09-04 21:33   ` Jan Hubicka
  0 siblings, 1 reply; 14+ messages in thread
From: Jeffrey A Law @ 2004-09-04 18:11 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, amacleod

On Sat, 2004-09-04 at 03:59, Jan Hubicka wrote:
> and the proposed patch to make varrays allocatable either in GGC and
> memory flag has been rejected and you mentioned to have better sollution
> in prototype stage (moving away from varrays to sane datastructures and
> improving them at the same time).
Well, I haven't had time to pick it up, but it wouldn't be terribly
hard for you to do so.  I would prefer reducing the amount of
varrays we carry around as opposed to bypassing our memory allocation
system design.

The first step is already done (by Diego), specifically we needed a
formal SSA_NAME table.  As I said, this is already done.

What needs to happen now is to expand that simple table of SSA_NAMEs
to be a table of SSA_NAMEs and their global properties (for example
their equivalences and nonzero status indicators).

Once you do that, the const_and_copies global varray disappears. 

The second (closely related change) is that value range propagation
(currently under development by Diego) needs to replace the very
limited form we have in dom, which causes the VRP_DATA varray to
totally disappear, along with the VRP_VARIABLES varray.  Again,
this range information needs to be stored in the SSA_NAME table
so that we don't just punt the problem into CCP.

[ Note that both CONST_AND_COPIES and VRP_DATA are grown as the
  number of SSA_NAMEs grow, which can lead to significant
  over-usage of varray space. ]

There's more that can be done, but kililng those two global varrays
would be a huge step forward and it's not a huge amount of work.

I can't emphasize enough how bad I think it would be to make
varrays allocable in multiple memory regions.

jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-04 18:11 ` Jeffrey A Law
@ 2004-09-04 21:33   ` Jan Hubicka
  2004-09-14  5:53     ` Jeffrey A Law
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Hubicka @ 2004-09-04 21:33 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: Jan Hubicka, gcc, amacleod

> On Sat, 2004-09-04 at 03:59, Jan Hubicka wrote:
> > and the proposed patch to make varrays allocatable either in GGC and
> > memory flag has been rejected and you mentioned to have better sollution
> > in prototype stage (moving away from varrays to sane datastructures and
> > improving them at the same time).
> Well, I haven't had time to pick it up, but it wouldn't be terribly
> hard for you to do so.  I would prefer reducing the amount of
> varrays we carry around as opposed to bypassing our memory allocation
> system design.

Yes, this is my prefference too and I did some work on this already too.
However it is depressing to see that the amount of varrays in the code
is increasing very quickly and consistently.

> 
> The first step is already done (by Diego), specifically we needed a
> formal SSA_NAME table.  As I said, this is already done.

Agreed.
> 
> What needs to happen now is to expand that simple table of SSA_NAMEs
> to be a table of SSA_NAMEs and their global properties (for example
> their equivalences and nonzero status indicators).

How this is different from putting this directly into SSA_NAMEs?
> 
> Once you do that, the const_and_copies global varray disappears. 
> 
> The second (closely related change) is that value range propagation
> (currently under development by Diego) needs to replace the very
> limited form we have in dom, which causes the VRP_DATA varray to
> totally disappear, along with the VRP_VARIABLES varray.  Again,
> this range information needs to be stored in the SSA_NAME table
> so that we don't just punt the problem into CCP.
> 
> [ Note that both CONST_AND_COPIES and VRP_DATA are grown as the
>   number of SSA_NAMEs grow, which can lead to significant
>   over-usage of varray space. ]

This is actually not included in the statistics, at least not in the
VARRAY kind block that just take optimistic estimation that varray takes
as much memory as is the maximal amount of elements in it.
> 
> There's more that can be done, but kililng those two global varrays
> would be a huge step forward and it's not a huge amount of work.

Looking on the top memory consumers (more than 1MB):

VARRAY Kind            Count      Bytes  Resized copied
-------------------------------------------------------
immediate_uses        254956   43235680  144954  144954
block_defs             28179   19576768   30032   30032
const_and_copies        1002   10140376      34       2
vrp_data                1002   10140376      34       2
ssa_names table          334    6489088    1074    1074
block_const_and_copies  7947    5335568   28111   28111
stack                  28523    3214692       0       0
block_avail_exprs       8800    1815520     719     719
aliases                26707    1772144   10223   10223
num_references          1002    1444588     523      92
processed_ptrs          1002    1358064     949     949
ib_boundaries_block      334    1381888     489     489
referenced_vars          334    1580128    1177    1177
work list               1336    1155840     517     517
current_lang_base      10355    1159760       0       0
stmts_to_rescan         5213    1094816     524     524

What you suggest here (and what is definitly good thing to do) will kill
constn_and_copies, vrp_data and replace them with presumably array-alike
ssa_names table.  These two arrays together are rougly 8% of overall
memory consumed by varrays and I would be surprised that the increased
ssa names would be terribly smaller.

I think we need to look the problem from the other side.

At least I do have dificulty to find examples of datastructures where
using dynamically sized arrays actually makes sense and even fewer
examples where using garbage collection help something.  They works as
all kind of stacks, worklists, queues mostly local with well defined
lifetimes.

I would say that we need way to make use of such datastructures cheaper
as current varray overhead is killing us.

As you pointed out in the earlier mails in January, the actual
percentage of places where we do need growing bounds checked garbage
allocated array is tiny.

We probably ought to have convenient way to do stacks - apparently
people (including me) preffer varrays instead of obstacks just because
the API is friendlier.
> 
> I can't emphasize enough how bad I think it would be to make
> varrays allocable in multiple memory regions.

I still think that for local data with obvious lifetimes (the case of
majority varrays in the list above) we should not use GGC.  It is both
faster and safer as we get sanity checking.

If you don't like varrays in the normal memory, would turning the
stack-like datastructures above to obstacks (or newly invented
friendlier stack datastructure perhaps with fixed object size)
and array-like datastructures to simple C arrays (where we would lose
bounds checking) or newly invented bound checked arrays without growing
facility make any sense to you?

If not (and under assumption that there is already plan for
immediate_uses, const_and_copies and vrp_data) what would you suggest
for block_defs, block_const_and_copies, block_avail_exprs, aliases,
num_references and friends?

Honza
> 
> jeff
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-04 21:33   ` Jan Hubicka
@ 2004-09-14  5:53     ` Jeffrey A Law
  2004-09-14 10:17       ` Richard Henderson
                         ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jeffrey A Law @ 2004-09-14  5:53 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, amacleod

On Sat, 2004-09-04 at 15:33, Jan Hubicka wrote:
> > On Sat, 2004-09-04 at 03:59, Jan Hubicka wrote:
> > > and the proposed patch to make varrays allocatable either in GGC and
> > > memory flag has been rejected and you mentioned to have better sollution
> > > in prototype stage (moving away from varrays to sane datastructures and
> > > improving them at the same time).
> > Well, I haven't had time to pick it up, but it wouldn't be terribly
> > hard for you to do so.  I would prefer reducing the amount of
> > varrays we carry around as opposed to bypassing our memory allocation
> > system design.
> 
> Yes, this is my prefference too and I did some work on this already too.
> However it is depressing to see that the amount of varrays in the code
> is increasing very quickly and consistently.
The varrays are a natural way of building unwindable state into
the optimizer, which in turn avoids the horrible path following
behaviors found in CSE.  However, there are ways to reduce the
number of varrays we allocate.

One such way is to recycle them as I've already suggested.

ggc_free-ing them may be a solution as well, but I have my
reservations given my experience with this code already.

Another would be to have a single varray for expressions rather
than having a varray per block for expressions.  Each element in
that varray would have the expression and the block associated
with the expression.

In the finalizer, you pop elements off that varray as long as
their block matches the block we are leaving.  Remove each
of the popped elements from the available expression hash
table.

The advantage of that scheme is that we allocate fewer block
local varrays.  However, the one varray we do allocate will
tend to be large and we have to allocate a two word element
for each entry in that varray (pointer for the expression
and a pointer for the block containing the expression).  It's
not clear if that will be a notable win over what we've got
or what has been proposed.

I haven't prototyped that code, but I don't think it would be
terribly hard to do so and run some experiments to see if it's
a worthwhile direction.

I believe you would do something similar with block local
stmts_to_rescan and vrp_variables.  You might even use a 
bitmap for vrp_variables.

block_defs ought to go away in the future.  The only reason we
track the current definition of a variable is for the old style
of jump threading.  It should be possible to rewrite the code
which determines when a block has a threadable COND_EXPR which
does not rely on maintaining the current definition of any variable.

The block local const_and_copies and nonzero_vars are trickier to
make go away.

 
> > What needs to happen now is to expand that simple table of SSA_NAMEs
> > to be a table of SSA_NAMEs and their global properties (for example
> > their equivalences and nonzero status indicators).
> 
> How this is different from putting this directly into SSA_NAMEs?
Not significantly so.  Though in general I don't like the idea of
stuffing more crap into the tree structures.  


> 
> If you don't like varrays in the normal memory, would turning the
> stack-like datastructures above to obstacks (or newly invented
> friendlier stack datastructure perhaps with fixed object size)
> and array-like datastructures to simple C arrays (where we would lose
> bounds checking) or newly invented bound checked arrays without growing
> facility make any sense to you?
Again, the fundamental problem is definable lifetime and the fact
that most objects in GCC do not have a definable lifetime.  Hell,
one could even make the claim that no objects in GCC have a
definable lifetime because of the rats nest of pointers we have.

jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-14  5:53     ` Jeffrey A Law
@ 2004-09-14 10:17       ` Richard Henderson
  2004-09-14 15:44         ` Jeffrey A Law
  2004-09-14 12:52       ` Jan Hubicka
  2004-09-14 15:41       ` Jan Hubicka
  2 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2004-09-14 10:17 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: Jan Hubicka, gcc, amacleod

On Mon, Sep 13, 2004 at 10:48:15PM -0600, Jeffrey A Law wrote:
> Another would be to have a single varray for expressions rather
> than having a varray per block for expressions.  Each element in
> that varray would have the expression and the block associated
> with the expression.
> 
> In the finalizer, you pop elements off that varray as long as
> their block matches the block we are leaving.  Remove each
> of the popped elements from the available expression hash
> table.

Or: Push a marker expression, e.g. NULL, on entry to any block.
In the finalizer, pop entries until reaching the marker.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-14  5:53     ` Jeffrey A Law
  2004-09-14 10:17       ` Richard Henderson
@ 2004-09-14 12:52       ` Jan Hubicka
  2004-09-14 15:41       ` Jan Hubicka
  2 siblings, 0 replies; 14+ messages in thread
From: Jan Hubicka @ 2004-09-14 12:52 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: Jan Hubicka, gcc, amacleod

> > 
> > If you don't like varrays in the normal memory, would turning the
> > stack-like datastructures above to obstacks (or newly invented
> > friendlier stack datastructure perhaps with fixed object size)
> > and array-like datastructures to simple C arrays (where we would lose
> > bounds checking) or newly invented bound checked arrays without growing
> > facility make any sense to you?
> Again, the fundamental problem is definable lifetime and the fact
> that most objects in GCC do not have a definable lifetime.  Hell,
> one could even make the claim that no objects in GCC have a
> definable lifetime because of the rats nest of pointers we have.

According to my statistics, this is not true at all.  The object with
undefined lifetimes (ie tree/RTL) are together are just minor portion of
GGC pool consumption.  Yet most of these never survive ggc_collect and
are thus local to single BB.

I used to have patch that tracks age of object (ie how many collections
it survives) and it shows pretty clearly what allocations seems to be
local to block.  I can rescuesce it if it looks interesting.

Honza
> 
> jeff
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-14  5:53     ` Jeffrey A Law
  2004-09-14 10:17       ` Richard Henderson
  2004-09-14 12:52       ` Jan Hubicka
@ 2004-09-14 15:41       ` Jan Hubicka
  2004-09-16 16:38         ` Jeffrey A Law
  2 siblings, 1 reply; 14+ messages in thread
From: Jan Hubicka @ 2004-09-14 15:41 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: Jan Hubicka, gcc, amacleod

> The varrays are a natural way of building unwindable state into
> the optimizer, which in turn avoids the horrible path following
> behaviors found in CSE.  However, there are ways to reduce the
> number of varrays we allocate.

Actually it seems to me that use of varrays for stack holding local
datastructures of the recursive walk of dominance tree is one of these
that makes kind of sense.  What I don't understand at all is why these
needs to be garbage collected and why we didn't developed some stack
implementation that don't need moving data around at reallocation and
don't have more convenient API than obstacks.

What is still behind me why we do have so often big varrays of small
varrays (like "vrp records" array where each varray occupy 24 bytes and
is never resized).
> 
> One such way is to recycle them as I've already suggested.
> 
> ggc_free-ing them may be a solution as well, but I have my
> reservations given my experience with this code already.
> 
> Another would be to have a single varray for expressions rather
> than having a varray per block for expressions.  Each element in
> that varray would have the expression and the block associated
> with the expression.
> 
> In the finalizer, you pop elements off that varray as long as
> their block matches the block we are leaving.  Remove each
> of the popped elements from the available expression hash
> table.
> 
> The advantage of that scheme is that we allocate fewer block
> local varrays.  However, the one varray we do allocate will
> tend to be large and we have to allocate a two word element
> for each entry in that varray (pointer for the expression
> and a pointer for the block containing the expression).  It's
> not clear if that will be a notable win over what we've got
> or what has been proposed.
> 
> I haven't prototyped that code, but I don't think it would be
> terribly hard to do so and run some experiments to see if it's
> a worthwhile direction.

This sounds like _nea_t idea, especially if we later develop sanier
representation of the stack.

It is also easy to avoid the explicit BB values for each entry in the
array for instance by pushing NULL as frame separator.  Would be
possible for you to give this a try?

> block_defs ought to go away in the future.  The only reason we
> track the current definition of a variable is for the old style
> of jump threading.  It should be possible to rewrite the code

What is the status of jump threading?

> which determines when a block has a threadable COND_EXPR which
> does not rely on maintaining the current definition of any variable.
> 
> The block local const_and_copies and nonzero_vars are trickier to
> make go away.

block_const_and_copies is roughly 200Kb of memory overall,
block_nonzero_vars just 33Kb, so these definitly can be dealt with
later.
> > > What needs to happen now is to expand that simple table of SSA_NAMEs
> > > to be a table of SSA_NAMEs and their global properties (for example
> > > their equivalences and nonzero status indicators).
> > 
> > How this is different from putting this directly into SSA_NAMEs?
> Not significantly so.  Though in general I don't like the idea of
> stuffing more crap into the tree structures.  

Well, technically the ssa_names table is also part of our IL, so we put
it there anyway.
I am not really fond of it either, but this is what we got into in the
past discussion.

Honza

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-14 10:17       ` Richard Henderson
@ 2004-09-14 15:44         ` Jeffrey A Law
  0 siblings, 0 replies; 14+ messages in thread
From: Jeffrey A Law @ 2004-09-14 15:44 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Jan Hubicka, gcc, amacleod

On Tue, 2004-09-14 at 02:00, Richard Henderson wrote:
> On Mon, Sep 13, 2004 at 10:48:15PM -0600, Jeffrey A Law wrote:
> > Another would be to have a single varray for expressions rather
> > than having a varray per block for expressions.  Each element in
> > that varray would have the expression and the block associated
> > with the expression.
> > 
> > In the finalizer, you pop elements off that varray as long as
> > their block matches the block we are leaving.  Remove each
> > of the popped elements from the available expression hash
> > table.
> 
> Or: Push a marker expression, e.g. NULL, on entry to any block.
> In the finalizer, pop entries until reaching the marker.
Yup.  That may be the best way to deal with the block local 
avail_exprs varray.

jeff


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-14 15:41       ` Jan Hubicka
@ 2004-09-16 16:38         ` Jeffrey A Law
  2004-09-16 23:12           ` Jan Hubicka
  0 siblings, 1 reply; 14+ messages in thread
From: Jeffrey A Law @ 2004-09-16 16:38 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, amacleod

On Tue, 2004-09-14 at 09:14, Jan Hubicka wrote:
> > The varrays are a natural way of building unwindable state into
> > the optimizer, which in turn avoids the horrible path following
> > behaviors found in CSE.  However, there are ways to reduce the
> > number of varrays we allocate.
> 
> What I don't understand at all is why these
> needs to be garbage collected

1. If you don't recycle them, then you ended up with hundreds of
thousands of varrays -- enough that we were getting killed allocating
the bloody things and to a lesser extent GC'ing them.

2. Once we started recycling them, they didn't have a nice clear
defined lifetime.  ie, they were natural for the GC system to
manage.




>  and why we didn't developed some stack
> implementation that don't need moving data around at reallocation and
> don't have more convenient API than obstacks.
Well, if you want to go forth and do that, feel free.  It's
nontrivial.

Note that with the new scheme I've started checking in, we don't
have block local varrays anymore, but instead we have one single
toplevel varray with markers within it to denote where data from
the each block ends.

A nice side effect of that is we can look at other allocation schemes
now since we've got a small number of toplevel objects with a well
defined lifetime.  Hell, they could even be implemented with obstacks
(one obstack per toplevel varray).


> 
> What is still behind me why we do have so often big varrays of small
> varrays (like "vrp records" array where each varray occupy 24 bytes and
> is never resized).
Well, the size of the toplevel vrp_records varray is the number of
SSA_NAMES.  ie, any SSA_NAME might have range information associated
with it.  [ non-gimple names can't have range info in practice,
so maybe a hash table is better... Hmmm. ]


For each of the SSA_NAMEs, you can have multiple statements which
restrict the range the SSA_NAME can have at a particular point in
the program.  ie

if (i_1 > 10)
  {
    if (i_1 < 20)
      {
        XXX
      }
  }

At the point XXX, we could have two records for i_1 which would
allow us to  know that 10 < i_1 < 20.

One of the things that will be valuable will be to distinguish
between global range information and local range information.

Global range information comes from the object's type and
its assignment statement and can also be derived from its
usage within loops.

Local range information is deduced from usage contexts such as
what I've shown above.

Distinguishing between the two is one of the long term ways I
want to go with DOM.  DOM served its purpose of being an effective
"cleanup the crap" optimizer; however, a lot of the things it does
can be handled better by CCP, PRE and the like.  As those passes
mature and take over more of DOM's functionality, I expect DOM to
be relegated to optimizations based on information that is local
to branches within the dominance tree.
 
 
> > I haven't prototyped that code, but I don't think it would be
> > terribly hard to do so and run some experiments to see if it's
> > a worthwhile direction.
> 
> This sounds like _nea_t idea, especially if we later develop sanier
> representation of the stack.
> 
> It is also easy to avoid the explicit BB values for each entry in the
> array for instance by pushing NULL as frame separator.  Would be
> possible for you to give this a try?
RTH also suggested using a marker, which I think is ultimately
the better solution than allocating two word structures.

> 
> > block_defs ought to go away in the future.  The only reason we
> > track the current definition of a variable is for the old style
> > of jump threading.  It should be possible to rewrite the code
> 
> What is the status of jump threading?
In what sense? I had to put it on hold to deal with this stuff.

In terms of what's likely to happen next, it would be an improved
method for updating the SSA graph.  That code is written and
bootstraps.  The question for that code now is its performance
characteristics.  Some of that work stands on its own (specifically
not duplicating SSA_NAMEs, but creating new SSA_NAMEs when we
duplicate blocks).

Once that's done I want to return to the algorithm for selecting
which blocks to thread through.  Changes to the updating algorithms
mean that we ought to be able to eliminate the need for currdefs
in the selection phase and allow us to thread through blocks with
side effects that must be preserved.  That's a long standing
limitation that needs to be addressed.

Also on the plate is evaluation of several ideas to avoid iterating
DOM after jump threading.  As I mentioned a few weeks ago, we may
be able to avoid the iteration step by walking through all the
PHI nodes and noting which (if any) create equivalences that didn't
before.  The theory is that any optimization opportunities that
would be found by iterating DOM all start from PHI nodes that
create an equivalency that didn't before.  We're going to need some
new APIs and such, but they're things we needed anyway.  For
example, if we discover an equivlance that didn't exist before,
rather than running all of DOM, we just want to cprop that
equivalence and note what changes in a worklist for later
examination.

Interestingly enough this notion of not iterating DOM, but instead
using new equivalences to catch the cascading effects results in
an optimizer that has a lot of properties that are similar to the
GCM/GVN optimizer from Click.  If things work the way I think they're
going to, we'd be able to do code hoisting to reduce codesize
pretty easily.


> block_const_and_copies is roughly 200Kb of memory overall,
> block_nonzero_vars just 33Kb, so these definitly can be dealt with
> later.
I want to be consistent in how we handle this stuff.  Also note that
if you zap all the block local objects, then you don't need a 
container for them or to manage the containers.  Which saves you
another set of varrays.


Jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-16 16:38         ` Jeffrey A Law
@ 2004-09-16 23:12           ` Jan Hubicka
  2004-09-23 11:38             ` Jeffrey A Law
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Hubicka @ 2004-09-16 23:12 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: Jan Hubicka, gcc, amacleod

> >  and why we didn't developed some stack
> > implementation that don't need moving data around at reallocation and
> > don't have more convenient API than obstacks.
> Well, if you want to go forth and do that, feel free.  It's
> nontrivial.
> 
> Note that with the new scheme I've started checking in, we don't
> have block local varrays anymore, but instead we have one single
> toplevel varray with markers within it to denote where data from
> the each block ends.
> 
> A nice side effect of that is we can look at other allocation schemes
> now since we've got a small number of toplevel objects with a well
> defined lifetime.  Hell, they could even be implemented with obstacks
> (one obstack per toplevel varray).

Yes, your current way of converting the varrays to huge ones makes me
perfectly happy and I think it is finally something that helps us to get
out of the problem.  I realize that I never really understood enought to
that code to figure out myself this sollution. My little mind always
stopped on the scheme of reusing varrays and I was still under belief
that there are nested arrays becuase you need some kind of random access
to the whole stack that is dificult to do with one big array.  So thanks
a lot for comming with this! I believe that it will make also the code
much easier to understand now.

Using obstacks for these is obviously cheaper than varrays so it might
be nice incremental step.  What I was thinking of is to simply wrap
obstack with something like OBSTACK_PUSH_TREE / OBSTACK_POP_TREE so the
api is more symetric to what varrays has that I think is much more
readable.

In the case you like this idea, I can simply do that once you are
finished with the current thread of removing nested arrays.

I am in a hurry now so I will return to rest of your email soon.

Honza

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-16 23:12           ` Jan Hubicka
@ 2004-09-23 11:38             ` Jeffrey A Law
  2004-10-06 20:39               ` Jan Hubicka
  0 siblings, 1 reply; 14+ messages in thread
From: Jeffrey A Law @ 2004-09-23 11:38 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc, amacleod

On Thu, 2004-09-16 at 16:03, Jan Hubicka wrote:

> Using obstacks for these is obviously cheaper than varrays so it might
> be nice incremental step.  What I was thinking of is to simply wrap
> obstack with something like OBSTACK_PUSH_TREE / OBSTACK_POP_TREE so the
> api is more symetric to what varrays has that I think is much more
> readable.
I'm not 100% convinced that obstacks are cheaper than what we're
doing.  But there's a way to be 100% sure, try them.  I've tried
to decipher their behavior to determine if they could be efficiently
used for similar tasks, but frankly got lost in the code.


> In the case you like this idea, I can simply do that once you are
> finished with the current thread of removing nested arrays.
You can certainly try it and see what results you get.  I'd be
most interested in the compile time behavior as well as memory
utilization. 

jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-09-23 11:38             ` Jeffrey A Law
@ 2004-10-06 20:39               ` Jan Hubicka
  2004-10-06 20:52                 ` Daniel Jacobowitz
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Hubicka @ 2004-10-06 20:39 UTC (permalink / raw)
  To: Jeffrey A Law; +Cc: Jan Hubicka, gcc, amacleod

> On Thu, 2004-09-16 at 16:03, Jan Hubicka wrote:
> 
> > Using obstacks for these is obviously cheaper than varrays so it might
> > be nice incremental step.  What I was thinking of is to simply wrap
> > obstack with something like OBSTACK_PUSH_TREE / OBSTACK_POP_TREE so the
> > api is more symetric to what varrays has that I think is much more
> > readable.
> I'm not 100% convinced that obstacks are cheaper than what we're
> doing.  But there's a way to be 100% sure, try them.  I've tried
> to decipher their behavior to determine if they could be efficiently
> used for similar tasks, but frankly got lost in the code.
> 
> 
> > In the case you like this idea, I can simply do that once you are
> > finished with the current thread of removing nested arrays.
> You can certainly try it and see what results you get.  I'd be
> most interested in the compile time behavior as well as memory
> utilization. 

Hi,

unfortunately obstack won't allow you to pop data ; to my surprise ;)
This is stack implementation that don't use resizing and ggc.  I converted the
dom's stacks to it and it seems to be resonably fast.  On Gerald's testcase it
redues compilation time from 1m14.581s to 1m14.025s (consistently) and memory
footprint from 89MB to 80.

While measuring the bootstrap speed, the difference seems to be winthin noise.

2004-10-02  Jan Hubicka  <jh@suse.cz>
	* Makefile.in (ptrstack.o): New.
	(tree-into-ssa.o): Add ptrstack.h dependency.
	* tree-flow.h: Include ptrstack.h
	(register_new_def): Accept ptrstack.
	* tree-into-ssa.c: Include ptrstack.h
	(block_defs_stack): Turn into ptrstack.
	(insert_phi_nodes_for, insert_phi_nodes_1, insert_phi_nodes,
	rewrite_initialize_block, ssa_register_new_def,
	ssa_rewrite_initialize_block, rewrite_finalize_block,
	ssa_rewrite_finalize_block, insert_phi_nodes_for, rewrite_stmt,
	register_new_def, rewrite_into_ssa): Update use of block_defs_stack.
	* tree-ssa-dom.c: Include ptrstack.h
	(avail_exprs_stack, block_defs_stack, stmts_to_rescan,
	const_and_copies_stack, nonzero_vars_stack, vrp_variables_stack): Turn
	into ptrstack.
	(tree_ssa_dominator_optimize, thread_across_edge,
	domp_opt_initialize_block, remove_local_expressions_from_table,
	restore_nonzero_vars_to_original_value,
	restore_currdefs_to_original_value, domp_opt_finalize_block,
	record_equivalence_from_phis, record_var_is_nonzero, record_cond,
	record_const_or_copy_1, optimize_stmt,
	update_rhs_and_lookup_avail_expr, loopup_avail_expr, record_range,
	register_definitions_for_stmt): Update uses of the stacks.
	* ptrstack.h: New file.
	* ptrstack.c: New file.
Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.1406
diff -c -3 -p -r1.1406 Makefile.in
*** Makefile.in	5 Oct 2004 11:56:00 -0000	1.1406
--- Makefile.in	6 Oct 2004 18:01:36 -0000
*************** OBJS-common = \
*** 925,931 ****
   varasm.o varray.o vec.o version.o vmsdbgout.o xcoffout.o alloc-pool.o	   \
   et-forest.o cfghooks.o bt-load.o pretty-print.o $(GGC) web.o passes.o	   \
   rtl-profile.o tree-profile.o rtlhooks.o cfgexpand.o lambda-mat.o          \
!  lambda-trans.o	lambda-code.o tree-loop-linear.o
  
  OBJS-md = $(out_object_file)
  OBJS-archive = $(EXTRA_OBJS) $(host_hook_obj) tree-inline.o		   \
--- 925,931 ----
   varasm.o varray.o vec.o version.o vmsdbgout.o xcoffout.o alloc-pool.o	   \
   et-forest.o cfghooks.o bt-load.o pretty-print.o $(GGC) web.o passes.o	   \
   rtl-profile.o tree-profile.o rtlhooks.o cfgexpand.o lambda-mat.o          \
!  lambda-trans.o	lambda-code.o tree-loop-linear.o ptrstack.o
  
  OBJS-md = $(out_object_file)
  OBJS-archive = $(EXTRA_OBJS) $(host_hook_obj) tree-inline.o		   \
*************** tree-into-ssa.o : tree-into-ssa.c $(TREE
*** 1608,1614 ****
     $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) output.h diagnostic.h \
     errors.h toplev.h function.h $(TIMEVAR_H) \
     $(TM_H) coretypes.h $(TREE_DUMP_H) langhooks.h domwalk.h tree-pass.h \
!    $(GGC_H)
  tree-outof-ssa.o : tree-outof-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
     $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) output.h diagnostic.h \
     errors.h toplev.h function.h $(TIMEVAR_H) \
--- 1608,1614 ----
     $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) output.h diagnostic.h \
     errors.h toplev.h function.h $(TIMEVAR_H) \
     $(TM_H) coretypes.h $(TREE_DUMP_H) langhooks.h domwalk.h tree-pass.h \
!    $(GGC_H) ptrstack.h
  tree-outof-ssa.o : tree-outof-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
     $(RTL_H) $(TREE_H) $(TM_P_H) $(EXPR_H) output.h diagnostic.h \
     errors.h toplev.h function.h $(TIMEVAR_H) \
*************** params.o : params.c $(CONFIG_H) $(SYSTEM
*** 2175,2180 ****
--- 2175,2181 ----
  hooks.o: hooks.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(HOOKS_H)
  pretty-print.o: $(CONFIG_H) $(SYSTEM_H) pretty-print.c $(PRETTY_PRINT_H)
  errors.o : errors.c $(CONFIG_H) $(SYSTEM_H) errors.h
+ ptrstack.o : ptrstack.c $(GGC_H) $(SYSTEM_H) $(CONFIG_H) ptrstack.h
  
  $(out_object_file): $(out_file) $(CONFIG_H) coretypes.h $(TM_H) $(TREE_H) $(GGC_H) \
     $(RTL_H) $(REGS_H) hard-reg-set.h real.h insn-config.h conditions.h \
Index: tree-flow.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-flow.h,v
retrieving revision 2.50
diff -c -3 -p -r2.50 tree-flow.h
*** tree-flow.h	29 Sep 2004 21:23:35 -0000	2.50
--- tree-flow.h	6 Oct 2004 18:01:36 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 29,34 ****
--- 29,35 ----
  #include "tree-gimple.h"
  #include "tree-ssa-operands.h"
  #include "cgraph.h"
+ #include "ptrstack.h"
  
  /* Forward declare structures for the garbage collector GTY markers.  */
  #ifndef GCC_BASIC_BLOCK_H
*************** extern bool tree_ssa_useless_type_conver
*** 582,588 ****
  extern bool tree_ssa_useless_type_conversion_1 (tree, tree);
  extern void verify_ssa (void);
  extern void delete_tree_ssa (void);
! extern void register_new_def (tree, varray_type *);
  extern void walk_use_def_chains (tree, walk_use_def_chains_fn, void *, bool);
  extern void kill_redundant_phi_nodes (void);
  
--- 583,589 ----
  extern bool tree_ssa_useless_type_conversion_1 (tree, tree);
  extern void verify_ssa (void);
  extern void delete_tree_ssa (void);
! extern void register_new_def (tree, ptrstack);
  extern void walk_use_def_chains (tree, walk_use_def_chains_fn, void *, bool);
  extern void kill_redundant_phi_nodes (void);
  
Index: tree-into-ssa.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-into-ssa.c,v
retrieving revision 2.24
diff -c -3 -p -r2.24 tree-into-ssa.c
*** tree-into-ssa.c	29 Sep 2004 21:23:35 -0000	2.24
--- tree-into-ssa.c	6 Oct 2004 18:01:36 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 47,52 ****
--- 47,53 ----
  #include "cfgloop.h"
  #include "domwalk.h"
  #include "ggc.h"
+ #include "ptrstack.h"
  
  /* This file builds the SSA form for a function as described in:
     R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and K. Zadeck. Efficiently
*************** static htab_t def_blocks;
*** 86,92 ****
  /* Stack of trees used to restore the global currdefs to its original
     state after completing rewriting of a block and its dominator children.
  
!    This varray is used in two contexts.  The first is rewriting of _DECL
     nodes into SSA_NAMEs.  In that context it's elements have the
     following properties:
  
--- 87,93 ----
  /* Stack of trees used to restore the global currdefs to its original
     state after completing rewriting of a block and its dominator children.
  
!    This stack is used in two contexts.  The first is rewriting of _DECL
     nodes into SSA_NAMEs.  In that context it's elements have the
     following properties:
  
*************** static htab_t def_blocks;
*** 100,106 ****
       current block. 
  
  
!    This varray is also used when rewriting an SSA_NAME which has multiple
     definition sites into multiple SSA_NAMEs.  In that context entries come
     in pairs.
  
--- 101,107 ----
       current block. 
  
  
!    This stack is also used when rewriting an SSA_NAME which has multiple
     definition sites into multiple SSA_NAMEs.  In that context entries come
     in pairs.
  
*************** static htab_t def_blocks;
*** 109,115 ****
  
       A NULL node at the top entry is used to mark the last node associated
       with the current block.  */
! static varray_type block_defs_stack;
  
  /* Global data to attach to the main dominator walk structure.  */
  struct mark_def_sites_global_data
--- 110,116 ----
  
       A NULL node at the top entry is used to mark the last node associated
       with the current block.  */
! static ptrstack block_defs_stack;
  
  /* Global data to attach to the main dominator walk structure.  */
  struct mark_def_sites_global_data
*************** static void insert_phi_nodes (bitmap *, 
*** 153,159 ****
  static void rewrite_stmt (struct dom_walk_data *, basic_block,
  			  block_stmt_iterator);
  static inline void rewrite_operand (use_operand_p);
! static void insert_phi_nodes_for (tree, bitmap *, varray_type *);
  static tree get_reaching_def (tree);
  static hashval_t def_blocks_hash (const void *);
  static int def_blocks_eq (const void *, const void *);
--- 154,160 ----
  static void rewrite_stmt (struct dom_walk_data *, basic_block,
  			  block_stmt_iterator);
  static inline void rewrite_operand (use_operand_p);
! static void insert_phi_nodes_for (tree, bitmap *, ptrstack);
  static tree get_reaching_def (tree);
  static hashval_t def_blocks_hash (const void *);
  static int def_blocks_eq (const void *, const void *);
*************** prepare_def_operand_for_rename (tree def
*** 586,592 ****
     blocks.  */
  
  static inline
! void insert_phi_nodes_1 (tree var, bitmap *dfs, varray_type *work_stack)
  {
    if (get_phi_state (var) != NEED_PHI_STATE_NO)
      insert_phi_nodes_for (var, dfs, work_stack);
--- 587,593 ----
     blocks.  */
  
  static inline
! void insert_phi_nodes_1 (tree var, bitmap *dfs, ptrstack work_stack)
  {
    if (get_phi_state (var) != NEED_PHI_STATE_NO)
      insert_phi_nodes_for (var, dfs, work_stack);
*************** static void
*** 604,617 ****
  insert_phi_nodes (bitmap *dfs, bitmap names_to_rename)
  {
    size_t i;
!   varray_type work_stack;
    bitmap_iterator bi;
  
    timevar_push (TV_TREE_INSERT_PHI_NODES);
  
!   /* Array WORK_STACK is a stack of CFG blocks.  Each block that contains
       an assignment or PHI node will be pushed to this stack.  */
!   VARRAY_GENERIC_PTR_NOGC_INIT (work_stack, last_basic_block, "work_stack");
  
    /* Iterate over all variables in VARS_TO_RENAME.  For each variable, add
       to the work list all the blocks that have a definition for the
--- 605,618 ----
  insert_phi_nodes (bitmap *dfs, bitmap names_to_rename)
  {
    size_t i;
!   ptrstack work_stack;
    bitmap_iterator bi;
  
    timevar_push (TV_TREE_INSERT_PHI_NODES);
  
!   /* WORK_STACK is a stack of CFG blocks.  Each block that contains
       an assignment or PHI node will be pushed to this stack.  */
!   work_stack = ptrstack_create ();
  
    /* Iterate over all variables in VARS_TO_RENAME.  For each variable, add
       to the work list all the blocks that have a definition for the
*************** insert_phi_nodes (bitmap *dfs, bitmap na
*** 622,640 ****
        EXECUTE_IF_SET_IN_BITMAP (names_to_rename, 0, i, bi)
  	{
  	  if (ssa_name (i))
! 	    insert_phi_nodes_1 (ssa_name (i), dfs, &work_stack);
  	}
      }
    else if (vars_to_rename)
      EXECUTE_IF_SET_IN_BITMAP (vars_to_rename, 0, i, bi)
        {
! 	insert_phi_nodes_1 (referenced_var (i), dfs, &work_stack);
        }
    else
      for (i = 0; i < num_referenced_vars; i++)
!       insert_phi_nodes_1 (referenced_var (i), dfs, &work_stack);
  
!   VARRAY_FREE (work_stack);
  
    timevar_pop (TV_TREE_INSERT_PHI_NODES);
  }
--- 623,641 ----
        EXECUTE_IF_SET_IN_BITMAP (names_to_rename, 0, i, bi)
  	{
  	  if (ssa_name (i))
! 	    insert_phi_nodes_1 (ssa_name (i), dfs, work_stack);
  	}
      }
    else if (vars_to_rename)
      EXECUTE_IF_SET_IN_BITMAP (vars_to_rename, 0, i, bi)
        {
! 	insert_phi_nodes_1 (referenced_var (i), dfs, work_stack);
        }
    else
      for (i = 0; i < num_referenced_vars; i++)
!       insert_phi_nodes_1 (referenced_var (i), dfs, work_stack);
  
!   ptrstack_free (work_stack);
  
    timevar_pop (TV_TREE_INSERT_PHI_NODES);
  }
*************** rewrite_initialize_block (struct dom_wal
*** 678,684 ****
      fprintf (dump_file, "\n\nRenaming block #%d\n\n", bb->index);
  
    /* Mark the unwind point for this block.  */
!   VARRAY_PUSH_TREE (block_defs_stack, NULL_TREE);
  
    /* Step 1.  Register new definitions for every PHI node in the block.
       Conceptually, all the PHI nodes are executed in parallel and each PHI
--- 679,685 ----
      fprintf (dump_file, "\n\nRenaming block #%d\n\n", bb->index);
  
    /* Mark the unwind point for this block.  */
!   ptrstack_push (block_defs_stack, NULL_TREE);
  
    /* Step 1.  Register new definitions for every PHI node in the block.
       Conceptually, all the PHI nodes are executed in parallel and each PHI
*************** rewrite_initialize_block (struct dom_wal
*** 687,693 ****
      {
        tree result = PHI_RESULT (phi);
  
!       register_new_def (result, &block_defs_stack);
      }
  }
  
--- 688,694 ----
      {
        tree result = PHI_RESULT (phi);
  
!       register_new_def (result, block_defs_stack);
      }
  }
  
*************** ssa_register_new_def (tree var, tree def
*** 716,723 ****
       later used by the dominator tree callbacks to restore the reaching
       definitions for all the variables defined in the block after a recursive
       visit to all its immediately dominated blocks.  */
!   VARRAY_PUSH_TREE (block_defs_stack, currdef);
!   VARRAY_PUSH_TREE (block_defs_stack, var);
  
    /* Set the current reaching definition for VAR to be DEF.  */
    set_current_def (var, def);
--- 717,724 ----
       later used by the dominator tree callbacks to restore the reaching
       definitions for all the variables defined in the block after a recursive
       visit to all its immediately dominated blocks.  */
!   ptrstack_push (block_defs_stack, currdef);
!   ptrstack_push (block_defs_stack, var);
  
    /* Set the current reaching definition for VAR to be DEF.  */
    set_current_def (var, def);
*************** ssa_rewrite_initialize_block (struct dom
*** 738,744 ****
      fprintf (dump_file, "\n\nRenaming block #%d\n\n", bb->index);
  
    /* Mark the unwind point for this block.  */
!   VARRAY_PUSH_TREE (block_defs_stack, NULL_TREE);
  
    FOR_EACH_EDGE (e, ei, bb->preds)
      if (e->flags & EDGE_ABNORMAL)
--- 739,745 ----
      fprintf (dump_file, "\n\nRenaming block #%d\n\n", bb->index);
  
    /* Mark the unwind point for this block.  */
!   ptrstack_push (block_defs_stack, NULL_TREE);
  
    FOR_EACH_EDGE (e, ei, bb->preds)
      if (e->flags & EDGE_ABNORMAL)
*************** rewrite_finalize_block (struct dom_walk_
*** 840,852 ****
  			basic_block bb ATTRIBUTE_UNUSED)
  {
    /* Restore CURRDEFS to its original state.  */
!   while (VARRAY_ACTIVE_SIZE (block_defs_stack) > 0)
      {
!       tree tmp = VARRAY_TOP_TREE (block_defs_stack);
        tree saved_def, var;
  
-       VARRAY_POP (block_defs_stack);
- 
        if (tmp == NULL_TREE)
  	break;
  
--- 841,851 ----
  			basic_block bb ATTRIBUTE_UNUSED)
  {
    /* Restore CURRDEFS to its original state.  */
!   while (!ptrstack_empty_p (block_defs_stack))
      {
!       tree tmp = ptrstack_pop (block_defs_stack);
        tree saved_def, var;
  
        if (tmp == NULL_TREE)
  	break;
  
*************** ssa_rewrite_finalize_block (struct dom_w
*** 878,895 ****
  
    /* Step 5.  Restore the current reaching definition for each variable
       referenced in the block (in reverse order).  */
!   while (VARRAY_ACTIVE_SIZE (block_defs_stack) > 0)
      {
!       tree var = VARRAY_TOP_TREE (block_defs_stack);
        tree saved_def;
- 
-       VARRAY_POP (block_defs_stack);
        
        if (var == NULL)
  	break;
  
!       saved_def = VARRAY_TOP_TREE (block_defs_stack);
!       VARRAY_POP (block_defs_stack);
  
        set_current_def (var, saved_def);
      }
--- 877,891 ----
  
    /* Step 5.  Restore the current reaching definition for each variable
       referenced in the block (in reverse order).  */
!   while (!ptrstack_empty_p (block_defs_stack))
      {
!       tree var = ptrstack_pop (block_defs_stack);
        tree saved_def;
        
        if (var == NULL)
  	break;
  
!       saved_def = ptrstack_pop (block_defs_stack);
  
        set_current_def (var, saved_def);
      }
*************** htab_statistics (FILE *file, htab_t htab
*** 965,971 ****
     implement the worklist of basic blocks.  */
  
  static void
! insert_phi_nodes_for (tree var, bitmap *dfs, varray_type *work_stack)
  {
    struct def_blocks_d *def_map;
    bitmap phi_insertion_points;
--- 961,967 ----
     implement the worklist of basic blocks.  */
  
  static void
! insert_phi_nodes_for (tree var, bitmap *dfs, ptrstack work_stack)
  {
    struct def_blocks_d *def_map;
    bitmap phi_insertion_points;
*************** insert_phi_nodes_for (tree var, bitmap *
*** 982,990 ****
    phi_insertion_points = BITMAP_XMALLOC ();
  
    EXECUTE_IF_SET_IN_BITMAP (def_map->def_blocks, 0, bb_index, bi)
!     {
!       VARRAY_PUSH_GENERIC_PTR_NOGC (*work_stack, BASIC_BLOCK (bb_index));
!     }
  
    /* Pop a block off the worklist, add every block that appears in
       the original block's dfs that we have not already processed to
--- 978,984 ----
    phi_insertion_points = BITMAP_XMALLOC ();
  
    EXECUTE_IF_SET_IN_BITMAP (def_map->def_blocks, 0, bb_index, bi)
!     ptrstack_push (work_stack, BASIC_BLOCK (bb_index));
  
    /* Pop a block off the worklist, add every block that appears in
       the original block's dfs that we have not already processed to
*************** insert_phi_nodes_for (tree var, bitmap *
*** 998,1012 ****
       determine if fully pruned or semi pruned SSA form was appropriate.
  
       We now always use fully pruned SSA form.  */
!   while (VARRAY_ACTIVE_SIZE (*work_stack) > 0)
      {
        int dfs_index;
        bitmap_iterator bi;
  
!       bb = VARRAY_TOP_GENERIC_PTR_NOGC (*work_stack);
        bb_index = bb->index;
- 
-       VARRAY_POP (*work_stack);
        
        EXECUTE_IF_AND_COMPL_IN_BITMAP (dfs[bb_index],
  				      phi_insertion_points,
--- 992,1004 ----
       determine if fully pruned or semi pruned SSA form was appropriate.
  
       We now always use fully pruned SSA form.  */
!   while (!ptrstack_empty_p (work_stack))
      {
        int dfs_index;
        bitmap_iterator bi;
  
!       bb = ptrstack_pop (work_stack);
        bb_index = bb->index;
        
        EXECUTE_IF_AND_COMPL_IN_BITMAP (dfs[bb_index],
  				      phi_insertion_points,
*************** insert_phi_nodes_for (tree var, bitmap *
*** 1014,1020 ****
  	{
  	  basic_block bb = BASIC_BLOCK (dfs_index);
  
! 	  VARRAY_PUSH_GENERIC_PTR_NOGC (*work_stack, bb);
  	  bitmap_set_bit (phi_insertion_points, dfs_index);
  	}
      }
--- 1006,1012 ----
  	{
  	  basic_block bb = BASIC_BLOCK (dfs_index);
  
! 	  ptrstack_push (work_stack, bb);
  	  bitmap_set_bit (phi_insertion_points, dfs_index);
  	}
      }
*************** rewrite_stmt (struct dom_walk_data *walk
*** 1088,1094 ****
  
        /* FIXME: We shouldn't be registering new defs if the variable
  	 doesn't need to be renamed.  */
!       register_new_def (DEF_FROM_PTR (def_p), &block_defs_stack);
      }
  }
  
--- 1080,1086 ----
  
        /* FIXME: We shouldn't be registering new defs if the variable
  	 doesn't need to be renamed.  */
!       register_new_def (DEF_FROM_PTR (def_p), block_defs_stack);
      }
  }
  
*************** rewrite_operand (use_operand_p op_p)
*** 1152,1161 ****
  
  /* Register DEF (an SSA_NAME) to be a new definition for its underlying
     variable (SSA_NAME_VAR (DEF)) and push VAR's current reaching definition
!    into the stack pointed by BLOCK_DEFS_P.  */
  
  void
! register_new_def (tree def, varray_type *block_defs_p)
  {
    tree var = SSA_NAME_VAR (def);
    tree currdef;
--- 1144,1153 ----
  
  /* Register DEF (an SSA_NAME) to be a new definition for its underlying
     variable (SSA_NAME_VAR (DEF)) and push VAR's current reaching definition
!    into the stack pointed by BLOCK_DEFS.  */
  
  void
! register_new_def (tree def, ptrstack block_defs)
  {
    tree var = SSA_NAME_VAR (def);
    tree currdef;
*************** register_new_def (tree def, varray_type 
*** 1181,1187 ****
       definitions for all the variables defined in the block after a recursive
       visit to all its immediately dominated blocks.  If there is no current
       reaching definition, then just record the underlying _DECL node.  */
!   VARRAY_PUSH_TREE (*block_defs_p, currdef ? currdef : var);
  
    /* Set the current reaching definition for VAR to be DEF.  */
    set_current_def (var, def);
--- 1173,1179 ----
       definitions for all the variables defined in the block after a recursive
       visit to all its immediately dominated blocks.  If there is no current
       reaching definition, then just record the underlying _DECL node.  */
!   ptrstack_push (block_defs, currdef ? currdef : var);
  
    /* Set the current reaching definition for VAR to be DEF.  */
    set_current_def (var, def);
*************** rewrite_into_ssa (bool all)
*** 1513,1519 ****
    walk_data.global_data = NULL;
    walk_data.block_local_data_size = 0;
  
!   VARRAY_TREE_INIT (block_defs_stack, 10, "Block DEFS Stack");
  
    /* Initialize the dominator walker.  */
    init_walk_dominator_tree (&walk_data);
--- 1505,1511 ----
    walk_data.global_data = NULL;
    walk_data.block_local_data_size = 0;
  
!   block_defs_stack = ptrstack_create ();
  
    /* Initialize the dominator walker.  */
    init_walk_dominator_tree (&walk_data);
*************** rewrite_into_ssa (bool all)
*** 1538,1543 ****
--- 1530,1537 ----
    FOR_EACH_BB (bb)
      BITMAP_XFREE (dfs[bb->index]);
    free (dfs);
+   ptrstack_free (block_defs_stack);
+   block_defs_stack = NULL;
  
    htab_delete (def_blocks);
  
*************** rewrite_ssa_into_ssa (void)
*** 1608,1614 ****
    mark_def_sites_global_data.names_to_rename = snames_to_rename;
    walk_data.global_data = &mark_def_sites_global_data;
  
!   VARRAY_TREE_INIT (block_defs_stack, 10, "Block DEFS Stack");
  
    /* We do not have any local data.  */
    walk_data.block_local_data_size = 0;
--- 1602,1608 ----
    mark_def_sites_global_data.names_to_rename = snames_to_rename;
    walk_data.global_data = &mark_def_sites_global_data;
  
!   block_defs_stack = ptrstack_create ();
  
    /* We do not have any local data.  */
    walk_data.block_local_data_size = 0;
*************** rewrite_ssa_into_ssa (void)
*** 1694,1699 ****
--- 1688,1695 ----
      }
  
    BITMAP_XFREE (to_rename);
+   ptrstack_free (block_defs_stack);
+   block_defs_stack = NULL;
    timevar_pop (TV_TREE_SSA_OTHER);
  }
  
Index: tree-ssa-dom.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-dom.c,v
retrieving revision 2.58
diff -c -3 -p -r2.58 tree-ssa-dom.c
*** tree-ssa-dom.c	4 Oct 2004 13:19:20 -0000	2.58
--- tree-ssa-dom.c	6 Oct 2004 18:01:36 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 42,47 ****
--- 42,48 ----
  #include "tree-pass.h"
  #include "tree-ssa-propagate.h"
  #include "langhooks.h"
+ #include "ptrstack.h"
  
  /* This file implements optimizations on the dominator tree.  */
  
*************** static htab_t avail_exprs;
*** 59,65 ****
     (null).  When we finish processing the block, we pop off entries and
     remove the expressions from the global hash table until we hit the
     marker.  */
! static varray_type avail_exprs_stack;
  
  /* Stack of trees used to restore the global currdefs to its original
     state after completing optimization of a block and its dominator children.
--- 60,66 ----
     (null).  When we finish processing the block, we pop off entries and
     remove the expressions from the global hash table until we hit the
     marker.  */
! static ptrstack avail_exprs_stack;
  
  /* Stack of trees used to restore the global currdefs to its original
     state after completing optimization of a block and its dominator children.
*************** static varray_type avail_exprs_stack;
*** 72,78 ****
  
     A NULL node is used to mark the last node associated with the
     current block.  */
! varray_type block_defs_stack;
  
  /* Stack of statements we need to rescan during finalization for newly
     exposed variables.
--- 73,79 ----
  
     A NULL node is used to mark the last node associated with the
     current block.  */
! static ptrstack block_defs_stack;
  
  /* Stack of statements we need to rescan during finalization for newly
     exposed variables.
*************** varray_type block_defs_stack;
*** 81,87 ****
     expressions are removed from AVAIL_EXPRS.  Else we may change the
     hash code for an expression and be unable to find/remove it from
     AVAIL_EXPRS.  */
! varray_type stmts_to_rescan;
  
  /* Structure for entries in the expression hash table.
  
--- 82,88 ----
     expressions are removed from AVAIL_EXPRS.  Else we may change the
     hash code for an expression and be unable to find/remove it from
     AVAIL_EXPRS.  */
! static ptrstack stmts_to_rescan;
  
  /* Structure for entries in the expression hash table.
  
*************** struct expr_hash_elt
*** 113,119 ****
  
     A NULL entry is used to mark the end of pairs which need to be
     restored during finalization of this block.  */
! static varray_type const_and_copies_stack;
  
  /* Bitmap of SSA_NAMEs known to have a nonzero value, even if we do not
     know their exact value.  */
--- 114,120 ----
  
     A NULL entry is used to mark the end of pairs which need to be
     restored during finalization of this block.  */
! static ptrstack const_and_copies_stack;
  
  /* Bitmap of SSA_NAMEs known to have a nonzero value, even if we do not
     know their exact value.  */
*************** static bitmap nonzero_vars;
*** 124,130 ****
  
     A NULL entry is used to mark the end of names needing their 
     entry in NONZERO_VARS cleared during finalization of this block.  */
! static varray_type nonzero_vars_stack;
  
  /* Track whether or not we have changed the control flow graph.  */
  static bool cfg_altered;
--- 125,131 ----
  
     A NULL entry is used to mark the end of names needing their 
     entry in NONZERO_VARS cleared during finalization of this block.  */
! static ptrstack nonzero_vars_stack;
  
  /* Track whether or not we have changed the control flow graph.  */
  static bool cfg_altered;
*************** struct vrp_hash_elt
*** 218,224 ****
     list to determine which variables need their VRP data updated.
  
     A NULL entry marks the end of the SSA_NAMEs associated with this block.  */
! static varray_type vrp_variables_stack;
  
  struct eq_expr_value
  {
--- 219,225 ----
     list to determine which variables need their VRP data updated.
  
     A NULL entry marks the end of the SSA_NAMEs associated with this block.  */
! static ptrstack vrp_variables_stack;
  
  struct eq_expr_value
  {
*************** tree_ssa_dominator_optimize (void)
*** 307,318 ****
    /* Create our hash tables.  */
    avail_exprs = htab_create (1024, real_avail_expr_hash, avail_expr_eq, free);
    vrp_data = htab_create (ceil_log2 (num_ssa_names), vrp_hash, vrp_eq, free);
!   VARRAY_TREE_INIT (avail_exprs_stack, 20, "Available expression stack");
!   VARRAY_TREE_INIT (block_defs_stack, 20, "Block DEFS stack");
!   VARRAY_TREE_INIT (const_and_copies_stack, 20, "Block const_and_copies stack");
!   VARRAY_TREE_INIT (nonzero_vars_stack, 20, "Block nonzero_vars stack");
!   VARRAY_TREE_INIT (vrp_variables_stack, 20, "Block vrp_variables stack");
!   VARRAY_TREE_INIT (stmts_to_rescan, 20, "Statements to rescan");
    nonzero_vars = BITMAP_XMALLOC ();
    need_eh_cleanup = BITMAP_XMALLOC ();
  
--- 308,319 ----
    /* Create our hash tables.  */
    avail_exprs = htab_create (1024, real_avail_expr_hash, avail_expr_eq, free);
    vrp_data = htab_create (ceil_log2 (num_ssa_names), vrp_hash, vrp_eq, free);
!   avail_exprs_stack = ptrstack_create ();
!   block_defs_stack = ptrstack_create ();
!   const_and_copies_stack = ptrstack_create ();
!   nonzero_vars_stack = ptrstack_create ();
!   vrp_variables_stack = ptrstack_create ();
!   stmts_to_rescan = ptrstack_create ();
    nonzero_vars = BITMAP_XMALLOC ();
    need_eh_cleanup = BITMAP_XMALLOC ();
  
*************** tree_ssa_dominator_optimize (void)
*** 407,412 ****
--- 408,420 ----
    BITMAP_XFREE (nonzero_vars);
    BITMAP_XFREE (need_eh_cleanup);
  
+   ptrstack_free (avail_exprs_stack);
+   ptrstack_free (block_defs_stack);
+   ptrstack_free (const_and_copies_stack);
+   ptrstack_free (nonzero_vars_stack);
+   ptrstack_free (vrp_variables_stack);
+   ptrstack_free (stmts_to_rescan);
+ 
    /* Finally, remove everything except invariants in SSA_NAME_VALUE.
  
       Long term we will be able to let everything in SSA_NAME_VALUE
*************** thread_across_edge (struct dom_walk_data
*** 467,473 ****
        tree src = PHI_ARG_DEF_FROM_EDGE (phi, e);
        tree dst = PHI_RESULT (phi);
        record_const_or_copy (dst, src);
!       register_new_def (dst, &block_defs_stack);
      }
  
    for (bsi = bsi_start (e->dest); ! bsi_end_p (bsi); bsi_next (&bsi))
--- 475,481 ----
        tree src = PHI_ARG_DEF_FROM_EDGE (phi, e);
        tree dst = PHI_RESULT (phi);
        record_const_or_copy (dst, src);
!       register_new_def (dst, block_defs_stack);
      }
  
    for (bsi = bsi_start (e->dest); ! bsi_end_p (bsi); bsi_next (&bsi))
*************** thread_across_edge (struct dom_walk_data
*** 582,588 ****
  	 the result of this statement is used later we can copy propagate
  	 suitably.  */
        record_const_or_copy (lhs, cached_lhs);
!       register_new_def (lhs, &block_defs_stack);
      }
  
    /* If we stopped at a COND_EXPR or SWITCH_EXPR, then see if we know which
--- 590,596 ----
  	 the result of this statement is used later we can copy propagate
  	 suitably.  */
        record_const_or_copy (lhs, cached_lhs);
!       register_new_def (lhs, block_defs_stack);
      }
  
    /* If we stopped at a COND_EXPR or SWITCH_EXPR, then see if we know which
*************** dom_opt_initialize_block (struct dom_wal
*** 718,728 ****
  
    /* Push a marker on the stacks of local information so that we know how
       far to unwind when we finalize this block.  */
!   VARRAY_PUSH_TREE (avail_exprs_stack, NULL_TREE);
!   VARRAY_PUSH_TREE (block_defs_stack, NULL_TREE);
!   VARRAY_PUSH_TREE (const_and_copies_stack, NULL_TREE);
!   VARRAY_PUSH_TREE (nonzero_vars_stack, NULL_TREE);
!   VARRAY_PUSH_TREE (vrp_variables_stack, NULL_TREE);
  
    record_equivalences_from_incoming_edge (walk_data, bb);
  
--- 726,736 ----
  
    /* Push a marker on the stacks of local information so that we know how
       far to unwind when we finalize this block.  */
!   ptrstack_push (avail_exprs_stack, NULL_TREE);
!   ptrstack_push (block_defs_stack, NULL_TREE);
!   ptrstack_push (const_and_copies_stack, NULL_TREE);
!   ptrstack_push (nonzero_vars_stack, NULL_TREE);
!   ptrstack_push (vrp_variables_stack, NULL_TREE);
  
    record_equivalences_from_incoming_edge (walk_data, bb);
  
*************** static void
*** 778,788 ****
  remove_local_expressions_from_table (void)
  {
    /* Remove all the expressions made available in this block.  */
!   while (VARRAY_ACTIVE_SIZE (avail_exprs_stack) > 0)
      {
        struct expr_hash_elt element;
!       tree expr = VARRAY_TOP_TREE (avail_exprs_stack);
!       VARRAY_POP (avail_exprs_stack);
  
        if (expr == NULL_TREE)
  	break;
--- 786,795 ----
  remove_local_expressions_from_table (void)
  {
    /* Remove all the expressions made available in this block.  */
!   while (!ptrstack_empty_p (avail_exprs_stack))
      {
        struct expr_hash_elt element;
!       tree expr = ptrstack_pop (avail_exprs_stack);
  
        if (expr == NULL_TREE)
  	break;
*************** remove_local_expressions_from_table (voi
*** 798,807 ****
  static void
  restore_nonzero_vars_to_original_value (void)
  {
!   while (VARRAY_ACTIVE_SIZE (nonzero_vars_stack) > 0)
      {
!       tree name = VARRAY_TOP_TREE (nonzero_vars_stack);
!       VARRAY_POP (nonzero_vars_stack);
  
        if (name == NULL)
  	break;
--- 805,813 ----
  static void
  restore_nonzero_vars_to_original_value (void)
  {
!   while (!ptrstack_empty_p (nonzero_vars_stack))
      {
!       tree name = ptrstack_pop (nonzero_vars_stack);
  
        if (name == NULL)
  	break;
*************** restore_nonzero_vars_to_original_value (
*** 817,834 ****
  static void
  restore_vars_to_original_value (void)
  {
!   while (VARRAY_ACTIVE_SIZE (const_and_copies_stack) > 0)
      {
        tree prev_value, dest;
  
!       dest = VARRAY_TOP_TREE (const_and_copies_stack);
!       VARRAY_POP (const_and_copies_stack);
  
        if (dest == NULL)
  	break;
  
!       prev_value = VARRAY_TOP_TREE (const_and_copies_stack);
!       VARRAY_POP (const_and_copies_stack);
  
        SSA_NAME_VALUE (dest) =  prev_value;
      }
--- 823,838 ----
  static void
  restore_vars_to_original_value (void)
  {
!   while (!ptrstack_empty_p (const_and_copies_stack))
      {
        tree prev_value, dest;
  
!       dest = ptrstack_pop (const_and_copies_stack);
  
        if (dest == NULL)
  	break;
  
!       prev_value = ptrstack_pop (const_and_copies_stack);
  
        SSA_NAME_VALUE (dest) =  prev_value;
      }
*************** static void
*** 840,852 ****
  restore_currdefs_to_original_value (void)
  {
    /* Restore CURRDEFS to its original state.  */
!   while (VARRAY_ACTIVE_SIZE (block_defs_stack) > 0)
      {
!       tree tmp = VARRAY_TOP_TREE (block_defs_stack);
        tree saved_def, var;
  
-       VARRAY_POP (block_defs_stack);
- 
        if (tmp == NULL_TREE)
  	break;
  
--- 844,854 ----
  restore_currdefs_to_original_value (void)
  {
    /* Restore CURRDEFS to its original state.  */
!   while (!ptrstack_empty_p (block_defs_stack))
      {
!       tree tmp = ptrstack_pop (block_defs_stack);
        tree saved_def, var;
  
        if (tmp == NULL_TREE)
  	break;
  
*************** dom_opt_finalize_block (struct dom_walk_
*** 918,926 ****
  	  /* Push a marker onto the available expression stack so that we
  	     unwind any expressions related to the TRUE arm before processing
  	     the false arm below.  */
! 	  VARRAY_PUSH_TREE (avail_exprs_stack, NULL_TREE);
! 	  VARRAY_PUSH_TREE (block_defs_stack, NULL_TREE);
! 	  VARRAY_PUSH_TREE (const_and_copies_stack, NULL_TREE);
  
  	  /* Record any equivalences created by following this edge.  */
  	  if (TREE_CODE_CLASS (cond_code) == tcc_comparison)
--- 920,928 ----
  	  /* Push a marker onto the available expression stack so that we
  	     unwind any expressions related to the TRUE arm before processing
  	     the false arm below.  */
! 	  ptrstack_push (avail_exprs_stack, NULL_TREE);
! 	  ptrstack_push (block_defs_stack, NULL_TREE);
! 	  ptrstack_push (const_and_copies_stack, NULL_TREE);
  
  	  /* Record any equivalences created by following this edge.  */
  	  if (TREE_CODE_CLASS (cond_code) == tcc_comparison)
*************** dom_opt_finalize_block (struct dom_walk_
*** 975,983 ****
       To be efficient, we note which variables have had their values
       constrained in this block.  So walk over each variable in the
       VRP_VARIABLEs array.  */
!   while (VARRAY_ACTIVE_SIZE (vrp_variables_stack) > 0)
      {
!       tree var = VARRAY_TOP_TREE (vrp_variables_stack);
        struct vrp_hash_elt vrp_hash_elt, *vrp_hash_elt_p;
        void **slot;
  
--- 977,985 ----
       To be efficient, we note which variables have had their values
       constrained in this block.  So walk over each variable in the
       VRP_VARIABLEs array.  */
!   while (!ptrstack_empty_p (vrp_variables_stack))
      {
!       tree var = ptrstack_pop (vrp_variables_stack);
        struct vrp_hash_elt vrp_hash_elt, *vrp_hash_elt_p;
        void **slot;
  
*************** dom_opt_finalize_block (struct dom_walk_
*** 988,995 ****
  	 we are done.  */
        varray_type var_vrp_records;
  
-       VARRAY_POP (vrp_variables_stack);
- 
        if (var == NULL)
  	break;
  
--- 990,995 ----
*************** dom_opt_finalize_block (struct dom_walk_
*** 1015,1029 ****
  
    /* If we queued any statements to rescan in this block, then
       go ahead and rescan them now.  */
!   while (VARRAY_ACTIVE_SIZE (stmts_to_rescan) > 0)
      {
!       tree stmt = VARRAY_TOP_TREE (stmts_to_rescan);
        basic_block stmt_bb = bb_for_stmt (stmt);
  
        if (stmt_bb != bb)
  	break;
  
!       VARRAY_POP (stmts_to_rescan);
        mark_new_vars_to_rename (stmt, vars_to_rename);
      }
  }
--- 1015,1030 ----
  
    /* If we queued any statements to rescan in this block, then
       go ahead and rescan them now.  */
!   while (!ptrstack_empty_p (stmts_to_rescan))
      {
!       tree stmt = ptrstack_top (stmts_to_rescan);
        basic_block stmt_bb = bb_for_stmt (stmt);
  
        if (stmt_bb != bb)
  	break;
  
!       ptrstack_pop (stmts_to_rescan);
! 
        mark_new_vars_to_rename (stmt, vars_to_rename);
      }
  }
*************** record_equivalences_from_phis (struct do
*** 1100,1106 ****
        if (i == PHI_NUM_ARGS (phi))
  	bitmap_set_bit (nonzero_vars, SSA_NAME_VERSION (PHI_RESULT (phi)));
  
!       register_new_def (lhs, &block_defs_stack);
      }
  }
  
--- 1101,1107 ----
        if (i == PHI_NUM_ARGS (phi))
  	bitmap_set_bit (nonzero_vars, SSA_NAME_VERSION (PHI_RESULT (phi)));
  
!       register_new_def (lhs, block_defs_stack);
      }
  }
  
*************** record_var_is_nonzero (tree var)
*** 1307,1313 ****
  
    /* Record this SSA_NAME so that we can reset the global table
       when we leave this block.  */
!   VARRAY_PUSH_TREE (nonzero_vars_stack, var);
  }
  
  /* Enter a statement into the true/false expression hash table indicating
--- 1308,1314 ----
  
    /* Record this SSA_NAME so that we can reset the global table
       when we leave this block.  */
!   ptrstack_push (nonzero_vars_stack, var);
  }
  
  /* Enter a statement into the true/false expression hash table indicating
*************** record_cond (tree cond, tree value)
*** 1326,1332 ****
    if (*slot == NULL)
      {
        *slot = (void *) element;
!       VARRAY_PUSH_TREE (avail_exprs_stack, cond);
      }
    else
      free (element);
--- 1327,1333 ----
    if (*slot == NULL)
      {
        *slot = (void *) element;
!       ptrstack_push (avail_exprs_stack, cond);
      }
    else
      free (element);
*************** record_const_or_copy_1 (tree x, tree y, 
*** 1486,1493 ****
  {
    SSA_NAME_VALUE (x) = y;
  
!   VARRAY_PUSH_TREE (const_and_copies_stack, prev_x);
!   VARRAY_PUSH_TREE (const_and_copies_stack, x);
  }
  
  /* Record that X is equal to Y in const_and_copies.  Record undo
--- 1487,1494 ----
  {
    SSA_NAME_VALUE (x) = y;
  
!   ptrstack_push (const_and_copies_stack, prev_x);
!   ptrstack_push (const_and_copies_stack, x);
  }
  
  /* Record that X is equal to Y in const_and_copies.  Record undo
*************** optimize_stmt (struct dom_walk_data *wal
*** 2772,2778 ****
      }
  
    if (may_have_exposed_new_symbols)
!     VARRAY_PUSH_TREE (stmts_to_rescan, bsi_stmt (si));
  }
  
  /* Replace the RHS of STMT with NEW_RHS.  If RHS can be found in the
--- 2773,2779 ----
      }
  
    if (may_have_exposed_new_symbols)
!     ptrstack_push (stmts_to_rescan, bsi_stmt (si));
  }
  
  /* Replace the RHS of STMT with NEW_RHS.  If RHS can be found in the
*************** update_rhs_and_lookup_avail_expr (tree s
*** 2824,2830 ****
       we found a copy of this statement in the second hash table lookup
       we want _no_ copies of this statement in BLOCK_AVAIL_EXPRs.  */
    if (insert)
!     VARRAY_POP (avail_exprs_stack);
  
    /* And make sure we record the fact that we modified this
       statement.  */
--- 2825,2831 ----
       we found a copy of this statement in the second hash table lookup
       we want _no_ copies of this statement in BLOCK_AVAIL_EXPRs.  */
    if (insert)
!     ptrstack_pop (avail_exprs_stack);
  
    /* And make sure we record the fact that we modified this
       statement.  */
*************** lookup_avail_expr (tree stmt, bool inser
*** 2900,2906 ****
    if (*slot == NULL)
      {
        *slot = (void *) element;
!       VARRAY_PUSH_TREE (avail_exprs_stack, stmt ? stmt : element->rhs);
        return NULL_TREE;
      }
  
--- 2901,2907 ----
    if (*slot == NULL)
      {
        *slot = (void *) element;
!       ptrstack_push (avail_exprs_stack, stmt ? stmt : element->rhs);
        return NULL_TREE;
      }
  
*************** record_range (tree cond, basic_block bb)
*** 3027,3033 ****
  	VARRAY_GENERIC_PTR_INIT (*vrp_records_p, 2, "vrp records");
        
        VARRAY_PUSH_GENERIC_PTR (*vrp_records_p, element);
!       VARRAY_PUSH_TREE (vrp_variables_stack, TREE_OPERAND (cond, 0));
      }
  }
  
--- 3028,3034 ----
  	VARRAY_GENERIC_PTR_INIT (*vrp_records_p, 2, "vrp records");
        
        VARRAY_PUSH_GENERIC_PTR (*vrp_records_p, element);
!       ptrstack_push (vrp_variables_stack, TREE_OPERAND (cond, 0));
      }
  }
  
*************** register_definitions_for_stmt (tree stmt
*** 3291,3297 ****
  
        /* FIXME: We shouldn't be registering new defs if the variable
  	 doesn't need to be renamed.  */
!       register_new_def (def, &block_defs_stack);
      }
  }
  
--- 3292,3298 ----
  
        /* FIXME: We shouldn't be registering new defs if the variable
  	 doesn't need to be renamed.  */
!       register_new_def (def, block_defs_stack);
      }
  }
  

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-10-06 20:39               ` Jan Hubicka
@ 2004-10-06 20:52                 ` Daniel Jacobowitz
  2004-10-06 20:54                   ` Jan Hubicka
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Jacobowitz @ 2004-10-06 20:52 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Jeffrey A Law, gcc, amacleod

On Wed, Oct 06, 2004 at 09:59:34PM +0200, Jan Hubicka wrote:
> > On Thu, 2004-09-16 at 16:03, Jan Hubicka wrote:
> > 
> > > Using obstacks for these is obviously cheaper than varrays so it might
> > > be nice incremental step.  What I was thinking of is to simply wrap
> > > obstack with something like OBSTACK_PUSH_TREE / OBSTACK_POP_TREE so the
> > > api is more symetric to what varrays has that I think is much more
> > > readable.
> > I'm not 100% convinced that obstacks are cheaper than what we're
> > doing.  But there's a way to be 100% sure, try them.  I've tried
> > to decipher their behavior to determine if they could be efficiently
> > used for similar tasks, but frankly got lost in the code.
> > 
> > 
> > > In the case you like this idea, I can simply do that once you are
> > > finished with the current thread of removing nested arrays.
> > You can certainly try it and see what results you get.  I'd be
> > most interested in the compile time behavior as well as memory
> > utilization. 
> 
> Hi,
> 
> unfortunately obstack won't allow you to pop data ; to my surprise ;)

Eh?  obstack_free is precisely a pop operation.

> This is stack implementation that don't use resizing and ggc.  I converted the
> dom's stacks to it and it seems to be resonably fast.  On Gerald's testcase it
> redues compilation time from 1m14.581s to 1m14.025s (consistently) and memory
> footprint from 89MB to 80.
> 
> While measuring the bootstrap speed, the difference seems to be winthin noise.

You didn't include the new files.

-- 
Daniel Jacobowitz

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Varray memory consumption strikes back
  2004-10-06 20:52                 ` Daniel Jacobowitz
@ 2004-10-06 20:54                   ` Jan Hubicka
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Hubicka @ 2004-10-06 20:54 UTC (permalink / raw)
  To: Jan Hubicka, Jeffrey A Law, gcc, amacleod

[-- Attachment #1: Type: text/plain, Size: 645 bytes --]

> 
> Eh?  obstack_free is precisely a pop operation.

Well, for obstack_free you need to know the address of object you want
to pop off the stack, so it is not exactly what I need ;)
> 
> > This is stack implementation that don't use resizing and ggc.  I converted the
> > dom's stacks to it and it seems to be resonably fast.  On Gerald's testcase it
> > redues compilation time from 1m14.581s to 1m14.025s (consistently) and memory
> > footprint from 89MB to 80.
> > 
> > While measuring the bootstrap speed, the difference seems to be winthin noise.
> 
> You didn't include the new files.
Oops...
Attached.
Honza
> 
> -- 
> Daniel Jacobowitz

[-- Attachment #2: ptrstack.h --]
[-- Type: text/x-chdr, Size: 2173 bytes --]

/* stack of pointers
   Copyright (C) 2004 Free Software Foundation, Inc.

This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, 59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA.  */

#ifndef GCC_PTRSTACK_H
#define GCC_PTRSTACK_H

struct ptrstack {
   struct ptrstack *prev;
   unsigned int size, used;
   void *data[1];
};
typedef struct ptrstack *ptrstack;

ptrstack ptrstack_create (void);
void ptrstack_free (ptrstack);
ptrstack ptrstack_grow_internal (ptrstack);

/* Push VALUE to STACK_HEADER. */
static inline void
ptrstack_push (ptrstack stack_header, void *value)
{
  ptrstack stack = stack_header->prev;
  if (stack->size == stack->used)
    stack = ptrstack_grow_internal (stack_header);
  stack->data[stack->used++] = value;
}

/* Pop value out of STACK_HEADER and return it.  */
static inline void *
ptrstack_pop (ptrstack stack_header)
{
  ptrstack stack = stack_header->prev;
  gcc_assert (stack->used || stack->prev != stack);

  if (!stack->used)
    {
      ptrstack oldstack = stack->prev;
      free (stack);
      stack = stack_header->prev = oldstack;
    }
  return stack->data[--stack->used];
}

/* Return top value of STACK_HEADER.  */
static inline void *
ptrstack_top (ptrstack stack_header)
{
  ptrstack stack = stack_header->prev;
  gcc_assert (stack->used || stack->prev != stack);

  if (!stack->used)
    {
      ptrstack oldstack = stack->prev;
      free (stack);
      stack = stack_header->prev = oldstack;
    }
  return stack->data[stack->used - 1];
}

/* Return true IFF the stack is empty.  */
static inline bool
ptrstack_empty_p (ptrstack stack)
{
  return !stack->used;
}
#endif

[-- Attachment #3: ptrstack.c --]
[-- Type: text/x-csrc, Size: 1821 bytes --]

/* stack of pointers
   Copyright (C) 2004 Free Software Foundation, Inc.

This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, 59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA.  */

#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "ptrstack.h"

#define DEFAULT_PTRSTACK_SIZE 256

/* Create empty stack.  */
ptrstack
ptrstack_create (void)
{
  ptrstack stack =
    xmalloc (sizeof (struct ptrstack) +
	     (DEFAULT_PTRSTACK_SIZE - 1) * sizeof (void *));
  stack->prev = stack;
  stack->size = DEFAULT_PTRSTACK_SIZE;
  stack->used = 0;
  return stack;
}

/* Free empty stack.  */
void
ptrstack_free (ptrstack stack)
{
  gcc_assert (stack->prev && !stack->used && stack->prev == stack);
  stack->prev = NULL;
  free (stack);
}

/* Helper function for ptrstack_push - allocate new memory container.  */
ptrstack
ptrstack_grow_internal (ptrstack stack_header)
{
  ptrstack stack = stack_header->prev;
  ptrstack newstack;
  size_t newsize = (size_t) stack->size + (size_t) stack->size / 2;

  if (newsize > 65536)
    newsize = 65536;
  newstack = xmalloc (sizeof (struct ptrstack) + (newsize - 1) * sizeof (void *));
  newstack->prev = stack;
  newstack->size = newsize;
  newstack->used = 0;
  stack_header->prev = newstack;
  return newstack;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-10-06 20:17 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-04  9:59 Varray memory consumption strikes back Jan Hubicka
2004-09-04 18:11 ` Jeffrey A Law
2004-09-04 21:33   ` Jan Hubicka
2004-09-14  5:53     ` Jeffrey A Law
2004-09-14 10:17       ` Richard Henderson
2004-09-14 15:44         ` Jeffrey A Law
2004-09-14 12:52       ` Jan Hubicka
2004-09-14 15:41       ` Jan Hubicka
2004-09-16 16:38         ` Jeffrey A Law
2004-09-16 23:12           ` Jan Hubicka
2004-09-23 11:38             ` Jeffrey A Law
2004-10-06 20:39               ` Jan Hubicka
2004-10-06 20:52                 ` Daniel Jacobowitz
2004-10-06 20:54                   ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).