public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
@ 2002-10-29 19:44 Matt Austern
  2002-10-30 14:27 ` Mark Mitchell
  0 siblings, 1 reply; 11+ messages in thread
From: Matt Austern @ 2002-10-29 19:44 UTC (permalink / raw)
  To: gcc

We've got a finish_file function in dp/decl2.c.  It walks the list of
global declarations, asking, for each one, whether it meets the
criteria for being output.  For some kinds of declarations, such
as inline functions, one of the criteria is whether the identifier
associated with that declaration has been referenced.

This is very inefficient, because in many cases the number of
declarations is much much larger than the number that will be
output.  General rule: work should be proportional to what gets
used, not to what happens to get declared.  I'd like to change
this so that when we know something is referenced it'll get put
on a special list.

The obvious solution is a callback, probably a lang hook, that
gets called when we know that the TREE_SYMBOL_REFERENCED
bit gets set.

So, two questions.

First, is this sort of change of interest?

Second, any advice on where to put that callback?  There are two
obvious choices, assemble_name and a (not yet existant)
TREE_SET_SYMBOL_REFERENCED macro.  If I take the latter
course, I'll change TREE_SYMBOL_REFERENCED so that it
returns an rvalue and change all lvalue uses of that macro to use
TREE_SET_SYMBOL_REFERENCED instead.  The latter of these
two choices obviously involves more change to the source.  Are
there any reasons why it might be the better choice anyway?

			--Matt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
  2002-10-29 19:44 Improved use of TREE_SYMBOL_REFERENCED in the C++ front end Matt Austern
@ 2002-10-30 14:27 ` Mark Mitchell
  2002-10-30 14:36   ` Matt Austern
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Mitchell @ 2002-10-30 14:27 UTC (permalink / raw)
  To: Matt Austern, gcc



--On Tuesday, October 29, 2002 12:37:05 PM -0800 Matt Austern 
<austern@apple.com> wrote:

> We've got a finish_file function in dp/decl2.c.  It walks the list of
> global declarations, asking, for each one, whether it meets the
> criteria for being output.  For some kinds of declarations, such
> as inline functions, one of the criteria is whether the identifier
> associated with that declaration has been referenced.

FWIW, I've always thought that this feedback from the back end to the
front end was a little weird.  I'd rather have the front end pass the
back end the stuff it thinks might need to go out (perhaps having done
some pruning of its own), and then let the back end pick among those
things.  We have no infrastructure for that, though...

> Second, any advice on where to put that callback?  There are two
> obvious choices, assemble_name and a (not yet existant)
> TREE_SET_SYMBOL_REFERENCED macro.

I'd prefer the latter; there are fewer ways to mess that up.

Bear in mind that the back end (unfortunately) only has a string; you'll
have to figure out how to map that back to the corresponding FUNCTION_DECL,
etc., in the front end.  I'm not sure that's easy.

A better, but more invasive, solution would be to keep the information
about what tree the symbol corresponds to in the back end; it could use
that information for other purposes as well.  (Like when the symbol
corresponds to a constant global array, and the array index is a constant,
after optimization, and we want to just grab the constant out of the
array.)

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
  2002-10-30 14:27 ` Mark Mitchell
@ 2002-10-30 14:36   ` Matt Austern
  2002-10-30 14:39     ` David Carlton
  2002-10-31 11:08     ` Mark Mitchell
  0 siblings, 2 replies; 11+ messages in thread
From: Matt Austern @ 2002-10-30 14:36 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

On Wednesday, October 30, 2002, at 12:02 PM, Mark Mitchell wrote:

> --On Tuesday, October 29, 2002 12:37:05 PM -0800 Matt Austern 
> <austern@apple.com> wrote:
>
>> We've got a finish_file function in dp/decl2.c.  It walks the list of
>> global declarations, asking, for each one, whether it meets the
>> criteria for being output.  For some kinds of declarations, such
>> as inline functions, one of the criteria is whether the identifier
>> associated with that declaration has been referenced.
>
> FWIW, I've always thought that this feedback from the back end to the
> front end was a little weird.  I'd rather have the front end pass the
> back end the stuff it thinks might need to go out (perhaps having done
> some pruning of its own), and then let the back end pick among those
> things.  We have no infrastructure for that, though...
>
>> Second, any advice on where to put that callback?  There are two
>> obvious choices, assemble_name and a (not yet existant)
>> TREE_SET_SYMBOL_REFERENCED macro.
>
> I'd prefer the latter; there are fewer ways to mess that up.
>
> Bear in mind that the back end (unfortunately) only has a string; 
> you'll
> have to figure out how to map that back to the corresponding 
> FUNCTION_DECL,
> etc., in the front end.  I'm not sure that's easy.

I'm not sure it's easy either.  Unless I'm missing something, the 
mapping
from a mangled name back to a decl isn't stored anywhere.   In 
principle,
of course, for any identifier node id there should be at most one decl d
such that DECL_ASSEMBLER_NAME(d) == id.  How to find that decl is
another matter...

The easiest solution I've thought of is to set up a hash table on the 
side,
and, for the C++ front end only, change SET_DECL_ASSEMBLER_NAME
to add an entry to that table.

> A better, but more invasive, solution would be to keep the information
> about what tree the symbol corresponds to in the back end; it could use
> that information for other purposes as well.  (Like when the symbol
> corresponds to a constant global array, and the array index is a 
> constant,
> after optimization, and we want to just grab the constant out of the
> array.)

More invasive, indeed!  You're probably right that it would be better,
though.  If you can persuade me that this wouldn't be quite as ghastly
as it looks like it would be, though, I'll go for it.  As you say, it 
could well
be useful for other reasons anyway.

			--Matt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
  2002-10-30 14:36   ` Matt Austern
@ 2002-10-30 14:39     ` David Carlton
  2002-10-31 11:08     ` Mark Mitchell
  1 sibling, 0 replies; 11+ messages in thread
From: David Carlton @ 2002-10-30 14:39 UTC (permalink / raw)
  To: Matt Austern; +Cc: Mark Mitchell, Daniel Berlin, gcc

On Wed, 30 Oct 2002 12:14:27 -0800, Matt Austern <austern@apple.com> said:
> On Wednesday, October 30, 2002, at 12:02 PM, Mark Mitchell wrote:

>> A better, but more invasive, solution would be to keep the
>> information about what tree the symbol corresponds to in the back
>> end; it could use that information for other purposes as well.
>> (Like when the symbol corresponds to a constant global array, and
>> the array index is a constant, after optimization, and we want to
>> just grab the constant out of the array.)

> More invasive, indeed!  You're probably right that it would be
> better, though.  If you can persuade me that this wouldn't be quite
> as ghastly as it looks like it would be, though, I'll go for it.  As
> you say, it could well be useful for other reasons anyway.

I think Daniel Berlin might be dealing with a similar problem right
now when trying to add proper debugging information for namespaces: my
understanding of what he's told me is that, when dwarf2out.c is trying
to output info for using directives, it only has the name of the
namespace that the using directive is trying to add, rather than the
DECL in question, and he really needs the latter so that he can put
the right info in the DW_TAG_imported_module.  (I'm CC'ing Daniel so
he can correct what I'm saying if I've got it wrong.)

David Carlton
carlton@math.stanford.edu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
  2002-10-30 14:36   ` Matt Austern
  2002-10-30 14:39     ` David Carlton
@ 2002-10-31 11:08     ` Mark Mitchell
  2002-10-31 11:37       ` Matt Austern
  1 sibling, 1 reply; 11+ messages in thread
From: Mark Mitchell @ 2002-10-31 11:08 UTC (permalink / raw)
  To: Matt Austern; +Cc: gcc

> I'm not sure it's easy either.  Unless I'm missing something, the mapping
> from a mangled name back to a decl isn't stored anywhere.   In principle,
> of course, for any identifier node id there should be at most one decl d
> such that DECL_ASSEMBLER_NAME(d) == id.  How to find that decl is
> another matter...

Right.

> The easiest solution I've thought of is to set up a hash table on the
> side, and, for the C++ front end only, change SET_DECL_ASSEMBLER_NAME
> to add an entry to that table.

That will probably work, but will cost memory of course.

>> A better, but more invasive, solution would be to keep the information
>> about what tree the symbol corresponds to in the back end; it could use
>> that information for other purposes as well.  (Like when the symbol
>> corresponds to a constant global array, and the array index is a
>> constant,
>> after optimization, and we want to just grab the constant out of the
>> array.)
>
> More invasive, indeed!  You're probably right that it would be better,
> though.  If you can persuade me that this wouldn't be quite as ghastly
> as it looks like it would be, though, I'll go for it.  As you say, it
> could well be useful for other reasons anyway.

I don't think it would be extraordinarily horrible.

You just need to find the places we call assemble_name and pass in a
DECL, when there is one to be had.  What that really means, is finding
the places where SYMBOL_REFs are created and pass in a DECL, where
appropriate.  This is one of those things that will end up being a big,
but largely mechanical, patch.  You don't have to set the DECL field
in the SYMBOL_REF everywher possible -- just everywhere that you need
the call back to occur.  That may just be make_decl_rtl.

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
  2002-10-31 11:08     ` Mark Mitchell
@ 2002-10-31 11:37       ` Matt Austern
  2002-10-31 11:57         ` Richard Henderson
  2002-10-31 13:24         ` Mark Mitchell
  0 siblings, 2 replies; 11+ messages in thread
From: Matt Austern @ 2002-10-31 11:37 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: gcc

On Thursday, October 31, 2002, at 10:03 AM, Mark Mitchell wrote:

>> More invasive, indeed!  You're probably right that it would be better,
>> though.  If you can persuade me that this wouldn't be quite as ghastly
>> as it looks like it would be, though, I'll go for it.  As you say, it
>> could well be useful for other reasons anyway.
>
> I don't think it would be extraordinarily horrible.
>
> You just need to find the places we call assemble_name and pass in a
> DECL, when there is one to be had.  What that really means, is finding
> the places where SYMBOL_REFs are created and pass in a DECL, where
> appropriate.  This is one of those things that will end up being a big,
> but largely mechanical, patch.  You don't have to set the DECL field
> in the SYMBOL_REF everywher possible -- just everywhere that you need
> the call back to occur.  That may just be make_decl_rtl.

OK, I'll look.  If I can avoid wasting memory, that would obviously
be A Good Thing.  (Speaking of avoiding memory wastage, one
alternative I didn't mention was sticking a backreference to the
decl in one of the identifier slots that's currently unused for
mangled names.  Main reason I didn't mention it is that some
other people at Apple are looking at drastically reducing the
space required to store mangled names.  It's silly to cart around
a huge data structure that you only use one word of.)

Oh, and also speaking of big but mechanical patches...

Something I didn't mention in my previous message is that if
we're going to put a callback in the macro that sets the
referenced bit, the first step is to create that macro!  That is,
the first step is to (1) change TREE_SYMBOL_REFERENCED
so that it returns an rvalue; (2) create a new macro,
TREE_SET_SYMBOL_REFERENCED; and (3) change every
place in the compiler that uses TREE_SYMBOL_REFERENCED
in an lvaluish way so that it uses the new SET macro instead.

I'm tempted to submit this patch first, before doing anything
else, just so that this sort of bookkeeping change doesn't get
confused with the real work later on.  It'll touch code all over the
compiler, but the changes are all trivial.  And my understanding
is that this sort of change is considered to be a good idea on its
own merits, since we're gradually working toward getting rid of
the lvalue-returning macros.

Sound reasonable to you?  If so, where would be the best place
to submit that patch?  I could argue for either TOT or the basic
improvements branch.

			--Matt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
  2002-10-31 11:37       ` Matt Austern
@ 2002-10-31 11:57         ` Richard Henderson
  2002-10-31 12:53           ` Matt Austern
  2002-10-31 13:24         ` Mark Mitchell
  1 sibling, 1 reply; 11+ messages in thread
From: Richard Henderson @ 2002-10-31 11:57 UTC (permalink / raw)
  To: Matt Austern; +Cc: Mark Mitchell, gcc

On Thu, Oct 31, 2002 at 10:18:57AM -0800, Matt Austern wrote:
> On Thursday, October 31, 2002, at 10:03 AM, Mark Mitchell wrote:
> >You just need to find the places we call assemble_name and pass in a
> >DECL, when there is one to be had.  What that really means, is finding
> >the places where SYMBOL_REFs are created and pass in a DECL, where
> >appropriate.  This is one of those things that will end up being a big,
> >but largely mechanical, patch.  You don't have to set the DECL field
> >in the SYMBOL_REF everywher possible -- just everywhere that you need
> >the call back to occur.  That may just be make_decl_rtl.

I've been thinking for a while that expanding a 
SYMBOL_REF to contain a pointer back to the DECL
would be a Very Good Thing.

> Something I didn't mention in my previous message is that if
> we're going to put a callback in the macro that sets the
> referenced bit, the first step is to create that macro!  That is,
> the first step is to (1) change TREE_SYMBOL_REFERENCED
> so that it returns an rvalue; (2) create a new macro,
> TREE_SET_SYMBOL_REFERENCED; and (3) change every
> place in the compiler that uses TREE_SYMBOL_REFERENCED
> in an lvaluish way so that it uses the new SET macro instead.

I don't think it's worth creating a new macro.  Using a langhook
directly should be sufficient for the set.  Part 1 is still 
valuable though.


r~

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
  2002-10-31 11:57         ` Richard Henderson
@ 2002-10-31 12:53           ` Matt Austern
  0 siblings, 0 replies; 11+ messages in thread
From: Matt Austern @ 2002-10-31 12:53 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Mark Mitchell, gcc

On Thursday, October 31, 2002, at 10:37 AM, Richard Henderson wrote:

> On Thu, Oct 31, 2002 at 10:18:57AM -0800, Matt Austern wrote:
>> On Thursday, October 31, 2002, at 10:03 AM, Mark Mitchell wrote:
>>> You just need to find the places we call assemble_name and pass in a
>>> DECL, when there is one to be had.  What that really means, is 
>>> finding
>>> the places where SYMBOL_REFs are created and pass in a DECL, where
>>> appropriate.  This is one of those things that will end up being a 
>>> big,
>>> but largely mechanical, patch.  You don't have to set the DECL field
>>> in the SYMBOL_REF everywher possible -- just everywhere that you need
>>> the call back to occur.  That may just be make_decl_rtl.
>
> I've been thinking for a while that expanding a
> SYMBOL_REF to contain a pointer back to the DECL
> would be a Very Good Thing.
>
>> Something I didn't mention in my previous message is that if
>> we're going to put a callback in the macro that sets the
>> referenced bit, the first step is to create that macro!  That is,
>> the first step is to (1) change TREE_SYMBOL_REFERENCED
>> so that it returns an rvalue; (2) create a new macro,
>> TREE_SET_SYMBOL_REFERENCED; and (3) change every
>> place in the compiler that uses TREE_SYMBOL_REFERENCED
>> in an lvaluish way so that it uses the new SET macro instead.
>
> I don't think it's worth creating a new macro.  Using a langhook
> directly should be sufficient for the set.  Part 1 is still
> valuable though.

One of us isn't being sufficiently clear.  I suspect it's me, so I'll
try to explain more clearly.

Yes, you're certainly right that we need a new langhook, and
I should certainly add that as well.  (I should have mentioned it
as part 4 above).  The question is: when does that langhook
get called?

The answer I would like is that it gets called whenever an
identifier's REFERENCED flag gets changed.  Right now, the
way that flag gets changed is with things like
   TREE_SYMBOL_REFERENCED (id) = 1;
Since TREE_SYMBOL_REFERENCED returns an lvalue,
there's no way to distinguish from within the macro whether
it's being used to query the flat or to change it.   (Besides,
even if I did call the langhook every time the existing macro
is used, I wouldn't be calling the langhook at the right time:
I'd be giving it the old value, not the new one.)

(You might ask why I want to have the langhook called from
within the macro, instead of calling it separately.  Answer:
the referenced flag is set all over the place, not just in (say)
assemble name, sometimes very deep in platform-specific
code; if you do a grep -r, you'll see what I mean.  My feeling
is that it's safer from a maintenance perspective to to have
the langhook called from just one point.  I'm willing to be
persuaded otherwise, of course.)

So I'd like to have two different macros, one to query the flag
and one to set it.  Certainly it would make no sense to change
TREE_SYMBOL_REFERENCED to return an rvalue unless I
also provided a new mechanism for setting that flag.

			--Matt

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
  2002-10-31 11:37       ` Matt Austern
  2002-10-31 11:57         ` Richard Henderson
@ 2002-10-31 13:24         ` Mark Mitchell
  1 sibling, 0 replies; 11+ messages in thread
From: Mark Mitchell @ 2002-10-31 13:24 UTC (permalink / raw)
  To: Matt Austern; +Cc: gcc

> I'm tempted to submit this patch first, before doing anything
> else, just so that this sort of bookkeeping change doesn't get
> confused with the real work later on.

That's fine.

> Sound reasonable to you?  If so, where would be the best place
> to submit that patch?  I could argue for either TOT or the basic
> improvements branch.

BIB -- we're not going to call this a bugfix.

Thanks,

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
  2002-11-05 14:28 ` Matt Austern
@ 2002-11-05 15:15   ` Richard Henderson
  0 siblings, 0 replies; 11+ messages in thread
From: Richard Henderson @ 2002-11-05 15:15 UTC (permalink / raw)
  To: Matt Austern; +Cc: Mark Mitchell, gcc

On Tue, Nov 05, 2002 at 02:29:27PM -0800, Matt Austern wrote:
> I'd like to submit a patch for this later today.

Well, let's see the patch and decide then.


r~

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Improved use of TREE_SYMBOL_REFERENCED in the C++ front end
       [not found] <20021031230853.GE5029@redhat.com>
@ 2002-11-05 14:28 ` Matt Austern
  2002-11-05 15:15   ` Richard Henderson
  0 siblings, 1 reply; 11+ messages in thread
From: Matt Austern @ 2002-11-05 14:28 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Mark Mitchell, gcc


On Thursday, October 31, 2002, at 03:08  PM, Richard Henderson wrote:

> On Thu, Oct 31, 2002 at 11:01:34AM -0800, Matt Austern wrote:
>> So I'd like to have two different macros, one to query the flag
>> and one to set it.
>
> Ok, my point is that I'd like to have one macro, which is
> read-only, and one langhook, which is write-only.
>
> As for the variety of places that set TREE_SYMBOL_REFERENCED,
> yes that is probably something that should be cleaned up.

Aha, I finally understand what you were suggesting.  Sorry; I've
been dense.  I think the reason I took so long is that you and I
had slightly different ideas about what the langhook would be.
You were assuming it would be something like:
   (*hooks.set_symbol_referenced) (id, 1)
I, however, was assuming it would be something more like:
   (*hooks.referenced_flag_was_changed) (id)
That is, I was assuming the hook would be called after the flag
had been changed, and that it would be called, for example, from
within the macro that did the setting.

But now that I understand what you meant, I think I do still
prefer my original scheme.  My rationale:
  (1) It's possible that in the future we won't need that hook.
      At present, I haven't heard anyone suggest that it's needed
      for anything but the C++ front end.  If we do get rid of the
      hook (or if we never need to introduce it in the first place)
      then I'd rather make the change in one place, the macro for
      setting the flag, than in every place that uses the macro.
  (2) If we have no macro for setting the flag, then how would the
      langhook do it?  The only answer to that question that I can
      think of is that it would manipulate tree node data structures
      directly.  But we try very hard to avoid that; we try to make
      sure those structures are manipulated only through macros, so
      that when we have to change the data structures we don't have
      to make changes anywhere but in tree.c or tree.h.

I'd like to submit a patch for this later today.

			--Matt

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2002-11-05 23:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-29 19:44 Improved use of TREE_SYMBOL_REFERENCED in the C++ front end Matt Austern
2002-10-30 14:27 ` Mark Mitchell
2002-10-30 14:36   ` Matt Austern
2002-10-30 14:39     ` David Carlton
2002-10-31 11:08     ` Mark Mitchell
2002-10-31 11:37       ` Matt Austern
2002-10-31 11:57         ` Richard Henderson
2002-10-31 12:53           ` Matt Austern
2002-10-31 13:24         ` Mark Mitchell
     [not found] <20021031230853.GE5029@redhat.com>
2002-11-05 14:28 ` Matt Austern
2002-11-05 15:15   ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).