public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* macros, debug information, and parse_macro_definition
@ 2003-04-22 16:40 David Taylor
  2003-04-22 20:06 ` Kevin Buettner
  2003-04-24  3:08 ` Jim Blandy
  0 siblings, 2 replies; 11+ messages in thread
From: David Taylor @ 2003-04-22 16:40 UTC (permalink / raw)
  To: gdb

One of my projects at work is to do the necessary gcc and gdb work to
allow users of sgdb (our GUI on top of GDB) to do macro expansion.

Now, we use ELF/STABS, not ELF/DWARF...

The encoding of macros that I have chosen is very very similar to the
DWARF-2 encoding.  In particular, the string is the same.  (I see no
reason to invent a new encoding.)

As a result, at some point I will need to call a function which will
either be identical to or 99% identical to parse_macro_definition.
So, I'd like to propose that the function parse_macro_definition be
made non static and that it and its support functions (copy_string,
dwarf2_macro_malformed_definition_complaint, consume_improper_spaces
-- all three of which are called *ONLY* by parse_macro_defintion) be
moved to another file -- since they are not DWARF specific anymore.

Any objections?  File name?

My inclination is to move them to macrotab.c since the function
parse_macro_defintion calls functions within that file and can be
thought of as a thin veneer above the functions macro_define_object
and macro_define_function.

Comments?

David
--
David Taylor
dtaylor@emc.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: macros, debug information, and parse_macro_definition
  2003-04-22 16:40 macros, debug information, and parse_macro_definition David Taylor
@ 2003-04-22 20:06 ` Kevin Buettner
  2003-04-24  3:08 ` Jim Blandy
  1 sibling, 0 replies; 11+ messages in thread
From: Kevin Buettner @ 2003-04-22 20:06 UTC (permalink / raw)
  To: David Taylor, gdb

On Apr 22, 12:40pm, David Taylor wrote:

> As a result, at some point I will need to call a function which will
> either be identical to or 99% identical to parse_macro_definition.
> So, I'd like to propose that the function parse_macro_definition be
> made non static and that it and its support functions (copy_string,
> dwarf2_macro_malformed_definition_complaint, consume_improper_spaces
> -- all three of which are called *ONLY* by parse_macro_defintion) be
> moved to another file -- since they are not DWARF specific anymore.
> 
> Any objections?  File name?

No objections from me...

> My inclination is to move them to macrotab.c since the function
> parse_macro_defintion calls functions within that file and can be
> thought of as a thin veneer above the functions macro_define_object
> and macro_define_function.
> 
> Comments?

Sounds reasonable to me.

Kevin

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: macros, debug information, and parse_macro_definition
  2003-04-22 16:40 macros, debug information, and parse_macro_definition David Taylor
  2003-04-22 20:06 ` Kevin Buettner
@ 2003-04-24  3:08 ` Jim Blandy
  2003-04-24 15:12   ` David Taylor
  1 sibling, 1 reply; 11+ messages in thread
From: Jim Blandy @ 2003-04-24  3:08 UTC (permalink / raw)
  To: David Taylor; +Cc: gdb


David Taylor <dtaylor@emc.com> writes:
> One of my projects at work is to do the necessary gcc and gdb work to
> allow users of sgdb (our GUI on top of GDB) to do macro expansion.
> 
> Now, we use ELF/STABS, not ELF/DWARF...
> 
> The encoding of macros that I have chosen is very very similar to the
> DWARF-2 encoding.  In particular, the string is the same.  (I see no
> reason to invent a new encoding.)
> 
> As a result, at some point I will need to call a function which will
> either be identical to or 99% identical to parse_macro_definition.
> So, I'd like to propose that the function parse_macro_definition be
> made non static and that it and its support functions (copy_string,
> dwarf2_macro_malformed_definition_complaint, consume_improper_spaces
> -- all three of which are called *ONLY* by parse_macro_defintion) be
> moved to another file -- since they are not DWARF specific anymore.
> 
> Any objections?  File name?
> 
> My inclination is to move them to macrotab.c since the function
> parse_macro_defintion calls functions within that file and can be
> thought of as a thin veneer above the functions macro_define_object
> and macro_define_function.

This seems like obviously the right thing to do, but I'm hesitating.

If there are two independent specs for how to encode something, then
GDB should have two independent readers, one for each spec.  Even if
the specs happen to be identical.  Specs change, and code has bugs; we
want to be able to upgrade or fix each reader without worrying that
we're breaking the other one.

If you think the syntax is too trivial to worry about this kind of
thing, then maybe sharing code could be okay.  Or if you can persuade
folks to make stabs.texinfo say, "the format is defined to be the same
as that used in Dwarf .debug_macinfo DW_MACINFO_define records; here's
what it looks like, but the Dwarf spec is authoritative", then that
would be okay.

If that can't be swung, I think duplicating the code would be the
right thing to do.  It's only ~170 lines.

Maybe I'm being silly.  But I'm reminded of the tragic case of
monitor.c:monitor_supply_register, used by lots of different *-rom.c
files to parse register dumps.  It had a nice set of heuristics for
groveling through all the garbage that typical ROM monitors print out
when you ask them to dump the register set, and finding the actual
meaningful hex digits amidst all the noise.  The problem was, once the
function was being used by too many ROM monitors, you couldn't change
its behavior for fear of breaking some other *-rom.c file for some
monitor you'd never be able to get your hands on.  It just wasn't
clear exactly what behaviors people relied upon and which they didn't.
Thus the following change:

    revision 1.6
    date: 2001/09/07 21:27:36;  author: jimb;  state: Exp;  lines: +79 -1
    Correctly parse register values provided by the monitor.
    * rom68k-rom.c: #include "value.h".
    (is_hex_digit, hex_digit_value, is_whitespace,
    rom68k_supply_one_register): New static functions.
    (rom68k_supply_register): Call rom68k_supply_one_register, instead
    of monitor_supply_register; the latter was incorrectly parsing
    the values.
    * Makefile.in (rom68k-rom.o): Note that this now #includes value.h.

I had to give rom68k-rom.c a new parser function of its own, that I
could fix and adjust without having to worry about breaking other roms.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: macros, debug information, and parse_macro_definition
  2003-04-24  3:08 ` Jim Blandy
@ 2003-04-24 15:12   ` David Taylor
  2003-04-29 14:55     ` Andrew Cagney
  0 siblings, 1 reply; 11+ messages in thread
From: David Taylor @ 2003-04-24 15:12 UTC (permalink / raw)
  To: Jim Blandy; +Cc: gdb

> From: Jim Blandy <jimb@redhat.com>
> Date: 23 Apr 2003 22:12:07 -0500

> David Taylor <dtaylor@emc.com> writes:

  [...]

> > My inclination is to move them to macrotab.c since the function
> > parse_macro_defintion calls functions within that file and can be
> > thought of as a thin veneer above the functions macro_define_object
> > and macro_define_function.
> 
> This seems like obviously the right thing to do, but I'm hesitating.
> 
> If there are two independent specs for how to encode something, then
> GDB should have two independent readers, one for each spec.  Even if
> the specs happen to be identical.  Specs change, and code has bugs; we
> want to be able to upgrade or fix each reader without worrying that
> we're breaking the other one.
>
> If you think the syntax is too trivial to worry about this kind of
> thing, then maybe sharing code could be okay.  Or if you can persuade
> folks to make stabs.texinfo say, "the format is defined to be the same
> as that used in Dwarf .debug_macinfo DW_MACINFO_define records; here's
> what it looks like, but the Dwarf spec is authoritative", then that
> would be okay.

The DWARF 2.0.0 macinfo types that correspond to the proposed STABS
types N_MAC_UNDEF and N_MAC_DEFINE are DW_MACINFO_undef and
DW_MACINFO_define, respectively.  Each type is defined to take two
operands -- an integer and a string.  The integer is the line number
on which the #undef or #define appears.

Now, .stabs has a place to put the line number (the 'n_desc' field),
so that's not a problem.  I'm proposing that the strings be encoded
the same in the proposed STABS extension as the strings are encoded in
DWARF 2.0.0.

Here's the relevant portions of the DWARF 2.0.0 spec (from section
6.3.1.1 Define and Undefine Entries):

[For N_MAC_UNDEF:]

    In the case of a DW_MACINFO_undef entry, the value of this string
    will be simply the name of the pre-processor symbol which was
    undefined at the indicated source file.

[Seems simple enough, no?  For N_MAC_DEFINE:]

    In the case of a DW_MACINFO_define entry, the value of this string
    will be the name of the pre-processor symbol that was defined at
    the indicated source file, followed immediately by the macro
    formal parameter list including the surrounding parentheses (in
    the case of a function-like macro) followed by the definition
    string for the macro.  If there is no formal parameter list, then
    the name of the defined macro is followed directly by its
    definition string.

    In the case of a function-like macro definition, no whitespace
    characters should appear between the name of the defined macro and
    the following left parenthesis.  Also, no whitespace characters
    should appear between successive formal paramters in the formal
    parameter list.  (Successive formal parameters should, however, be
    separated by commas.)  Also, exactly one space character should
    separate the right parenthesis which terminates the formal
    parameter list and the following definition string.

    In the case of a ``normal'' (i.e. non-function-like) macro
    definition, exactly one space character should separate the name
    of the macro from the following definition text.

[A bit of verbiage about whitespace, but again it seems pretty simple.]

> If that can't be swung, I think duplicating the code would be the
> right thing to do.  It's only ~170 lines.

These encodings are simple enough that I wouldn't expect them to
change over time.

I would think it more likely that there would be a bug, either in
gcc's encoding of them or in gdb's processing of them.  And if the
code is duplicated, then both locations need to be modified to fix the
bug -- with the attendant risk that someone will modify one and forget
to modify the other.  Or will modify it, but accidentally make it
slightly different.

I don't know how people would feel about having STABS have a
description of the encoding combined with a caveat that the DWARF
2.0.0 spec is authoritative.  I would certainly prefer that the STABS
document remain self contained.

I wouldn't mind a statement saying, roughly:

    The above encoding of the string operands to N_MAC_DEFINE and
    N_MAC_UNDEF is meant to be identical to the encoding of the string
    operands to the DWARF 2.0.0 macinfo types DW_MACINFO_define and
    DW_MACINFO_undef.  Any differences between the two are
    unintentional.

So, it stops short of saying that DWARF 2.0.0 is authoritative; 

> Maybe I'm being silly.  But I'm reminded of the tragic case of
> monitor.c:monitor_supply_register, used by lots of different *-rom.c
> files to parse register dumps.  It had a nice set of heuristics for
> groveling through all the garbage that typical ROM monitors print out
> when you ask them to dump the register set, and finding the actual
> meaningful hex digits amidst all the noise.  The problem was, once the
> function was being used by too many ROM monitors, you couldn't change
> its behavior for fear of breaking some other *-rom.c file for some
> monitor you'd never be able to get your hands on.  It just wasn't
> clear exactly what behaviors people relied upon and which they didn't.

GDB supports fewer debug format readers than monitors.  And I do not
forsee a lot of gdb's debug format readers using these routines.

> Thus the following change:
> 
>     revision 1.6
>     date: 2001/09/07 21:27:36;  author: jimb;  state: Exp;  lines: +79 -1
>     Correctly parse register values provided by the monitor.
>     * rom68k-rom.c: #include "value.h".
>     (is_hex_digit, hex_digit_value, is_whitespace,
>     rom68k_supply_one_register): New static functions.
>     (rom68k_supply_register): Call rom68k_supply_one_register, instead
>     of monitor_supply_register; the latter was incorrectly parsing
>     the values.
>     * Makefile.in (rom68k-rom.o): Note that this now #includes value.h.
> 
> I had to give rom68k-rom.c a new parser function of its own, that I
> could fix and adjust without having to worry about breaking other roms.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: macros, debug information, and parse_macro_definition
  2003-04-24 15:12   ` David Taylor
@ 2003-04-29 14:55     ` Andrew Cagney
  2003-04-29 15:55       ` Daniel Berlin
                         ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Andrew Cagney @ 2003-04-29 14:55 UTC (permalink / raw)
  To: David Taylor; +Cc: gdb


> 
> These encodings are simple enough that I wouldn't expect them to
> change over time.
> 
> I would think it more likely that there would be a bug, either in
> gcc's encoding of them or in gdb's processing of them.  And if the
> code is duplicated, then both locations need to be modified to fix the
> bug -- with the attendant risk that someone will modify one and forget
> to modify the other.  Or will modify it, but accidentally make it
> slightly different.


> I don't know how people would feel about having STABS have a
> description of the encoding combined with a caveat that the DWARF
> 2.0.0 spec is authoritative.  I would certainly prefer that the STABS
> document remain self contained.

I think this is a good strategy.  It's already a ``gcc extension'',  and 
stealing an existing (known to be working) spec is always a good 
strategy :-)

Have you thought about link time information compression?  One of the 
complaints leveled at the macro stuff is the size of the resultant debug 
info.

Andrew


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: macros, debug information, and parse_macro_definition
  2003-04-29 14:55     ` Andrew Cagney
@ 2003-04-29 15:55       ` Daniel Berlin
  2003-04-29 16:24         ` Keith Walker
  2003-04-29 16:51       ` David Taylor
  2003-05-02 19:21       ` Jim Blandy
  2 siblings, 1 reply; 11+ messages in thread
From: Daniel Berlin @ 2003-04-29 15:55 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: David Taylor, gdb


On Tuesday, April 29, 2003, at 10:55  AM, Andrew Cagney wrote:

>
>> These encodings are simple enough that I wouldn't expect them to
>> change over time.
>> I would think it more likely that there would be a bug, either in
>> gcc's encoding of them or in gdb's processing of them.  And if the
>> code is duplicated, then both locations need to be modified to fix the
>> bug -- with the attendant risk that someone will modify one and forget
>> to modify the other.  Or will modify it, but accidentally make it
>> slightly different.
>
>
>> I don't know how people would feel about having STABS have a
>> description of the encoding combined with a caveat that the DWARF
>> 2.0.0 spec is authoritative.  I would certainly prefer that the STABS
>> document remain self contained.
>
> I think this is a good strategy.  It's already a ``gcc extension'',  
> and stealing an existing (known to be working) spec is always a good 
> strategy :-)
>
> Have you thought about link time information compression?  One of the 
> complaints leveled at the macro stuff is the size of the resultant 
> debug info.

It's trivial to do macro information compression, unlike normal dwarf2 
info compression, because the macro info has no references. It is what 
it is.  With a smart algorithm (it's a bit tricky to keep the semantics 
the same after merging all the macro infos), you could simply take all 
the macro infos, merge them, make one macro info, and point all the 
debug sections at it.
I think, anyway.

>
> Andrew
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: macros, debug information, and parse_macro_definition
  2003-04-29 15:55       ` Daniel Berlin
@ 2003-04-29 16:24         ` Keith Walker
  2003-04-29 17:13           ` Daniel Berlin
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Walker @ 2003-04-29 16:24 UTC (permalink / raw)
  To: Daniel Berlin; +Cc: gdb

At 11:55 29/04/2003 -0400, Daniel Berlin wrote:
>It's trivial to do macro information compression, unlike normal dwarf2 
>info compression, because the macro info has no references. It is what it 
>is.  With a smart algorithm (it's a bit tricky to keep the semantics the 
>same after merging all the macro infos), you could simply take all the 
>macro infos, merge them, make one macro info, and point all the debug 
>sections at it.
>I think, anyway.

I'm not sure whether your comment is about macro info in general or about 
macro info in DWARF2.

Unfortunately, for DWARF2 debugging information, I don't think it is quite 
so easy in that the macro information can include file start/end entries 
which refer to file entries in the associated line number information - so 
you would also have to do something about merging the line number tables as 
well; and hence update all other entries that refer to the file entries.

Keith
  

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: macros, debug information, and parse_macro_definition
  2003-04-29 14:55     ` Andrew Cagney
  2003-04-29 15:55       ` Daniel Berlin
@ 2003-04-29 16:51       ` David Taylor
  2003-05-02 19:21       ` Jim Blandy
  2 siblings, 0 replies; 11+ messages in thread
From: David Taylor @ 2003-04-29 16:51 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: gdb

> Date: Tue, 29 Apr 2003 10:55:58 -0400
> From: Andrew Cagney <ac131313@redhat.com>

> Have you thought about link time information compression?  One of the 
> complaints leveled at the macro stuff is the size of the resultant debug 
> info.
> 
> Andrew

I've thought some about (link time) debug information compression;
but, probably not as much as I should have.

[We already do some compression of STABS w/o macro information because
it's too voluminous as it is.  (And DWARF is worse still.)  I imagine
I will have to solve the problem in-house when macro information is
'turned on', so yes, this is a problem I need to think about.]

Since it's not in a separate section, you can't just say "oh, these
are identical, only keep one copy".

And you certainly (or, I certainly) don't want to tech LD about the
internals of STABS (or any debug format, for that matter) -- to, for
example, compare chunks that are bracketed by N_BINCL / N_EINCL (begin
include, end include), never mind more serious knowledge.

The choices that come immediately to mind are:

. teach LD about the internals of the debug format -- major blech.

. leave the information in the object files and have the executable
  file just 'reference' the object file information in some manner.

  [Sun does/did this with their compilers.  And this has some appeal.
  I believe that it would satisfy needs during development; but, I
  believe that it would not satisfy support needs.  So, I don't see us
  implementing it unless it turns out that its just a step along the
  way to the real solution.]

. have a separate program do some post processing of the debug symbols
  to compress them.

  [Hmmm, what would it take to get GNU LD to support piping its output
  to a program?  That way, the huge executable need never be stored on
  disk.  And then, if you invoked it via the specs file, it could be
  made pretty transparent to the user...  Just thinking out loud, mind
  you.]

. somehow generate only one copy in the first place.  Possibly by
  doing something with 'pre-compiled' header files?  What exactly, is
  unclear.

I believe that all of the above would require GDB work to support
them.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: macros, debug information, and parse_macro_definition
  2003-04-29 16:24         ` Keith Walker
@ 2003-04-29 17:13           ` Daniel Berlin
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Berlin @ 2003-04-29 17:13 UTC (permalink / raw)
  To: Keith Walker; +Cc: gdb


On Tuesday, April 29, 2003, at 12:16  PM, Keith Walker wrote:

> At 11:55 29/04/2003 -0400, Daniel Berlin wrote:
>> It's trivial to do macro information compression, unlike normal 
>> dwarf2 info compression, because the macro info has no references. It 
>> is what it is.  With a smart algorithm (it's a bit tricky to keep the 
>> semantics the same after merging all the macro infos), you could 
>> simply take all the macro infos, merge them, make one macro info, and 
>> point all the debug sections at it.
>> I think, anyway.
>
> I'm not sure whether your comment is about macro info in general or 
> about macro info in DWARF2.
>
> Unfortunately, for DWARF2 debugging information, I don't think it is 
> quite so easy in that the macro information can include file start/end 
> entries which refer to file entries in the associated line number 
> information -
I'm aware, i wrote GCC's DWARF2 macro info support.
:)

> so you would also have to do something about merging the line number 
> tables as well;
This is also trivial, since it's just going to require updating an 
attribute on all the DIEs, not moving/removing DIE's (which is the only 
real thing that generates problems in merging DWARF2 info).

>  and hence update all other entries that refer to the file entries.
Still trivial compared to what you have to do for DWARF2 info in 
general.

>
> Keith
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: macros, debug information, and parse_macro_definition
  2003-04-29 14:55     ` Andrew Cagney
  2003-04-29 15:55       ` Daniel Berlin
  2003-04-29 16:51       ` David Taylor
@ 2003-05-02 19:21       ` Jim Blandy
  2003-05-04 19:41         ` Andrew Cagney
  2 siblings, 1 reply; 11+ messages in thread
From: Jim Blandy @ 2003-05-02 19:21 UTC (permalink / raw)
  To: Andrew Cagney; +Cc: David Taylor, gdb


Andrew Cagney <ac131313@redhat.com> writes:

> > These encodings are simple enough that I wouldn't expect them to
> > change over time.
> > I would think it more likely that there would be a bug, either in
> > gcc's encoding of them or in gdb's processing of them.  And if the
> > code is duplicated, then both locations need to be modified to fix the
> > bug -- with the attendant risk that someone will modify one and forget
> > to modify the other.  Or will modify it, but accidentally make it
> > slightly different.
> 
> 
> > I don't know how people would feel about having STABS have a
> > description of the encoding combined with a caveat that the DWARF
> > 2.0.0 spec is authoritative.  I would certainly prefer that the STABS
> > document remain self contained.
> 
> I think this is a good strategy.  It's already a ``gcc extension'',
> and stealing an existing (known to be working) spec is always a good
> strategy :-)
> 
> Have you thought about link time information compression?  One of the
> complaints leveled at the macro stuff is the size of the resultant
> debug info.

Actually, I think that David's proposed representation will compress
really well, with no new linker work.  The linker's current behavior
will do everything that's needed.

Each entry in the .stab section is a fixed-size record; the textual
portion of the stab is represented as an offset into the .stabstr
section, which contains null-terminated strings.  The .stabstr section
is a SHT_STRTAB type section, which means that the linker will
automatically factor out duplicates.  So if two .stab entries have the
same text, they'll end up pointing to the same bytes in .stabstr in
the final executable.

In David's proposed representation, #including a file into many
different .o files will produce stabs entries with identical strings,
so they'll all get factored out nicely.

All this is completely independent of the BINCL/EINCL -> EXCL
compression the linker also does for STABS, to factor out duplicated
entries from the .stab section itself.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: macros, debug information, and parse_macro_definition
  2003-05-02 19:21       ` Jim Blandy
@ 2003-05-04 19:41         ` Andrew Cagney
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Cagney @ 2003-05-04 19:41 UTC (permalink / raw)
  To: Jim Blandy; +Cc: David Taylor, gdb


> Actually, I think that David's proposed representation will compress
> really well, with no new linker work.  The linker's current behavior
> will do everything that's needed.
> 
> Each entry in the .stab section is a fixed-size record; the textual
> portion of the stab is represented as an offset into the .stabstr
> section, which contains null-terminated strings.  The .stabstr section
> is a SHT_STRTAB type section, which means that the linker will
> automatically factor out duplicates.  So if two .stab entries have the
> same text, they'll end up pointing to the same bytes in .stabstr in
> the final executable.
> 
> In David's proposed representation, #including a file into many
> different .o files will produce stabs entries with identical strings,
> so they'll all get factored out nicely.
> 
> All this is completely independent of the BINCL/EINCL -> EXCL
> compression the linker also does for STABS, to factor out duplicated
> entries from the .stab section itself.

So the stabs mechanism might compress down to something useable.  Ah, 
the irony :-)

Andrew


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2003-05-04 19:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-22 16:40 macros, debug information, and parse_macro_definition David Taylor
2003-04-22 20:06 ` Kevin Buettner
2003-04-24  3:08 ` Jim Blandy
2003-04-24 15:12   ` David Taylor
2003-04-29 14:55     ` Andrew Cagney
2003-04-29 15:55       ` Daniel Berlin
2003-04-29 16:24         ` Keith Walker
2003-04-29 17:13           ` Daniel Berlin
2003-04-29 16:51       ` David Taylor
2003-05-02 19:21       ` Jim Blandy
2003-05-04 19:41         ` Andrew Cagney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).