public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* gcj debug question
@ 2022-10-25 10:48 Andrew Dinn
  2022-11-09 16:10 ` Tom Tromey
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Dinn @ 2022-10-25 10:48 UTC (permalink / raw)
  To: gdb-list

Hi gdb experts,

I'm hoping there is still enough institutional memory left somewhere in 
this forum to provide info about (the now defunct) DWARF support for 
gcj. Specifically, does anyone have a long enough memory to recall 
whether and, if so, how gcj advertised the presence of Java reflective 
class objects (instances of java.lang.Class) to the debugger?

   - Did it insert linker symbols for e.g. org.my.Foo.class into a 
generated binary?

   - Did it emit DWARF info records with tag DW_TAG_variable and 
associated attributes like name, type, linkage name and location?

   - In the latter case were these records located in the class (tag 
DW_TAG_class) info record or at top level in the same compile unit as 
the class?

Of course, the question assumes that java.lang.Class instances were 
present in the image heap in advance of startup, which may be completely 
unwarranted as they coudl equally be created on demand during program 
bootstrap or normal execution. Correction of any such erroneous 
presumption would be most welcome.

Answers on a (e-)postcard would be much appreciated. Actually, I'd be 
happy just with the name of someone who knows someone who might know.

regards,


Andrew Dinn
-----------


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gcj debug question
  2022-10-25 10:48 gcj debug question Andrew Dinn
@ 2022-11-09 16:10 ` Tom Tromey
  2022-11-21 14:23   ` Andrew Dinn
  0 siblings, 1 reply; 4+ messages in thread
From: Tom Tromey @ 2022-11-09 16:10 UTC (permalink / raw)
  To: Andrew Dinn via Gdb; +Cc: Andrew Dinn

>>>>> "Andrew" == Andrew Dinn via Gdb <gdb@sourceware.org> writes:

Andrew> I'm hoping there is still enough institutional memory left somewhere
Andrew> in this forum to provide info about (the now defunct) DWARF support
Andrew> for gcj. Specifically, does anyone have a long enough memory to recall 
Andrew> whether and, if so, how gcj advertised the presence of Java reflective
Andrew> class objects (instances of java.lang.Class) to the debugger?

You can see all the old code in commit 9c37b5ae, which removed it.
Most of what you want is in jv-lang.c.

Andrew>   - Did it insert linker symbols for e.g. org.my.Foo.class into a
Andrew>     generated binary?

I believe gcj did do this, but gdb also knew how to extract the vtable
from an object, use that to find the runtime's class object, and then
decode that object to make a gdb 'struct type'.  See
java_class_from_object and type_from_class.

It's been a long time but my recollection is that debugging Java didn't
work extremely well.  When I worked on gcj I basically knew nothing
about gdb and so I never tried to fix any of the bugs.

The main issue with this kind of thing is that there has to be a way to
communicate the Class layout from the runtime to gdb.  DWARF could be
used for this, of course, but often these kinds of system libraries are
stripped.  Ada has this problem for task objects, and there we just have
gdb know the object layout... not really ideal.

If I was doing this again I'd probably look into whether enough Python
infrastructure could be added so that the magic could be done in Python
code that was shipped alongside libgcj.  That would break this link
between the runtime and the debugger.  For basic debugging it could
maybe all be done via pretty-printers; though of course that doesn't
work if you want to support 'ptype'.

Andrew>   - Did it emit DWARF info records with tag DW_TAG_variable and
Andrew>     associated attributes like name, type, linkage name and location?

I am not sure.

One thing to note is that gcj had two ABIs.  One ABI was C++-like and
was used for the core classes.  For example, all of java.lang (IIRC)
would have been built this way.  In this mode, object and vtable layout
was mostly compatible with C++ and so (I assume, I don't recall looking
at this much) ordinary C++-ish DWARF would have been emitted.

There was also the "binary compatibility ABI", which tried to follow the
Java binary compatibility rules.  This mode deferred object and vtable
layout (and other related decisions) until class initialization.
Normally, user code would be compiled in this mode.  I'm not sure what
the DWARF would have looked like here, but it couldn't have been very
ordinary, because things like data member offsets wouldn't be known
until after class initialization, and in those days gdb didn't
understand things like dynamic type layout.

Andrew>   - In the latter case were these records located in the class (tag
Andrew>     DW_TAG_class) info record or at top level in the same compile unit
Andrew>    as the class?

In DWARF these are always nested in the DW_TAG_class.  Top-level is for
things like global variables.  (A Java static member would still be
under the class, Java doesn't have this kind of global.)

Tom

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gcj debug question
  2022-11-09 16:10 ` Tom Tromey
@ 2022-11-21 14:23   ` Andrew Dinn
  2022-11-21 14:35     ` Jan Vrany
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Dinn @ 2022-11-21 14:23 UTC (permalink / raw)
  To: Tom Tromey, Andrew Dinn via Gdb

Hi Tom,

Thanks every much for your helpful response -- and also very nice to 
hear from you again.

On 09/11/2022 16:10, Tom Tromey wrote:
>>>>>> "Andrew" == Andrew Dinn via Gdb <gdb@sourceware.org> writes:
> 
> Andrew> I'm hoping there is still enough institutional memory left somewhere
> Andrew> in this forum to provide info about (the now defunct) DWARF support
> Andrew> for gcj. Specifically, does anyone have a long enough memory to recall
> Andrew> whether and, if so, how gcj advertised the presence of Java reflective
> Andrew> class objects (instances of java.lang.Class) to the debugger?
> 
> You can see all the old code in commit 9c37b5ae, which removed it.
> Most of what you want is in jv-lang.c.

That's the most useful thing I could have asked for, not just for this 
issue but for any other questions I might need to answer.

> Andrew>   - Did it insert linker symbols for e.g. org.my.Foo.class into a
> Andrew>     generated binary?
> 
> I believe gcj did do this, but gdb also knew how to extract the vtable
> from an object, use that to find the runtime's class object, and then
> decode that object to make a gdb 'struct type'.  See
> java_class_from_object and type_from_class.

Wow, that's nice. Of course I'm able to generate all the required info 
up front as DWARF (because GraalVM has a closed world model).

> It's been a long time but my recollection is that debugging Java didn't
> work extremely well.  When I worked on gcj I basically knew nothing
> about gdb and so I never tried to fix any of the bugs.

Hmm, much like when I started on gdb support for GraalVM ... ;-)

> The main issue with this kind of thing is that there has to be a way to
> communicate the Class layout from the runtime to gdb.  DWARF could be
> used for this, of course, but often these kinds of system libraries are
> stripped.  Ada has this problem for task objects, and there we just have
> gdb know the object layout... not really ideal.

Well, without runtime class loading its much less of a two way street. 
Being able to generate all the DWARF info you need up front means you 
are able provide complete file and line info, frame info (without 
needing to rely on a valid fp), static field and local var locations, 
etc. I even lookup and cache sources at build time so they can be pulled 
out of a hat (along with a .dwz file) when you want to debug an image.

> If I was doing this again I'd probably look into whether enough Python
> infrastructure could be added so that the magic could be done in Python
> code that was shipped alongside libgcj.  That would break this link
> between the runtime and the debugger.  For basic debugging it could
> maybe all be done via pretty-printers; though of course that doesn't
> work if you want to support 'ptype'.

I had not thought much about using the python APIs (mainly because my 
python skills ... how can I best express this ... sit comfortably in the 
[0, epsilon) interval. I will have a think about what python might be 
useful for though.

> One thing to note is that gcj had two ABIs.  One ABI was C++-like and
> was used for the core classes.  For example, all of java.lang (IIRC)
> would have been built this way.  In this mode, object and vtable layout
> was mostly compatible with C++ and so (I assume, I don't recall looking
> at this much) ordinary C++-ish DWARF would have been emitted.

Ok, so this starts to make sense of the way that all those core classes 
are defined as they are using C code. Effectively Java objects, 
including Java Class objects, are implemented as near as possible to the 
way g++ implements C++ objects (i.e. using closely similar conventions 
for data and code layouts).

That's never really been a guiding principle for the operation of the 
compiler I am generating DWARF for. However, I *have* effectively found 
a way to model the Java objects in DWARF that makes it look like they 
originated from a C++ type & class base derivable from the original Java 
class base via a simple source to source mapping. That's enough to fake 
Java debugging.

> In DWARF these are always nested in the DW_TAG_class.  Top-level is for
> things like global variables.  (A Java static member would still be
> under the class, Java doesn't have this kind of global.)
Yes, that makes sense. Thanks for the history lesson and, most of all, 
for a link to the source.

regards,


Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gcj debug question
  2022-11-21 14:23   ` Andrew Dinn
@ 2022-11-21 14:35     ` Jan Vrany
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Vrany @ 2022-11-21 14:35 UTC (permalink / raw)
  To: Andrew Dinn, Tom Tromey, Andrew Dinn via Gdb

Hi Andrew,


On Mon, 2022-11-21 at 14:23 +0000, Andrew Dinn via Gdb wrote:
> Hi Tom,
>
> Thanks every much for your helpful response -- and also very nice to
> hear from you again.
>
> On 09/11/2022 16:10, Tom Tromey wrote:
> > > > > > > "Andrew" == Andrew Dinn via Gdb <gdb@sourceware.org> writes:
> >
> > Andrew> I'm hoping there is still enough institutional memory left somewhere
> > Andrew> in this forum to provide info about (the now defunct) DWARF support
> > Andrew> for gcj. Specifically, does anyone have a long enough memory to recall
> > Andrew> whether and, if so, how gcj advertised the presence of Java reflective
> > Andrew> class objects (instances of java.lang.Class) to the debugger?
> >
> > You can see all the old code in commit 9c37b5ae, which removed it.
> > Most of what you want is in jv-lang.c.
>
> That's the most useful thing I could have asked for, not just for this
> issue but for any other questions I might need to answer.
>
> > Andrew>   - Did it insert linker symbols for e.g. org.my.Foo.class into a
> > Andrew>     generated binary?
> >
> > I believe gcj did do this, but gdb also knew how to extract the vtable
> > from an object, use that to find the runtime's class object, and then
> > decode that object to make a gdb 'struct type'.  See
> > java_class_from_object and type_from_class.
>
> Wow, that's nice. Of course I'm able to generate all the required info
> up front as DWARF (because GraalVM has a closed world model).
>
> > It's been a long time but my recollection is that debugging Java didn't
> > work extremely well.  When I worked on gcj I basically knew nothing
> > about gdb and so I never tried to fix any of the bugs.
>
> Hmm, much like when I started on gdb support for GraalVM ... ;-)
>
> > The main issue with this kind of thing is that there has to be a way to
> > communicate the Class layout from the runtime to gdb.  DWARF could be
> > used for this, of course, but often these kinds of system libraries are
> > stripped.  Ada has this problem for task objects, and there we just have
> > gdb know the object layout... not really ideal.
>
> Well, without runtime class loading its much less of a two way street.
> Being able to generate all the DWARF info you need up front means you
> are able provide complete file and line info, frame info (without
> needing to rely on a valid fp), static field and local var locations,
> etc. I even lookup and cache sources at build time so they can be pulled
> out of a hat (along with a .dwz file) when you want to debug an image.
>
> > If I was doing this again I'd probably look into whether enough Python
> > infrastructure could be added so that the magic could be done in Python
> > code that was shipped alongside libgcj.  That would break this link
> > between the runtime and the debugger.  For basic debugging it could
> > maybe all be done via pretty-printers; though of course that doesn't
> > work if you want to support 'ptype'.
>
> I had not thought much about using the python APIs (mainly because my
> python skills ... how can I best express this ... sit comfortably in the
> [0, epsilon) interval. I will have a think about what python might be
> useful for though.

I'm confident Python is the way to go here.

I'm python code to to navigate objects on (garbage-collected) heap and pretty-print
them - the current Python API so far is sufficient (for my usecases).

Things are getting a little bit more difficult when it comes to frame 
decorators.

This spring I have extended Python API to allow building line number tables
using Python and use this to demo how to map jitted code to .java source code.
I'm about to start submitting patches next month or in January.

>
> > One thing to note is that gcj had two ABIs.  One ABI was C++-like and
> > was used for the core classes.  For example, all of java.lang (IIRC)
> > would have been built this way.  In this mode, object and vtable layout
> > was mostly compatible with C++ and so (I assume, I don't recall looking
> > at this much) ordinary C++-ish DWARF would have been emitted.
>
> Ok, so this starts to make sense of the way that all those core classes
> are defined as they are using C code. Effectively Java objects,
> including Java Class objects, are implemented as near as possible to the
> way g++ implements C++ objects (i.e. using closely similar conventions
> for data and code layouts).
>
> That's never really been a guiding principle for the operation of the
> compiler I am generating DWARF for. However, I *have* effectively found
> a way to model the Java objects in DWARF that makes it look like they
> originated from a C++ type & class base derivable from the original Java
> class base via a simple source to source mapping. That's enough to fake
> Java debugging.
>
> > In DWARF these are always nested in the DW_TAG_class.  Top-level is for
> > things like global variables.  (A Java static member would still be
> > under the class, Java doesn't have this kind of global.)
> Yes, that makes sense. Thanks for the history lesson and, most of all,
> for a link to the source.
>
> regards,
>
>
> Andrew Dinn
> -----------
> Red Hat Distinguished Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill
>



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-11-21 14:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-25 10:48 gcj debug question Andrew Dinn
2022-11-09 16:10 ` Tom Tromey
2022-11-21 14:23   ` Andrew Dinn
2022-11-21 14:35     ` Jan Vrany

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).