Dwfl Callbacks, how would I pass a ByteBuffer to find_elf (possibly through user

public inbox for frysk@sourceware.org
 help / color / mirror / Atom feed

* Dwfl Callbacks, how would I pass a ByteBuffer to find_elf (possibly  through user_data).
@ 2007-05-17 18:04 Nurdin
  2007-05-18  7:14 ` Dwfl Callbacks Roland McGrath
  2007-05-18 16:53 ` getting vDSO image into libdwfl Roland McGrath
  0 siblings, 2 replies; 4+ messages in thread
From: Nurdin @ 2007-05-17 18:04 UTC (permalink / raw)
  To: Frysk List

The following depends on the patch in bug:
<http://sourceware.org/bugzilla/show_bug.cgi?id=4513>
Specifically the changes to Dwfl.cxx

The problem I am trying to solve is to get the Dwfl_Callback find_elf:

dwfl_frysk_proc_find_elf (Dwfl_Module *mod __attribute__ ((unused)),
              void **userdata __attribute__ ((unused)),
              const char *module_name, Dwarf_Addr base,
              char **file_name, Elf **elfp)

 to use frysk to read a processes memory rather than reading 
/proc/pid/mem. For the vdso case.

In order to do this I must somehow pass the memory from frysk to the 
callback.
What the callback does currently is store the pid in the Dwfl_Module 
name. (as "[vdso pid]")
and then lookup the /proc/pid/mem. It uses a callback with the signature:

read_proc_memory (void *arg, void *data, GElf_Addr address,
          size_t minread, size_t maxread)

So if I can change the void *arg to refer to a byte buffer rather than a 
file descriptor and just read from the byte buffer I should be one step 
closer to getting procedure names for a stack trace using elfutils.

I think the best way to do this is to pass the byte buffer to the 
callback through the void **userdata field, but I'm unsure as to how to 
set that field.

I have thought about having a member in the Dwfl class to hold the byte 
buffer but since neither the read_proc_memory or 
dwfl_frysk_proc_find_elf callback isn't linked to a java method I can't 
use the "this" keyword.

I've also thought about wrapping the Dwfl_Callback in a 
Frysk_Dwfl_Callback struct which has a bytebuffer as well, but I can't 
reference the callback structure in read_proc_memory or 
dwfl_frysk_proc_find_elf.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Dwfl Callbacks
  2007-05-17 18:04 Dwfl Callbacks, how would I pass a ByteBuffer to find_elf (possibly through user_data) Nurdin
@ 2007-05-18  7:14 ` Roland McGrath
  2007-05-18 16:53 ` getting vDSO image into libdwfl Roland McGrath
  1 sibling, 0 replies; 4+ messages in thread
From: Roland McGrath @ 2007-05-18  7:14 UTC (permalink / raw)
  To: Nurdin; +Cc: Frysk List

There are two independent subjects here, so I'll take one at a time in
separate messages.  First, the general issue of dealing with libdwfl callbacks.

I don't know the Frysk code or the Java wrappers in any detail, so I will
just speak to the general case of using the library in C/C++.

It's true that there is no interface to get the Dwfl_Callbacks pointer from
a Dwfl or Dwfl_Module, or the Dwfl from a Dwfl_Module.  These would be
simple to add.  That would make it possible to embed Dwfl_Callbacks in a
larger structure and recover a pointer to your own data from that (e.g. via
C++ multiple inheritance).  But I always figured applications would set the
module userdata to point at a per-module data structure of their own, and
use that to find anything else of theirs in callbacks.

Callbacks get passed void **userdata, and you can get this pointer from
dwfl_module_info on a Dwfl_Module.  This gives the location of a void *
inside the Dwfl_Module that belongs to the application.  The usage model I
expect is that when populating a Dwfl by calling dwfl_report_module (or
things that call it, like dwfl_report_elf, dwfl_report_offline), an
application sets each module's userdata to point to its own data structure
about that module (if it has one, or some other structure it wants).  To do
this after each dwfl_report_module call, you'd call dwfl_module_info to get
the void **.  After reporting, you could do that on Dwfl_Module pointers
you have, or use dwfl_getmodules and get the void ** in each callback along
with the Dwfl_Module *.  When you have the void **, then store there any
void * that floats your boat, or fetch what you stored before.  The void *
field starts out as NULL, and libdwfl never touches it or looks at it, only
passes its address to you.

As a reminder, the general model of libdwfl is that a reporting pass from
dwfl_report_begin (or dwfl_report_begin_add) to dwfl_report_end sets up the
current set of modules in the address space one Dwfl object describes.
The general-purpose library code itself assigns no meaning to the name of
a module at all.  The only meaning of the name is that a module is
identified by the tuple (name, start-address, end-address), which are the
arguments to dwfl_report_module.  When it's called with all three of those
matching an existing module, that means that module is still there and you
get the same pointer back.  There can be multiple modules with the same
name and different addresses.  The names are up to you and your callback
functions.  You could use the names to drive the work of the callbacks.
dwfl_linux_proc_find_elf is a kludgey callback for simple users; it uses
the module names as meaningful.  But complex applications like Frysk
should have their own data structure associated with each module and use
that to drive their callbacks.

What I would think is the natural binding of this interface to an OO
language is as follows.  The per-module object in the binding (perhaps
called Dwfl_Module, or maybe it's Dwfl::Module) wraps the C Dwfl_Module *,
has various methods corresponding to dwfl_module_* functions, and is
subclassable by users.  When a per-module object is created, it sets the C
Dwfl_Module's userdata to point to that language-binding object.  The
binding glue for the various callback functions taking arguments
(Dwfl_Module *mod, void **userdata, const char *modname, Dwarf_Addr base,
...) fetches the binding object from *userdata and passes just
(Dwfl::Module, ...)  with the meaningful arguments in that particular
callback to the language-level callback.  (In the C interface, those four
arguments are always given for convenience of writing callback functions
in C, but the last three are all things you can get from the Dwfl_Module.)
A per-module object is created by the bindings for the reporting calls,
and/or perhaps on demand in other callback bindings that get arguments
(mod, userdata, ...) with *userdata == NULL.

At the level of Java, you would subclass Dwfl::Module with a frysk.module
or whatnot that defines an interface of methods you call from your
find_elf callback wrapper.  Then you have subclasses of that implementing
those differently for different kinds of modules, like on-disk modules and
memory-only (vDSO) modules.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 4+ messages in thread

* getting vDSO image into libdwfl
  2007-05-17 18:04 Dwfl Callbacks, how would I pass a ByteBuffer to find_elf (possibly through user_data) Nurdin
  2007-05-18  7:14 ` Dwfl Callbacks Roland McGrath
@ 2007-05-18 16:53 ` Roland McGrath
  2007-05-25 16:50   ` Nurdin
  1 sibling, 1 reply; 4+ messages in thread
From: Roland McGrath @ 2007-05-18 16:53 UTC (permalink / raw)
  To: Nurdin; +Cc: Frysk List

The normal kinds of modules in a process (or a kernel) correspond to ELF
files on disk.  We satisfy the Dwfl_Callbacks.find_elf callback for those
just by opening the file by name.  The vDSO does not exist as a file on
disk, so we need to read the ELF file image directly from memory in the
debuggee address space.

The find_elf callback normally supplies a file name and most often an fd.
But it can also supply an Elf * handle already opened, and then the file
name and fd are not required.  So, a find_elf callback dealing with the
vDSO module can get a copy of the vDSO image from the debugee address space
into local memory in the debugger and use elf_memory to create an Elf *
handle on it.

The dwfl_linux_proc_find_elf callback does this as a skeletal example.
It uses this function (see libdwfl/elf-from-memory.c):

    /* Reconstruct an ELF file by reading the segments out of remote memory
       based on the ELF file header at EHDR_VMA and the ELF program headers it
       points to.  If not null, *LOADBASEP is filled in with the difference
       between the addresses from which the segments were read, and the
       addresses the file headers put them at.

       The function READ_MEMORY is called to copy at least MINREAD and at most
       MAXREAD bytes from the remote memory at target address ADDRESS into the
       local buffer at DATA; it should return -1 for errors (with code in
       `errno'), 0 if it failed to read at least MINREAD bytes due to EOF, or
       the number of bytes read if >= MINREAD.  ARG is passed through.  */

    Elf *
    elf_from_remote_memory (GElf_Addr ehdr_vma,
			    GElf_Addr *loadbasep,
			    ssize_t (*read_memory) (void *arg, void *data,
						    GElf_Addr address,
						    size_t minread,
						    size_t maxread),
			    void *arg)

This is an internal function not exported in the DSO or declared anywhere.
I wrote it as the necessary and separable component (and analogous to the
BFD function bfd_elf_bfd_from_remote_memory I added for the gdb vDSO
support).  I tested it via dwfl_linux_proc_find_elf (-p option to
eu-addr2line et al).  (Back then, /proc/PID/mem had not been made so
"secure", so it even worked in eu-foo without ptrace.)  I figured the
details of interface to that code and how it integrates with everything
would be ironed out when a real user came along.  Now here we are.

If you read the function, you'll see a lot of libelf hooey to deal with the
format encoding in all cross-permutations, but it's not doing very much.
It just reads the ELF file header and phdrs enough to decide if the whole
file image is really in memory and exactly how big it is.  Then it reads it
all in, and calls elf_memory.

The signature of the read_memory callback is arbitrary.  
It just seemed like the roughly appropriate general thing.
It's easy to change.

This function doesn't really fit with the libelf interfaces.  It could be
exposed by libdwfl in some fashion I suppose.  I hadn't really figured out
what seemed right, which is why it's as it is.  I could just change its
name and signature slightly, and export it from libdwfl.

Alternatively, if you have Java libelf bindings for all the xlate stuff,
it's not very complex to code it up in Java.  Then you might have a cleaner
way to integrate the memory-reading stuff that might do less copying.
Maybe some existing buffer data structure of yours has a raw pointer you
can pass to elf_memory.

Finally, you can instead avoid all this ELF grovelling stuff altogether.
This certainly seems like simplest thing for Frysk to do to start with, so
naturally I mention it last.  All the ELF header grokking is basically to
determine how big the image is (and in the general case that does not apply
here, if the phdrs really say that the memory image matches the whole file).
You already know how big the mapping is (in practice so far always one page,
two on ia64, might be a few on ppc).  That's the page-rounded upper limit on
the size of the image.  On x86, the mapping is 4k (one page) and the vDSO
ELF file image is actually between 2k and 3k.  Past the end of the actual
image, the rest of the page is zero.  It doesn't hurt to give elf_memory a
buffer including this padding.  So if it's not too costly to read a whole
page of memory instead of 1/2 or 3/4 that much, you can just do that.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: getting vDSO image into libdwfl
  2007-05-18 16:53 ` getting vDSO image into libdwfl Roland McGrath
@ 2007-05-25 16:50   ` Nurdin
  0 siblings, 0 replies; 4+ messages in thread
From: Nurdin @ 2007-05-25 16:50 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Frysk List

Roland McGrath wrote:
> The dwfl_linux_proc_find_elf callback does this as a skeletal example.
> It uses this function (see libdwfl/elf-from-memory.c):
>
>     /* Reconstruct an ELF file by reading the segments out of remote memory
>        based on the ELF file header at EHDR_VMA and the ELF program headers it
>        points to.  If not null, *LOADBASEP is filled in with the difference
>        between the addresses from which the segments were read, and the
>        addresses the file headers put them at.
>
>        The function READ_MEMORY is called to copy at least MINREAD and at most
>        MAXREAD bytes from the remote memory at target address ADDRESS into the
>        local buffer at DATA; it should return -1 for errors (with code in
>        `errno'), 0 if it failed to read at least MINREAD bytes due to EOF, or
>        the number of bytes read if >= MINREAD.  ARG is passed through.  */
>
>     Elf *
>     elf_from_remote_memory (GElf_Addr ehdr_vma,
> 			    GElf_Addr *loadbasep,
> 			    ssize_t (*read_memory) (void *arg, void *data,
> 						    GElf_Addr address,
> 						    size_t minread,
> 						    size_t maxread),
> 			    void *arg)
>
> This is an internal function not exported in the DSO or declared anywhere.
> I wrote it as the necessary and separable component (and analogous to the
> BFD function bfd_elf_bfd_from_remote_memory I added for the gdb vDSO
> support).  I tested it via dwfl_linux_proc_find_elf (-p option to
> eu-addr2line et al).  (Back then, /proc/PID/mem had not been made so
> "secure", so it even worked in eu-foo without ptrace.)  I figured the
> details of interface to that code and how it integrates with everything
> would be ironed out when a real user came along.  Now here we are.
>
> The signature of the read_memory callback is arbitrary.  
> It just seemed like the roughly appropriate general thing.
> It's easy to change.
>
> This function doesn't really fit with the libelf interfaces.  It could be
> exposed by libdwfl in some fashion I suppose.  I hadn't really figured out
> what seemed right, which is why it's as it is.  I could just change its
> name and signature slightly, and export it from libdwfl.
I just exposed this method inside libdwfl.h and I haven't needed to 
change the read_memory interface at all.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-05-24 20:39 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-05-17 18:04 Dwfl Callbacks, how would I pass a ByteBuffer to find_elf (possibly through user_data) Nurdin
2007-05-18  7:14 ` Dwfl Callbacks Roland McGrath
2007-05-18 16:53 ` getting vDSO image into libdwfl Roland McGrath
2007-05-25 16:50   ` Nurdin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).