example for using libdw(fl) for in-process stack unwinding?

public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed

* example for using libdw(fl) for in-process stack unwinding?
@ 2016-06-09 15:54 Milian Wolff
  0 siblings, 0 replies; 3+ messages in thread
From: Milian Wolff @ 2016-06-09 15:54 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 5068 bytes --]

Hey all,

from [1] I got that libdw(fl) can be used for unwinding as an alternative to 
libunwind, i.e. `dwfl_getthread_frames()`. Apparently, it is even considerably 
faster in the context of Linux perf. I'd like to try that out and compare it 
to libunwind in the context of my heaptrack tracer.

[1]: https://lwn.net/Articles/579508/

But, so far, I could not find an example for using libdw(fl) for in-process 
stack unwinding. All examples I can find out there use libunwind for the 
unwinding and libdw only for the DWARF debug information interpretation.

I've tried my shot at implementing a trivial example around 
`dwfl_getthread_frames` but struggle with the API a lot. It is quite involved, 
contrary to a simple `unw_backtrace`, or even to the manual stepping with 
libunwind over the `unw_local_addr_space`. The documentation of libdw(fl) 
often refers to terms that I have no clue about as I'm not deeply acquainted 
with the DWARF and ELF specs. Problems I'm facing are:

- Am I correct in assuming that in-process is the opposite of "offline use" 
referred to in the libdwfl API documentation?
  * If so, what should I set `Dwfl_Callbacks::section_address` to?

- How do I attach state in-process? `dwfl_attach_state` sounds like the 
correct choice, as `dwfl_linux_proc_attach` mentions ptrace which I don't 
want/need. So, assuming it's `dwfl_attach_state`:

What is the correct way to get an `Elf *` for my current executable? Do I 
really `open("/proc/self/exe")` and pass that to `elf_begin`? What Elf_Cmd 
should I use? ELF_C_READ?

How should the obligatory callbacks of Dwfl_Thread_Callbacks be implemented?

  * next_thread: I'm only interested in the current thread, so doing something 
similar to perf should be possible here
  * memory_read: just cast the address to a pointer and dereference it?
  * set_initial_registers: no clue, really

Is there an easy-to-grasp example out there somewhere for me to follow on how  
to use libdw(fl) for in-process stack unwinding? For reference, here's my 
current non-functional attempt, which you can compile with

g++ -ldw -lelf -std=c++11 -g -O0 backtrace.cpp -o backtrace

~~~~~~~~~~~~~~~~~~~~~~~
#define PACKAGE
#define PACKAGE_VERSION

#include <elfutils/libdw.h>
#include <elfutils/libdwfl.h>
#include <unistd.h>
#include <cassert>
#include <fcntl.h>

namespace dw {

    static const Dwfl_Callbacks offline_callbacks = {
        dwfl_build_id_find_elf,
        dwfl_standard_find_debuginfo,
        // TODO: we are in-process, not offline, or?
        dwfl_offline_section_address,
        nullptr,
    };

    bool set_initial_registers(Dwfl_Thread *thread, void *arg)
    {
        // TODO: what to do here?
        return true;
    }

    static pid_t next_thread(Dwfl *dwfl, void *arg, void **thread_argp)
    {
        // TODO: what to do here? Code below is copied from perf, probably 
wrong here
        /* We want only single thread to be processed. */
        if (*thread_argp != NULL)
                return 0;

        *thread_argp = arg;
        return dwfl_pid(dwfl);
    }

    static bool memory_read(Dwfl *dwfl, Dwarf_Addr addr, Dwarf_Word *result,
                            void *arg)
    {
        // TODO: what to do here? We are in-process, so can we just do the 
following?
        *result = *reinterpret_cast<Dwarf_Word*>(addr);
        return true;
    }

    static const Dwfl_Thread_Callbacks callbacks = {
        next_thread,
        nullptr,
        memory_read,
        set_initial_registers,
        nullptr,
        nullptr
    };

    int frame_callback(Dwfl_Frame* frame, void* data)
    {
        return DWARF_CB_OK;
    }

    void backtrace()
    {
        auto dwfl = dwfl_begin(&offline_callbacks);
        fprintf(stderr, "%d: %d = %s\n", __LINE__, dwfl_errno(), 
dwfl_errmsg(dwfl_errno()));
        assert(dwfl);

        // TODO: thread specific pid?
        auto pid = getpid();

        // TODO: is this the correct way to get the Elf*?
        auto fd = open("/proc/self/exe", O_RDONLY);
        auto elf = elf_begin(fd, ELF_C_READ_MMAP, nullptr);

        bool attached = dwfl_attach_state(dwfl, elf, pid, &callbacks, 
nullptr);
        fprintf(stderr, "%d: %d = %s\n", __LINE__, dwfl_errno(), 
dwfl_errmsg(dwfl_errno()));
        assert(attached);

        auto ret = dwfl_getthread_frames(dwfl, pid, &frame_callback, nullptr);
        fprintf(stderr, "%d: %d = %s\n", __LINE__, dwfl_errno(), 
dwfl_errmsg(dwfl_errno()));
        assert(ret == 0);
        dwfl_end(dwfl);
    }
}

void a()
{
    dw::backtrace();
}

void b()
{
    a();
}

void c()
{
    b();
}

int main()
{
    c();
    return 0;
}
~~~~~~~~~~~~~~~~~~~~

The output I get is:

62: 0 = (null)
73: 0 = (null)
77: 0 = No DWARF information found
backtrace: backtrace.cpp:78: void dw::backtrace(): Assertion `ret == 0' 
failed.

Any help appreciated, thanks.

-- 
Milian Wolff
mail@milianw.de
http://milianw.de

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: example for using libdw(fl) for in-process stack unwinding?
@ 2016-06-11 11:50 Milian Wolff
  0 siblings, 0 replies; 3+ messages in thread
From: Milian Wolff @ 2016-06-11 11:50 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1953 bytes --]

Hey Mark,

thanks a lot for your input and explanations.

On Freitag, 10. Juni 2016 13:04:14 CEST Mark Wielaard wrote:

<snip>

> > How should the obligatory callbacks of Dwfl_Thread_Callbacks be
> > implemented?> 
> >   * next_thread: I'm only interested in the current thread, so doing
> >   something> 
> > similar to perf should be possible here
> > 
> >   * memory_read: just cast the address to a pointer and dereference it?
> >   * set_initial_registers: no clue, really
> 
> This is the big issue. And we don't yet have a standard callback for
> that. Ben Gamari (added to CC) has been working on that. You can find
> his patches at https://github.com/bgamari/elfutils/commits/local-unwind
> with some discussion in the archives:
> https://lists.fedorahosted.org/archives/list/elfutils-devel@lists.fedorahost
> ed.org/thread/VDZY5DA6QEYYXLR4NWUY77NHE43HBSKH/
> https://lists.fedorahosted.org/archives/list/elfutils-devel@lists.fedorahos
> ted.org/thread/VFTKJQ3LS4WN3RVMZES3BOPTHR5IPHU6/
> > Is there an easy-to-grasp example out there somewhere for me to follow on
> > how to use libdw(fl) for in-process stack unwinding?
> 
> Hope the above helps. But feel free to ask more questions.

It does help in the sense that I'll try Ben's patches when I get the time, or 
wait until they get upstreamed before I investigate more.

Ben, have you compared the functionality of libdw in your current patched form 
to libunwind, for process-local unwinding? If not, could you do that? I'd 
mostly be interested in some performance aspects.

Libunwind relies on dl_iterate_phdr, which can add significant overhead when 
unwinding separate threads as it induces synchronization. I hope that with 
explicitly using dwfl_report_elf I could have libdw cache everything that's 
required and only need to update the data when dlopen/dlclose are called.

Cheers

-- 
Milian Wolff
mail@milianw.de
http://milianw.de

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: example for using libdw(fl) for in-process stack unwinding?
@ 2016-06-10 11:04 Mark Wielaard
  0 siblings, 0 replies; 3+ messages in thread
From: Mark Wielaard @ 2016-06-10 11:04 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 3846 bytes --]

On Thu, 2016-06-09 at 17:54 +0200, Milian Wolff wrote:
> from [1] I got that libdw(fl) can be used for unwinding as an alternative to 
> libunwind, i.e. `dwfl_getthread_frames()`. Apparently, it is even considerably 
> faster in the context of Linux perf. I'd like to try that out and compare it 
> to libunwind in the context of my heaptrack tracer.
> 
> [1]: https://lwn.net/Articles/579508/
> 
> But, so far, I could not find an example for using libdw(fl) for in-process 
> stack unwinding. All examples I can find out there use libunwind for the 
> unwinding and libdw only for the DWARF debug information interpretation.

The libdwfl interface was designed for out-of-process/core-file
unwinding. In theory it should also work for in-process unwinding and
there have been some attempts to make that easy. But you need to create
a Dwfl and attach state for your own process, which is not entirely
trivial.

> I've tried my shot at implementing a trivial example around 
> `dwfl_getthread_frames` but struggle with the API a lot. It is quite involved, 
> contrary to a simple `unw_backtrace`, or even to the manual stepping with 
> libunwind over the `unw_local_addr_space`. The documentation of libdw(fl) 
> often refers to terms that I have no clue about as I'm not deeply acquainted 
> with the DWARF and ELF specs. Problems I'm facing are:
> 
> - Am I correct in assuming that in-process is the opposite of "offline use" 
> referred to in the libdwfl API documentation?

Yes. offline means you report whole ELF modules and let libdwfl figure
out the in-memory layout. "online" means you report the modules given a
specific address layout (assume running process).

>   * If so, what should I set `Dwfl_Callbacks::section_address` to?

section_address can usually be NULL. It is used when dealing with ET_REL
files. Normal processes are made up of ET_EXEC and ET_DYN files
(executable and shared libraries). Dwfl can also be used to introspect
the kernel. Modules in the kernel are ET_REL which need special rules
for memory layout. Normal processes don't need it.

> - How do I attach state in-process? `dwfl_attach_state` sounds like the 
> correct choice, as `dwfl_linux_proc_attach` mentions ptrace which I don't 
> want/need. So, assuming it's `dwfl_attach_state`:
> 
> What is the correct way to get an `Elf *` for my current executable? Do I 
> really `open("/proc/self/exe")` and pass that to `elf_begin`? What Elf_Cmd 
> should I use? ELF_C_READ?

That should work. But you should also be able to just use
dwfl_linux_proc_report (dwfl, getpid()) or even dwfl_linux_proc_attach
(dwfl, getpid(), false) which reconstructs the whole process layout
from /proc/pid/map. The last is used for example in
tests/dwfl-proc-attach.c.

> How should the obligatory callbacks of Dwfl_Thread_Callbacks be implemented?
> 
>   * next_thread: I'm only interested in the current thread, so doing something 
> similar to perf should be possible here
>   * memory_read: just cast the address to a pointer and dereference it?
>   * set_initial_registers: no clue, really

This is the big issue. And we don't yet have a standard callback for
that. Ben Gamari (added to CC) has been working on that. You can find
his patches at https://github.com/bgamari/elfutils/commits/local-unwind
with some discussion in the archives:
https://lists.fedorahosted.org/archives/list/elfutils-devel@lists.fedorahosted.org/thread/VDZY5DA6QEYYXLR4NWUY77NHE43HBSKH/
https://lists.fedorahosted.org/archives/list/elfutils-devel@lists.fedorahosted.org/thread/VFTKJQ3LS4WN3RVMZES3BOPTHR5IPHU6/

> Is there an easy-to-grasp example out there somewhere for me to follow on how  
> to use libdw(fl) for in-process stack unwinding?

Hope the above helps. But feel free to ask more questions.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-06-11 11:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-09 15:54 example for using libdw(fl) for in-process stack unwinding? Milian Wolff
2016-06-10 11:04 Mark Wielaard
2016-06-11 11:50 Milian Wolff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).