* Handling pgoff in perf elf mmap/mmap2 elf info
@ 2018-09-19 12:12 Christoph Sterz
2018-09-19 12:24 ` Ulf Hermann
0 siblings, 1 reply; 17+ messages in thread
From: Christoph Sterz @ 2018-09-19 12:12 UTC (permalink / raw)
To: elfutils-devel
[-- Attachment #1: Type: text/plain, Size: 2186 bytes --]
Hi,
I work on Hotspot[1] an opensource linux perf aggregator and visualizer.
For this we use perfparser[2], which in turn uses libdw for unwinding.
Recently, we found more and more perf trace-files to use the 'pgoff'
field [3].
This happens especially on newer distros, (arch, opensuse tumbleweed).
We suspect perf to offset its recording-addresses of mmapped
dsos/executables starting with a specific section, such that they denote
their pointers with this pg_offset parameter. (e.g. skipping a library's
header and setting pgoff to the headersize). Although we are not 100%
sure about this information.
The Function I am using here is:
extern Dwfl_Module *dwfl_report_elf (Dwfl *dwfl, const char *name,
const char *file_name, int fd,
GElf_Addr base, bool add_p_vaddr);
in the specific call I am doing is:
Dwfl_Module *ret = dwfl_report_elf(
m_dwfl, info.originalFileName.constData(),
info.localFile.absoluteFilePath().toLocal8Bit().constData(), -1,
info.addr,
false);
and I am wondering how to include the pgoff here.
Simply subtracting it from info.addr results in a lots of "address range
overlaps an existing module" errors, where I guess I subtracted too
much. I know pgoff is in bytes.
Tried adding the offset, also overlap errors.
Ignoring the offset results in errors where perfparser fails to find ELF
for instruction pointer addresses.
I would be happy to hear if anyone has experience unwinding with these
offsets.
Maybe there is a different function I should use reporting the elf.
Maybe even someone unwinded/parsed perf data before.
Thanks,
Christoph
[1] https://github.com/KDAB/hotspot
[2] http://code.qt.io/cgit/qt-creator/perfparser.git/
[3] sparse info at
http://man7.org/linux/man-pages/man2/perf_event_open.2.html
--
Christoph Sterz | christoph.sterz@kdab.com | Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4003 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-09-19 12:12 Handling pgoff in perf elf mmap/mmap2 elf info Christoph Sterz
@ 2018-09-19 12:24 ` Ulf Hermann
2018-09-21 13:07 ` Mark Wielaard
0 siblings, 1 reply; 17+ messages in thread
From: Ulf Hermann @ 2018-09-19 12:24 UTC (permalink / raw)
To: elfutils-devel
> We suspect perf to offset its recording-addresses of mmapped
> dsos/executables starting with a specific section, such that they denote
> their pointers with this pg_offset parameter. (e.g. skipping a library's
> header and setting pgoff to the headersize). Although we are not 100%
> sure about this information.
According to my understanding, the pgoff is not perf's invention. Rather, the libary loader for the target application does not mmap() the full ELF file, but only the parts it's interested in. Those partial mappings are then reported through perf. We then try to recreate the memory mapping with perfparser, but run into problems because dwfl_report_elf() doesn't let us do partial mappings. You can only map complete files with that function. There probably is some way to manually map the relevant sections using other functions in libdw and libelf, but I haven't figured out how to do this, yet. If there is a simple trick I'm missing, I'd be happy to hear about it.
And, yes, a function that works like dwfl_report_elf, but takes a pgoff and length as additional parameters is sorely missing from the API.
best regards,
Ulf
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-09-19 12:24 ` Ulf Hermann
@ 2018-09-21 13:07 ` Mark Wielaard
2018-09-21 13:35 ` Ulf Hermann
2018-09-26 14:38 ` Milian Wolff
0 siblings, 2 replies; 17+ messages in thread
From: Mark Wielaard @ 2018-09-21 13:07 UTC (permalink / raw)
To: Ulf Hermann, elfutils-devel; +Cc: Christoph Sterz
On Wed, 2018-09-19 at 14:24 +0200, Ulf Hermann wrote:
> > We suspect perf to offset its recording-addresses of mmapped
> > dsos/executables starting with a specific section, such that they
> > denote
> > their pointers with this pg_offset parameter. (e.g. skipping a
> > library's
> > header and setting pgoff to the headersize). Although we are not
> > 100%
> > sure about this information.
>
> According to my understanding, the pgoff is not perf's invention.
> Rather, the libary loader for the target application does not mmap()
> the full ELF file, but only the parts it's interested in. Those
> partial mappings are then reported through perf.
OK, so pgoff is like the offset argument of the mmap call?
Is it just recording all user space mmap events that have PROT_EXEC in
their prot argument? What about if the mapping was later changed with
mprotect? Or does PERF_RECORD_MMAP only map to some internal kernel
mmap action?
> We then try to recreate the memory mapping with perfparser, but run
> into problems because dwfl_report_elf() doesn't let us do partial
> mappings. You can only map complete files with that function. There
> probably is some way to manually map the relevant sections using
> other functions in libdw and libelf, but I haven't figured out how to
> do this, yet. If there is a simple trick I'm missing, I'd be happy to
> hear about it.
>
> And, yes, a function that works like dwfl_report_elf, but takes a
> pgoff and length as additional parameters is sorely missing from the
> API.
dwfl_report_elf indeed does assume the whole ELF file is mapped in
according to the PHDRs in the file and the given base address. But what
you are actually seeing (I think, depending on the answers on the
questions above) is the dynamic loader mapping in the file in pieces.
And so you would like an interface where you can report the module
piece wise while it is being mapped in. So what would be most
convenient would be some kind of dwfl_report_elf_mmap function that you
can use to either get a new Dwfl_Module or to extend an existing one.
I have to think how that interacts with mprotect and mmunmap.
Cheers,
Mark
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-09-21 13:07 ` Mark Wielaard
@ 2018-09-21 13:35 ` Ulf Hermann
2018-09-26 14:38 ` Milian Wolff
1 sibling, 0 replies; 17+ messages in thread
From: Ulf Hermann @ 2018-09-21 13:35 UTC (permalink / raw)
To: Mark Wielaard, elfutils-devel; +Cc: Christoph Sterz
> OK, so pgoff is like the offset argument of the mmap call?
As far as I understand it, yes.
> Is it just recording all user space mmap events that have PROT_EXEC in
> their prot argument?
It just records all mmap events, also the ones without PROT_EXEC.
We check in perfparser if the file in question is supposed to be an elf
file and consider everything that looks like one for later reporting to
dwfl. We then report modules on demand, though. So, if no sample ever
touches a module, the module doesn't get reported. Mmaps without
PROT_EXEC shouldn't show up in any samples, except if the trace data is
bad in some other way. But then, if we reported to dwfl a module that
isn't actually linked in the target application but still mapped to a
place in memory, that wouldn't disturb the unwinding for other modules.
Would it?
We can probably determine the initial PROT_EXEC state for some mmaps,
though. There are two possible variants of mmap events in the perf
trace, one of which has a "prot" field. However, I don't know if we can
rely on that being present in traces from the systems in question.
> What about if the mapping was later changed with mprotect?
We don't get separate events for mprotect, so the "prot" field is
probably worthless.
> Or does PERF_RECORD_MMAP only map to some internal kernel mmap action?
I don't know exactly where the mmap events are generated, but it has to
be somewhere in the kernel. That is how perf_event_open operates, and,
by extension, how perf gets its data. We might get some synthesized mmap
events generated from the current memory map of an application when we
attach while it's already running. I don't quite know how that works.
> dwfl_report_elf indeed does assume the whole ELF file is mapped in
> according to the PHDRs in the file and the given base address. But what
> you are actually seeing (I think, depending on the answers on the
> questions above) is the dynamic loader mapping in the file in pieces.
Yes, that is my interpretation of the (limited) data I have. Maybe
Christoph can add something here.
> And so you would like an interface where you can report the module
> piece wise while it is being mapped in. So what would be most
> convenient would be some kind of dwfl_report_elf_mmap function that you
> can use to either get a new Dwfl_Module or to extend an existing one.
That sounds about right.
> I have to think how that interacts with mprotect and mmunmap.
We don't get any extra events for mprotect and munmap. If we detect an
address space conflict between different modules, we just throw out the
dwfl state and restart the reporting. (I know, there is a potential for
optimization here: We could use the callback to dwfl_report_end and only
throw out the conflicting modules.)
cheers,
Ulf
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-09-21 13:07 ` Mark Wielaard
2018-09-21 13:35 ` Ulf Hermann
@ 2018-09-26 14:38 ` Milian Wolff
2018-10-09 20:33 ` Milian Wolff
1 sibling, 1 reply; 17+ messages in thread
From: Milian Wolff @ 2018-09-26 14:38 UTC (permalink / raw)
To: elfutils-devel; +Cc: Mark Wielaard, Ulf Hermann, Christoph Sterz
On Friday, September 21, 2018 3:07:29 PM CEST Mark Wielaard wrote:
> On Wed, 2018-09-19 at 14:24 +0200, Ulf Hermann wrote:
> > > We suspect perf to offset its recording-addresses of mmapped
> > > dsos/executables starting with a specific section, such that they
> > > denote
> > > their pointers with this pg_offset parameter. (e.g. skipping a
> > > library's
> > > header and setting pgoff to the headersize). Although we are not
> > > 100%
> > > sure about this information.
> >
> > According to my understanding, the pgoff is not perf's invention.
> > Rather, the libary loader for the target application does not mmap()
> > the full ELF file, but only the parts it's interested in. Those
> > partial mappings are then reported through perf.
>
> OK, so pgoff is like the offset argument of the mmap call?
> Is it just recording all user space mmap events that have PROT_EXEC in
> their prot argument? What about if the mapping was later changed with
> mprotect? Or does PERF_RECORD_MMAP only map to some internal kernel
> mmap action?
>
> > We then try to recreate the memory mapping with perfparser, but run
> >
> > into problems because dwfl_report_elf() doesn't let us do partial
> > mappings. You can only map complete files with that function. There
> > probably is some way to manually map the relevant sections using
> > other functions in libdw and libelf, but I haven't figured out how to
> > do this, yet. If there is a simple trick I'm missing, I'd be happy to
> > hear about it.
> >
> > And, yes, a function that works like dwfl_report_elf, but takes a
> > pgoff and length as additional parameters is sorely missing from the
> > API.
>
> dwfl_report_elf indeed does assume the whole ELF file is mapped in
> according to the PHDRs in the file and the given base address. But what
> you are actually seeing (I think, depending on the answers on the
> questions above) is the dynamic loader mapping in the file in pieces.
> And so you would like an interface where you can report the module
> piece wise while it is being mapped in. So what would be most
> convenient would be some kind of dwfl_report_elf_mmap function that you
> can use to either get a new Dwfl_Module or to extend an existing one.
>
> I have to think how that interacts with mprotect and mmunmap.
Hey Mark,
I can only second what Christoph and Ulf said so far. I want to add though
that this limitation essentially makes elfutils unusable for usage in perf.
I.e., perf can be build with either libunwind or elfutils for unwinding
callstacks. Using the former works like a charm. The latter runs into the same
problems like perfparser / hotspot. To reproduce, use a modern distribution
with recent userland and kernel, then do something like this:
~~~~~
$ cat test.cpp
#include <cmath>
#include <complex>
#include <iostream>
#include <random>
using namespace std;
int main()
{
uniform_real_distribution<double> uniform(-1E5, 1E5);
default_random_engine engine;
double s = 0;
for (int i = 0; i < 10000000; ++i) {
s += norm(complex<double>(uniform(engine), uniform(engine)));
}
cout << s << '\n';
return 0;
}
$ g++ -O2 -g test.cpp -o test
$ perf record --call-graph dwarf ./test
$ perf report --stdio -vv
...
overlapping maps:
55a1cdd87000-55a1cdd89000 1000 /ssd/milian/projects/kdab/rnd/hotspot/tests/
test-clients/cpp-inlining/test
55a1cdd85000-55a1cdd89000 0 /ssd/milian/projects/kdab/rnd/hotspot/tests/test-
clients/cpp-inlining/test
55a1cdd85000-55a1cdd87000 0 /ssd/milian/projects/kdab/rnd/hotspot/tests/test-
clients/cpp-inlining/test
overlapping maps:
7fdda4690000-7fdda46af000 2000 /usr/lib/ld-2.28.so
7fdda468e000-7fdda46ba000 0 /usr/lib/ld-2.28.so
7fdda468e000-7fdda4690000 0 /usr/lib/ld-2.28.so
7fdda46af000-7fdda46ba000 0 /usr/lib/ld-2.28.so
...
57.14% 0.00% test [unknown] [.] 0xe775b17d50ae8cff
|
---0xe775b17d50ae8cff
~~~~~
To build perf against elfutils, do:
~~~~~
git clone --branch=perf/core git://git.kernel.org/pub/scm/linux/kernel/git/
acme/linux.git
cd linux/tools/perf
make NO_LIBUNWIND=1
~~~~~
The perf binary in the last folder can than be used as a drop-in replacement.
Since I consider this issue a serious blocker, I would like to see it fixed
sooner rather than later. Would it maybe be possible for you to create a proof
of concept for the new proposed dwfl_report_elf_mmap? I can then try to take
it from there to fill in the missing bits and pieces and to make it actually
work for our purposes.
Thanks
--
Milian Wolff
mail@milianw.de
http://milianw.de
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-09-26 14:38 ` Milian Wolff
@ 2018-10-09 20:33 ` Milian Wolff
2018-10-11 17:02 ` Ulf Hermann
0 siblings, 1 reply; 17+ messages in thread
From: Milian Wolff @ 2018-10-09 20:33 UTC (permalink / raw)
To: elfutils-devel; +Cc: Mark Wielaard, Ulf Hermann, Christoph Sterz
[-- Attachment #1: Type: text/plain, Size: 6144 bytes --]
On Mittwoch, 26. September 2018 16:38:43 CEST Milian Wolff wrote:
> On Friday, September 21, 2018 3:07:29 PM CEST Mark Wielaard wrote:
> > On Wed, 2018-09-19 at 14:24 +0200, Ulf Hermann wrote:
> > > > We suspect perf to offset its recording-addresses of mmapped
> > > > dsos/executables starting with a specific section, such that they
> > > > denote
> > > > their pointers with this pg_offset parameter. (e.g. skipping a
> > > > library's
> > > > header and setting pgoff to the headersize). Although we are not
> > > > 100%
> > > > sure about this information.
> > >
> > > According to my understanding, the pgoff is not perf's invention.
> > > Rather, the libary loader for the target application does not mmap()
> > > the full ELF file, but only the parts it's interested in. Those
> > > partial mappings are then reported through perf.
> >
> > OK, so pgoff is like the offset argument of the mmap call?
> > Is it just recording all user space mmap events that have PROT_EXEC in
> > their prot argument? What about if the mapping was later changed with
> > mprotect? Or does PERF_RECORD_MMAP only map to some internal kernel
> > mmap action?
> >
> > > We then try to recreate the memory mapping with perfparser, but run
> > >
> > > into problems because dwfl_report_elf() doesn't let us do partial
> > > mappings. You can only map complete files with that function. There
> > > probably is some way to manually map the relevant sections using
> > > other functions in libdw and libelf, but I haven't figured out how to
> > > do this, yet. If there is a simple trick I'm missing, I'd be happy to
> > > hear about it.
> > >
> > > And, yes, a function that works like dwfl_report_elf, but takes a
> > > pgoff and length as additional parameters is sorely missing from the
> > > API.
> >
> > dwfl_report_elf indeed does assume the whole ELF file is mapped in
> > according to the PHDRs in the file and the given base address. But what
> > you are actually seeing (I think, depending on the answers on the
> > questions above) is the dynamic loader mapping in the file in pieces.
> > And so you would like an interface where you can report the module
> > piece wise while it is being mapped in. So what would be most
> > convenient would be some kind of dwfl_report_elf_mmap function that you
> > can use to either get a new Dwfl_Module or to extend an existing one.
> >
> > I have to think how that interacts with mprotect and mmunmap.
>
> Hey Mark,
>
> I can only second what Christoph and Ulf said so far. I want to add though
> that this limitation essentially makes elfutils unusable for usage in perf.
> I.e., perf can be build with either libunwind or elfutils for unwinding
> callstacks. Using the former works like a charm. The latter runs into the
> same problems like perfparser / hotspot. To reproduce, use a modern
> distribution with recent userland and kernel, then do something like this:
>
> ~~~~~
> $ cat test.cpp
> #include <cmath>
> #include <complex>
> #include <iostream>
> #include <random>
>
> using namespace std;
>
> int main()
> {
> uniform_real_distribution<double> uniform(-1E5, 1E5);
> default_random_engine engine;
> double s = 0;
> for (int i = 0; i < 10000000; ++i) {
> s += norm(complex<double>(uniform(engine), uniform(engine)));
> }
> cout << s << '\n';
> return 0;
> }
> $ g++ -O2 -g test.cpp -o test
> $ perf record --call-graph dwarf ./test
> $ perf report --stdio -vv
> ...
> overlapping maps:
> 55a1cdd87000-55a1cdd89000 1000 /ssd/milian/projects/kdab/rnd/hotspot/tests/
> test-clients/cpp-inlining/test
> 55a1cdd85000-55a1cdd89000 0
> /ssd/milian/projects/kdab/rnd/hotspot/tests/test- clients/cpp-inlining/test
> 55a1cdd85000-55a1cdd87000 0
> /ssd/milian/projects/kdab/rnd/hotspot/tests/test- clients/cpp-inlining/test
> overlapping maps:
> 7fdda4690000-7fdda46af000 2000 /usr/lib/ld-2.28.so
> 7fdda468e000-7fdda46ba000 0 /usr/lib/ld-2.28.so
> 7fdda468e000-7fdda4690000 0 /usr/lib/ld-2.28.so
> 7fdda46af000-7fdda46ba000 0 /usr/lib/ld-2.28.so
> ...
> 57.14% 0.00% test [unknown] [.] 0xe775b17d50ae8cff
>
> ---0xe775b17d50ae8cff
> ~~~~~
>
> To build perf against elfutils, do:
>
> ~~~~~
> git clone --branch=perf/core git://git.kernel.org/pub/scm/linux/kernel/git/
> acme/linux.git
> cd linux/tools/perf
> make NO_LIBUNWIND=1
> ~~~~~
>
> The perf binary in the last folder can than be used as a drop-in
> replacement.
>
> Since I consider this issue a serious blocker, I would like to see it fixed
> sooner rather than later. Would it maybe be possible for you to create a
> proof of concept for the new proposed dwfl_report_elf_mmap? I can then try
> to take it from there to fill in the missing bits and pieces and to make it
> actually work for our purposes.
Hey Mark,
any news on this? Today, I spend some time reading through dwfl_report_elf.c
and it's far from trivial for me to go from here to a dwfl_report_elf_mmap or
similar.
If you are unable to work on a POC, can you guide us a bit please? Here are
some questions from my side:
- The address mapping seems to be handled via dwfl_report_module, so do we
want to call this once per mmap we encounter?
- If so, then maybe what's actually missing is some API to allow *setting* an
Elf for a Dwfl_Module, i.e. such that dwfl_report_module becomes useful for
public consumption? This would basically boil down to two functions: One to
open an elf file as is done currently in dwfl_report_elf/__libdw_open_file.
Then another function to assign the opened Elf to the Dwfl_Module, cf. the
tail of __libdwfl_report_elf below the call to __libdwfl_elf_address_range.
- But then, in such a per-mmap module, what values would one set for the
following properties?
m->main.vaddr = vaddr;
m->main.address_sync = address_sync;
m->main_bias = bias;
Do we set the same values everywhere, or do these values then depend on which
mmap section we are looking at? Generally, I still don't have a good enough
knowledge of the Elf API nor elfutils to really know what I'm talking about
here...
Thanks
--
Milian Wolff
mail@milianw.de
http://milianw.de
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-09 20:33 ` Milian Wolff
@ 2018-10-11 17:02 ` Ulf Hermann
2018-10-11 17:37 ` Mark Wielaard
0 siblings, 1 reply; 17+ messages in thread
From: Ulf Hermann @ 2018-10-11 17:02 UTC (permalink / raw)
To: Milian Wolff, elfutils-devel; +Cc: Mark Wielaard, Christoph Sterz
Hi Milian,
is there any pattern in how the loader maps the ELF sections into
memory? What sections does it actually map and which of those do we need
for unwinding?
I hope that only one of those MMAPs per ELF is actually meaningful and
we can simply add that one's pgoff as an extra member to Dwfl_Module and
use it whenever we poke the underlying file.
br,
Ulf
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-11 17:02 ` Ulf Hermann
@ 2018-10-11 17:37 ` Mark Wielaard
2018-10-11 18:14 ` Milian Wolff
0 siblings, 1 reply; 17+ messages in thread
From: Mark Wielaard @ 2018-10-11 17:37 UTC (permalink / raw)
To: Ulf Hermann; +Cc: Milian Wolff, elfutils-devel, Christoph Sterz
Hi,
My apologies for not having looked deeper at this.
It is a bit tricky and I just didnt have enough time to
really sit down and think it all through yet.
On Thu, Oct 11, 2018 at 05:02:18PM +0000, Ulf Hermann wrote:
> is there any pattern in how the loader maps the ELF sections into
> memory? What sections does it actually map and which of those do we need
> for unwinding?
Yes, it would be helpful to have some examples of mmap events plus
the associated segment header (eu-readelf -l) of the ELF file.
Note that the kernel and dynamic loader will use the (PT_LOAD) segments,
not the sections, to map things into memory. Each segment might contain
multiple sections.
libdwfl then tries to associate the correct sections (and address bias)
with how the ELF file was mapped into memory.
> I hope that only one of those MMAPs per ELF is actually meaningful and
> we can simply add that one's pgoff as an extra member to Dwfl_Module and
> use it whenever we poke the underlying file.
One "trick" might be to just substract the pgoff from the load address.
And so report as if the ELF file was being mapped from the start. This
isn't really correct, but it might be interesting to see if that makes
libdwfl able to just associate the whole ELF file with the correct
address map.
Cheers,
Mark
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-11 17:37 ` Mark Wielaard
@ 2018-10-11 18:14 ` Milian Wolff
2018-10-15 20:39 ` Milian Wolff
2018-10-15 20:48 ` Milian Wolff
0 siblings, 2 replies; 17+ messages in thread
From: Milian Wolff @ 2018-10-11 18:14 UTC (permalink / raw)
To: Mark Wielaard; +Cc: Ulf Hermann, elfutils-devel, Christoph Sterz
[-- Attachment #1: Type: text/plain, Size: 2363 bytes --]
On Donnerstag, 11. Oktober 2018 19:37:07 CEST Mark Wielaard wrote:
> Hi,
>
> My apologies for not having looked deeper at this.
> It is a bit tricky and I just didnt have enough time to
> really sit down and think it all through yet.
>
> On Thu, Oct 11, 2018 at 05:02:18PM +0000, Ulf Hermann wrote:
> > is there any pattern in how the loader maps the ELF sections into
> > memory? What sections does it actually map and which of those do we need
> > for unwinding?
>
> Yes, it would be helpful to have some examples of mmap events plus
> the associated segment header (eu-readelf -l) of the ELF file.
>
> Note that the kernel and dynamic loader will use the (PT_LOAD) segments,
> not the sections, to map things into memory. Each segment might contain
> multiple sections.
>
> libdwfl then tries to associate the correct sections (and address bias)
> with how the ELF file was mapped into memory.
>
> > I hope that only one of those MMAPs per ELF is actually meaningful and
> > we can simply add that one's pgoff as an extra member to Dwfl_Module and
> > use it whenever we poke the underlying file.
>
> One "trick" might be to just substract the pgoff from the load address.
> And so report as if the ELF file was being mapped from the start. This
> isn't really correct, but it might be interesting to see if that makes
> libdwfl able to just associate the whole ELF file with the correct
> address map.
I'll try to come up with some minimal code examples we can use to test all of
this. But from what I remember, neither of the above suggestions will be
sufficient as we can still run into overlapping module errors from elfutils
when we always load everything. I.e. I believe we've seen mappings that
eventually become partially obsoleted by a future mmap event. At that point,
we somehow need to be able to only map parts of a file, not all of it. So just
subtracting or honoring pgoff is not enough, I believe we also need to be able
to explicitly say how much of a file to map.
But to make this discussion easier to follow for others, I'll create some
standalone cpp code that takes a `perf script --show-mmap-events | grep
PERF_RECORD_MMAP` input file and then runs this through elfutils API to
reproduce the issues we are facing.
I'll get back to you all once this is done.
Cheers
--
Milian Wolff
mail@milianw.de
http://milianw.de
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-11 18:14 ` Milian Wolff
@ 2018-10-15 20:39 ` Milian Wolff
2018-10-15 21:05 ` Mark Wielaard
` (2 more replies)
2018-10-15 20:48 ` Milian Wolff
1 sibling, 3 replies; 17+ messages in thread
From: Milian Wolff @ 2018-10-15 20:39 UTC (permalink / raw)
To: elfutils-devel; +Cc: Mark Wielaard, Ulf Hermann, Christoph Sterz
[-- Attachment #1: Type: text/plain, Size: 3966 bytes --]
On Donnerstag, 11. Oktober 2018 20:14:43 CEST Milian Wolff wrote:
> On Donnerstag, 11. Oktober 2018 19:37:07 CEST Mark Wielaard wrote:
> > Hi,
> >
> > My apologies for not having looked deeper at this.
> > It is a bit tricky and I just didnt have enough time to
> > really sit down and think it all through yet.
> >
> > On Thu, Oct 11, 2018 at 05:02:18PM +0000, Ulf Hermann wrote:
> > > is there any pattern in how the loader maps the ELF sections into
> > > memory? What sections does it actually map and which of those do we need
> > > for unwinding?
> >
> > Yes, it would be helpful to have some examples of mmap events plus
> > the associated segment header (eu-readelf -l) of the ELF file.
> >
> > Note that the kernel and dynamic loader will use the (PT_LOAD) segments,
> > not the sections, to map things into memory. Each segment might contain
> > multiple sections.
> >
> > libdwfl then tries to associate the correct sections (and address bias)
> > with how the ELF file was mapped into memory.
> >
> > > I hope that only one of those MMAPs per ELF is actually meaningful and
> > > we can simply add that one's pgoff as an extra member to Dwfl_Module and
> > > use it whenever we poke the underlying file.
> >
> > One "trick" might be to just substract the pgoff from the load address.
> > And so report as if the ELF file was being mapped from the start. This
> > isn't really correct, but it might be interesting to see if that makes
> > libdwfl able to just associate the whole ELF file with the correct
> > address map.
>
> I'll try to come up with some minimal code examples we can use to test all
> of this. But from what I remember, neither of the above suggestions will be
> sufficient as we can still run into overlapping module errors from elfutils
> when we always load everything. I.e. I believe we've seen mappings that
> eventually become partially obsoleted by a future mmap event. At that
> point, we somehow need to be able to only map parts of a file, not all of
> it. So just subtracting or honoring pgoff is not enough, I believe we also
> need to be able to explicitly say how much of a file to map.
>
> But to make this discussion easier to follow for others, I'll create some
> standalone cpp code that takes a `perf script --show-mmap-events | grep
> PERF_RECORD_MMAP` input file and then runs this through elfutils API to
> reproduce the issues we are facing.
>
> I'll get back to you all once this is done.
Hey all,
here's one example of mmap events recorded by perf:
0x7fac5ec0b000 to 0x7fac5ed9a000, len = 0x18f000, offset = 0
r--p /usr/lib/libstdc++.so.6.0.25
0x7fac5ec94000 to 0x7fac5ed8a000, len = 0xf6000, offset = 0x89000
---p /usr/lib/libstdc++.so.6.0.25
0x7fac5ec94000 to 0x7fac5ed4c000, len = 0xb8000, offset = 0x89000
r-xp /usr/lib/libstdc++.so.6.0.25
0x7fac5ed4c000 to 0x7fac5ed89000, len = 0x3d000, offset = 0x141000
r--p /usr/lib/libstdc++.so.6.0.25
0x7fac5ed8a000 to 0x7fac5ed97000, len = 0xd000, offset = 0x17e000
rw-p /usr/lib/libstdc++.so.6.0.25
this is noteworthy in multiple ways:
- the first mapping we receive is for pgoff = 0 for the full file size aligned
to the page boundary
- the first mapping isn't executable yet
- the last mappings have a huge offset which actually lies beyond the
initially mmaped region?!
And to make things worse, when we report the file at address 0x7fac5ec0b000
via dwfl, we get:
reported module /usr/lib/libstdc++.so.6.0.25
expected: 0x7fac5ec0b000 to 0x7fac5ed9a000 (0x18f000)
actual: 0x7fac5ec0b000 to 0x7fac5ed99640 (0x18e640)
So now dwfl won't ever be able to map any addresses into this module when they
come after 0x7fac5ed99640, but the mmap events above seem to indicate that
this could be possible?
I'll now upload my code to enable you all to play around with this yourself.
Bye
--
Milian Wolff
mail@milianw.de
http://milianw.de
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-11 18:14 ` Milian Wolff
2018-10-15 20:39 ` Milian Wolff
@ 2018-10-15 20:48 ` Milian Wolff
1 sibling, 0 replies; 17+ messages in thread
From: Milian Wolff @ 2018-10-15 20:48 UTC (permalink / raw)
To: elfutils-devel; +Cc: Mark Wielaard, Ulf Hermann, Christoph Sterz
[-- Attachment #1: Type: text/plain, Size: 2941 bytes --]
On Donnerstag, 11. Oktober 2018 20:14:43 CEST Milian Wolff wrote:
> On Donnerstag, 11. Oktober 2018 19:37:07 CEST Mark Wielaard wrote:
> > Hi,
> >
> > My apologies for not having looked deeper at this.
> > It is a bit tricky and I just didnt have enough time to
> > really sit down and think it all through yet.
> >
> > On Thu, Oct 11, 2018 at 05:02:18PM +0000, Ulf Hermann wrote:
> > > is there any pattern in how the loader maps the ELF sections into
> > > memory? What sections does it actually map and which of those do we need
> > > for unwinding?
> >
> > Yes, it would be helpful to have some examples of mmap events plus
> > the associated segment header (eu-readelf -l) of the ELF file.
> >
> > Note that the kernel and dynamic loader will use the (PT_LOAD) segments,
> > not the sections, to map things into memory. Each segment might contain
> > multiple sections.
> >
> > libdwfl then tries to associate the correct sections (and address bias)
> > with how the ELF file was mapped into memory.
> >
> > > I hope that only one of those MMAPs per ELF is actually meaningful and
> > > we can simply add that one's pgoff as an extra member to Dwfl_Module and
> > > use it whenever we poke the underlying file.
> >
> > One "trick" might be to just substract the pgoff from the load address.
> > And so report as if the ELF file was being mapped from the start. This
> > isn't really correct, but it might be interesting to see if that makes
> > libdwfl able to just associate the whole ELF file with the correct
> > address map.
>
> I'll try to come up with some minimal code examples we can use to test all
> of this. But from what I remember, neither of the above suggestions will be
> sufficient as we can still run into overlapping module errors from elfutils
> when we always load everything. I.e. I believe we've seen mappings that
> eventually become partially obsoleted by a future mmap event. At that
> point, we somehow need to be able to only map parts of a file, not all of
> it. So just subtracting or honoring pgoff is not enough, I believe we also
> need to be able to explicitly say how much of a file to map.
>
> But to make this discussion easier to follow for others, I'll create some
> standalone cpp code that takes a `perf script --show-mmap-events | grep
> PERF_RECORD_MMAP` input file and then runs this through elfutils API to
> reproduce the issues we are facing.
>
> I'll get back to you all once this is done.
I've pushed a preliminary POC for a reproducer:
https://github.com/milianw/perf_mmaps_to_elfutils
Note that it's not really exhibiting any dwfl errors as-is. We would need to
feed it also all the instruction pointer addresses that perf encounters, then
try to find the matching module via libdwfl. This isn't easily done, and I
hope the current output already exemplifies some of the issues with the
current libdwfl API.
Thanks
--
Milian Wolff
mail@milianw.de
http://milianw.de
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-15 20:39 ` Milian Wolff
@ 2018-10-15 21:05 ` Mark Wielaard
2018-10-15 21:06 ` Milian Wolff
2018-10-18 7:41 ` Ulf Hermann
2018-10-20 10:49 ` Milian Wolff
2 siblings, 1 reply; 17+ messages in thread
From: Mark Wielaard @ 2018-10-15 21:05 UTC (permalink / raw)
To: Milian Wolff, elfutils-devel; +Cc: Ulf Hermann, Christoph Sterz
Hi Milian,
On Mon, 2018-10-15 at 22:38 +0200, Milian Wolff wrote:
> here's one example of mmap events recorded by perf:
>
> 0x7fac5ec0b000 to 0x7fac5ed9a000, len = 0x18f000, offset = 0
> r--p /usr/lib/libstdc++.so.6.0.25
> 0x7fac5ec94000 to 0x7fac5ed8a000, len = 0xf6000, offset = 0x89000
> ---p /usr/lib/libstdc++.so.6.0.25
> 0x7fac5ec94000 to 0x7fac5ed4c000, len = 0xb8000, offset = 0x89000
> r-xp /usr/lib/libstdc++.so.6.0.25
> 0x7fac5ed4c000 to 0x7fac5ed89000, len = 0x3d000, offset = 0x141000
> r--p /usr/lib/libstdc++.so.6.0.25
> 0x7fac5ed8a000 to 0x7fac5ed97000, len = 0xd000, offset = 0x17e000
> rw-p /usr/lib/libstdc++.so.6.0.25
Could you also post the matching phdr output for the file?
eu-readelf -l /usr/lib/libstdc++.so.6.0.25 should show it.
That way we can see how the PT_LOAD segments map to the mmap events.
Thanks,
Mark
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-15 21:05 ` Mark Wielaard
@ 2018-10-15 21:06 ` Milian Wolff
2018-10-17 14:52 ` Milian Wolff
0 siblings, 1 reply; 17+ messages in thread
From: Milian Wolff @ 2018-10-15 21:06 UTC (permalink / raw)
To: Mark Wielaard; +Cc: elfutils-devel, Ulf Hermann, Christoph Sterz
[-- Attachment #1: Type: text/plain, Size: 2885 bytes --]
On Montag, 15. Oktober 2018 23:04:52 CEST Mark Wielaard wrote:
> Hi Milian,
>
> On Mon, 2018-10-15 at 22:38 +0200, Milian Wolff wrote:
> > here's one example of mmap events recorded by perf:
> >
> > 0x7fac5ec0b000 to 0x7fac5ed9a000, len = 0x18f000, offset
> > = 0 r--p /usr/lib/libstdc++.so.6.0.25
> > 0x7fac5ec94000 to 0x7fac5ed8a000, len = 0xf6000, offset
> > = 0x89000 ---p /usr/lib/libstdc++.so.6.0.25
> > 0x7fac5ec94000 to 0x7fac5ed4c000, len = 0xb8000, offset
> > = 0x89000 r-xp /usr/lib/libstdc++.so.6.0.25
> > 0x7fac5ed4c000 to 0x7fac5ed89000, len = 0x3d000, offset
> > = 0x141000 r--p /usr/lib/libstdc++.so.6.0.25
> > 0x7fac5ed8a000 to 0x7fac5ed97000, len = 0xd000, offset
> > = 0x17e000 rw-p /usr/lib/libstdc++.so.6.0.25
>
> Could you also post the matching phdr output for the file?
> eu-readelf -l /usr/lib/libstdc++.so.6.0.25 should show it.
> That way we can see how the PT_LOAD segments map to the mmap events.
Sure:
$ eu-readelf -l /usr/lib/libstdc++.so.6.0.25
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz
MemSiz Flg Align
LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x088fa8
0x088fa8 R 0x1000
LOAD 0x089000 0x0000000000089000 0x0000000000089000 0x0b7ae1
0x0b7ae1 R E 0x1000
LOAD 0x141000 0x0000000000141000 0x0000000000141000 0x03cfe0
0x03cfe0 R 0x1000
LOAD 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x00b8b8
0x00ed60 RW 0x1000
DYNAMIC 0x1873a8 0x00000000001883a8 0x00000000001883a8 0x0001e0
0x0001e0 RW 0x8
NOTE 0x0002a8 0x00000000000002a8 0x00000000000002a8 0x000024
0x000024 R 0x4
NOTE 0x17dfc0 0x000000000017dfc0 0x000000000017dfc0 0x000020
0x000020 R 0x8
TLS 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x000000
0x000020 R 0x8
GNU_EH_FRAME 0x149558 0x0000000000149558 0x0000000000149558 0x007f04
0x007f04 R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000
0x000000 RW 0x10
GNU_RELRO 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x00b720
0x00b720 R 0x1
Section to Segment mapping:
Segment Sections...
00 [RO: .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version
.gnu.version_d .gnu.version_r .rela.dyn]
01 [RO: .init .text .fini]
02 [RO: .rodata .eh_frame_hdr .eh_frame .gcc_except_table
.note.gnu.property]
03 [RELRO: .tbss .init_array .fini_array .data.rel.ro .dynamic .got]
.got.plt .data .bss
04 [RELRO: .dynamic]
05 [RO: .note.gnu.build-id]
06 [RO: .note.gnu.property]
07 [RELRO: .tbss]
08 [RO: .eh_frame_hdr]
09
10 [RELRO: .tbss .init_array .fini_array .data.rel.ro .dynamic .got]
--
Milian Wolff
mail@milianw.de
http://milianw.de
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-15 21:06 ` Milian Wolff
@ 2018-10-17 14:52 ` Milian Wolff
2018-10-17 22:26 ` Mark Wielaard
0 siblings, 1 reply; 17+ messages in thread
From: Milian Wolff @ 2018-10-17 14:52 UTC (permalink / raw)
To: elfutils-devel; +Cc: Mark Wielaard, Ulf Hermann, Christoph Sterz
[-- Attachment #1: Type: text/plain, Size: 3724 bytes --]
On Montag, 15. Oktober 2018 23:06:07 CEST Milian Wolff wrote:
> On Montag, 15. Oktober 2018 23:04:52 CEST Mark Wielaard wrote:
> > Hi Milian,
> >
> > On Mon, 2018-10-15 at 22:38 +0200, Milian Wolff wrote:
> > > here's one example of mmap events recorded by perf:
> > >
> > > 0x7fac5ec0b000 to 0x7fac5ed9a000, len = 0x18f000, offset
> > > = 0 r--p /usr/lib/libstdc++.so.6.0.25
> > > 0x7fac5ec94000 to 0x7fac5ed8a000, len = 0xf6000, offset
> > > = 0x89000 ---p /usr/lib/libstdc++.so.6.0.25
> > > 0x7fac5ec94000 to 0x7fac5ed4c000, len = 0xb8000, offset
> > > = 0x89000 r-xp /usr/lib/libstdc++.so.6.0.25
> > > 0x7fac5ed4c000 to 0x7fac5ed89000, len = 0x3d000, offset
> > > = 0x141000 r--p /usr/lib/libstdc++.so.6.0.25
> > > 0x7fac5ed8a000 to 0x7fac5ed97000, len = 0xd000, offset
> > > = 0x17e000 rw-p /usr/lib/libstdc++.so.6.0.25
> >
> > Could you also post the matching phdr output for the file?
> > eu-readelf -l /usr/lib/libstdc++.so.6.0.25 should show it.
> > That way we can see how the PT_LOAD segments map to the mmap events.
>
> Sure:
>
> $ eu-readelf -l /usr/lib/libstdc++.so.6.0.25
> Program Headers:
> Type Offset VirtAddr PhysAddr FileSiz
> MemSiz Flg Align
> LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x088fa8
> 0x088fa8 R 0x1000
> LOAD 0x089000 0x0000000000089000 0x0000000000089000 0x0b7ae1
> 0x0b7ae1 R E 0x1000
> LOAD 0x141000 0x0000000000141000 0x0000000000141000 0x03cfe0
> 0x03cfe0 R 0x1000
> LOAD 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x00b8b8
> 0x00ed60 RW 0x1000
> DYNAMIC 0x1873a8 0x00000000001883a8 0x00000000001883a8 0x0001e0
> 0x0001e0 RW 0x8
> NOTE 0x0002a8 0x00000000000002a8 0x00000000000002a8 0x000024
> 0x000024 R 0x4
> NOTE 0x17dfc0 0x000000000017dfc0 0x000000000017dfc0 0x000020
> 0x000020 R 0x8
> TLS 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x000000
> 0x000020 R 0x8
> GNU_EH_FRAME 0x149558 0x0000000000149558 0x0000000000149558 0x007f04
> 0x007f04 R 0x4
> GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000
> 0x000000 RW 0x10
> GNU_RELRO 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x00b720
> 0x00b720 R 0x1
>
> Section to Segment mapping:
> Segment Sections...
> 00 [RO: .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version
> .gnu.version_d .gnu.version_r .rela.dyn]
> 01 [RO: .init .text .fini]
> 02 [RO: .rodata .eh_frame_hdr .eh_frame .gcc_except_table
> .note.gnu.property]
> 03 [RELRO: .tbss .init_array .fini_array .data.rel.ro .dynamic .got]
> .got.plt .data .bss
> 04 [RELRO: .dynamic]
> 05 [RO: .note.gnu.build-id]
> 06 [RO: .note.gnu.property]
> 07 [RELRO: .tbss]
> 08 [RO: .eh_frame_hdr]
> 09
> 10 [RELRO: .tbss .init_array .fini_array .data.rel.ro .dynamic .got]
So, Mark - any chance you could have a look at the above and give us your
feedback?
When I compare the actual mmap events with the LOAD segments, there are some
similarities, but also some discrepancies. Note how the mmap sizes always
differ from the FileSiz header value. And the offsets also sometimes mismatch,
e.g. for the last segment / mmap event we get 0x17f8e0 in the header, but
0x17e000 in the mmap event...:
LOAD 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x00b8b8
0x00ed60 RW 0x1000
0x7fac5ed8a000 to 0x7fac5ed97000, len = 0xd000, offset = 0x17e000
rw-p /usr/lib/libstdc++.so.6.0.25
I'm pretty confused here!
--
Milian Wolff
mail@milianw.de
http://milianw.de
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-17 14:52 ` Milian Wolff
@ 2018-10-17 22:26 ` Mark Wielaard
0 siblings, 0 replies; 17+ messages in thread
From: Mark Wielaard @ 2018-10-17 22:26 UTC (permalink / raw)
To: Milian Wolff; +Cc: elfutils-devel, Ulf Hermann, Christoph Sterz
Hi Milian,
On Wed, Oct 17, 2018 at 04:52:42PM +0200, Milian Wolff wrote:
> On Montag, 15. Oktober 2018 23:06:07 CEST Milian Wolff wrote:
> > On Montag, 15. Oktober 2018 23:04:52 CEST Mark Wielaard wrote:
> > > On Mon, 2018-10-15 at 22:38 +0200, Milian Wolff wrote:
> > > > here's one example of mmap events recorded by perf:
> > > >
> > > > 0x7fac5ec0b000 to 0x7fac5ed9a000, len = 0x18f000, offset
> > > > = 0 r--p /usr/lib/libstdc++.so.6.0.25
> > > > 0x7fac5ec94000 to 0x7fac5ed8a000, len = 0xf6000, offset
> > > > = 0x89000 ---p /usr/lib/libstdc++.so.6.0.25
> > > > 0x7fac5ec94000 to 0x7fac5ed4c000, len = 0xb8000, offset
> > > > = 0x89000 r-xp /usr/lib/libstdc++.so.6.0.25
> > > > 0x7fac5ed4c000 to 0x7fac5ed89000, len = 0x3d000, offset
> > > > = 0x141000 r--p /usr/lib/libstdc++.so.6.0.25
> > > > 0x7fac5ed8a000 to 0x7fac5ed97000, len = 0xd000, offset
> > > > = 0x17e000 rw-p /usr/lib/libstdc++.so.6.0.25
> > >
> > > Could you also post the matching phdr output for the file?
> > > eu-readelf -l /usr/lib/libstdc++.so.6.0.25 should show it.
> > > That way we can see how the PT_LOAD segments map to the mmap events.
> >
> > Sure:
> >
> > $ eu-readelf -l /usr/lib/libstdc++.so.6.0.25
> > Program Headers:
> > Type Offset VirtAddr PhysAddr FileSiz
> > MemSiz Flg Align
> > LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x088fa8
> > 0x088fa8 R 0x1000
> > LOAD 0x089000 0x0000000000089000 0x0000000000089000 0x0b7ae1
> > 0x0b7ae1 R E 0x1000
> > LOAD 0x141000 0x0000000000141000 0x0000000000141000 0x03cfe0
> > 0x03cfe0 R 0x1000
> > LOAD 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x00b8b8
> > 0x00ed60 RW 0x1000
> > DYNAMIC 0x1873a8 0x00000000001883a8 0x00000000001883a8 0x0001e0
> > 0x0001e0 RW 0x8
> > NOTE 0x0002a8 0x00000000000002a8 0x00000000000002a8 0x000024
> > 0x000024 R 0x4
> > NOTE 0x17dfc0 0x000000000017dfc0 0x000000000017dfc0 0x000020
> > 0x000020 R 0x8
> > TLS 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x000000
> > 0x000020 R 0x8
> > GNU_EH_FRAME 0x149558 0x0000000000149558 0x0000000000149558 0x007f04
> > 0x007f04 R 0x4
> > GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000
> > 0x000000 RW 0x10
> > GNU_RELRO 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x00b720
> > 0x00b720 R 0x1
> >
> > Section to Segment mapping:
> > Segment Sections...
> > 00 [RO: .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version
> > .gnu.version_d .gnu.version_r .rela.dyn]
> > 01 [RO: .init .text .fini]
> > 02 [RO: .rodata .eh_frame_hdr .eh_frame .gcc_except_table
> > .note.gnu.property]
> > 03 [RELRO: .tbss .init_array .fini_array .data.rel.ro .dynamic .got]
> > .got.plt .data .bss
> > 04 [RELRO: .dynamic]
> > 05 [RO: .note.gnu.build-id]
> > 06 [RO: .note.gnu.property]
> > 07 [RELRO: .tbss]
> > 08 [RO: .eh_frame_hdr]
> > 09
> > 10 [RELRO: .tbss .init_array .fini_array .data.rel.ro .dynamic .got]
>
> So, Mark - any chance you could have a look at the above and give us your
> feedback?
Sorry, I haven't yet looked at this deeply. But some quick comments.
The mmap events do seem to correspond to the PT_LOAD segments. At least
the offsets are. Why the second on is mmapped twice I don't know. The
difference in length for the last 3 seems to be that the mmaps are
aligned up (0x1000, 4K, page size)
> When I compare the actual mmap events with the LOAD segments, there are some
> similarities, but also some discrepancies. Note how the mmap sizes always
> differ from the FileSiz header value. And the offsets also sometimes mismatch,
> e.g. for the last segment / mmap event we get 0x17f8e0 in the header, but
> 0x17e000 in the mmap event...:
>
> LOAD 0x17e8e0 0x000000000017f8e0 0x000000000017f8e0 0x00b8b8
> 0x00ed60 RW 0x1000
>
> 0x7fac5ed8a000 to 0x7fac5ed97000, len = 0xd000, offset = 0x17e000
> rw-p /usr/lib/libstdc++.so.6.0.25
>
> I'm pretty confused here!
I think the differences can be explained by the fact that mmap will use
aligned offsets and length.
In theory libdwfl just needs to see one mmap even and should then be
able to use the phdrs PT_LOAD headers to see how the whole file is
mmapped into memory. Maybe something goes wrong there. And reporting
multiple events for the same file might confuse things.
Cheers,
Mark
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-15 20:39 ` Milian Wolff
2018-10-15 21:05 ` Mark Wielaard
@ 2018-10-18 7:41 ` Ulf Hermann
2018-10-20 10:49 ` Milian Wolff
2 siblings, 0 replies; 17+ messages in thread
From: Ulf Hermann @ 2018-10-18 7:41 UTC (permalink / raw)
To: Milian Wolff, elfutils-devel; +Cc: Mark Wielaard, Christoph Sterz
Consider:
> 0x7fac5ec0b000 to 0x7fac5ed9a000, len = 0x18f000, offset = 0
> r--p /usr/lib/libstdc++.so.6.0.25
> 0x7fac5ec94000 to 0x7fac5ed8a000, len = 0xf6000, offset = 0x89000
> ---p /usr/lib/libstdc++.so.6.0.25
0x7fac5ec94000 - 0x89000 = 0x7fac5ec0b000
This is just taking away the 'r' bit from part of the first mapping. We
can ignore it in perf and perfparser.
> 0x7fac5ec94000 to 0x7fac5ed4c000, len = 0xb8000, offset = 0x89000
> r-xp /usr/lib/libstdc++.so.6.0.25
Same thing, but adding the 'r' and 'x' bits.
> 0x7fac5ed4c000 to 0x7fac5ed89000, len = 0x3d000, offset = 0x141000
> r--p /usr/lib/libstdc++.so.6.0.25
0x7fac5ed4c000 - 0x141000 = 0x7fac5ec0b000
This is re-adding the 'r' bit for a different range, again without
changing the contents.
> 0x7fac5ed8a000 to 0x7fac5ed97000, len = 0xd000, offset = 0x17e000
> rw-p /usr/lib/libstdc++.so.6.0.25
0x7fac5ed8a000 - 0x17e000 = 0x7fac5ec0c000
Strange, what is this? Note that the 'rw', though. It probably doesn't
contain any executable code. In effect, perfparser and perf should then
truncate the original mapping and never report this one to libdw, just
like we do in perfparser now (for different reasons).
So, what about the following algorithing, which can be done entirely
outside of elfutils:
If we get an mmap with a pgoff, check if the same file is already mapped
at the "virtual" start address of the file (start - pgoff), and if it
is, ignore the mmap. That should deal with the first 3 overlaps here.
The last one is really mapping a different part of the file, but as the
mapping is not executable and read-write we should really never get a
sample from it.
We might actually check the permission bits before reporting things to
dwfl so that a few broken samples don't destroy the memory map. However,
that partially contradicts the above. We'd need to OR up all the
permissions from different mmaps covering the same file at the same base
address to get an approximation for it, or we need a different data
structure.
best,
Ulf
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Handling pgoff in perf elf mmap/mmap2 elf info
2018-10-15 20:39 ` Milian Wolff
2018-10-15 21:05 ` Mark Wielaard
2018-10-18 7:41 ` Ulf Hermann
@ 2018-10-20 10:49 ` Milian Wolff
2 siblings, 0 replies; 17+ messages in thread
From: Milian Wolff @ 2018-10-20 10:49 UTC (permalink / raw)
To: elfutils-devel; +Cc: Mark Wielaard, Ulf Hermann, Christoph Sterz
[-- Attachment #1: Type: text/plain, Size: 4450 bytes --]
On Montag, 15. Oktober 2018 22:38:53 CEST Milian Wolff wrote:
> On Donnerstag, 11. Oktober 2018 20:14:43 CEST Milian Wolff wrote:
> > On Donnerstag, 11. Oktober 2018 19:37:07 CEST Mark Wielaard wrote:
> > > Hi,
> > >
> > > My apologies for not having looked deeper at this.
> > > It is a bit tricky and I just didnt have enough time to
> > > really sit down and think it all through yet.
> > >
> > > On Thu, Oct 11, 2018 at 05:02:18PM +0000, Ulf Hermann wrote:
> > > > is there any pattern in how the loader maps the ELF sections into
> > > > memory? What sections does it actually map and which of those do we
> > > > need
> > > > for unwinding?
> > >
> > > Yes, it would be helpful to have some examples of mmap events plus
> > > the associated segment header (eu-readelf -l) of the ELF file.
> > >
> > > Note that the kernel and dynamic loader will use the (PT_LOAD) segments,
> > > not the sections, to map things into memory. Each segment might contain
> > > multiple sections.
> > >
> > > libdwfl then tries to associate the correct sections (and address bias)
> > > with how the ELF file was mapped into memory.
> > >
> > > > I hope that only one of those MMAPs per ELF is actually meaningful and
> > > > we can simply add that one's pgoff as an extra member to Dwfl_Module
> > > > and
> > > > use it whenever we poke the underlying file.
> > >
> > > One "trick" might be to just substract the pgoff from the load address.
> > > And so report as if the ELF file was being mapped from the start. This
> > > isn't really correct, but it might be interesting to see if that makes
> > > libdwfl able to just associate the whole ELF file with the correct
> > > address map.
> >
> > I'll try to come up with some minimal code examples we can use to test all
> > of this. But from what I remember, neither of the above suggestions will
> > be
> > sufficient as we can still run into overlapping module errors from
> > elfutils
> > when we always load everything. I.e. I believe we've seen mappings that
> > eventually become partially obsoleted by a future mmap event. At that
> > point, we somehow need to be able to only map parts of a file, not all of
> > it. So just subtracting or honoring pgoff is not enough, I believe we also
> > need to be able to explicitly say how much of a file to map.
> >
> > But to make this discussion easier to follow for others, I'll create some
> > standalone cpp code that takes a `perf script --show-mmap-events | grep
> > PERF_RECORD_MMAP` input file and then runs this through elfutils API to
> > reproduce the issues we are facing.
> >
> > I'll get back to you all once this is done.
>
> Hey all,
>
> here's one example of mmap events recorded by perf:
>
> 0x7fac5ec0b000 to 0x7fac5ed9a000, len = 0x18f000, offset =
> 0 r--p /usr/lib/libstdc++.so.6.0.25
> 0x7fac5ec94000 to 0x7fac5ed8a000, len = 0xf6000, offset =
> 0x89000 ---p /usr/lib/libstdc++.so.6.0.25
> 0x7fac5ec94000 to 0x7fac5ed4c000, len = 0xb8000, offset =
> 0x89000 r-xp /usr/lib/libstdc++.so.6.0.25
> 0x7fac5ed4c000 to 0x7fac5ed89000, len = 0x3d000, offset =
> 0x141000 r--p /usr/lib/libstdc++.so.6.0.25
> 0x7fac5ed8a000 to 0x7fac5ed97000, len = 0xd000, offset =
> 0x17e000 rw-p /usr/lib/libstdc++.so.6.0.25
Spending more time on this issue, I've come up with a seemingly viable
approach to workaround the libdwfl API limitations. Most notably, one must not
take the raw mmap events and try to report them. Instead, we now associate the
"base mmap", i.e. the first one with pgoff = 0, with the following mmaps for
this file. Then, when we'd otherwise try to report one of the following mmaps,
we lookup the base mmap addr and use that in our interaction with libdwfl.
Now I'm not seeing any "overlapping address" errors anymore, and unwinding
seems to work fine again for perf files with mmap events like in the above.
There are still quite a few broken backtraces, but so far I can't say what the
reason for this is.
So, from my POV, I'm fine - I have a viable workaround. But from the POV of
future users of libdwfl, I still believe it would be useful to have more
control over what gets reported and where. I.e. instead of having libdwfl
analyze the PT_LOAD sections, offer an API that would allow us to feed the
mmaped regions directly with pgoff etc.
Thanks for the input everyone
--
Milian Wolff
mail@milianw.de
http://milianw.de
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2018-10-20 10:49 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-19 12:12 Handling pgoff in perf elf mmap/mmap2 elf info Christoph Sterz
2018-09-19 12:24 ` Ulf Hermann
2018-09-21 13:07 ` Mark Wielaard
2018-09-21 13:35 ` Ulf Hermann
2018-09-26 14:38 ` Milian Wolff
2018-10-09 20:33 ` Milian Wolff
2018-10-11 17:02 ` Ulf Hermann
2018-10-11 17:37 ` Mark Wielaard
2018-10-11 18:14 ` Milian Wolff
2018-10-15 20:39 ` Milian Wolff
2018-10-15 21:05 ` Mark Wielaard
2018-10-15 21:06 ` Milian Wolff
2018-10-17 14:52 ` Milian Wolff
2018-10-17 22:26 ` Mark Wielaard
2018-10-18 7:41 ` Ulf Hermann
2018-10-20 10:49 ` Milian Wolff
2018-10-15 20:48 ` Milian Wolff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).