dwfl_module_addrinfo and @plt entries

public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed

* dwfl_module_addrinfo and @plt entries
@ 2017-01-04  0:41 Milian Wolff
  2017-01-04 13:42 ` Mark Wielaard
  0 siblings, 1 reply; 8+ messages in thread
From: Milian Wolff @ 2017-01-04  0:41 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 990 bytes --]

Hello,

how do I get symbol information for @plt entries? Consider the following case:

~~~~~~~~~~~~~~~
$ objdump -j .plt -S lab_mandelbrot | head

lab_mandelbrot:     file format elf64-x86-64

Disassembly of section .plt:

0000000000002aa0 <_ZN7QWidget4showEv@plt-0x10>:
    2aa0:       ff 35 62 35 20 00       pushq  0x203562(%rip)        # 206008 
<_GLOBAL_OFFSET_TABLE_+0x8>
    2aa6:       ff 25 64 35 20 00       jmpq   *0x203564(%rip)        # 206010 
<_GLOBAL_OFFSET_TABLE_+0x10>
    2aac:       0f 1f 40 00             nopl   0x0(%rax)
~~~~~~~~~~~~~~~

Now I report dwfl the above binary at address 0x56360eaff000. Then I try to 
get information about the address 0x56360EB01AA0 (i.e. at offset 0x2aa0, 
corresponding to the @plt entry above). dwfl_module_addrinfo returns a NULL 
string, and offset equals the input address.

So, how do I use the dwfl API to also get sym names for @plt entries like in 
the case above?

Thanks

-- 
Milian Wolff
mail@milianw.de
http://milianw.de

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: dwfl_module_addrinfo and @plt entries
  2017-01-04  0:41 dwfl_module_addrinfo and @plt entries Milian Wolff
@ 2017-01-04 13:42 ` Mark Wielaard
  2017-01-06 10:28   ` Milian Wolff
  0 siblings, 1 reply; 8+ messages in thread
From: Mark Wielaard @ 2017-01-04 13:42 UTC (permalink / raw)
  To: Milian Wolff; +Cc: elfutils-devel

On Wed, Jan 04, 2017 at 01:41:26AM +0100, Milian Wolff wrote:
> how do I get symbol information for @plt entries?

Short answer. You cannot with the dwfl_module_getsym/addr*
functions. And we don't have accessors for "fake" symbols
(yet). Sorry.

Longer answer. An address pointing into the PLT does
really point to an ELF symbol. The PLT/GOT contains entries
that are architecture specific "jump targets" that contain
(self-modifying) code/data on first access. A PLT entry
is code that sets up the correct (absolute) address of
a function on first access. And on second access it fetches
that function address and jumps to it. Since this is
architecture specific we would need a backend function
that translates an address pointing into the PLT into
an actual function address. You would then be able to
fetch the actual ELF symbol that address is associated
with.

If we have such a backend function then we could even
do what BFD apparently does. Which is to then create a
"fake" symbol with as name real_function@plt. But I am
not sure such fake symbols are very useful (and will
quickly become confusing since they aren't real ELF
symbols).

Hope that helps. And maybe inspires someone (you?) to
write up such a backend function and corresponding
dwfl frontend function.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: dwfl_module_addrinfo and @plt entries
  2017-01-04 13:42 ` Mark Wielaard
@ 2017-01-06 10:28   ` Milian Wolff
  2017-01-06 19:17     ` Mark Wielaard
  0 siblings, 1 reply; 8+ messages in thread
From: Milian Wolff @ 2017-01-06 10:28 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: elfutils-devel

On Wednesday, January 4, 2017 2:42:23 PM CET Mark Wielaard wrote:
> On Wed, Jan 04, 2017 at 01:41:26AM +0100, Milian Wolff wrote:
> > how do I get symbol information for @plt entries?
> 
> Short answer. You cannot with the dwfl_module_getsym/addr*
> functions. And we don't have accessors for "fake" symbols
> (yet). Sorry.

Thanks for the clarification Mark.

> Longer answer. An address pointing into the PLT does
> really point to an ELF symbol.

You mean: does _not_
Right?

> The PLT/GOT contains entries
> that are architecture specific "jump targets" that contain
> (self-modifying) code/data on first access. A PLT entry
> is code that sets up the correct (absolute) address of
> a function on first access. And on second access it fetches
> that function address and jumps to it. Since this is
> architecture specific we would need a backend function
> that translates an address pointing into the PLT into
> an actual function address. You would then be able to
> fetch the actual ELF symbol that address is associated
> with.
> 
> If we have such a backend function then we could even
> do what BFD apparently does. Which is to then create a
> "fake" symbol with as name real_function@plt. But I am
> not sure such fake symbols are very useful (and will
> quickly become confusing since they aren't real ELF
> symbols).

So the objdump command I used is leveraging BFD internally to give me the @plt 
names? I noticed that I also see @plt in perf, which is also probably using 
BFD internally. That at least clarifies why it works in some tools but not in 
when using dwfl.

> Hope that helps. And maybe inspires someone (you?) to
> write up such a backend function and corresponding
> dwfl frontend function.

It does help, thanks. I'm interested in contributing such functionality, but, 
sadly, I'm not sure when I'll get the time to actually do it.

Cheers

-- 
Milian Wolff
mail@milianw.de
http://milianw.de

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: dwfl_module_addrinfo and @plt entries
  2017-01-06 10:28   ` Milian Wolff
@ 2017-01-06 19:17     ` Mark Wielaard
  2017-07-05 13:34       ` Milian Wolff
  0 siblings, 1 reply; 8+ messages in thread
From: Mark Wielaard @ 2017-01-06 19:17 UTC (permalink / raw)
  To: Milian Wolff; +Cc: elfutils-devel

On Fri, Jan 06, 2017 at 11:28:25AM +0100, Milian Wolff wrote:
> On Wednesday, January 4, 2017 2:42:23 PM CET Mark Wielaard wrote:
> > Longer answer. An address pointing into the PLT does
> > really point to an ELF symbol.
> 
> You mean: does _not_
> Right?

Yes, I meant "does not point".

> > If we have such a backend function then we could even
> > do what BFD apparently does. Which is to then create a
> > "fake" symbol with as name real_function@plt. But I am
> > not sure such fake symbols are very useful (and will
> > quickly become confusing since they aren't real ELF
> > symbols).
> 
> So the objdump command I used is leveraging BFD internally to give me the @plt 
> names? I noticed that I also see @plt in perf, which is also probably using 
> BFD internally. That at least clarifies why it works in some tools but not in 
> when using dwfl.

binutils objdump certainly does.

> > Hope that helps. And maybe inspires someone (you?) to
> > write up such a backend function and corresponding
> > dwfl frontend function.
> 
> It does help, thanks. I'm interested in contributing such functionality, but, 
> sadly, I'm not sure when I'll get the time to actually do it.

Thanks, wish I had spare time myself :)

Cheers,

Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: dwfl_module_addrinfo and @plt entries
  2017-01-06 19:17     ` Mark Wielaard
@ 2017-07-05 13:34       ` Milian Wolff
  2017-07-07 11:03         ` Mark Wielaard
  0 siblings, 1 reply; 8+ messages in thread
From: Milian Wolff @ 2017-07-05 13:34 UTC (permalink / raw)
  To: elfutils-devel; +Cc: Mark Wielaard

On Friday, January 6, 2017 8:17:53 PM CEST Mark Wielaard wrote:
> On Fri, Jan 06, 2017 at 11:28:25AM +0100, Milian Wolff wrote:
> > On Wednesday, January 4, 2017 2:42:23 PM CET Mark Wielaard wrote:
> > > Longer answer. An address pointing into the PLT does
> > > really point to an ELF symbol.
> > 
> > You mean: does _not_
> > Right?
> 
> Yes, I meant "does not point".
> 
> > > If we have such a backend function then we could even
> > > do what BFD apparently does. Which is to then create a
> > > "fake" symbol with as name real_function@plt. But I am
> > > not sure such fake symbols are very useful (and will
> > > quickly become confusing since they aren't real ELF
> > > symbols).
> > 
> > So the objdump command I used is leveraging BFD internally to give me the
> > @plt names? I noticed that I also see @plt in perf, which is also
> > probably using BFD internally. That at least clarifies why it works in
> > some tools but not in when using dwfl.
> 
> binutils objdump certainly does.
> 
> > > Hope that helps. And maybe inspires someone (you?) to
> > > write up such a backend function and corresponding
> > > dwfl frontend function.
> > 
> > It does help, thanks. I'm interested in contributing such functionality,
> > but, sadly, I'm not sure when I'll get the time to actually do it.
> 
> Thanks, wish I had spare time myself :)

I have now looked into this issue again and have found a way to workaround 
this limitation outside of elfutils, by manually resolving the address in a 
.plt section to a symbol. See:

https://github.com/KDAB/perfparser/commit/
885f88f3d66904cd94af65f802232f6c6dc339f4

This seems to work in my limited tests (only on X86_64). Beside the 32bit/
64bit difference, it isn't really platform dependent, is it? Or was this what 
you had in mind when you said the elfutils code would be "architecture 
specific [and] we would need a backend function that translates an address 
pointing into the PLT into an actual function address"?

If my code is roughly OK, then I'll try to put it into a patch for elfutils 
and submit it there. If it's fundamentally broken, please tell me. I still 
plan to get this functionality upstream into elfutils.

Cheers

-- 
Milian Wolff
mail@milianw.de
http://milianw.de

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: dwfl_module_addrinfo and @plt entries
  2017-07-05 13:34       ` Milian Wolff
@ 2017-07-07 11:03         ` Mark Wielaard
  2017-07-10 11:06           ` Milian Wolff
  0 siblings, 1 reply; 8+ messages in thread
From: Mark Wielaard @ 2017-07-07 11:03 UTC (permalink / raw)
  To: Milian Wolff; +Cc: elfutils-devel

Hi Milian,

First congrats on https://www.kdab.com/hotspot-gui-linux-perf-profiler/
Very cool.

On Wed, 2017-07-05 at 15:34 +0200, Milian Wolff wrote:
> On Friday, January 6, 2017 8:17:53 PM CEST Mark Wielaard wrote:
> I have now looked into this issue again and have found a way to workaround 
> this limitation outside of elfutils, by manually resolving the address in a 
> .plt section to a symbol. See:
> 
> https://github.com/KDAB/perfparser/commit/
> 885f88f3d66904cd94af65f802232f6c6dc339f4
> 
> This seems to work in my limited tests (only on X86_64). Beside the 32bit/
> 64bit difference, it isn't really platform dependent, is it? Or was this what 
> you had in mind when you said the elfutils code would be "architecture 
> specific [and] we would need a backend function that translates an address 
> pointing into the PLT into an actual function address"?
> 
> If my code is roughly OK, then I'll try to put it into a patch for elfutils 
> and submit it there. If it's fundamentally broken, please tell me. I still 
> plan to get this functionality upstream into elfutils.

Thanks for the research. I don't know if the PLT/GOT resolving works
identical for all architectures. But yes, it does look like what you
came up with is in general architecture independent.

In general it would be nice if we could avoid any name based section
lookups (or only do them as fallbacks) since we might not have section
headers (for example if you got the ELF image from memory).

I wonder if we can get all the information needed from the dynamic
segment. For example it seems we have a DT_JMPREL that points directly
at the .plt table, DT_PLTREL gives you what kind of relocation entries
REL or RELA it contains and DT_PLTRELSZ gives the size of the plt
table. 

In your code you get the GOT address through DT_PLTGOT, but then use
that address to lookup the .got.plt section and use its sh_addr to index
into the table. Why is that? Isn't that address equal to what you
already got through DT_PLTGOT? 

Thanks,

Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: dwfl_module_addrinfo and @plt entries
  2017-07-07 11:03         ` Mark Wielaard
@ 2017-07-10 11:06           ` Milian Wolff
  2017-08-28 14:28             ` Milian Wolff
  0 siblings, 1 reply; 8+ messages in thread
From: Milian Wolff @ 2017-07-10 11:06 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 2793 bytes --]

On Freitag, 7. Juli 2017 13:03:32 CEST Mark Wielaard wrote:
> Hi Milian,
> 
> First congrats on https://www.kdab.com/hotspot-gui-linux-perf-profiler/
> Very cool.
> 
> On Wed, 2017-07-05 at 15:34 +0200, Milian Wolff wrote:
> > On Friday, January 6, 2017 8:17:53 PM CEST Mark Wielaard wrote:
> > I have now looked into this issue again and have found a way to workaround
> > this limitation outside of elfutils, by manually resolving the address in
> > a
> > .plt section to a symbol. See:
> > 
> > https://github.com/KDAB/perfparser/commit/
> > 885f88f3d66904cd94af65f802232f6c6dc339f4
> > 
> > This seems to work in my limited tests (only on X86_64). Beside the 32bit/
> > 64bit difference, it isn't really platform dependent, is it? Or was this
> > what you had in mind when you said the elfutils code would be
> > "architecture specific [and] we would need a backend function that
> > translates an address pointing into the PLT into an actual function
> > address"?
> > 
> > If my code is roughly OK, then I'll try to put it into a patch for
> > elfutils
> > and submit it there. If it's fundamentally broken, please tell me. I still
> > plan to get this functionality upstream into elfutils.
> 
> Thanks for the research. I don't know if the PLT/GOT resolving works
> identical for all architectures. But yes, it does look like what you
> came up with is in general architecture independent.
> 
> In general it would be nice if we could avoid any name based section
> lookups (or only do them as fallbacks) since we might not have section
> headers (for example if you got the ELF image from memory).

Yes, the name comparison is ugly but I don't know any alternative. The sh_type 
is just SHT_PROGBITS afair and I couldn't find anything else to use. From what 
I gathered online, one could even (theoretically) change the name of the 
section and it would still work fine but my mapping would break. That said, at 
least this works for the common case.

> I wonder if we can get all the information needed from the dynamic
> segment. For example it seems we have a DT_JMPREL that points directly
> at the .plt table, DT_PLTREL gives you what kind of relocation entries
> REL or RELA it contains and DT_PLTRELSZ gives the size of the plt
> table.
> 
> In your code you get the GOT address through DT_PLTGOT, but then use
> that address to lookup the .got.plt section and use its sh_addr to index
> into the table. Why is that? Isn't that address equal to what you
> already got through DT_PLTGOT?

Indeed, that is convoluted. I tried to reverse-engineer the code from elf-
dissector, which does this mapping in reverse (no pun intended). Maybe I over-
complicated it. I'll research this when I'm back from vacation in two weeks.

Thanks

-- 
Milian Wolff
mail@milianw.de
http://milianw.de

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: dwfl_module_addrinfo and @plt entries
  2017-07-10 11:06           ` Milian Wolff
@ 2017-08-28 14:28             ` Milian Wolff
  0 siblings, 0 replies; 8+ messages in thread
From: Milian Wolff @ 2017-08-28 14:28 UTC (permalink / raw)
  To: elfutils-devel; +Cc: Mark Wielaard

On Monday, July 10, 2017 1:06:27 PM CEST Milian Wolff wrote:
> On Freitag, 7. Juli 2017 13:03:32 CEST Mark Wielaard wrote:
> > Hi Milian,
> > 
> > First congrats on https://www.kdab.com/hotspot-gui-linux-perf-profiler/
> > Very cool.
> > 
> > On Wed, 2017-07-05 at 15:34 +0200, Milian Wolff wrote:
> > > On Friday, January 6, 2017 8:17:53 PM CEST Mark Wielaard wrote:
> > > I have now looked into this issue again and have found a way to
> > > workaround
> > > this limitation outside of elfutils, by manually resolving the address
> > > in
> > > a
> > > .plt section to a symbol. See:
> > > 
> > > https://github.com/KDAB/perfparser/commit/
> > > 885f88f3d66904cd94af65f802232f6c6dc339f4
> > > 
> > > This seems to work in my limited tests (only on X86_64). Beside the
> > > 32bit/
> > > 64bit difference, it isn't really platform dependent, is it? Or was this
> > > what you had in mind when you said the elfutils code would be
> > > "architecture specific [and] we would need a backend function that
> > > translates an address pointing into the PLT into an actual function
> > > address"?
> > > 
> > > If my code is roughly OK, then I'll try to put it into a patch for
> > > elfutils
> > > and submit it there. If it's fundamentally broken, please tell me. I
> > > still
> > > plan to get this functionality upstream into elfutils.
> > 
> > Thanks for the research. I don't know if the PLT/GOT resolving works
> > identical for all architectures. But yes, it does look like what you
> > came up with is in general architecture independent.
> > 
> > In general it would be nice if we could avoid any name based section
> > lookups (or only do them as fallbacks) since we might not have section
> > headers (for example if you got the ELF image from memory).
> 
> Yes, the name comparison is ugly but I don't know any alternative. The
> sh_type is just SHT_PROGBITS afair and I couldn't find anything else to
> use. From what I gathered online, one could even (theoretically) change the
> name of the section and it would still work fine but my mapping would
> break. That said, at least this works for the common case.
> 
> > I wonder if we can get all the information needed from the dynamic
> > segment. For example it seems we have a DT_JMPREL that points directly
> > at the .plt table, DT_PLTREL gives you what kind of relocation entries
> > REL or RELA it contains and DT_PLTRELSZ gives the size of the plt
> > table.
> > 
> > In your code you get the GOT address through DT_PLTGOT, but then use
> > that address to lookup the .got.plt section and use its sh_addr to index
> > into the table. Why is that? Isn't that address equal to what you
> > already got through DT_PLTGOT?
> 
> Indeed, that is convoluted. I tried to reverse-engineer the code from elf-
> dissector, which does this mapping in reverse (no pun intended). Maybe I
> over- complicated it. I'll research this when I'm back from vacation in two
> weeks.

Hey Mark,

more than two weeks passed, but I finally had some time to investigate the 
above. I have a hard time justifying what I wrote, I can only explain what I'm 
seeing. Can you maybe add your comments in the below? I think I'm just missing 
something that you have in your mind to shortcut my code:

- find dynamic segment via SHT_DYNAMIC (1st loop)
- in there, find address of PLTGOT segment via DT_PLTGOT
- find corresponding segment (2nd loop) for PLTGOT
- in there, find address for requested symbol index, offset by two
- ...

I mean, searching the address in the first loop is not a goal per se. What I'm 
looking for is the Scn/Shdr that contains the PLTGOT. The first loop allows me 
to identify the PLTGOT via it's address, but to actually get my hands on the 
corresponding Scn/Shdr I still need the second loop, no? Or can I somehow 
translate the PLTGOT address to a Scn/Shdr directly?

Thanks

-- 
Milian Wolff
mail@milianw.de
http://milianw.de


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-08-28 14:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-04  0:41 dwfl_module_addrinfo and @plt entries Milian Wolff
2017-01-04 13:42 ` Mark Wielaard
2017-01-06 10:28   ` Milian Wolff
2017-01-06 19:17     ` Mark Wielaard
2017-07-05 13:34       ` Milian Wolff
2017-07-07 11:03         ` Mark Wielaard
2017-07-10 11:06           ` Milian Wolff
2017-08-28 14:28             ` Milian Wolff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).