public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* GDB shared library tracking with stap probes x _dl_debug_state
@ 2021-05-07 19:42 Luis Machado
  2021-05-07 20:44 ` Florian Weimer
  0 siblings, 1 reply; 7+ messages in thread
From: Luis Machado @ 2021-05-07 19:42 UTC (permalink / raw)
  To: libc-alpha, gdb; +Cc: doko, Adhemerval Zanella

Hi,

I'm cc-ing the GDB ML as well, as this might be an issue for other 
architectures that store flags in ELF symbols like armhf.

Matthias (cc-ed) reported the following ticket on GDB's bugzilla: 
https://sourceware.org/bugzilla/show_bug.cgi?id=27826

This is related to how GDB tracks shared library loads/unloads in 
dynamically-linked executables. GDB is aided by some hooks provided by 
the dynamic linker.

There are two ways GDB will track these shared library events:

* _dl_debug_state mechanism

This is a dummy function that gets called by the dynamic linker's 
dl_main (...) function so debbugers can breakpoint it and track shared 
library events.

_dl_debug_state is a real ELF symbol that lives in .dynsym. This is a 
fallback mechanism in GDB these days.

* stap probes

This is a more recent approach where some probe points are provided by 
the ELF file and GDB breakpoints a list of known probes instead.

There are no real ELF symbols here, just probe names and addresses that 
debuggers should use to put breakpoints into. This is the preferred way 
to track shared library events in GDB nowadays.

Going back to bz27826, up until Ubuntu 18.04 (glibc 2.27) on armhf, GDB 
used the _dl_debug_state mechanism to track shared library events. This 
is due to a bug in stap that made GDB fail a check, thus falling back to 
using the _dl_debug_state mechanism.

With Ubuntu 20.04 (glibc 2.31), this check no longer fails and GDB 
decide to use the new stap mechanism instead.

That's all fine, but there is one small detail that doesn't work for 
armhf, and that is discovering if we're dealing with a PC that is arm 
mode or thumb mode.

armhf's GDB uses a few strategies to figure out the mode: mapping 
symbols, LSB of the PC and ELF symbol flags.

Given distros usually strip binaries (ld.so is also stripped), only a 
few symbols are left in the executable file itself, and _dl_debug_state 
is one of them.

GDB can still peak at the _dl_debug_state ELF symbol and retrieve the 
flag that indicates we have a arm or thumb mode function. That way GDB 
can place the proper arm/thumb breakpoint at the address pointed to by 
r_brk.

With the stap probes approach, this is not possible. As was said before, 
the probe points are not real ELF symbols. They're just metadata with a 
name and an address.

Of course we could lookup what symbol contains a particular probe 
address, but those symbols are not available in stripped binaries 
(_dl_main, for example).

So GDB is left with no useful information to insert the right kind of 
breakpoint for the specified address. It defaults to arm mode, and, 
since dl_main is thumb mode, things just break.

I believe this may also be a problem for MIPS, since it has to determine 
the ISA bit for some operations.

Now, two or three possible solutions exist:

1 - Force GDB to fallback to using _dl_debug_state for armhf (and 
possibly other architectures). This is considered bad because the 
affected architectures can't take advantage of a more advanced mechanism 
for tracking shared library events.

2 - Not stripping ld.so/glibc. I can't determine the impact of this 
choice, but distros strip binaries for a reason. Having to carry all 
symbols for a particular library may not be desirable.

It is also not desirable to force users to install a dbg package for 
ld.so/glibc just to be able to use a debugger.

3 - Strip symbols from ld.so/glibc, but keep a few select critical 
symbols that debuggers will want to use. I've been told this may be a 
bit undesirable from glibc's perspective.

I noticed the probe points fall into the following functions: _dl_main, 
_dl_map_object_from_fd, lose, dl_open_worker and _dl_close_worker.

If we keep those symbols, GDB will be able to figure out what mode we 
have and the proper breakpoint to use for each of those symbols.

Before making a decision, it sounds best to discuss this and come up 
with the best solution for both projects and the distros.

Thoughts?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GDB shared library tracking with stap probes x _dl_debug_state
  2021-05-07 19:42 GDB shared library tracking with stap probes x _dl_debug_state Luis Machado
@ 2021-05-07 20:44 ` Florian Weimer
  2021-05-07 21:44   ` Luis Machado
  0 siblings, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2021-05-07 20:44 UTC (permalink / raw)
  To: Luis Machado via Libc-alpha; +Cc: gdb, Luis Machado, doko

* Luis Machado via Libc-alpha:

> That's all fine, but there is one small detail that doesn't work for
> armhf, and that is discovering if we're dealing with a PC that is arm 
> mode or thumb mode.

Is it possible to recognize Arm mode vs thumb mode based on the NOP
encoding at the probe address?

> 2 - Not stripping ld.so/glibc. I can't determine the impact of this
> choice, but distros strip binaries for a reason. Having to carry all 
> symbols for a particular library may not be desirable.

We are switching Fedora to not strip ld.so, primarily for introspection
purposes in Systemtap.

(In Fedora, we've preserved the symbol table for ages, to make valgrind
work.)

Thanks,
Florian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GDB shared library tracking with stap probes x _dl_debug_state
  2021-05-07 20:44 ` Florian Weimer
@ 2021-05-07 21:44   ` Luis Machado
  2021-05-07 21:56     ` Sergio Durigan Junior
  0 siblings, 1 reply; 7+ messages in thread
From: Luis Machado @ 2021-05-07 21:44 UTC (permalink / raw)
  To: Florian Weimer, Luis Machado via Libc-alpha; +Cc: gdb, doko

On 5/7/21 5:44 PM, Florian Weimer wrote:
> * Luis Machado via Libc-alpha:
> 
>> That's all fine, but there is one small detail that doesn't work for
>> armhf, and that is discovering if we're dealing with a PC that is arm
>> mode or thumb mode.
> 
> Is it possible to recognize Arm mode vs thumb mode based on the NOP
> encoding at the probe address?
> 

If we know the instruction is a NOP, it might be possible. But the 
function that checks this, arm_pc_is_thumb (...), is generic and gets 
called to determine if arbitrary PC's are arm or thumb.

It would be somewhat hacky to do it this way. It would be more natural 
to let arm_pc_is_thumb figure out symbols on its own without corner cases.

(gdb) maint info br
Num     Type           Disp Enb Address    What
-1      shlib events   keep n   0xb6fd7b5a  inf 1

(gdb) show arm force-mode
The current execution mode assumed (even when symbols are available) is 
"auto".
(gdb) x/i 0xb6fd7b5a
    0xb6fd7b5a:                  ; <UNDEFINED> instruction: 0xf8dfbf00
(gdb) set arm force-mode thumb
(gdb) x/i 0xb6fd7b5a
    0xb6fd7b5a:  nop

>> 2 - Not stripping ld.so/glibc. I can't determine the impact of this
>> choice, but distros strip binaries for a reason. Having to carry all
>> symbols for a particular library may not be desirable.
> 
> We are switching Fedora to not strip ld.so, primarily for introspection
> purposes in Systemtap.
> 
> (In Fedora, we've preserved the symbol table for ages, to make valgrind
> work.)

That's good information, and a more reasonable approach to solve this 
problem from GDB's point of view.

> 
> Thanks,
> Florian
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GDB shared library tracking with stap probes x _dl_debug_state
  2021-05-07 21:44   ` Luis Machado
@ 2021-05-07 21:56     ` Sergio Durigan Junior
  2021-05-08 10:55       ` Florian Weimer
  0 siblings, 1 reply; 7+ messages in thread
From: Sergio Durigan Junior @ 2021-05-07 21:56 UTC (permalink / raw)
  To: Luis Machado via Gdb
  Cc: Florian Weimer, Luis Machado via Libc-alpha, Luis Machado, doko

On Friday, May 07 2021, Luis Machado via Gdb wrote:

> On 5/7/21 5:44 PM, Florian Weimer wrote:
>> * Luis Machado via Libc-alpha:
>> 
>>> That's all fine, but there is one small detail that doesn't work for
>>> armhf, and that is discovering if we're dealing with a PC that is arm
>>> mode or thumb mode.
>> Is it possible to recognize Arm mode vs thumb mode based on the NOP
>> encoding at the probe address?
>> 
>
> If we know the instruction is a NOP, it might be possible.

I think it's guaranteed that the instruction is always going to be a
NOP.

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
https://sergiodj.net/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GDB shared library tracking with stap probes x _dl_debug_state
  2021-05-07 21:56     ` Sergio Durigan Junior
@ 2021-05-08 10:55       ` Florian Weimer
  2021-05-10 14:16         ` Luis Machado
  0 siblings, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2021-05-08 10:55 UTC (permalink / raw)
  To: systemtap
  Cc: Sergio Durigan Junior, Luis Machado via Gdb,
	Luis Machado via Libc-alpha, Luis Machado, doko

* Sergio Durigan Junior:

> On Friday, May 07 2021, Luis Machado via Gdb wrote:
>
>> On 5/7/21 5:44 PM, Florian Weimer wrote:
>>> * Luis Machado via Libc-alpha:
>>> 
>>>> That's all fine, but there is one small detail that doesn't work for
>>>> armhf, and that is discovering if we're dealing with a PC that is arm
>>>> mode or thumb mode.
>>> Is it possible to recognize Arm mode vs thumb mode based on the NOP
>>> encoding at the probe address?
>>> 
>>
>> If we know the instruction is a NOP, it might be possible.
>
> I think it's guaranteed that the instruction is always going to be a
> NOP.

Maybe we can add a comment to that effect to the Systemtap sources?

Start of the thread is here:

  <https://sourceware.org/pipermail/gdb/2021-May/049421.html>

I think there are four distinct two-byte patterns at the probe
addressing, depending on endianess and thumb/non-thumb mode.  Looking at
the instruction has the clear advantage that it works with today's
binaries.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GDB shared library tracking with stap probes x _dl_debug_state
  2021-05-08 10:55       ` Florian Weimer
@ 2021-05-10 14:16         ` Luis Machado
  2021-05-24 14:22           ` Luis Machado
  0 siblings, 1 reply; 7+ messages in thread
From: Luis Machado @ 2021-05-10 14:16 UTC (permalink / raw)
  To: Florian Weimer, systemtap
  Cc: Sergio Durigan Junior, Luis Machado via Gdb,
	Luis Machado via Libc-alpha, doko, Maciej W. Rozycki,
	Ulrich Weigand

cc-ing Maciej and Ulrich for feedback about MIPS / rs6000. I see both 
architectures rely on marking some symbols as special, for different 
purposes.

On 5/8/21 7:55 AM, Florian Weimer wrote:
> * Sergio Durigan Junior:
> 
>> On Friday, May 07 2021, Luis Machado via Gdb wrote:
>>
>>> On 5/7/21 5:44 PM, Florian Weimer wrote:
>>>> * Luis Machado via Libc-alpha:
>>>>
>>>>> That's all fine, but there is one small detail that doesn't work for
>>>>> armhf, and that is discovering if we're dealing with a PC that is arm
>>>>> mode or thumb mode.
>>>> Is it possible to recognize Arm mode vs thumb mode based on the NOP
>>>> encoding at the probe address?
>>>>
>>>
>>> If we know the instruction is a NOP, it might be possible.
>>
>> I think it's guaranteed that the instruction is always going to be a
>> NOP.

That's good, but ...

> 
> Maybe we can add a comment to that effect to the Systemtap sources?
> 
> Start of the thread is here:
> 
>    <https://sourceware.org/pipermail/gdb/2021-May/049421.html>
> 
> I think there are four distinct two-byte patterns at the probe
> addressing, depending on endianess and thumb/non-thumb mode.  Looking at
> the instruction has the clear advantage that it works with today's
> binaries.
... the way the breakpoint selection works doesn't take into account 
additional input like this. It would be a non-trivial change. Not too 
complex, but still not trivial.

If this is to be implemented, it would be nice to make sure there are 
other architectures affected by this problem and that this sort of 
solution also works for them.

Maciej, do you think MIPS will run into the same issue? If so, is this 
an acceptable solution?

I still think having the symbol information is a cleaner solution.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GDB shared library tracking with stap probes x _dl_debug_state
  2021-05-10 14:16         ` Luis Machado
@ 2021-05-24 14:22           ` Luis Machado
  0 siblings, 0 replies; 7+ messages in thread
From: Luis Machado @ 2021-05-24 14:22 UTC (permalink / raw)
  To: Florian Weimer, systemtap
  Cc: Sergio Durigan Junior, Luis Machado via Gdb,
	Luis Machado via Libc-alpha, doko, Maciej W. Rozycki,
	Ulrich Weigand

Maciej,

Do you have any feedback on this for MIPS?

On 5/10/21 11:16 AM, Luis Machado wrote:
> cc-ing Maciej and Ulrich for feedback about MIPS / rs6000. I see both 
> architectures rely on marking some symbols as special, for different 
> purposes.
> 
> On 5/8/21 7:55 AM, Florian Weimer wrote:
>> * Sergio Durigan Junior:
>>
>>> On Friday, May 07 2021, Luis Machado via Gdb wrote:
>>>
>>>> On 5/7/21 5:44 PM, Florian Weimer wrote:
>>>>> * Luis Machado via Libc-alpha:
>>>>>
>>>>>> That's all fine, but there is one small detail that doesn't work for
>>>>>> armhf, and that is discovering if we're dealing with a PC that is arm
>>>>>> mode or thumb mode.
>>>>> Is it possible to recognize Arm mode vs thumb mode based on the NOP
>>>>> encoding at the probe address?
>>>>>
>>>>
>>>> If we know the instruction is a NOP, it might be possible.
>>>
>>> I think it's guaranteed that the instruction is always going to be a
>>> NOP.
> 
> That's good, but ...
> 
>>
>> Maybe we can add a comment to that effect to the Systemtap sources?
>>
>> Start of the thread is here:
>>
>>    <https://sourceware.org/pipermail/gdb/2021-May/049421.html>
>>
>> I think there are four distinct two-byte patterns at the probe
>> addressing, depending on endianess and thumb/non-thumb mode.  Looking at
>> the instruction has the clear advantage that it works with today's
>> binaries.
> ... the way the breakpoint selection works doesn't take into account 
> additional input like this. It would be a non-trivial change. Not too 
> complex, but still not trivial.
> 
> If this is to be implemented, it would be nice to make sure there are 
> other architectures affected by this problem and that this sort of 
> solution also works for them.
> 
> Maciej, do you think MIPS will run into the same issue? If so, is this 
> an acceptable solution?
> 
> I still think having the symbol information is a cleaner solution.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-05-24 14:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-07 19:42 GDB shared library tracking with stap probes x _dl_debug_state Luis Machado
2021-05-07 20:44 ` Florian Weimer
2021-05-07 21:44   ` Luis Machado
2021-05-07 21:56     ` Sergio Durigan Junior
2021-05-08 10:55       ` Florian Weimer
2021-05-10 14:16         ` Luis Machado
2021-05-24 14:22           ` Luis Machado

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).