public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Luis Machado <luis.machado@arm.com>
To: Andrew Burgess <aburgess@redhat.com>,
	Simon Marchi <simark@simark.ca>,
	gdb-patches@sourceware.org
Subject: Re: [PATCH] gdb: relax requirement for the map_failed stap probe to be present
Date: Mon, 5 Dec 2022 10:27:45 +0000	[thread overview]
Message-ID: <9135877b-0e37-6fd2-8c69-e9d0376b75e7@arm.com> (raw)
In-Reply-To: <87mt8240w8.fsf@redhat.com>

On 12/5/22 10:09, Andrew Burgess wrote:
> Luis Machado <luis.machado@arm.com> writes:
> 
>> On 11/29/22 08:27, Luis Machado wrote:
>>> Hi Andrew,
>>>
>>> This seems to have broken armhf on Ubuntu 22.04.
>>>
>>> https://builder.sourceware.org/buildbot/#/builders/169/builds/1163
>>>
>>
>> I haven't investigated this. I only spotted it in the sourceware buildbot page. But I'd guess it is something to do
>> with thumb mode detection early in the startup.
>>
>> Ubuntu has moved to not stripping ld.so for armhf because the probe mechanism is not capable of conveying the thumb mode
>> information, so gdb has to rely on symbols instead. If the changes have touched this fragile area, it may have broken the
>> delicate balance.
> 
> I somehow missed this last week.  I'll take a look today.
> 
> Sorry for the breakage.

No problem. For the record, I haven't managed to reproduce this on my Ubuntu 20.04/22.04 setup. I'll have to give it a try on the
buildbot machine.

Just a heads-up in case this doesn't reproduce for you either.

> 
> Andrew
> 
> 
>>
>>
>>> On 11/28/22 17:18, Andrew Burgess via Gdb-patches wrote:
>>>>
>>>> Thanks you all for your review feedback.
>>>>
>>>> I've now pushed this version of the patch (inc Pedro's suggested typo
>>>> fix).
>>>>
>>>> Thanks,
>>>> Andrew
>>>>
>>>>
>>>> Andrew Burgess <aburgess@redhat.com> writes:
>>>>
>>>>> Simon Marchi <simark@simark.ca> writes:
>>>>>
>>>>>> On 11/22/22 10:09, Andrew Burgess via Gdb-patches wrote:
>>>>>>>   From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>>>>>> included in glibc.  The removal of the probe looks like an accident,
>>>>>>> but it was caused by a glibc commit which meant that the "map_failed"
>>>>>>> probe could no longer be reached; the compiler than helpfully
>>>>>>> optimised out the probe.
>>>>>>>
>>>>>>> In GDB, in solib-svr4.c, we have a list of probes that we look for
>>>>>>> related to the shared library loading detection.  If any of these
>>>>>>> probes are missing then GDB will fall back to the non-probe based
>>>>>>> mechanism for detecting shared library loading.  The "map_failed"
>>>>>>> probe is include in the list of required probes.
>>>>>>>
>>>>>>> This means that on glibc 2.35 (or later) systems, GDB is going to
>>>>>>> always fall back to the non-probes based mechanism for detecting
>>>>>>> shared library loading.
>>>>>>>
>>>>>>> I raised a glibc bug to discuss this issue:
>>>>>>>
>>>>>>>     https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>>>>>>
>>>>>>> But, whatever the ultimate decision from the glibc team, given there
>>>>>>> are version of glibc in the wild without the "map_failed" probe, we
>>>>>>> probably should update GDB to handle this situation.
>>>>>>>
>>>>>>> The "map_failed" probe is already a little strange, very early
>>>>>>> versions of glibc didn't include this probe, so, in some cases, if
>>>>>>> this probe is missing GDB is happy to ignore it.  In this commit I
>>>>>>> just expand this logic to make the "map_failed" probe fully optional.
>>>>>>>
>>>>>>> With this commit in place, then, when using a glibc 2.35 or later
>>>>>>> system, GDB will once again use the stap probes for shared library
>>>>>>> detection.
>>>>>>> ---
>>>>>>>    gdb/solib-svr4.c | 13 +++++++++----
>>>>>>>    1 file changed, 9 insertions(+), 4 deletions(-)
>>>>>>>
>>>>>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>>>>>> index 6acaf87960b..87cd06f251a 100644
>>>>>>> --- a/gdb/solib-svr4.c
>>>>>>> +++ b/gdb/solib-svr4.c
>>>>>>> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>>>>>          probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>>>>>> -      /* The "map_failed" probe did not exist in early
>>>>>>> -     versions of the probes code in which the probes'
>>>>>>> -     names were prefixed with "rtld_".  */
>>>>>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>>>>>> +      /* The "map_failed" probe did not exist in early versions of the
>>>>>>> +     probes code in which the probes' names were prefixed with
>>>>>>> +     "rtld_".
>>>>>>> +
>>>>>>> +     Additionally, the "map_failed" probe was accidentally removed
>>>>>>> +     from glibc 2.35 and later, when changes in glibc meant the probe
>>>>>>> +     could no longer be reached.  In this case the probe name doesn't
>>>>>>> +     have the "rtld_" prefix.  */
>>>>>>> +      if (streq (probe_info[i].name, "map_failed"))
>>>>>>>        continue;
>>>>>>>          /* Ensure at least one probe for the current name was found.  */
>>>>>>>
>>>>>>> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
>>>>>>> -- 
>>>>>>> 2.25.4
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I looked at this separately, and this was one of the fixes I considered.
>>>>>>
>>>>>> Another option was to make GDB not give up on the probes interface if
>>>>>> failing to look up a probe whose action is DO_NOTHING.  Probes with that
>>>>>> action are not used by GDB for solib bookkeeping, but can be used to
>>>>>> stop on solib events, with "set stop-on-solib-events".  I was just
>>>>>> worried if there was some cases where a probe would be missing, but the
>>>>>> corresponding event could be caught if using the original interface.  In
>>>>>> that case, using the probes interface would be a regression.  But it's
>>>>>> probably not worth wondering about.  If that happens it's just a bug
>>>>>> that needs to be fixed.  In the case we are looking at, if the
>>>>>> map_failed probe gets optimized out, then surely the corresponding call
>>>>>> to the r_brk function would also be optimized out.
>>>>>
>>>>> I also considered just ignoring any probe was (a) missing, and (b) had a
>>>>> DO_NOTHING action.  The reason I didn't post this patch was because, at
>>>>> the time, my thinking was: if we don't care about any probe with a
>>>>> DO_NOTHING action, why even look for those probes, why not just remove
>>>>> them from the list?
>>>>>
>>>>> I think you've (partially) convinced me that the user might be
>>>>> interested in seeing a stop at these probes even if GDB's action is
>>>>> DO_NOTHING.
>>>>>
>>>>> I say partially above because GDB doesn't really do anything to tell the
>>>>> user which probe we stopped at, e.g. was is "init_start", "map_start",
>>>>> "map_failed", etc.  The user might be able to figure it out from the
>>>>> backtrace, but I still think it's not going to be trivial in all cases,
>>>>> e.g. "map_start" and "map_failed" are both located in the same function,
>>>>> so I think the user would need to lookup the probe address in the ELF,
>>>>> then compare that to the stop address.  Not impossible, but, I suspect,
>>>>> the complexity is an indication that users are not doing this much.
>>>>> Thus, I suspect, in reality, nobody really cares about the DO_NOTHING
>>>>> probes.
>>>>>
>>>>> However, I think there is enough of a justification there for keeping
>>>>> the probes in the list, and just skipping any DO_NOTHING probes that
>>>>> don't exist.
>>>>>
>>>>> Below then, is an alternative patch.  I don't have a strong preference
>>>>> between this one, and the original[1], but I thought I'd post this for
>>>>> discussion.  If this is preferred then I can just merge this.
>>>>>
>>>>> [1] I'll also post an update to the original patch shortly that
>>>>> addresses Lancelot's feedback.
>>>>>
>>>>> Thanks,
>>>>> Andrew
>>>>>
>>>>> ---
>>>>>
>>>>> commit 11be1f25f446e68c23d0709cde46e32ff24b7eb9
>>>>> Author: Andrew Burgess <aburgess@redhat.com>
>>>>> Date:   Tue Nov 22 12:45:56 2022 +0000
>>>>>
>>>>>       gdb: relax requirement for the map_failed stap probe to be present
>>>>>       From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>>>>       included in glibc.  The removal of the probe looks like an accident,
>>>>>       but it was caused by a glibc commit which meant that the "map_failed"
>>>>>       probe could no longer be reached; the compiler than helpfully
>>>>>       optimised out the probe.
>>>>>       In GDB, in solib-svr4.c, we have a list of probes that we look for
>>>>>       related to the shared library loading detection.  If any of these
>>>>>       probes are missing then GDB will fall back to the non-probe based
>>>>>       mechanism for detecting shared library loading.  The "map_failed"
>>>>>       probe is include in the list of required probes.
>>>>>       This means that on glibc 2.35 (or later) systems, GDB is going to
>>>>>       always fall back to the non-probes based mechanism for detecting
>>>>>       shared library loading.
>>>>>       I raised a glibc bug to discuss this issue:
>>>>>         https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>>>>       But, whatever the ultimate decision from the glibc team, given there
>>>>>       are version of glibc in the wild without the "map_failed" probe, we
>>>>>       probably should update GDB to handle this situation.
>>>>>       The "map_failed" probe is already a little strange, very early
>>>>>       versions of glibc didn't include this probe, so, in some cases, if
>>>>>       this probe is missing GDB is happy to ignore it.  This is fine, the
>>>>>       action associated with this probe inside GDB is DO_NOTHING, this means
>>>>>       the probe isn't actually required in order for GDB to correctly detect
>>>>>       the loading of shared libraries.
>>>>>       In this commit I propose changing the rules so that any probe whose
>>>>>       action is DO_NOTHING, is optional.
>>>>>       There is one possible downside to this change, and that concerns 'set
>>>>>       stop-on-solib-events on'.  If a probe is removed from glibc, but the
>>>>>       old style breakpoint based mechanism is still in place within glibc
>>>>>       for that same event, then GDB will stop when using the old style
>>>>>       non-probe based mechanism, but not when using the probes based
>>>>>       mechanism.
>>>>>       For the map_failed case this is not a problem, both the map_failed
>>>>>       probe, and the call to the old style breakpoint location were
>>>>>       optimised out, and so neither event (probes based, or breakpoint
>>>>>       based) will trigger.  This would only become an issue if glibc removed
>>>>>       a probe, but left the breakpoint in place (this would almost certainly
>>>>>       be a bug in glibc).
>>>>>       For now, I'm proposing that we just don't worry about this.  Because
>>>>>       some probes have actions that are not DO_NOTHING, then we know the
>>>>>       user will always seem _some_ stops when a shared library is
>>>>>       loaded/unloaded, and (I'm guessing), in most cases, that's all they
>>>>>       care about.  I figure when someone complains then we can figure out
>>>>>       what the right solution is then.
>>>>>       With this commit in place, then, when using a glibc 2.35 or later
>>>>>       system, GDB will once again use the stap probes for shared library
>>>>>       detection.
>>>>>
>>>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>>>> index 6acaf87960b..10e446af908 100644
>>>>> --- a/gdb/solib-svr4.c
>>>>> +++ b/gdb/solib-svr4.c
>>>>> @@ -2205,15 +2205,34 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>>>          probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>>>> -      /* The "map_failed" probe did not exist in early
>>>>> -     versions of the probes code in which the probes'
>>>>> -     names were prefixed with "rtld_".  */
>>>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>>>> -    continue;
>>>>> -
>>>>>          /* Ensure at least one probe for the current name was found.  */
>>>>>          if (probes[i].empty ())
>>>>> -    return false;
>>>>> +    {
>>>>> +      /* The "map_failed" probe did not exist in early versions of the
>>>>> +         probes code in which the probes' names were prefixed with
>>>>> +         "rtld_".
>>>>> +
>>>>> +         Additionally, the "map_failed" probe was accidentally removed
>>>>> +         from glibc 2.35 and later, when changes in glibc meant the
>>>>> +         probe could no longer be reached, and the compiler optimized
>>>>> +         the probe away.  In this case the probe name doesn't have the
>>>>> +         "rtld_" prefix.
>>>>> +
>>>>> +         To handle this, and give GDB as much flexibility as possible,
>>>>> +         we make the rule that, if a probe isn't required for the
>>>>> +         correct operation of GDB (i.e. it's action is DO_NOTHING),
>>>>> +         then we will still use the probes interface, even if that
>>>>> +         probe is missing.
>>>>> +
>>>>> +         The only (possible) downside of this is that, if the user has
>>>>> +         'set stop-on-solib-events on' in effect, then they might get
>>>>> +         fewer events using the probes interface than with the classic
>>>>> +         non-probes interface.  */
>>>>> +      if (prove_info[i].action == DO_NOTHING)
>>>>> +        continue;
>>>>> +      else
>>>>> +        return false;
>>>>> +    }
>>>>>          /* Ensure probe arguments can be evaluated.  */
>>>>>          for (probe *p : probes[i])
>>>>
>>>
> 


  reply	other threads:[~2022-12-05 10:27 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-22 15:09 Andrew Burgess
2022-11-22 15:31 ` Simon Marchi
2022-11-24 10:46   ` Andrew Burgess
2022-11-24 15:10     ` Simon Marchi
2022-11-24 16:13     ` Lancelot SIX
2022-11-28 15:47     ` Pedro Alves
2022-11-28 17:18     ` Andrew Burgess
2022-11-29  8:27       ` Luis Machado
2022-11-29  8:38         ` Luis Machado
2022-12-05 10:09           ` Andrew Burgess
2022-12-05 10:27             ` Luis Machado [this message]
2022-12-05 12:04               ` Andrew Burgess
2022-12-05 12:55                 ` Luis Machado
2022-11-22 15:33 ` Lancelot SIX
2022-11-24 11:39   ` Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9135877b-0e37-6fd2-8c69-e9d0376b75e7@arm.com \
    --to=luis.machado@arm.com \
    --cc=aburgess@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=simark@simark.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).