From: Simon Marchi <simark@simark.ca>
To: Florian Weimer <fweimer@redhat.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: gdb@sourceware.org, Simon Marchi <simon.marchi@efficios.com>,
libc-alpha <libc-alpha@sourceware.org>
Subject: Re: rseq and gdb
Date: Fri, 17 Dec 2021 10:51:34 -0500 [thread overview]
Message-ID: <ce8e87f6-8b8c-d29f-1f77-afdf25f615bb@simark.ca> (raw)
In-Reply-To: <87wnk4fe3t.fsf@oldenburg.str.redhat.com>
On 2021-12-16 4:12 p.m., Florian Weimer via Gdb wrote:
> * Mathieu Desnoyers:
>
>> I suspect that gdb should ideally do something to allow it to
>> single-step through rseq critical sections. In librseq [1], we emit
>> the __rseq_cs_ptr_array and __rseq_exit_point_array sections to allow
>> gdb to know about rseq critical sections and skip over those critical
>> sections as needed. Otherwise single-stepping over each instruction of
>> a rseq critical section will loop forever.
>
> That's probably something that should be expressed in the DWARF data.
I thought about this a little bit, and I don't think it belongs in
DWARF:
- If it's in DWARF, it means that by default (without installing the
corresponding -dbg/-debug package), the information is not
available. The user might have debug info for their program /
libraries, but not for libc.so, so GDB won't know about the critical
sections in libc.so.
GDB should ideally know about critical sections even if there's not
debug info. Let's say that they do "continue" while having a
software watchpoint, GDB will be in "single step all instructions"
mode. If execution happens to cross a critical rseq section,
execution will hang. So it seems better to have this information in
the main binary (not the separate debug info). It should be very
small, just a few addresses, so I don't think size will be an issue.
- Currently, the ELF sections are emitted by macros that add some
assembly directives and labels, that's pretty straightforward. If
it's in DWARF, I suppose it needs to go through the compiler. The
rseq library needs to communicate the information to the compiler,
which will then put it somewhere in a DWARF section. Nothing
impossible, I suppose, is there a precedent for this way of doing
things?
- A philosophical argument more than practical: the goal of DWARF is to
express the mapping between the source language and the machine code,
rseq critical sections are more low level platform details.
A slightly corner use case I am wondering about is GDB attaching to a
process that uses an rseq-using library (libc or another) whose .so
doesn't exist on disk anymore (perhaps because the package has been
upgraded in the mean time). With the rseq sections information in the
binary (regardless if it is in a simple ELF section or in DWARF), GDB
wouldn't be able to load it then. The current simple ELF section is
marked as allocated (at least, that's what Mathieu told me :)), which
means the information is somewhere in memory, but to find it GDB would
need to get its symbol from an address, which is not possible without
the binary. The only solution I could see for this is if the dynamic
linker structures gave us that information (either the address of the
allocated section or the tables themselves).
>> Now that glibc plans to enable rseq by default starting with glibc 2.35,
>> it appears to be a good timing to raise this topic with the gdb community.
>
> It's less of an issue than the CRIU problem that we discussed because it
> will only affect attempts to debug actually rseq-using applications.
> (The CRIU problem affects everything.) Not saying that debugging
> support isn't important, just trying to put it into perspective. 8-)
Indeed. Although now is time to think about it, so we don't make
choices we regret later.
Simon
prev parent reply other threads:[~2021-12-17 15:51 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-16 21:09 Mathieu Desnoyers
2021-12-16 21:12 ` Florian Weimer
2021-12-17 15:51 ` Simon Marchi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ce8e87f6-8b8c-d29f-1f77-afdf25f615bb@simark.ca \
--to=simark@simark.ca \
--cc=fweimer@redhat.com \
--cc=gdb@sourceware.org \
--cc=libc-alpha@sourceware.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=simon.marchi@efficios.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).