public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness
@ 2021-05-07  8:44 Tom de Vries
  2021-05-07 19:27 ` Simon Marchi
  0 siblings, 1 reply; 10+ messages in thread
From: Tom de Vries @ 2021-05-07  8:44 UTC (permalink / raw)
  To: gdb-patches

Hi,

In a linux kernel mailing list discussion, it was mentioned that "gdb has
this odd thing where it takes the 64-bit vs 32-bit data for the whole process
from one thread, and picks the worst possible thread to do it (ie explicitly
not even the main thread, ...)" [1].

The picking of the thread is done here in
x86_linux_nat_target::read_description:
...
  /* GNU/Linux LWP ID's are process ID's.  */
  tid = inferior_ptid.lwp ();
  if (tid == 0)
    tid = inferior_ptid.pid (); /* Not a threaded program.  */
...

To understand what this code does, let's investigate a scenario in which
inferior_ptid.lwp () != inferior_ptid.pid ().

Say we start exec jit-attach-pie, identified with pid x.  The main thread
starts another thread that sleeps, and then the main thread waits for the
sleeping thread.  So we have two threads, identified with LWP IDs x and x+1:
...
PID  LWP  CMD
x    x    ./jit-attach-pie
x    x+1  ./jit-attach-pie
...
[ The thread with LWP x is known as the thread group leader. ]

When attaching to this exec using the pid, gdb does a stop_all_threads which
iterates over all the threads, first LWP x, and then LWP x+1.

So the state we arrive with at x86_linux_nat_target::read_description is:
...
(gdb) p inferior_ptid
$1 = {m_pid = x, m_lwp = x+1, m_tid = 0}
...
and consequently we probe 64/32-bitness from thread LWP x+1.

[ Note that this is different from when gdb doesn't attach but instead
launches the exec itself, in which case there's no stop_all_threads needed,
and the probed thread is LWP x. ]

According to aforementioned remark, a better choice would have been the main
thread, that is, LWP x.

This patch implement that choice, by simply doing:
...
  tid = inferior_ptid.pid ();
...

The fact that gdb makes a per-process choice for 64/32-bitness is a problem in
itself: each thread can be in either 64 or 32 bit mode.  That is a problem
that this patch doesn't fix.

Tested on x86_64-linux.

[1] https://lore.kernel.org/io-uring/\
  CAHk-=wh0KoEZXPYMGkfkeVEerSCEF1AiCZSvz9TRrx=Kj74D+Q@mail.gmail.com/

Any comments?

Thanks,
- Tom

[gdb/tdep] Use pid to choose x86_64 process 64/32-bitness

gdb/ChangeLog:

2021-05-07  Tom de Vries  <tdevries@suse.de>

	PR tdep/27822
	* x86-linux-nat.c (x86_linux_nat_target::read_description): Use
	pid to determine if process is 64-bit or 32-bit.

---
 gdb/x86-linux-nat.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/gdb/x86-linux-nat.c b/gdb/x86-linux-nat.c
index 85c7f0ddc94..adea1ad0092 100644
--- a/gdb/x86-linux-nat.c
+++ b/gdb/x86-linux-nat.c
@@ -113,10 +113,7 @@ x86_linux_nat_target::read_description ()
   static uint64_t xcr0;
   uint64_t xcr0_features_bits;
 
-  /* GNU/Linux LWP ID's are process ID's.  */
-  tid = inferior_ptid.lwp ();
-  if (tid == 0)
-    tid = inferior_ptid.pid (); /* Not a threaded program.  */
+  tid = inferior_ptid.pid ();
 
 #ifdef __x86_64__
   {

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness
  2021-05-07  8:44 [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness Tom de Vries
@ 2021-05-07 19:27 ` Simon Marchi
  2021-05-07 20:06   ` Luis Machado
  2021-05-19 16:32   ` Tom de Vries
  0 siblings, 2 replies; 10+ messages in thread
From: Simon Marchi @ 2021-05-07 19:27 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 2021-05-07 4:44 a.m., Tom de Vries wrote:
> Hi,
> 
> In a linux kernel mailing list discussion, it was mentioned that "gdb has
> this odd thing where it takes the 64-bit vs 32-bit data for the whole process
> from one thread, and picks the worst possible thread to do it (ie explicitly
> not even the main thread, ...)" [1].
> 
> The picking of the thread is done here in
> x86_linux_nat_target::read_description:
> ...
>   /* GNU/Linux LWP ID's are process ID's.  */
>   tid = inferior_ptid.lwp ();
>   if (tid == 0)
>     tid = inferior_ptid.pid (); /* Not a threaded program.  */
> ...
> 
> To understand what this code does, let's investigate a scenario in which
> inferior_ptid.lwp () != inferior_ptid.pid ().
> 
> Say we start exec jit-attach-pie, identified with pid x.  The main thread
> starts another thread that sleeps, and then the main thread waits for the
> sleeping thread.  So we have two threads, identified with LWP IDs x and x+1:
> ...
> PID  LWP  CMD
> x    x    ./jit-attach-pie
> x    x+1  ./jit-attach-pie
> ...
> [ The thread with LWP x is known as the thread group leader. ]
> 
> When attaching to this exec using the pid, gdb does a stop_all_threads which
> iterates over all the threads, first LWP x, and then LWP x+1.
> 
> So the state we arrive with at x86_linux_nat_target::read_description is:
> ...
> (gdb) p inferior_ptid
> $1 = {m_pid = x, m_lwp = x+1, m_tid = 0}
> ...
> and consequently we probe 64/32-bitness from thread LWP x+1.
> 
> [ Note that this is different from when gdb doesn't attach but instead
> launches the exec itself, in which case there's no stop_all_threads needed,
> and the probed thread is LWP x. ]

Well, it's mostly because when running, there's only one thread to begin
with, and the io_uring threads are likely not there yet.

> According to aforementioned remark, a better choice would have been the main
> thread, that is, LWP x.

That doesn't tell the whole story though.  Up to now, probing the
non-main thread was maybe a slightly strange thing to do, but it worked
because all threads were set up the same way for that matter.  It worked
under the assumption that all threads had the same CS/DS contents (the
registers we use to detect architecture).  The failure happens when
there are io_uring threads present.  These threads are of a strange kind
because they are like any userspace threads, except that they never
return to userspace.  Because of that, they don't have a meaningful
userspace register state, reading their registers returns garbage.  So
if we choose one of these threads for probing the architecture of the
process (which, as you mentioned below, is bogus, but that's how it
works now), then we get bad results.

From what I read (from the lkml thread you linked), the kernel's plan is
to mitigate it by filling the register state of these threads to
something GDB is happy with, even though that's unnecessary otherwise.
I don't know what's the state of that though.  But I agree it would be
good to fix this on our side as well.

> 
> This patch implement that choice, by simply doing:
> ...
>   tid = inferior_ptid.pid ();

I think that's ok.

> ...
> 
> The fact that gdb makes a per-process choice for 64/32-bitness is a problem in
> itself: each thread can be in either 64 or 32 bit mode.  That is a problem
> that this patch doesn't fix.

Not only that, but from what I understood, it's not right to decide once
and for all whether a thread is 32 or 64.  It could switch between the
two modes.  So all we can really know is what mode a thread is
currently, while it is stopped.  But who knows what mode it's going to
be in 5 instructions from now.

This actually is a problem to people who debug early boot code with
QEMU, because there the "thread" goes from 32 to 64 bit mode at some
point.  To handle this, it sounds like GDB should re-evaluate the
architecture of each thread every time it stops.  That sounds expensive,
but maybe this mode could exist behind a knob that is disabled by
default, for these special cases.

There might also be features that can only work if we assume the
architecture of a thread never changes, I'm not sure.

Also, we already have kind of partial machinery for having threads with
different architectures, that's why we have
target_ops::thread_architecture.  But read_description is used to set
inferior::gdbarch, which is kind of the "default" arch of the process,
used for many things.  And the default implementation of
thread_architecture, process_stratum_target::thread_architecture, just
returns inf->gdbarch.  So in practice, today, for Linux it's true that
we have one single architecture for the whole inferior.  The exception
is aarch64-linux, which can return different architectures for different
SVE configurations.

> Tested on x86_64-linux.
> 
> [1] https://lore.kernel.org/io-uring/\
>   CAHk-=wh0KoEZXPYMGkfkeVEerSCEF1AiCZSvz9TRrx=Kj74D+Q@mail.gmail.com/

Hmm, the link is not clickable / does not lead to the right place,
because it is split in two.

> 
> Any comments?

1. I think it would be good to have a test for this.  I wanted to actually
try making (or finding) a program that uses io_uring and see the problem
with my own eyes, but I didn't have time yet.  But there could be such a
test in our testsuite (we would check that the kernel actually supports
io_uring, of course).  We could validate that when we select one of the
io threads and do things like backtrace, the failure is handled
gracefully.

2. There are other architectures than x86-64 where it looks like we would
also probe a non-main thread, like ARM:

    https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/arm-linux-nat.c;h=662dade0a122f8adf25da91577b1de375df7785b;hb=HEAD

It might not fail like it does on x86-64, it depends on the kernel
implementation.  But we should probably change them the same way.

3. Have you tested attaching to a program where the main thread has
exited?  It might still work (and there might be a test for it in our
testsuite) because the thread group leader stays around as a zombie as
long as other threads exist.  But it would be good to test it.

Simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness
  2021-05-07 19:27 ` Simon Marchi
@ 2021-05-07 20:06   ` Luis Machado
  2021-05-19 16:32   ` Tom de Vries
  1 sibling, 0 replies; 10+ messages in thread
From: Luis Machado @ 2021-05-07 20:06 UTC (permalink / raw)
  To: Simon Marchi, Tom de Vries, gdb-patches

On 5/7/21 4:27 PM, Simon Marchi via Gdb-patches wrote:
> On 2021-05-07 4:44 a.m., Tom de Vries wrote:
>> Hi,
>>
>> In a linux kernel mailing list discussion, it was mentioned that "gdb has
>> this odd thing where it takes the 64-bit vs 32-bit data for the whole process
>> from one thread, and picks the worst possible thread to do it (ie explicitly
>> not even the main thread, ...)" [1].
>>
>> The picking of the thread is done here in
>> x86_linux_nat_target::read_description:
>> ...
>>    /* GNU/Linux LWP ID's are process ID's.  */
>>    tid = inferior_ptid.lwp ();
>>    if (tid == 0)
>>      tid = inferior_ptid.pid (); /* Not a threaded program.  */
>> ...
>>
>> To understand what this code does, let's investigate a scenario in which
>> inferior_ptid.lwp () != inferior_ptid.pid ().
>>
>> Say we start exec jit-attach-pie, identified with pid x.  The main thread
>> starts another thread that sleeps, and then the main thread waits for the
>> sleeping thread.  So we have two threads, identified with LWP IDs x and x+1:
>> ...
>> PID  LWP  CMD
>> x    x    ./jit-attach-pie
>> x    x+1  ./jit-attach-pie
>> ...
>> [ The thread with LWP x is known as the thread group leader. ]
>>
>> When attaching to this exec using the pid, gdb does a stop_all_threads which
>> iterates over all the threads, first LWP x, and then LWP x+1.
>>
>> So the state we arrive with at x86_linux_nat_target::read_description is:
>> ...
>> (gdb) p inferior_ptid
>> $1 = {m_pid = x, m_lwp = x+1, m_tid = 0}
>> ...
>> and consequently we probe 64/32-bitness from thread LWP x+1.
>>
>> [ Note that this is different from when gdb doesn't attach but instead
>> launches the exec itself, in which case there's no stop_all_threads needed,
>> and the probed thread is LWP x. ]
> 
> Well, it's mostly because when running, there's only one thread to begin
> with, and the io_uring threads are likely not there yet.
> 
>> According to aforementioned remark, a better choice would have been the main
>> thread, that is, LWP x.
> 
> That doesn't tell the whole story though.  Up to now, probing the
> non-main thread was maybe a slightly strange thing to do, but it worked
> because all threads were set up the same way for that matter.  It worked
> under the assumption that all threads had the same CS/DS contents (the
> registers we use to detect architecture).  The failure happens when
> there are io_uring threads present.  These threads are of a strange kind
> because they are like any userspace threads, except that they never
> return to userspace.  Because of that, they don't have a meaningful
> userspace register state, reading their registers returns garbage.  So
> if we choose one of these threads for probing the architecture of the
> process (which, as you mentioned below, is bogus, but that's how it
> works now), then we get bad results.
> 
>From what I read (from the lkml thread you linked), the kernel's plan is
> to mitigate it by filling the register state of these threads to
> something GDB is happy with, even though that's unnecessary otherwise.
> I don't know what's the state of that though.  But I agree it would be
> good to fix this on our side as well.
> 
>>
>> This patch implement that choice, by simply doing:
>> ...
>>    tid = inferior_ptid.pid ();
> 
> I think that's ok.
> 
>> ...
>>
>> The fact that gdb makes a per-process choice for 64/32-bitness is a problem in
>> itself: each thread can be in either 64 or 32 bit mode.  That is a problem
>> that this patch doesn't fix.
> 
> Not only that, but from what I understood, it's not right to decide once
> and for all whether a thread is 32 or 64.  It could switch between the
> two modes.  So all we can really know is what mode a thread is
> currently, while it is stopped.  But who knows what mode it's going to
> be in 5 instructions from now.
> 
> This actually is a problem to people who debug early boot code with
> QEMU, because there the "thread" goes from 32 to 64 bit mode at some
> point.  To handle this, it sounds like GDB should re-evaluate the
> architecture of each thread every time it stops.  That sounds expensive,
> but maybe this mode could exist behind a knob that is disabled by
> default, for these special cases.
> 
> There might also be features that can only work if we assume the
> architecture of a thread never changes, I'm not sure.
> 
> Also, we already have kind of partial machinery for having threads with
> different architectures, that's why we have
> target_ops::thread_architecture.  But read_description is used to set
> inferior::gdbarch, which is kind of the "default" arch of the process,
> used for many things.  And the default implementation of
> thread_architecture, process_stratum_target::thread_architecture, just
> returns inf->gdbarch.  So in practice, today, for Linux it's true that
> we have one single architecture for the whole inferior.  The exception
> is aarch64-linux, which can return different architectures for different
> SVE configurations.

A while ago I suggested having a better mechanism to update the 
architecture of each thread mid-execution (in order to support SVE 
dynamic vector lenghts for gdbserver), including remote packet changes. 
But the feedback was that it would be too much overhead.

In summary, assuming a per-process architecture is not right for SVE and 
it sounds like we should be moving towards per-thread architectures in 
the future.

We'll need to dump per-thread GDB_TDESC notes for a core file as well.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness
  2021-05-07 19:27 ` Simon Marchi
  2021-05-07 20:06   ` Luis Machado
@ 2021-05-19 16:32   ` Tom de Vries
  2021-05-20 10:42     ` Alan Hayward
  1 sibling, 1 reply; 10+ messages in thread
From: Tom de Vries @ 2021-05-19 16:32 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches
  Cc: Kevin Buettner, Andreas Arnez, Alan Hayward, Luis Machado

[-- Attachment #1: Type: text/plain, Size: 7567 bytes --]

On 5/7/21 9:27 PM, Simon Marchi wrote:
> On 2021-05-07 4:44 a.m., Tom de Vries wrote:
>> Hi,
>>
>> In a linux kernel mailing list discussion, it was mentioned that "gdb has
>> this odd thing where it takes the 64-bit vs 32-bit data for the whole process
>> from one thread, and picks the worst possible thread to do it (ie explicitly
>> not even the main thread, ...)" [1].
>>
>> The picking of the thread is done here in
>> x86_linux_nat_target::read_description:
>> ...
>>   /* GNU/Linux LWP ID's are process ID's.  */
>>   tid = inferior_ptid.lwp ();
>>   if (tid == 0)
>>     tid = inferior_ptid.pid (); /* Not a threaded program.  */
>> ...
>>
>> To understand what this code does, let's investigate a scenario in which
>> inferior_ptid.lwp () != inferior_ptid.pid ().
>>
>> Say we start exec jit-attach-pie, identified with pid x.  The main thread
>> starts another thread that sleeps, and then the main thread waits for the
>> sleeping thread.  So we have two threads, identified with LWP IDs x and x+1:
>> ...
>> PID  LWP  CMD
>> x    x    ./jit-attach-pie
>> x    x+1  ./jit-attach-pie
>> ...
>> [ The thread with LWP x is known as the thread group leader. ]
>>
>> When attaching to this exec using the pid, gdb does a stop_all_threads 
which
>> iterates over all the threads, first LWP x, and then LWP x+1.
>>
>> So the state we arrive with at x86_linux_nat_target::read_description is:
>> ...
>> (gdb) p inferior_ptid
>> $1 = {m_pid = x, m_lwp = x+1, m_tid = 0}
>> ...
>> and consequently we probe 64/32-bitness from thread LWP x+1.
>>
>> [ Note that this is different from when gdb doesn't attach but instead
>> launches the exec itself, in which case there's no stop_all_threads needed,
>> and the probed thread is LWP x. ]
> 
> Well, it's mostly because when running, there's only one thread to begin
> with,

Correct, I've updated the comment.

> and the io_uring threads are likely not there yet.
> 

I'm trying to discuss things here first relative to the current
situation, so without the kernel patch that makes io_uring threads
user-visible.

>> According to aforementioned remark, a better choice would have been the main
>> thread, that is, LWP x.
> 
> That doesn't tell the whole story though.

Ack, updated log message.

> Up to now, probing the
> non-main thread was maybe a slightly strange thing to do, but it worked
> because all threads were set up the same way for that matter.  It worked
> under the assumption that all threads had the same CS/DS contents (the
> registers we use to detect architecture).  The failure happens when
> there are io_uring threads present.  These threads are of a strange kind
> because they are like any userspace threads, except that they never
> return to userspace.  Because of that, they don't have a meaningful
> userspace register state, reading their registers returns garbage.  So
> if we choose one of these threads for probing the architecture of the
> process (which, as you mentioned below, is bogus, but that's how it
> works now), then we get bad results.
> 
> From what I read (from the lkml thread you linked), the kernel's plan is
> to mitigate it by filling the register state of these threads to
> something GDB is happy with, even though that's unnecessary otherwise.
> I don't know what's the state of that though.  But I agree it would be
> good to fix this on our side as well.
> 
>>
>> This patch implement that choice, by simply doing:
>> ...
>>   tid = inferior_ptid.pid ();
> 
> I think that's ok.
> 
>> ...
>>
>> The fact that gdb makes a per-process choice for 64/32-bitness is a problem in
>> itself: each thread can be in either 64 or 32 bit mode.  That is a problem
>> that this patch doesn't fix.
> 
> Not only that, but from what I understood, it's not right to decide once
> and for all whether a thread is 32 or 64.  It could switch between the
> two modes.  So all we can really know is what mode a thread is
> currently, while it is stopped.  But who knows what mode it's going to
> be in 5 instructions from now.
> 

Ack, I've updated the message to make that more clear.

> This actually is a problem to people who debug early boot code with
> QEMU, because there the "thread" goes from 32 to 64 bit mode at some
> point.  To handle this, it sounds like GDB should re-evaluate the
> architecture of each thread every time it stops.  That sounds expensive,
> but maybe this mode could exist behind a knob that is disabled by
> default, for these special cases.
> 
> There might also be features that can only work if we assume the
> architecture of a thread never changes, I'm not sure.
> 

It would already be good if users could tell gdb that things have
switched, currently not even that is possible, see PR27838.

> Also, we already have kind of partial machinery for having threads with
> different architectures, that's why we have
> target_ops::thread_architecture.  But read_description is used to set
> inferior::gdbarch, which is kind of the "default" arch of the process,
> used for many things.  And the default implementation of
> thread_architecture, process_stratum_target::thread_architecture, just
> returns inf->gdbarch.  So in practice, today, for Linux it's true that
> we have one single architecture for the whole inferior.  The exception
> is aarch64-linux, which can return different architectures for different
> SVE configurations.
> 
>> Tested on x86_64-linux.
>>
>> [1] https://lore.kernel.org/io-uring/\
>>   CAHk-=wh0KoEZXPYMGkfkeVEerSCEF1AiCZSvz9TRrx=Kj74D+Q@mail.gmail.com/
> 
> Hmm, the link is not clickable / does not lead to the right place,
> because it is split in two.
> 

Yeah, I don't like lines overrunning, but OK, fixed.

>>
>> Any comments?
> 
> 1. I think it would be good to have a test for this.  I wanted to actually
> try making (or finding) a program that uses io_uring and see the problem
> with my own eyes, but I didn't have time yet.  But there could be such a
> test in our testsuite (we would check that the kernel actually supports
> io_uring, of course).  We could validate that when we select one of the
> io threads and do things like backtrace, the failure is handled
> gracefully.
> 

Yeah, that sounds like something I could try once the kernel exposing
io_uring threads to user-space lands in tumbleweed.

> 2. There are other architectures than x86-64 where it looks like we would
> also probe a non-main thread, like ARM:
> 
>     https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/arm-linux-nat.c;h=662dade0a122f8adf25da91577b1de375df7785b;hb=HEAD
> 
> It might not fail like it does on x86-64, it depends on the kernel
> implementation.  But we should probably change them the same way.
> 

Ack, I've scanned the sources of other ports, and added a few more
(cc-ing port maintainers).

> 3. Have you tested attaching to a program where the main thread has
> exited?  It might still work (and there might be a test for it in our
> testsuite) because the thread group leader stays around as a zombie as
> long as other threads exist.  But it would be good to test it.

Good question.  I've tried and got both with system gdb and patched
upstream gdb the same result:
...
$ gdb -q -p $(pidof a.out)
Attaching to process 28988
warning: process 28988 is a zombie - the process has already terminated
ptrace: Operation not permitted.
(gdb)
$ ~/gdb_versions/devel/gdb -q -p $(pidof a.out)
Attaching to process 28988
warning: process 28988 is a zombie - the process has already terminated
ptrace: Operation not permitted.
(gdb) q
...

Updated patch attached.  Any further comments?

Thanks,
- Tom

[-- Attachment #2: 0005-gdb-tdep-Use-pid-to-choose-process-64-32-bitness.patch --]
[-- Type: text/x-patch, Size: 6025 bytes --]

[gdb/tdep] Use pid to choose process 64/32-bitness

In a linux kernel mailing list discussion, it was mentioned that "gdb has
this odd thing where it takes the 64-bit vs 32-bit data for the whole process
from one thread, and picks the worst possible thread to do it (ie explicitly
not even the main thread, ...)" [1].

The picking of the thread is done here in
x86_linux_nat_target::read_description:
...
  /* GNU/Linux LWP ID's are process ID's.  */
  tid = inferior_ptid.lwp ();
  if (tid == 0)
    tid = inferior_ptid.pid (); /* Not a threaded program.  */
...

To understand what this code does, let's investigate a scenario in which
inferior_ptid.lwp () != inferior_ptid.pid ().

Say we start exec jit-attach-pie, identified with pid x.  The main thread
starts another thread that sleeps, and then the main thread waits for the
sleeping thread.  So we have two threads, identified with LWP IDs x and x+1:
...
PID  LWP  CMD
x    x    ./jit-attach-pie
x    x+1  ./jit-attach-pie
...
[ The thread with LWP x is known as the thread group leader. ]

When attaching to this exec using the pid, gdb does a stop_all_threads which
iterates over all the threads, first LWP x, and then LWP x+1.

So the state we arrive with at x86_linux_nat_target::read_description is:
...
(gdb) p inferior_ptid
$1 = {m_pid = x, m_lwp = x+1, m_tid = 0}
...
and consequently we probe 64/32-bitness from thread LWP x+1.

[ Note that this is different from when gdb doesn't attach but instead
launches the exec itself, in which case there's just one thread to begin with,
and consequently the probed thread is LWP x. ]

According to aforementioned remark, a better choice would have been the main
thread, that is, LWP x.

This patch implement that choice, by simply doing:
...
  tid = inferior_ptid.pid ();
...

The fact that gdb makes a per-process permanent choice for 64/32-bitness is a
problem in itself: each thread can be in either 64 or 32 bit mode, and change
forth and back.  That is a problem that this patch doesn't fix.

Now finally: why does this matter in the context of the linux kernel
discussion?  The discussion was related to a patch that exposed io_uring
threads to user-space.  This made it possible that one of those threads would
be picked out to select 64/32-bitness.  Given that such threads are atypical
user-space threads in the sense that they don't return to user-space and don't
have a userspace register state, reading their registers returns garbage, 
and
so it could f.i. occur that in a 64-bit process with all normal user-space
threads in 64-bit mode, the probing would return 32-bit.

It may be that this is worked-around on the kernel side by providing userspace
register state in those threads such that current gdb is happy.  Nevertheless,
it seems prudent to fix this on the gdb size as well.

Tested on x86_64-linux.

[1] https://lore.kernel.org/io-uring/CAHk-=wh0KoEZXPYMGkfkeVEerSCEF1AiCZSvz9TRrx=Kj74D+Q@mail.gmail.com/

gdb/ChangeLog:

2021-05-07  Tom de Vries  <tdevries@suse.de>

	PR tdep/27822
	* x86-linux-nat.c (x86_linux_nat_target::read_description): Use
	pid to determine if process is 64-bit or 32-bit.
	* aarch64-linux-nat.c (aarch64_linux_nat_target::read_description):
	Same.
	* arm-linux-nat.c (arm_linux_nat_target::read_description): Same.
	* ppc-linux-nat.c (ppc_linux_nat_target::read_description): Same.
	* s390-linux-nat.c (s390_linux_nat_target::read_description): Same.

---
 gdb/aarch64-linux-nat.c | 2 +-
 gdb/arm-linux-nat.c     | 2 +-
 gdb/ppc-linux-nat.c     | 4 +---
 gdb/s390-linux-nat.c    | 2 +-
 gdb/x86-linux-nat.c     | 5 +----
 5 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
index ae8db2988c2..61224022f6a 100644
--- a/gdb/aarch64-linux-nat.c
+++ b/gdb/aarch64-linux-nat.c
@@ -723,7 +723,7 @@ aarch64_linux_nat_target::read_description ()
   gdb_byte regbuf[ARM_VFP3_REGS_SIZE];
   struct iovec iovec;
 
-  tid = inferior_ptid.lwp ();
+  tid = inferior_ptid.pid ();
 
   iovec.iov_base = regbuf;
   iovec.iov_len = ARM_VFP3_REGS_SIZE;
diff --git a/gdb/arm-linux-nat.c b/gdb/arm-linux-nat.c
index 662dade0a12..b5b470b6876 100644
--- a/gdb/arm-linux-nat.c
+++ b/gdb/arm-linux-nat.c
@@ -537,7 +537,7 @@ arm_linux_nat_target::read_description ()
     {
       elf_gregset_t gpregs;
       struct iovec iov;
-      int tid = inferior_ptid.lwp ();
+      int tid = inferior_ptid.pid ();
 
       iov.iov_base = &gpregs;
       iov.iov_len = sizeof (gpregs);
diff --git a/gdb/ppc-linux-nat.c b/gdb/ppc-linux-nat.c
index 171f5b386fa..06a30efeaef 100644
--- a/gdb/ppc-linux-nat.c
+++ b/gdb/ppc-linux-nat.c
@@ -1946,9 +1946,7 @@ ppc_linux_nat_target::auxv_parse (gdb_byte **readptr,
 const struct target_desc *
 ppc_linux_nat_target::read_description ()
 {
-  int tid = inferior_ptid.lwp ();
-  if (tid == 0)
-    tid = inferior_ptid.pid ();
+  int tid = inferior_ptid.pid ();
 
   if (have_ptrace_getsetevrregs)
     {
diff --git a/gdb/s390-linux-nat.c b/gdb/s390-linux-nat.c
index 41b50ce4800..8f6eb61505b 100644
--- a/gdb/s390-linux-nat.c
+++ b/gdb/s390-linux-nat.c
@@ -988,7 +988,7 @@ s390_linux_nat_target::auxv_parse (gdb_byte **readptr,
 const struct target_desc *
 s390_linux_nat_target::read_description ()
 {
-  int tid = s390_inferior_tid ();
+  int tid = inferior_ptid.pid ();
 
   have_regset_last_break
     = check_regset (tid, NT_S390_LAST_BREAK, 8);
diff --git a/gdb/x86-linux-nat.c b/gdb/x86-linux-nat.c
index 85c7f0ddc94..adea1ad0092 100644
--- a/gdb/x86-linux-nat.c
+++ b/gdb/x86-linux-nat.c
@@ -113,10 +113,7 @@ x86_linux_nat_target::read_description ()
   static uint64_t xcr0;
   uint64_t xcr0_features_bits;
 
-  /* GNU/Linux LWP ID's are process ID's.  */
-  tid = inferior_ptid.lwp ();
-  if (tid == 0)
-    tid = inferior_ptid.pid (); /* Not a threaded program.  */
+  tid = inferior_ptid.pid ();
 
 #ifdef __x86_64__
   {

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness
  2021-05-19 16:32   ` Tom de Vries
@ 2021-05-20 10:42     ` Alan Hayward
  2021-05-20 16:07       ` Simon Marchi
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Hayward @ 2021-05-20 10:42 UTC (permalink / raw)
  To: Tom de Vries
  Cc: Simon Marchi, gdb-patches\@sourceware.org, Kevin Buettner,
	Andreas Arnez, Luis Machado, nd


On 19 May 2021, at 17:32, Tom de Vries <tdevries@suse.de<mailto:tdevries@suse.de>> wrote:




---
gdb/aarch64-linux-nat.c | 2 +-
gdb/arm-linux-nat.c     | 2 +-
gdb/ppc-linux-nat.c     | 4 +---
gdb/s390-linux-nat.c    | 2 +-
gdb/x86-linux-nat.c     | 5 +----
5 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
index ae8db2988c2..61224022f6a 100644
--- a/gdb/aarch64-linux-nat.c
+++ b/gdb/aarch64-linux-nat.c
@@ -723,7 +723,7 @@ aarch64_linux_nat_target::read_description ()
  gdb_byte regbuf[ARM_VFP3_REGS_SIZE];
  struct iovec iovec;

-  tid = inferior_ptid.lwp ();
+  tid = inferior_ptid.pid ();

A user level process is either AArch64 or AArch32, it can only change by execve().
All threads in a single process will be the same architecture.

As was mentioned above somewhere, SVE can have different vector lengths per thread.

Therefore, this needs to stay as lwp.

Maybe this needs a comment, something like:
// Use lwp as sve vector length can change per thread.


  iovec.iov_base = regbuf;
  iovec.iov_len = ARM_VFP3_REGS_SIZE;
diff --git a/gdb/arm-linux-nat.c b/gdb/arm-linux-nat.c
index 662dade0a12..b5b470b6876 100644
--- a/gdb/arm-linux-nat.c
+++ b/gdb/arm-linux-nat.c
@@ -537,7 +537,7 @@ arm_linux_nat_target::read_description ()
    {
      elf_gregset_t gpregs;
      struct iovec iov;
-      int tid = inferior_ptid.lwp ();
+      int tid = inferior_ptid.pid ();


Arm port is only ever going to be 32bits, so this change is probably not that useful.
Fine with the change for consistency reasons across the ports.

If making this change, could you also change the other inferior_ptid.lwp call in the
same function (line 559).



      iov.iov_base = &gpregs;
      iov.iov_len = sizeof (gpregs);
diff --git a/gdb/ppc-linux-nat.c b/gdb/ppc-linux-nat.c
index 171f5b386fa..06a30efeaef 100644
--- a/gdb/ppc-linux-nat.c
+++ b/gdb/ppc-linux-nat.c
@@ -1946,9 +1946,7 @@ ppc_linux_nat_target::auxv_parse (gdb_byte **readptr,
const struct target_desc *
ppc_linux_nat_target::read_description ()
{
-  int tid = inferior_ptid.lwp ();
-  if (tid == 0)
-    tid = inferior_ptid.pid ();
+  int tid = inferior_ptid.pid ();

  if (have_ptrace_getsetevrregs)
    {
diff --git a/gdb/s390-linux-nat.c b/gdb/s390-linux-nat.c
index 41b50ce4800..8f6eb61505b 100644
--- a/gdb/s390-linux-nat.c
+++ b/gdb/s390-linux-nat.c
@@ -988,7 +988,7 @@ s390_linux_nat_target::auxv_parse (gdb_byte **readptr,
const struct target_desc *
s390_linux_nat_target::read_description ()
{
-  int tid = s390_inferior_tid ();
+  int tid = inferior_ptid.pid ();

  have_regset_last_break
    = check_regset (tid, NT_S390_LAST_BREAK, 8);
diff --git a/gdb/x86-linux-nat.c b/gdb/x86-linux-nat.c
index 85c7f0ddc94..adea1ad0092 100644
--- a/gdb/x86-linux-nat.c
+++ b/gdb/x86-linux-nat.c
@@ -113,10 +113,7 @@ x86_linux_nat_target::read_description ()
  static uint64_t xcr0;
  uint64_t xcr0_features_bits;

-  /* GNU/Linux LWP ID's are process ID's.  */
-  tid = inferior_ptid.lwp ();
-  if (tid == 0)
-    tid = inferior_ptid.pid (); /* Not a threaded program.  */
+  tid = inferior_ptid.pid ();

#ifdef __x86_64__
  {


Thanks,
Alan.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness
  2021-05-20 10:42     ` Alan Hayward
@ 2021-05-20 16:07       ` Simon Marchi
  2021-05-21 10:50         ` Alan Hayward
  0 siblings, 1 reply; 10+ messages in thread
From: Simon Marchi @ 2021-05-20 16:07 UTC (permalink / raw)
  To: Alan Hayward, Tom de Vries
  Cc: gdb-patches\@sourceware.org, Kevin Buettner, Andreas Arnez,
	Luis Machado, nd

On 2021-05-20 6:42 a.m., Alan Hayward wrote:
> 
>> On 19 May 2021, at 17:32, Tom de Vries <tdevries@suse.de <mailto:tdevries@suse.de>> wrote:
>>
> 
> 
> 
>> ---
>> gdb/aarch64-linux-nat.c | 2 +-
>> gdb/arm-linux-nat.c     | 2 +-
>> gdb/ppc-linux-nat.c     | 4 +---
>> gdb/s390-linux-nat.c    | 2 +-
>> gdb/x86-linux-nat.c     | 5 +----
>> 5 files changed, 5 insertions(+), 10 deletions(-)
>>
>> diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
>> index ae8db2988c2..61224022f6a 100644
>> --- a/gdb/aarch64-linux-nat.c
>> +++ b/gdb/aarch64-linux-nat.c
>> @@ -723,7 +723,7 @@ aarch64_linux_nat_target::read_description ()
>>   gdb_byte regbuf[ARM_VFP3_REGS_SIZE];
>>   struct iovec iovec;
>>
>> -  tid = inferior_ptid.lwp ();
>> +  tid = inferior_ptid.pid ();
> 
> A user level process is either AArch64 or AArch32, it can only change by execve().
> All threads in a single process will be the same architecture.

But the question is: if the last thread happens to be a io_uring
thread, will that test work as intended?

  ret = ptrace (PTRACE_GETREGSET, tid, NT_ARM_VFP, &iovec);
  if (ret == 0)
    return aarch32_read_description ();

In other words, what happens when doing a PTRACE_GETREGSET with
NT_ARM_VFP on a io_uring thread.  Does that return the same answer as
querying the main thread?  If so, the code doesn't _need_ to be changed.

> 
> As was mentioned above somewhere, SVE can have different vector lengths per thread.
> 
> Therefore, this needs to stay as lwp.

The thing is that target_ops::read_description is used ultimately to
find one "process-wide" gdbarch, saved in inferior::gdbarch.  It's
called only once at process start / attach.  And indeed, because threads
can have different SVE vector lengths, AArch64 implements
target_ops::thread_architecture, to allow obtaining a thread-specific
gdbarch for a given thread, that will contain the right SVE vector
length for that thread.

So the target_desc (and ultimately the gdbarch) returned by
aarch64_linux_nat_target::read_description will not be able to
accurately describe all threads.  It is expected that this gdbarch will
contain some arbitrary value for the SVE vector length (the one matching
the thread we chose to probe in read_description).  And that if you need
to know a thread's SVE vector length, you don't use that gdbarch, but
call target_thread_architecture instead.

Maybe that having a process-wide gdbarch and be able to fetch
thread-specific gdbarches doesn't make sense.  That sounds like a design
issue in GDB.  Maybe we could try to transition to only having
thread-specific gdbarches obtained through thread_architecture.  Or
maybe the gdbarch object should be split in two: one for values that are
process-wide, still saved in inferior::gdbarch, and one for values that
are per-thread.  Or maybe we could keep the single process-wide gdbarch
for AArch64, and the vq value could be saved in some other per-thread
data structure, instead of in gdbarch_tdep.  That would remove the need
to have thread-specific gdbarches.  In any case, that's long-term work.

What could be done short term is to clarify (document) what is expected
from target_ops::read_description.  Clarify that it should return a
gdbarch that describes the whole process.  And if there are things in
the gdbarch that are thread-specific, then that will not be
representative of all threads, target_thread_architecture will be used
for that.  I would also explore the possibility of passing an "int pid"
to read_description to make it clear which process/thread to read a
description from.

So in conclusion, I don't think that Tom's patch will cause a problem,
because SVE length is not obtained from that description.  But it may
not be needed either.

> Maybe this needs a comment, something like:
> // Use lwp as sve vector length can change per thread.
> 
>>
>>   iovec.iov_base = regbuf;
>>   iovec.iov_len = ARM_VFP3_REGS_SIZE;
>> diff --git a/gdb/arm-linux-nat.c b/gdb/arm-linux-nat.c
>> index 662dade0a12..b5b470b6876 100644
>> --- a/gdb/arm-linux-nat.c
>> +++ b/gdb/arm-linux-nat.c
>> @@ -537,7 +537,7 @@ arm_linux_nat_target::read_description ()
>>     {
>>       elf_gregset_t gpregs;
>>       struct iovec iov;
>> -      int tid = inferior_ptid.lwp ();
>> +      int tid = inferior_ptid.pid ();
> 
> 
> Arm port is only ever going to be 32bits, so this change is probably not that useful.
> Fine with the change for consistency reasons across the ports.
> 
> If making this change, could you also change the other inferior_ptid.lwp call in the
> same function (line 559).

The important question is: would this test work if it was done on an
io_uring thread:

      /* Check if PTRACE_GETREGSET works.  */
      if (ptrace (PTRACE_GETREGSET, tid, NT_PRSTATUS, &iov) < 0)
	have_ptrace_getregset = TRIBOOL_FALSE;
      else
	have_ptrace_getregset = TRIBOOL_TRUE;

and:

      if (ptrace (PTRACE_GETVFPREGS, pid, 0, buf) < 0 && errno == EIO)
	return nullptr;

I guess they would work, because here we just check whether we are able
to read the registers.  The kernel doesn't prevent reading the io_uring
thread registers, it just returns unreliable (for our purposes) values.
But here we don't check the values, we just check if the read works.  In
the case of x86, the issue is that we check what is the value.

I think it makes sense in any case to probe the main thread, we know
it's always ok.

Simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness
  2021-05-20 16:07       ` Simon Marchi
@ 2021-05-21 10:50         ` Alan Hayward
  2021-05-22  8:32           ` Tom de Vries
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Hayward @ 2021-05-21 10:50 UTC (permalink / raw)
  To: Simon Marchi
  Cc: Tom de Vries, gdb-patches\@sourceware.org, Kevin Buettner,
	Andreas Arnez, Luis Machado, nd



> On 20 May 2021, at 17:07, Simon Marchi <simon.marchi@polymtl.ca> wrote:
> 
> On 2021-05-20 6:42 a.m., Alan Hayward wrote:
>> 
>>> On 19 May 2021, at 17:32, Tom de Vries <tdevries@suse.de <mailto:tdevries@suse.de>> wrote:
>>> 
>> 
>> 
>> 
>>> ---
>>> gdb/aarch64-linux-nat.c | 2 +-
>>> gdb/arm-linux-nat.c     | 2 +-
>>> gdb/ppc-linux-nat.c     | 4 +---
>>> gdb/s390-linux-nat.c    | 2 +-
>>> gdb/x86-linux-nat.c     | 5 +----
>>> 5 files changed, 5 insertions(+), 10 deletions(-)
>>> 
>>> diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
>>> index ae8db2988c2..61224022f6a 100644
>>> --- a/gdb/aarch64-linux-nat.c
>>> +++ b/gdb/aarch64-linux-nat.c
>>> @@ -723,7 +723,7 @@ aarch64_linux_nat_target::read_description ()
>>>  gdb_byte regbuf[ARM_VFP3_REGS_SIZE];
>>>  struct iovec iovec;
>>> 
>>> -  tid = inferior_ptid.lwp ();
>>> +  tid = inferior_ptid.pid ();
>> 
>> A user level process is either AArch64 or AArch32, it can only change by execve().
>> All threads in a single process will be the same architecture.
> 
> But the question is: if the last thread happens to be a io_uring
> thread, will that test work as intended?
> 
>  ret = ptrace (PTRACE_GETREGSET, tid, NT_ARM_VFP, &iovec);
>  if (ret == 0)
>    return aarch32_read_description ();
> 
> In other words, what happens when doing a PTRACE_GETREGSET with
> NT_ARM_VFP on a io_uring thread.  Does that return the same answer as
> querying the main thread?  If so, the code doesn't _need_ to be changed.

…..I’m not sure (I’ll try find out, but ultimately probably doesn't matter too much,
given everything below)

> 
>> 
>> As was mentioned above somewhere, SVE can have different vector lengths per thread.
>> 
>> Therefore, this needs to stay as lwp.
> 
> The thing is that target_ops::read_description is used ultimately to
> find one "process-wide" gdbarch, saved in inferior::gdbarch.  It's
> called only once at process start / attach.  And indeed, because threads
> can have different SVE vector lengths, AArch64 implements
> target_ops::thread_architecture, to allow obtaining a thread-specific
> gdbarch for a given thread, that will contain the right SVE vector
> length for that thread.
> 
> So the target_desc (and ultimately the gdbarch) returned by
> aarch64_linux_nat_target::read_description will not be able to
> accurately describe all threads.  It is expected that this gdbarch will
> contain some arbitrary value for the SVE vector length (the one matching
> the thread we chose to probe in read_description).  And that if you need
> to know a thread's SVE vector length, you don't use that gdbarch, but
> call target_thread_architecture instead.

I had forgotten there were the two different routes. Ok, I’m happy that this
is only called at process startup and therefore it’ll be fine.

> 
> Maybe that having a process-wide gdbarch and be able to fetch
> thread-specific gdbarches doesn't make sense.  That sounds like a design
> issue in GDB.  Maybe we could try to transition to only having
> thread-specific gdbarches obtained through thread_architecture.  Or
> maybe the gdbarch object should be split in two: one for values that are
> process-wide, still saved in inferior::gdbarch, and one for values that
> are per-thread.  Or maybe we could keep the single process-wide gdbarch
> for AArch64, and the vq value could be saved in some other per-thread
> data structure, instead of in gdbarch_tdep.  That would remove the need
> to have thread-specific gdbarches.  In any case, that's long-term work.

The thread_architecture() function always felt a bit of a hack, but
it does work for getting SVE working in GDB. But there’s still an issue for
gdbserver, as implementing thread_architecture for that causes the tdesc
to get refetched across the connection constantly.

It’s not just the vq needed in the thread structure, as the size of the registers
in tdesc rely on it, and therefore regcache etc. So, quite a lot ends up per
thread.

I did spend some time trying to make gdbarch per thread, but never got it
working. Bit fuzzy on what I changed, but it’s quite a large piece of work to
get it all done properly.

> 
> What could be done short term is to clarify (document) what is expected
> from target_ops::read_description.  Clarify that it should return a
> gdbarch that describes the whole process.  And if there are things in
> the gdbarch that are thread-specific, then that will not be
> representative of all threads, target_thread_architecture will be used
> for that.  I would also explore the possibility of passing an "int pid"
> to read_description to make it clear which process/thread to read a
> description from.

Agreed. It needs explaining better.

> 
> So in conclusion, I don't think that Tom's patch will cause a problem,
> because SVE length is not obtained from that description.  But it may
> not be needed either.

Agreed. I’m thinking Tom’s changes should be done for readability.

> 
>> Maybe this needs a comment, something like:
>> // Use lwp as sve vector length can change per thread.
>> 
>>> 
>>>  iovec.iov_base = regbuf;
>>>  iovec.iov_len = ARM_VFP3_REGS_SIZE;
>>> diff --git a/gdb/arm-linux-nat.c b/gdb/arm-linux-nat.c
>>> index 662dade0a12..b5b470b6876 100644
>>> --- a/gdb/arm-linux-nat.c
>>> +++ b/gdb/arm-linux-nat.c
>>> @@ -537,7 +537,7 @@ arm_linux_nat_target::read_description ()
>>>    {
>>>      elf_gregset_t gpregs;
>>>      struct iovec iov;
>>> -      int tid = inferior_ptid.lwp ();
>>> +      int tid = inferior_ptid.pid ();
>> 
>> 
>> Arm port is only ever going to be 32bits, so this change is probably not that useful.
>> Fine with the change for consistency reasons across the ports.
>> 
>> If making this change, could you also change the other inferior_ptid.lwp call in the
>> same function (line 559).
> 
> The important question is: would this test work if it was done on an
> io_uring thread:
> 
>      /* Check if PTRACE_GETREGSET works.  */
>      if (ptrace (PTRACE_GETREGSET, tid, NT_PRSTATUS, &iov) < 0)
> 	have_ptrace_getregset = TRIBOOL_FALSE;
>      else
> 	have_ptrace_getregset = TRIBOOL_TRUE;
> 
> and:
> 
>      if (ptrace (PTRACE_GETVFPREGS, pid, 0, buf) < 0 && errno == EIO)
> 	return nullptr;
> 
> I guess they would work, because here we just check whether we are able
> to read the registers.  The kernel doesn't prevent reading the io_uring
> thread registers, it just returns unreliable (for our purposes) values.
> But here we don't check the values, we just check if the read works.  In
> the case of x86, the issue is that we check what is the value.
> 

Again, I’m not sure….

> I think it makes sense in any case to probe the main thread, we know
> it's always ok.
> 

 ...But agreed this is the sensible option. Even if io_uring is ok, something else
could get added down the line which isn’t ok. Using pid should keep it safe.


Alan.






^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness
  2021-05-21 10:50         ` Alan Hayward
@ 2021-05-22  8:32           ` Tom de Vries
  2021-05-22  9:56             ` Tom de Vries
  0 siblings, 1 reply; 10+ messages in thread
From: Tom de Vries @ 2021-05-22  8:32 UTC (permalink / raw)
  To: Alan Hayward, Simon Marchi
  Cc: gdb-patches\@sourceware.org, Kevin Buettner, Andreas Arnez,
	Luis Machado, nd, Andrew Burgess, ,palmer

[-- Attachment #1: Type: text/plain, Size: 495 bytes --]

On 5/21/21 12:50 PM, Alan Hayward wrote:
>> I think it makes sense in any case to probe the main thread, we know
>> it's always ok.
>>
> 
>  ...But agreed this is the sensible option. Even if io_uring is ok, something else
> could get added down the line which isn’t ok. Using pid should keep it safe.
> 

Changes since last post:
- added riscv64 (adding maintainers in cc)
- added suggested change for arm
- added comment at read_description in target.h

Any further comments?

Thanks,
- Tom

[-- Attachment #2: 0001-gdb-tdep-Use-pid-to-choose-process-64-32-bitness.patch --]
[-- Type: text/x-patch, Size: 7987 bytes --]

[gdb/tdep] Use pid to choose process 64/32-bitness

In a linux kernel mailing list discussion, it was mentioned that "gdb has
this odd thing where it takes the 64-bit vs 32-bit data for the whole process
from one thread, and picks the worst possible thread to do it (ie explicitly
not even the main thread, ...)" [1].

The picking of the thread is done here in
x86_linux_nat_target::read_description:
...
  /* GNU/Linux LWP ID's are process ID's.  */
  tid = inferior_ptid.lwp ();
  if (tid == 0)
    tid = inferior_ptid.pid (); /* Not a threaded program.  */
...

To understand what this code does, let's investigate a scenario in which
inferior_ptid.lwp () != inferior_ptid.pid ().

Say we start exec jit-attach-pie, identified with pid x.  The main thread
starts another thread that sleeps, and then the main thread waits for the
sleeping thread.  So we have two threads, identified with LWP IDs x and x+1:
...
PID  LWP  CMD
x    x    ./jit-attach-pie
x    x+1  ./jit-attach-pie
...
[ The thread with LWP x is known as the thread group leader. ]

When attaching to this exec using the pid, gdb does a stop_all_threads which
iterates over all the threads, first LWP x, and then LWP x+1.

So the state we arrive with at x86_linux_nat_target::read_description is:
...
(gdb) p inferior_ptid
$1 = {m_pid = x, m_lwp = x+1, m_tid = 0}
...
and consequently we probe 64/32-bitness from thread LWP x+1.

[ Note that this is different from when gdb doesn't attach but instead
launches the exec itself, in which case there's just one thread to begin with,
and consequently the probed thread is LWP x. ]

According to aforementioned remark, a better choice would have been the main
thread, that is, LWP x.

This patch implement that choice, by simply doing:
...
  tid = inferior_ptid.pid ();
...

The fact that gdb makes a per-process permanent choice for 64/32-bitness is a
problem in itself: each thread can be in either 64 or 32 bit mode, and change
forth and back.  That is a problem that this patch doesn't fix.

Now finally: why does this matter in the context of the linux kernel
discussion?  The discussion was related to a patch that exposed io_uring
threads to user-space.  This made it possible that one of those threads would
be picked out to select 64/32-bitness.  Given that such threads are atypical
user-space threads in the sense that they don't return to user-space and don't
have a userspace register state, reading their registers returns garbage, 
and
so it could f.i. occur that in a 64-bit process with all normal user-space
threads in 64-bit mode, the probing would return 32-bit.

It may be that this is worked-around on the kernel side by providing userspace
register state in those threads such that current gdb is happy.  Nevertheless,
it seems prudent to fix this on the gdb size as well.

Tested on x86_64-linux.

[1] https://lore.kernel.org/io-uring/CAHk-=wh0KoEZXPYMGkfkeVEerSCEF1AiCZSvz9TRrx=Kj74D+Q@mail.gmail.com/

gdb/ChangeLog:

2021-05-07  Tom de Vries  <tdevries@suse.de>

	PR tdep/27822
	* target.h (struct target_ops): Mention target_thread_architecture in
	read_description comment.
	* x86-linux-nat.c (x86_linux_nat_target::read_description): Use
	pid to determine if process is 64-bit or 32-bit.
	* aarch64-linux-nat.c (aarch64_linux_nat_target::read_description):
	Same.
	* ppc-linux-nat.c (ppc_linux_nat_target::read_description): Same.
        * riscv-linux-nat.c (riscv_linux_nat_target::read_description): Same.
	* s390-linux-nat.c (s390_linux_nat_target::read_description): Same.
	* arm-linux-nat.c (arm_linux_nat_target::read_description): Same.
	Likewise, use pid to determine if kernel supports reading VFP
	registers.

---
 gdb/aarch64-linux-nat.c | 2 +-
 gdb/arm-linux-nat.c     | 4 ++--
 gdb/ppc-linux-nat.c     | 4 +---
 gdb/riscv-linux-nat.c   | 2 +-
 gdb/s390-linux-nat.c    | 2 +-
 gdb/target.h            | 5 ++++-
 gdb/x86-linux-nat.c     | 5 +----
 7 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
index ae8db2988c2..61224022f6a 100644
--- a/gdb/aarch64-linux-nat.c
+++ b/gdb/aarch64-linux-nat.c
@@ -723,7 +723,7 @@ aarch64_linux_nat_target::read_description ()
   gdb_byte regbuf[ARM_VFP3_REGS_SIZE];
   struct iovec iovec;
 
-  tid = inferior_ptid.lwp ();
+  tid = inferior_ptid.pid ();
 
   iovec.iov_base = regbuf;
   iovec.iov_len = ARM_VFP3_REGS_SIZE;
diff --git a/gdb/arm-linux-nat.c b/gdb/arm-linux-nat.c
index 662dade0a12..880ac0da044 100644
--- a/gdb/arm-linux-nat.c
+++ b/gdb/arm-linux-nat.c
@@ -537,7 +537,7 @@ arm_linux_nat_target::read_description ()
     {
       elf_gregset_t gpregs;
       struct iovec iov;
-      int tid = inferior_ptid.lwp ();
+      int tid = inferior_ptid.pid ();
 
       iov.iov_base = &gpregs;
       iov.iov_len = sizeof (gpregs);
@@ -556,7 +556,7 @@ arm_linux_nat_target::read_description ()
     {
       /* Make sure that the kernel supports reading VFP registers.  Support was
 	 added in 2.6.30.  */
-      int pid = inferior_ptid.lwp ();
+      int pid = inferior_ptid.pid ();
       errno = 0;
       char *buf = (char *) alloca (ARM_VFP3_REGS_SIZE);
       if (ptrace (PTRACE_GETVFPREGS, pid, 0, buf) < 0 && errno == EIO)
diff --git a/gdb/ppc-linux-nat.c b/gdb/ppc-linux-nat.c
index 171f5b386fa..06a30efeaef 100644
--- a/gdb/ppc-linux-nat.c
+++ b/gdb/ppc-linux-nat.c
@@ -1946,9 +1946,7 @@ ppc_linux_nat_target::auxv_parse (gdb_byte **readptr,
 const struct target_desc *
 ppc_linux_nat_target::read_description ()
 {
-  int tid = inferior_ptid.lwp ();
-  if (tid == 0)
-    tid = inferior_ptid.pid ();
+  int tid = inferior_ptid.pid ();
 
   if (have_ptrace_getsetevrregs)
     {
diff --git a/gdb/riscv-linux-nat.c b/gdb/riscv-linux-nat.c
index 04bf46b3bb1..c0f5a27a37e 100644
--- a/gdb/riscv-linux-nat.c
+++ b/gdb/riscv-linux-nat.c
@@ -202,7 +202,7 @@ const struct target_desc *
 riscv_linux_nat_target::read_description ()
 {
   const struct riscv_gdbarch_features features
-    = riscv_linux_read_features (inferior_ptid.lwp ());
+    = riscv_linux_read_features (inferior_ptid.pid ());
   return riscv_lookup_target_description (features);
 }
 
diff --git a/gdb/s390-linux-nat.c b/gdb/s390-linux-nat.c
index 41b50ce4800..8f6eb61505b 100644
--- a/gdb/s390-linux-nat.c
+++ b/gdb/s390-linux-nat.c
@@ -988,7 +988,7 @@ s390_linux_nat_target::auxv_parse (gdb_byte **readptr,
 const struct target_desc *
 s390_linux_nat_target::read_description ()
 {
-  int tid = s390_inferior_tid ();
+  int tid = inferior_ptid.pid ();
 
   have_regset_last_break
     = check_regset (tid, NT_S390_LAST_BREAK, 8);
diff --git a/gdb/target.h b/gdb/target.h
index d867a58e2a8..51139042691 100644
--- a/gdb/target.h
+++ b/gdb/target.h
@@ -841,7 +841,10 @@ struct target_ops
     /* Describe the architecture-specific features of this target.  If
        OPS doesn't have a description, this should delegate to the
        "beneath" target.  Returns the description found, or NULL if no
-       description was available.  */
+       description was available.  This should return something that
+       describes a whole process, and if there are things that are
+       thread-specific, target_thread_architecture should be used for
+       that.  */
     virtual const struct target_desc *read_description ()
 	 TARGET_DEFAULT_RETURN (NULL);
 
diff --git a/gdb/x86-linux-nat.c b/gdb/x86-linux-nat.c
index 85c7f0ddc94..adea1ad0092 100644
--- a/gdb/x86-linux-nat.c
+++ b/gdb/x86-linux-nat.c
@@ -113,10 +113,7 @@ x86_linux_nat_target::read_description ()
   static uint64_t xcr0;
   uint64_t xcr0_features_bits;
 
-  /* GNU/Linux LWP ID's are process ID's.  */
-  tid = inferior_ptid.lwp ();
-  if (tid == 0)
-    tid = inferior_ptid.pid (); /* Not a threaded program.  */
+  tid = inferior_ptid.pid ();
 
 #ifdef __x86_64__
   {

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness
  2021-05-22  8:32           ` Tom de Vries
@ 2021-05-22  9:56             ` Tom de Vries
  2021-05-23  1:27               ` Simon Marchi
  0 siblings, 1 reply; 10+ messages in thread
From: Tom de Vries @ 2021-05-22  9:56 UTC (permalink / raw)
  To: Alan Hayward, Simon Marchi
  Cc: gdb-patches\@sourceware.org, Kevin Buettner, Andreas Arnez,
	Luis Machado, nd, Andrew Burgess, palmer

[ correcting mistyped email address ]

On 5/22/21 10:32 AM, Tom de Vries wrote:
> On 5/21/21 12:50 PM, Alan Hayward wrote:
>>> I think it makes sense in any case to probe the main thread, we know
>>> it's always ok.
>>>
>>  ...But agreed this is the sensible option. Even if io_uring is ok, something else
>> could get added down the line which isn’t ok. Using pid should keep it safe.
>>
> Changes since last post:
> - added riscv64 (adding maintainers in cc)
> - added suggested change for arm
> - added comment at read_description in target.h
> 
> Any further comments?
> 
> Thanks,
> - Tom
> 
> 
> 0001-gdb-tdep-Use-pid-to-choose-process-64-32-bitness.patch
> 
> [gdb/tdep] Use pid to choose process 64/32-bitness
> 
> In a linux kernel mailing list discussion, it was mentioned that "gdb has
> this odd thing where it takes the 64-bit vs 32-bit data for the whole process
> from one thread, and picks the worst possible thread to do it (ie explicitly
> not even the main thread, ...)" [1].
> 
> The picking of the thread is done here in
> x86_linux_nat_target::read_description:
> ...
>   /* GNU/Linux LWP ID's are process ID's.  */
>   tid = inferior_ptid.lwp ();
>   if (tid == 0)
>     tid = inferior_ptid.pid (); /* Not a threaded program.  */
> ...
> 
> To understand what this code does, let's investigate a scenario in which
> inferior_ptid.lwp () != inferior_ptid.pid ().
> 
> Say we start exec jit-attach-pie, identified with pid x.  The main thread
> starts another thread that sleeps, and then the main thread waits for the
> sleeping thread.  So we have two threads, identified with LWP IDs x and x+1:
> ...
> PID  LWP  CMD
> x    x    ./jit-attach-pie
> x    x+1  ./jit-attach-pie
> ...
> [ The thread with LWP x is known as the thread group leader. ]
> 
> When attaching to this exec using the pid, gdb does a stop_all_threads which
> iterates over all the threads, first LWP x, and then LWP x+1.
> 
> So the state we arrive with at x86_linux_nat_target::read_description is:
> ...
> (gdb) p inferior_ptid
> $1 = {m_pid = x, m_lwp = x+1, m_tid = 0}
> ...
> and consequently we probe 64/32-bitness from thread LWP x+1.
> 
> [ Note that this is different from when gdb doesn't attach but instead
> launches the exec itself, in which case there's just one thread to begin with,
> and consequently the probed thread is LWP x. ]
> 
> According to aforementioned remark, a better choice would have been the main
> thread, that is, LWP x.
> 
> This patch implement that choice, by simply doing:
> ...
>   tid = inferior_ptid.pid ();
> ...
> 
> The fact that gdb makes a per-process permanent choice for 64/32-bitness is a
> problem in itself: each thread can be in either 64 or 32 bit mode, and change
> forth and back.  That is a problem that this patch doesn't fix.
> 
> Now finally: why does this matter in the context of the linux kernel
> discussion?  The discussion was related to a patch that exposed io_uring
> threads to user-space.  This made it possible that one of those threads would
> be picked out to select 64/32-bitness.  Given that such threads are atypical
> user-space threads in the sense that they don't return to user-space and don't
> have a userspace register state, reading their registers returns garbage, 
> and
> so it could f.i. occur that in a 64-bit process with all normal user-space
> threads in 64-bit mode, the probing would return 32-bit.
> 
> It may be that this is worked-around on the kernel side by providing userspace
> register state in those threads such that current gdb is happy.  Nevertheless,
> it seems prudent to fix this on the gdb size as well.
> 
> Tested on x86_64-linux.
> 
> [1] https://lore.kernel.org/io-uring/CAHk-=wh0KoEZXPYMGkfkeVEerSCEF1AiCZSvz9TRrx=Kj74D+Q@mail.gmail.com/
> 
> gdb/ChangeLog:
> 
> 2021-05-07  Tom de Vries  <tdevries@suse.de>
> 
> 	PR tdep/27822
> 	* target.h (struct target_ops): Mention target_thread_architecture in
> 	read_description comment.
> 	* x86-linux-nat.c (x86_linux_nat_target::read_description): Use
> 	pid to determine if process is 64-bit or 32-bit.
> 	* aarch64-linux-nat.c (aarch64_linux_nat_target::read_description):
> 	Same.
> 	* ppc-linux-nat.c (ppc_linux_nat_target::read_description): Same.
>         * riscv-linux-nat.c (riscv_linux_nat_target::read_description): Same.
> 	* s390-linux-nat.c (s390_linux_nat_target::read_description): Same.
> 	* arm-linux-nat.c (arm_linux_nat_target::read_description): Same.
> 	Likewise, use pid to determine if kernel supports reading VFP
> 	registers.
> 
> ---
>  gdb/aarch64-linux-nat.c | 2 +-
>  gdb/arm-linux-nat.c     | 4 ++--
>  gdb/ppc-linux-nat.c     | 4 +---
>  gdb/riscv-linux-nat.c   | 2 +-
>  gdb/s390-linux-nat.c    | 2 +-
>  gdb/target.h            | 5 ++++-
>  gdb/x86-linux-nat.c     | 5 +----
>  7 files changed, 11 insertions(+), 13 deletions(-)
> 
> diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
> index ae8db2988c2..61224022f6a 100644
> --- a/gdb/aarch64-linux-nat.c
> +++ b/gdb/aarch64-linux-nat.c
> @@ -723,7 +723,7 @@ aarch64_linux_nat_target::read_description ()
>    gdb_byte regbuf[ARM_VFP3_REGS_SIZE];
>    struct iovec iovec;
>  
> -  tid = inferior_ptid.lwp ();
> +  tid = inferior_ptid.pid ();
>  
>    iovec.iov_base = regbuf;
>    iovec.iov_len = ARM_VFP3_REGS_SIZE;
> diff --git a/gdb/arm-linux-nat.c b/gdb/arm-linux-nat.c
> index 662dade0a12..880ac0da044 100644
> --- a/gdb/arm-linux-nat.c
> +++ b/gdb/arm-linux-nat.c
> @@ -537,7 +537,7 @@ arm_linux_nat_target::read_description ()
>      {
>        elf_gregset_t gpregs;
>        struct iovec iov;
> -      int tid = inferior_ptid.lwp ();
> +      int tid = inferior_ptid.pid ();
>  
>        iov.iov_base = &gpregs;
>        iov.iov_len = sizeof (gpregs);
> @@ -556,7 +556,7 @@ arm_linux_nat_target::read_description ()
>      {
>        /* Make sure that the kernel supports reading VFP registers.  Support was
>  	 added in 2.6.30.  */
> -      int pid = inferior_ptid.lwp ();
> +      int pid = inferior_ptid.pid ();
>        errno = 0;
>        char *buf = (char *) alloca (ARM_VFP3_REGS_SIZE);
>        if (ptrace (PTRACE_GETVFPREGS, pid, 0, buf) < 0 && errno == EIO)
> diff --git a/gdb/ppc-linux-nat.c b/gdb/ppc-linux-nat.c
> index 171f5b386fa..06a30efeaef 100644
> --- a/gdb/ppc-linux-nat.c
> +++ b/gdb/ppc-linux-nat.c
> @@ -1946,9 +1946,7 @@ ppc_linux_nat_target::auxv_parse (gdb_byte **readptr,
>  const struct target_desc *
>  ppc_linux_nat_target::read_description ()
>  {
> -  int tid = inferior_ptid.lwp ();
> -  if (tid == 0)
> -    tid = inferior_ptid.pid ();
> +  int tid = inferior_ptid.pid ();
>  
>    if (have_ptrace_getsetevrregs)
>      {
> diff --git a/gdb/riscv-linux-nat.c b/gdb/riscv-linux-nat.c
> index 04bf46b3bb1..c0f5a27a37e 100644
> --- a/gdb/riscv-linux-nat.c
> +++ b/gdb/riscv-linux-nat.c
> @@ -202,7 +202,7 @@ const struct target_desc *
>  riscv_linux_nat_target::read_description ()
>  {
>    const struct riscv_gdbarch_features features
> -    = riscv_linux_read_features (inferior_ptid.lwp ());
> +    = riscv_linux_read_features (inferior_ptid.pid ());
>    return riscv_lookup_target_description (features);
>  }
>  
> diff --git a/gdb/s390-linux-nat.c b/gdb/s390-linux-nat.c
> index 41b50ce4800..8f6eb61505b 100644
> --- a/gdb/s390-linux-nat.c
> +++ b/gdb/s390-linux-nat.c
> @@ -988,7 +988,7 @@ s390_linux_nat_target::auxv_parse (gdb_byte **readptr,
>  const struct target_desc *
>  s390_linux_nat_target::read_description ()
>  {
> -  int tid = s390_inferior_tid ();
> +  int tid = inferior_ptid.pid ();
>  
>    have_regset_last_break
>      = check_regset (tid, NT_S390_LAST_BREAK, 8);
> diff --git a/gdb/target.h b/gdb/target.h
> index d867a58e2a8..51139042691 100644
> --- a/gdb/target.h
> +++ b/gdb/target.h
> @@ -841,7 +841,10 @@ struct target_ops
>      /* Describe the architecture-specific features of this target.  If
>         OPS doesn't have a description, this should delegate to the
>         "beneath" target.  Returns the description found, or NULL if no
> -       description was available.  */
> +       description was available.  This should return something that
> +       describes a whole process, and if there are things that are
> +       thread-specific, target_thread_architecture should be used for
> +       that.  */
>      virtual const struct target_desc *read_description ()
>  	 TARGET_DEFAULT_RETURN (NULL);
>  
> diff --git a/gdb/x86-linux-nat.c b/gdb/x86-linux-nat.c
> index 85c7f0ddc94..adea1ad0092 100644
> --- a/gdb/x86-linux-nat.c
> +++ b/gdb/x86-linux-nat.c
> @@ -113,10 +113,7 @@ x86_linux_nat_target::read_description ()
>    static uint64_t xcr0;
>    uint64_t xcr0_features_bits;
>  
> -  /* GNU/Linux LWP ID's are process ID's.  */
> -  tid = inferior_ptid.lwp ();
> -  if (tid == 0)
> -    tid = inferior_ptid.pid (); /* Not a threaded program.  */
> +  tid = inferior_ptid.pid ();
>  
>  #ifdef __x86_64__
>    {
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness
  2021-05-22  9:56             ` Tom de Vries
@ 2021-05-23  1:27               ` Simon Marchi
  0 siblings, 0 replies; 10+ messages in thread
From: Simon Marchi @ 2021-05-23  1:27 UTC (permalink / raw)
  To: Tom de Vries, Alan Hayward
  Cc: gdb-patches\@sourceware.org, Kevin Buettner, Andreas Arnez,
	Luis Machado, nd, Andrew Burgess, palmer


>> diff --git a/gdb/target.h b/gdb/target.h
>> index d867a58e2a8..51139042691 100644
>> --- a/gdb/target.h
>> +++ b/gdb/target.h
>> @@ -841,7 +841,10 @@ struct target_ops
>>      /* Describe the architecture-specific features of this target.  If
>>         OPS doesn't have a description, this should delegate to the
>>         "beneath" target.  Returns the description found, or NULL if no
>> -       description was available.  */
>> +       description was available.  This should return something that
>> +       describes a whole process, and if there are things that are
>> +       thread-specific, target_thread_architecture should be used for
>> +       that.  */

Haha, that sounds like the way I said it, which is a bit too informal
for a comment I think.  Here's what I suggest:

    /* Describe the architecture-specific features of the current
       inferior.

       Returns the description found, or nullptr if no description was
       available.

       If some target features differ between threads, the description
       returned by read_description (and the resulting gdbarch) won't
       accurately describe all threads.  In this case, the
       thread_architecture method can be used to obtain gdbarches that
       accurately describe each thread.  */

The patch LGTM with that.

Simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-05-23  1:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-07  8:44 [PATCH][gdb/tdep] Use pid to choose x86_64 process 64/32-bitness Tom de Vries
2021-05-07 19:27 ` Simon Marchi
2021-05-07 20:06   ` Luis Machado
2021-05-19 16:32   ` Tom de Vries
2021-05-20 10:42     ` Alan Hayward
2021-05-20 16:07       ` Simon Marchi
2021-05-21 10:50         ` Alan Hayward
2021-05-22  8:32           ` Tom de Vries
2021-05-22  9:56             ` Tom de Vries
2021-05-23  1:27               ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).