public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Limit the switch_to_thread() calls in startup_inferior()
@ 2020-10-14  8:49 Kamil Rytarowski
  2020-10-25  2:58 ` Simon Marchi
  2020-11-02  0:34 ` Simon Marchi
  0 siblings, 2 replies; 5+ messages in thread
From: Kamil Rytarowski @ 2020-10-14  8:49 UTC (permalink / raw)
  To: gdb-patches

Do not jump over the threads during the startup unless we encounter
TARGET_WAITKIND_STOPPED with SIGTRAP or TARGET_WAITKIND_EXECD.

Otherwise whenever a startup-with-shell processes signals on the
startup stage, it might indicate to switch to a non-existing
thread or a special-thread number (target lwp=0 on NetBSD means
that a signal was directed to all threads within a process).

This caused a crash with tcsh on NetBSD, where the tcsh shell
runs startup detection of the hostname. This action involves
spwaning a new process through fork.

GDB crashes this way:
$ SHELL=tcsh /usr/bin/gdb echo
(gdb) r
Starting program: /bin/echo
/usr/src/external/gpl3/gdb/lib/libgdb/../../dist/gdb/thread.c:1309: internal-error: void switch_to_thread(thread_info*): Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.

And the core-dump is as follows:

(top-gdb) bt
During symbol reading: incomplete CFI data; unspecified registers (e.g., rax) at 0x7f7ff5b698a4
    file=0x135314f "../../gdb/thread.c", line=1309, fmt=0x1353069 "%s: Assertion `%s' failed.", ap=0x7f7fffffdc08) at ../../gdb/utils.c:414
    at ../../gdb/inf-ptrace.c:117
args=0x0, from_tty=1, run_how=RUN_NORMAL) at ../../gdb/infcmd.c:493
--Type <RET> for more, q to quit, c to continue without paging--
---
 gdb/nat/fork-inferior.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gdb/nat/fork-inferior.c b/gdb/nat/fork-inferior.c
index 7ba0126871d..b6c20a8fac1 100644
--- a/gdb/nat/fork-inferior.c
+++ b/gdb/nat/fork-inferior.c
@@ -503,7 +503,6 @@ startup_inferior (process_stratum_target *proc_target, pid_t pid, int ntraps,
 	  case TARGET_WAITKIND_SYSCALL_ENTRY:
 	  case TARGET_WAITKIND_SYSCALL_RETURN:
 	    /* Ignore gracefully during startup of the inferior.  */
-	    switch_to_thread (proc_target, event_ptid);
 	    break;

 	  case TARGET_WAITKIND_SIGNALLED:
@@ -536,7 +535,9 @@ startup_inferior (process_stratum_target *proc_target, pid_t pid, int ntraps,

 	  case TARGET_WAITKIND_STOPPED:
 	    resume_signal = ws.value.sig;
-	    switch_to_thread (proc_target, event_ptid);
+	    /* Ignore gracefully the !TRAP signals intercepted from the shell.  */
+	    if (resume_signal == GDB_SIGNAL_TRAP)
+	      switch_to_thread (proc_target, event_ptid);
 	    break;
 	}

--
2.28.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Limit the switch_to_thread() calls in startup_inferior()
  2020-10-14  8:49 [PATCH] Limit the switch_to_thread() calls in startup_inferior() Kamil Rytarowski
@ 2020-10-25  2:58 ` Simon Marchi
  2020-11-01 18:33   ` Samuel Thibault
  2020-11-02  0:34 ` Simon Marchi
  1 sibling, 1 reply; 5+ messages in thread
From: Simon Marchi @ 2020-10-25  2:58 UTC (permalink / raw)
  To: Kamil Rytarowski, gdb-patches, Samuel Thibault

On 2020-10-14 4:49 a.m., Kamil Rytarowski wrote:
> Do not jump over the threads during the startup unless we encounter
> TARGET_WAITKIND_STOPPED with SIGTRAP or TARGET_WAITKIND_EXECD.
>
> Otherwise whenever a startup-with-shell processes signals on the
> startup stage, it might indicate to switch to a non-existing
> thread or a special-thread number (target lwp=0 on NetBSD means
> that a signal was directed to all threads within a process).
>
> This caused a crash with tcsh on NetBSD, where the tcsh shell
> runs startup detection of the hostname. This action involves
> spwaning a new process through fork.
>
> GDB crashes this way:
> $ SHELL=tcsh /usr/bin/gdb echo
> (gdb) r
> Starting program: /bin/echo
> /usr/src/external/gpl3/gdb/lib/libgdb/../../dist/gdb/thread.c:1309: internal-error: void switch_to_thread(thread_info*): Assertion `thr != NULL' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
>
> And the core-dump is as follows:
>
> (top-gdb) bt
> During symbol reading: incomplete CFI data; unspecified registers (e.g., rax) at 0x7f7ff5b698a4
>     file=0x135314f "../../gdb/thread.c", line=1309, fmt=0x1353069 "%s: Assertion `%s' failed.", ap=0x7f7fffffdc08) at ../../gdb/utils.c:414
>     at ../../gdb/inf-ptrace.c:117
> args=0x0, from_tty=1, run_how=RUN_NORMAL) at ../../gdb/infcmd.c:493
> --Type <RET> for more, q to quit, c to continue without paging--

I tried to wrap my head around this and understand the need for the
switch_to_thread, what could be the impacts of your change, and I just
can't give anything close to a definitive answer.

I did a bit of archeology, and found that the switch_to_thread were
added by this patch:

  https://sourceware.org/pipermail/gdb-patches/2008-October/060950.html

Following this thread:

  https://sourceware.org/legacy-ml/gdb/2008-10/msg00034.html

Sooo... it sounds like the tid in the ptid made by gnu-nat.c is a bogus
integer (looks like it still is today, but it's in lwp instead of tid).
And the problem was because the tid increased at every exec.

So my theory is that these switch_to_thread were needed to keep the
inferior_ptid global sync'ed with the ptid gnu-nat.c knows about.

The failure reported in the thread above is this one:

  https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/i386gnu-nat.c;h=3d71047eab7dd4184dd978bb7b351661f09beec5;hb=2e979b9404414113dc61b32eebcda8a53632605c#l124

You can see that back then, gnu_fetch_registers relied on inferior_ptid,
so it would make sense for this function to fail if inferior_ptid is
stale.  I just don't know where/when during the startup_inferior
function did gnu_fetch_registers get called (it would have been useful
to have a backtrace for that).  Today,
i386_gnu_nat_target::fetch_registers relies on the ptid of the regcache,
not inferior_ptid.  Perhaps it's not necessary to call switch_to_thread
at all today?  It's hard to say.

Samuel, as the last person who touched gnu-nat.c, do you have a bit of
time to test Kamil's patch on Hurd, see if it breaks starting inferiors?
And what if you remove all the switch_to_thread from that function, does
it still work?

Simon

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Limit the switch_to_thread() calls in startup_inferior()
  2020-10-25  2:58 ` Simon Marchi
@ 2020-11-01 18:33   ` Samuel Thibault
  2020-11-02  0:47     ` Simon Marchi
  0 siblings, 1 reply; 5+ messages in thread
From: Samuel Thibault @ 2020-11-01 18:33 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Kamil Rytarowski, gdb-patches

Hello,

(Note that I have never really hacked very much into gdb's inferior
management, so I don't know much how things are supposed to be working.)

Simon Marchi, le sam. 24 oct. 2020 22:58:05 -0400, a ecrit:
> I tried to wrap my head around this and understand the need for the
> switch_to_thread, what could be the impacts of your change, and I just
> can't give anything close to a definitive answer.
> 
[...]
> Samuel, as the last person who touched gnu-nat.c, do you have a bit of
> time to test Kamil's patch on Hurd, see if it breaks starting inferiors?

I am not really sure what I am supposed to test. I tried a program that
starts a second thread which just pause()s. With Kamil's patch I can
switch & backtrace between the two threads.

(gdb) r
Starting program: /home/samy/test
^C[New Thread 13012.6]
[New Thread 13012.7]

Thread 5 received signal SIGINT, Interrupt.
0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
(gdb) info thread
  Id   Target Id         Frame
* 5    Thread 13012.5    0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
  6    Thread 13012.6    0x0109dc74 in _hurd_intr_rpc_mach_msg ()
   from /lib/i386-gnu/libc.so.0.3
  7    Thread 13012.7    0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
(gdb) bt
#0  0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
#1  0x01079cc6 in mach_msg () from /lib/i386-gnu/libc.so.0.3
[...]
#5  0x0104cdca in pthread_join () from /lib/i386-gnu/libpthread.so.0.3
#6  0x0803254b in main () at test.c:13
(gdb) thread 7
[Switching to thread 7 (Thread 13012.7)]
#0  0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
(gdb) bt
#0  0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
[...]
#3  0x0114e4ce in pause () from /lib/i386-gnu/libc.so.0.3
#4  0x080324fa in f (foo=0x0) at test.c:6
#5  0x0104c3ff in entry_point () from /lib/i386-gnu/libpthread.so.0.3
#6  0x00000000 in ?? ()

> And what if you remove all the switch_to_thread from that function, does
> it still work?

If I remove all calls from that fonction I am getting

(gdb) r
Starting program: /home/samy/test
Can't fetch registers from thread bogus thread id 1: No such thread
(gdb) info thread
  Id   Target Id         Frame
  5    Thread 18821.5    0x000021e0 in ?? ()

The current thread <Thread ID 1> has terminated.  See `help thread'.

Since I now have a built tree I can easily experiment more.

Samuel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Limit the switch_to_thread() calls in startup_inferior()
  2020-10-14  8:49 [PATCH] Limit the switch_to_thread() calls in startup_inferior() Kamil Rytarowski
  2020-10-25  2:58 ` Simon Marchi
@ 2020-11-02  0:34 ` Simon Marchi
  1 sibling, 0 replies; 5+ messages in thread
From: Simon Marchi @ 2020-11-02  0:34 UTC (permalink / raw)
  To: Kamil Rytarowski, gdb-patches

On 2020-10-14 4:49 a.m., Kamil Rytarowski wrote:
> Do not jump over the threads during the startup unless we encounter
> TARGET_WAITKIND_STOPPED with SIGTRAP or TARGET_WAITKIND_EXECD.
>
> Otherwise whenever a startup-with-shell processes signals on the
> startup stage, it might indicate to switch to a non-existing
> thread or a special-thread number (target lwp=0 on NetBSD means
> that a signal was directed to all threads within a process).
>
> This caused a crash with tcsh on NetBSD, where the tcsh shell
> runs startup detection of the hostname. This action involves
> spwaning a new process through fork.

Kamil, one last question: out of curiosity, could you please share
what's the list of target events reported by your target_wait, while in
startup_inferior, in the problematic case?

You can for apply this hack that will print them:

diff --git a/gdb/nat/fork-inferior.c b/gdb/nat/fork-inferior.c
index 7ba0126871dd..b8fc14ab83ac 100644
--- a/gdb/nat/fork-inferior.c
+++ b/gdb/nat/fork-inferior.c
@@ -485,6 +485,10 @@ startup_inferior (process_stratum_target *proc_target, pid_t pid, int ntraps,
       memset (&ws, 0, sizeof (ws));
       event_ptid = target_wait (resume_ptid, &ws, 0);

+      void print_target_wait_results (ptid_t waiton_ptid, ptid_t result_ptid,
+                                     const struct target_waitstatus *ws);
+      print_target_wait_results (resume_ptid, event_ptid, &ws);
+
       if (last_waitstatus != NULL)
        *last_waitstatus = ws;
       if (last_ptid != NULL)


Simon

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Limit the switch_to_thread() calls in startup_inferior()
  2020-11-01 18:33   ` Samuel Thibault
@ 2020-11-02  0:47     ` Simon Marchi
  0 siblings, 0 replies; 5+ messages in thread
From: Simon Marchi @ 2020-11-02  0:47 UTC (permalink / raw)
  To: Samuel Thibault; +Cc: Kamil Rytarowski, gdb-patches

On 2020-11-01 1:33 p.m., Samuel Thibault wrote:
> Hello,
>
> (Note that I have never really hacked very much into gdb's inferior
> management, so I don't know much how things are supposed to be working.)
>
> Simon Marchi, le sam. 24 oct. 2020 22:58:05 -0400, a ecrit:
>> I tried to wrap my head around this and understand the need for the
>> switch_to_thread, what could be the impacts of your change, and I just
>> can't give anything close to a definitive answer.
>>
> [...]
>> Samuel, as the last person who touched gnu-nat.c, do you have a bit of
>> time to test Kamil's patch on Hurd, see if it breaks starting inferiors?
>
> I am not really sure what I am supposed to test. I tried a program that
> starts a second thread which just pause()s. With Kamil's patch I can
> switch & backtrace between the two threads.

Just testing a "run" would have been sufficient, but you've tested even
more, which doesn't hurt.

>
> (gdb) r
> Starting program: /home/samy/test
> ^C[New Thread 13012.6]
> [New Thread 13012.7]
>
> Thread 5 received signal SIGINT, Interrupt.
> 0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
> (gdb) info thread
>   Id   Target Id         Frame
> * 5    Thread 13012.5    0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
>   6    Thread 13012.6    0x0109dc74 in _hurd_intr_rpc_mach_msg ()
>    from /lib/i386-gnu/libc.so.0.3
>   7    Thread 13012.7    0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
> (gdb) bt
> #0  0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
> #1  0x01079cc6 in mach_msg () from /lib/i386-gnu/libc.so.0.3
> [...]
> #5  0x0104cdca in pthread_join () from /lib/i386-gnu/libpthread.so.0.3
> #6  0x0803254b in main () at test.c:13
> (gdb) thread 7
> [Switching to thread 7 (Thread 13012.7)]
> #0  0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
> (gdb) bt
> #0  0x0107951c in ?? () from /lib/i386-gnu/libc.so.0.3
> [...]
> #3  0x0114e4ce in pause () from /lib/i386-gnu/libc.so.0.3
> #4  0x080324fa in f (foo=0x0) at test.c:6
> #5  0x0104c3ff in entry_point () from /lib/i386-gnu/libpthread.so.0.3
> #6  0x00000000 in ?? ()

Ok, so it all seems good.

>> And what if you remove all the switch_to_thread from that function, does
>> it still work?
>
> If I remove all calls from that fonction I am getting
>
> (gdb) r
> Starting program: /home/samy/test
> Can't fetch registers from thread bogus thread id 1: No such thread
> (gdb) info thread
>   Id   Target Id         Frame
>   5    Thread 18821.5    0x000021e0 in ?? ()

> The current thread <Thread ID 1> has terminated.  See `help thread'.

Ok, so if we remove these switch_to_thread calls, we get back to the
initial problem reported in

    https://sourceware.org/legacy-ml/gdb/2008-10/msg00034.html

So we know they are still necessary, thanks for your help.

> Since I now have a built tree I can easily experiment more.

I think we have all we need for now, thanks!

Simon

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-02  0:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-14  8:49 [PATCH] Limit the switch_to_thread() calls in startup_inferior() Kamil Rytarowski
2020-10-25  2:58 ` Simon Marchi
2020-11-01 18:33   ` Samuel Thibault
2020-11-02  0:47     ` Simon Marchi
2020-11-02  0:34 ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).