public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Aditya Kamath1 <Aditya.Kamath1@ibm.com>
To: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>,
	"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] Fix AIX thread NULL assertion failure during fork
Date: Tue, 21 Nov 2023 07:30:50 +0000	[thread overview]
Message-ID: <CH2PR15MB35441471C82C5C2455447A59D6BBA@CH2PR15MB3544.namprd15.prod.outlook.com> (raw)
In-Reply-To: <101f32d12e4e058fa0199bbf0b38ecfe43fcd551.camel@de.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 5928 bytes --]

Hi Ulrich and community,

>So here's the piece I do not fully understand: Yes, wait () inside rs6000-aix-nat.c
>will set the ourstatus child_ptid to a non-threaded ptid.  But that's for a newly
>created child process that should not *be* threaded at this time, right?

Yes. So till here it is correct and we are on the same page here. The child ptid is ptid_t (pid, 0, 0). Here child should not be threaded. And therefore we have registered to the GDB core that the parent has a non-threaded child.

>So how is it possible that in between wait () setting the child_ptid and infrun.c
>using it to switch to the child, the child is becoming multi-threaded?  Where is
>the sync_threadlists () call that makes this happen?

>I think we should understand better how this could have happened.

I’m sorry I missed an information to tell you. So the parent process is loaded it is multi-threaded, child is loaded and through wait we have informed that fork () event has happened and given the GDB core its required information.

This child now will have its object file which will be loaded soon. So new_objfile () is called which will inturn call pd_enable () and this function will call pd_activate () then pd_update (), then sync_threadlists (). Once it is in sync_threadlists () its ptid will get synced to ptid_t (pid, 0, utid) since cmp_result will be positive, pbuf has a user thread ID but gbuf does not and has a ptid which is non threaded process. This where the mess happens and we end up changing the ptid via thread_change_ptid (). After this we know that child has threaded ptid but GDB core is still using ptid_t (pid, 0, 0)..

Perhaps GDB core will update this ptid later. I am not sure of that. But yes, we need to stop pd_activate () from syncing threadlists when the call is made for a child process whose object file is just loaded and GDB core is yet to switch to this thread post detaching the parent process since the user has set his debugging options like that. If we recall we check this inf->in_initial_library_scan. But in this case, this flag is not able to stop this bug from happening.

That is why in my patch sent in the previous email I was checking that is there is only one thread that a process has then do not change the ptid to a threaded one.

So yeah this is the thought process. Let me know what you think. I am pasting the output where I have print in pd_update () and pd_enable (). We can clearly see why this is happening. Hope it helps.

Have a nice day ahead.

Thanks and regards,
Aditya.

-------------------------------------------------------
Reading symbols from //gdb_tests/multi-thread-fork...
(gdb) set follow-fork-mode child
(gdb) r
Starting program: /gdb_tests/multi-thread-fork
pd_update pid = 9044280
pid in sync_threadlists () is 9044280
pd_update pid = 9044280
pid in sync_threadlists () is 9044280
pd_update pid = 9044280
pid in sync_threadlists () is 9044280
[New Thread 258]
[New Thread 515]
[Attaching after Thread 515 fork to child process 13763052]
[New inferior 2 (process 13763052)]
[Detaching after fork from parent process 9044280]
Hello from Parent!
[Inferior 1 (process 9044280) detached]
Hello from Child!
Hello from Parent!
In pd_enable with pid 13763052
pd_update pid = 13763052
pid in sync_threadlists () is 13763052
thread.c:1385: internal-error: switch_to_thread: Assertion `thr != NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.



From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Date: Monday, 20 November 2023 at 4:57 PM
To: gdb-patches@sourceware.org <gdb-patches@sourceware.org>, Aditya Kamath1 <Aditya.Kamath1@ibm.com>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] Fix AIX thread NULL assertion failure during fork
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Assume we have set detach_on_fork = on and set follow-fork-mode child.
>In AIX, on a fork () event we set our status and return our parent ptid from rs6000-aix-nat.c..
>Once the object file of the new_inferior or child process is loaded we call pd_enable () to
>set our thread target and sync our threadlists. In our sync_threadlists we have pbuf having
>our pthread library threads and gbuf having our GDB threads known to GDB core that have been
>registered.

>While I cannot say with 100% surety that from where GDB core got this ptid and why it did not
>update to ptid_t (pid, 0, tid) , my observation post debugging is that GDB core would have got
>ptid_t(pid, 0, 0) from the rs6000-aix-nat.c file, inside the wait () where we did inform GDB
>by setting a status that this a child process belonging a parent process on a fork event.
>GDB could not change this ptid it got during the fork event, even though we changed it later
>via sync_threadlists () from aix-thread.c for the threaded event.

So here's the piece I do not fully understand: Yes, wait () inside rs6000-aix-nat.c
will set the ourstatus child_ptid to a non-threaded ptid.  But that's for a newly
created child process that should not *be* threaded at this time, right?

So how is it possible that in between wait () setting the child_ptid and infrun.c
using it to switch to the child, the child is becoming multi-threaded?  Where is
the sync_threadlists () call that makes this happen?

I'm aware that the wait () in aix-thread.c (which is the caller of the rs6000-aix-nat.c
one) does perform a pd_enable / sync_threadlists, but only on the *parent*, not
on the child.  That should happen only later.

I think we should understand better how this could have happened.

If there is a good reason why the child can already be multi-threaded, then
one option to fix this would be to switch ourstatus->child_ptid to the multi-
threaded version in the *aix-thread.c* version of wait (), just like it
switches the returned ptid.

Bye,
Ulrich

  reply	other threads:[~2023-11-21  7:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-20  7:25 Aditya Kamath1
2023-11-20 11:27 ` Ulrich Weigand
2023-11-21  7:30   ` Aditya Kamath1 [this message]
2023-11-21 12:17     ` Ulrich Weigand
2023-11-22 10:48       ` Aditya Kamath1
2023-11-22 11:30         ` Ulrich Weigand
2023-11-22 13:58           ` Aditya Kamath1
2023-11-22 14:14             ` Aditya Kamath1
2023-11-22 15:33               ` Ulrich Weigand
2023-11-22 16:22                 ` Aditya Kamath1
2023-11-22 18:30                   ` Ulrich Weigand
2023-11-23  6:06                     ` Aditya Kamath1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CH2PR15MB35441471C82C5C2455447A59D6BBA@CH2PR15MB3544.namprd15.prod.outlook.com \
    --to=aditya.kamath1@ibm.com \
    --cc=Ulrich.Weigand@de.ibm.com \
    --cc=gdb-patches@sourceware.org \
    --cc=sangamesh.swamy@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).