public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
@ 2022-10-25  6:47 Aditya Kamath1
  2022-10-28  9:49 ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-10-25  6:47 UTC (permalink / raw)
  To: Ulrich Weigand, simark, Aditya Kamath1 via Gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 2574 bytes --]

Hi all,

In the latest gdb AIX users aren't able to debug multi threaded programs.

I have fixed it. Please find attached the patch. [See 0001-Fix-multi-thread-debug-bug-in-AIX.patch].

Also pasting the output of the failure and the output after adding the patch.

Kindly let me know if any change required, if not kindly push this patch so that AIX folks have a happy multi threaded debugging.

Have a nice day ahead.

Thanks and regards,
Aditya.

---------------------------------------------------------

The code:- [Program Credits:- GDB TESTSUITE gdb_threads/continue-pending-status.c ]


#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  pthread_barrier_wait (&barrier);

  while (1);

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

    sleep (1);


  return 0;

}

OUTPUT BEFORE PATCH:-


(gdb) r

Starting program: /home/gdb_tests/continue-pending-status

[New Thread 1]

./../gdbsupport/gdb-checked-static-cast.h:58: internal-error: checked_static_cast: Assertion `result != nullptr' failed.

A problem internal to GDB has been detected,

further debugging may prove unreliable.

----- Backtrace -----

0x1010fb657 ???

0x1010fb81f ???


OUTPUT AFTER PATCH


(gdb) r

Starting program: /home/gdb_tests/continue-pending-status

[New Thread 1]

^C[New Thread 258]

[New Thread 515]


Thread 1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1    process 7602548                    0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2    Thread 1 (tid 27984319, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3    Thread 258 (tid 37093859, running) thread_function (arg=0x0)

    at continue-pending-status.c:36

  4    Thread 515 (tid 35062111, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at continue-pending-status.c:36

(gdb) q



[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 1369 bytes --]

From 15797e8153fd90e6f457cc3bcd0237fbfd4491d4 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Tue, 25 Oct 2022 01:35:33 -0500
Subject: [PATCH] Fix multi thread debug bug in AIX

In the recent commit 98ed24fb35d89eb20179edf6c12f599c7a9e228e made by Mr. Tom there is a change in aix-thread.c file that changes

 static_cast <aix_thread_info *> in gdb to gdb::checked_static_cast <aix_thread_info *>

AIX folks using the latest version will not be able to debug multi thread programs as a result of it.

The error in AIX is as follows:-

internal-error: checked_static_cast: Assertion 'result != nullptr' failed.

The reason is that the first thread in a multi threaded program will not have a prev.

Hence we need to add this check.

The future threads if any will handle the change gdb::checked_static_cast with ease after this condition is checked.
---
 gdb/aix-thread.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..b2dabd242fc 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -90,6 +90,9 @@ struct aix_thread_info : public private_thread_info
 static aix_thread_info *
 get_aix_thread_info (thread_info *thread)
 {
+  if (thread->priv == NULL)
+    return NULL;
+
   return gdb::checked_static_cast<aix_thread_info *> (thread->priv.get ());
 }
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-10-25  6:47 [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch Aditya Kamath1
@ 2022-10-28  9:49 ` Ulrich Weigand
  2022-11-08 12:00   ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2022-10-28  9:49 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

> static aix_thread_info *
> get_aix_thread_info (thread_info *thread)
> {
>+  if (thread->priv == NULL)
>+    return NULL;

This doesn't look right.  Note that all users of
get_aix_thread_info assume the pointer returned
from there is never NULL.

You should find out why the "priv" field isn't
set up correctly, and fix whatever was going
wrong there.  (I believe this should have been
done in sync_threadlists.)

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-10-28  9:49 ` Ulrich Weigand
@ 2022-11-08 12:00   ` Aditya Kamath1
  2022-11-08 12:17     ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-11-08 12:00 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 3999 bytes --]

Hi Ulrich,


>You should find out why the "priv" field isn't
>set up correctly, and fix whatever was going
>wrong there.  (I believe this should have been
>done in sync_threadlists.)

You were right about this. What is happening is the main process and the thread representing it are treated as two separate threads by the libpthread library. Main process had no private data set whereas the thread representing it had. Usually, both of them should have it and their private data must be the same.

For example ,

Consider the program below:- [ Program Credits:-  GDB test case continue-pending-status.c]


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{



  pthread_barrier_wait (&barrier);


  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }

  while (1)

    sleep (1);


  return 0;

}

Here is the gdb output of the above code,  Clearly when I switched to thread 2 which same as thread1 and interrupted, thread 1 received the input. So, when we added a private data in sync_threadlists() we added for thread 2 but not 1 which is main thread and same as thread 1. This is why we got that assertion failure as thread 1 did not have a private data.


Reading symbols from /home/XYZ/gdb_tests/continue-pending-status...

(gdb) r

Starting program: /home/XYZ/gdb_tests/continue-pending-status

[New Thread 1]

^C[New Thread 258]

[New Thread 515]


Thread 1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1    process 12059046                   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

  2    Thread 1 (tid 39125487, running)   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

  3    Thread 258 (tid 23396809, running) thread_function (arg=0x0) at continue-pending-status.c:36

  4    Thread 515 (tid 36503883, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0) at continue-pending-status.c:36

(gdb) thread 2

[Switching to thread 2 (Thread 1)]

#0  0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) c

Continuing.

^C

Thread 1 received signal SIGINT, Interrupt.

[Switching to process 12059046]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb)

I have written my comments in the patch. Hope this works and if it is right kindly push the same in git, otherwise Let me know what you think.

Have a nice day ahead.

Thanks and regards,
Aditya.
________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 28 October 2022 15:19
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

> static aix_thread_info *
> get_aix_thread_info (thread_info *thread)
> {
>+  if (thread->priv == NULL)
>+    return NULL;

This doesn't look right.  Note that all users of
get_aix_thread_info assume the pointer returned
from there is never NULL.

You should find out why the "priv" field isn't
set up correctly, and fix whatever was going
wrong there.  (I believe this should have been
done in sync_threadlists.)

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-Multi-thread-debug-bug-fix-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 2022 bytes --]

From 32fc5ec2fa9c5431e1b5718ff8ab36467fb0e1cb Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Tue, 8 Nov 2022 05:27:44 -0600
Subject: [PATCH] Fix Multi thread debug bug fix in AIX

In the recent commit 98ed24fb35d89eb20179edf6c12f599c7a9e228e made by Mr. Tom there is a change in aix-thread.c file that changes

static_cast <aix_thread_info *> in gdb to gdb::checked_static_cast <aix_thread_info *>

AIX folks using the latest version will not be able to debug multi thread programs as a result of it

The error in AIX is as follows:-

internal-error: checked_static_cast: Assertion 'result != nullptr' failed.

The reason is that once the threads are syncronised with sync_threadlists () and threads are added with priv -

We iterate over threads to get the thread who caused the event and return its ptid

However the pthreadlib library treats the main process and its thread as separate threads though they are one thread.

This patch is a fix for the same.
---
 gdb/aix-thread.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..07190543973 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -816,6 +816,19 @@ sync_threadlists (int pid)
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
 
+	  /* When the new thread is added and the pthread library is 
+             initialised, the process is threaded but in the 
+             libpthread library it will be counted as two threads
+             one with the main process and second one with the thread
+             that is added.  The main process thread needs to have a
+             private data.  The thread we added will have but main 
+             process will not. Hence the below chunk code does this.  */
+
+          inferior *inf = find_inferior_pid (proc_target, pid);
+          for (thread_info *tp : inf->threads ())
+            if (tp->priv == NULL)
+              tp->priv.reset (priv); 
+
 	  pi++;
 	}
       else
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-08 12:00   ` Aditya Kamath1
@ 2022-11-08 12:17     ` Ulrich Weigand
  2022-11-13 18:15       ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2022-11-08 12:17 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>You should find out why the "priv" field isn't
>>set up correctly, and fix whatever was going
>>wrong there.  (I believe this should have been
>>done in sync_threadlists.)
>
>You were right about this. What is happening is the main process 
>and the thread representing it are treated as two separate threads
>by the libpthread library. Main process had no private data set
>whereas the thread representing it had. Usually, both of them
>should have it and their private data must be the same. 

I see.  I agree this is the root cause of the issue, but the fix
doesn't look quite right to me.

You should not even *have* the duplicate GDB thread in the first place.

>(gdb) info threads
>  Id   Target Id                          Frame 
>* 1    process 12059046                   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
>  2    Thread 1 (tid 39125487, running)   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
>  3    Thread 258 (tid 23396809, running) thread_function (arg=0x0) at continue-pending-status.c:36
>  4    Thread 515 (tid 36503883, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)

This is not how GDB threads are handled on any other platform.  While
you do have a single dummy thread representing the process if it is
*non-threaded*, as soon as the process is recognized as multi-threaded,
you will only see a single GDB thread per target thread, and no
separate "thread" for the whole process.

So I think instead of adding a "priv" struct to that GDB thread
identifying the main process, the sync_threadlists routine should
actually just delete it (or replace it with the actual first thread,
whatever is easier).

Looking at the code, there already seems to be a place where
sync_threadlists deletes GDB threads that do not match onto
and of the threads reported by the AIX thread library - can
you verify why this doesn't trigger here?

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-08 12:17     ` Ulrich Weigand
@ 2022-11-13 18:15       ` Aditya Kamath1
  2022-11-15 18:16         ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-11-13 18:15 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 6413 bytes --]

Hi Ulrich,

Please find attached the new patch. See 0001-Fix-Multi-thread-debug-bug-fix-in-AIX.patch ..

>So I think instead of adding a "priv" struct to that GDB thread
>identifying the main process, the sync_threadlists routine should
>actually just delete it (or replace it with the actual first thread,
>whatever is easier).

I have chosen not to add the first main thread as new thread. Instead, we carry on with main process thread itself adding private data to it. Kindly see the first if condition. I observed this with the linux folks where in their output as you mentioned do not add any new threads the first time on recognition of multi thread debugee for the main process.

>Looking at the code, there already seems to be a place where
>sync_threadlists deletes GDB threads that do not match onto
>and of the threads reported by the AIX thread library - can
>you verify why this doesn't trigger here?

So that is when there is a thread that exits, and it will reflect in pthread library thread buffer but not in the GDB thread buffer. In order to keep both of them in sync we delete extra one's in GDB thread library buffer to maintain sync in both of them. It is not going to hit to delete extra threads representing the same thread be it for the main process or anything else. pi == pcount will happen only in the thread exit case where pthread buf threads are deleted as it will immediately reflect but in GDB thread buf we need to take care of the same by deleting the exited threads.

A couple of things I want to inform you is that the way the second for loop is executing is not correct from here on to sync both the buffer lists [pthread and GDB thread]. Since we are now not adding multiple threads for the same process main thread one representing the GDB thread and the other by the pthread those conditions and indices like pi and gi will fail. Now there has not pcount - 1 threads in the GDB thread buffer always. Condition 2 and 3 in the patch take care of them for addition and deletion of threads.

Attaching a sample output with source code.

Let me know what you think and if any case that I may have missed out. If it is okay, then kindly push this patch.

Have a nice day ahead.

Thanks,
Regards,
Aditya.

------------------------------------------------------------------
Consider the program below:- [ Program Credits:-  GDB test case continue-pending-status.c]


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{



  pthread_barrier_wait (&barrier);


  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }

  while (1)

    sleep (1);


  return 0;

}

Output after patch:-


Reading symbols from /home/XYZ/gdb_tests/continue-pending-status...

(gdb) r

Starting program: /home/XYZ/gdb_tests/continue-pending-status

^C[New Thread 258]

[New Thread 515]


Thread 1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1    process 26149278                   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2    Thread 258 (tid 24445361, running) thread_function (arg=0x0)

    at continue-pending-status.c:36

  3    Thread 515 (tid 16187681, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at continue-pending-status.c:36

(gdb)


________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 08 November 2022 17:47
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>You should find out why the "priv" field isn't
>>set up correctly, and fix whatever was going
>>wrong there.  (I believe this should have been
>>done in sync_threadlists.)
>
>You were right about this. What is happening is the main process
>and the thread representing it are treated as two separate threads
>by the libpthread library. Main process had no private data set
>whereas the thread representing it had. Usually, both of them
>should have it and their private data must be the same.

I see.  I agree this is the root cause of the issue, but the fix
doesn't look quite right to me.

You should not even *have* the duplicate GDB thread in the first place.

>(gdb) info threads
>  Id   Target Id                          Frame
>* 1    process 12059046                   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
>  2    Thread 1 (tid 39125487, running)   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
>  3    Thread 258 (tid 23396809, running) thread_function (arg=0x0) at continue-pending-status.c:36
>  4    Thread 515 (tid 36503883, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)

This is not how GDB threads are handled on any other platform.  While
you do have a single dummy thread representing the process if it is
*non-threaded*, as soon as the process is recognized as multi-threaded,
you will only see a single GDB thread per target thread, and no
separate "thread" for the whole process.

So I think instead of adding a "priv" struct to that GDB thread
identifying the main process, the sync_threadlists routine should
actually just delete it (or replace it with the actual first thread,
whatever is easier).

Looking at the code, there already seems to be a place where
sync_threadlists deletes GDB threads that do not match onto
and of the threads reported by the AIX thread library - can
you verify why this doesn't trigger here?

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 4134 bytes --]

From b9af3062ad2037b8acb463a0bba0b8cfc6a1c405 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Sun, 13 Nov 2022 11:48:52 -0600
Subject: [PATCH] Fix multi thread debug bug in AIX

When a process is multi threaded then AIX was adding a new thread with a priv set.

This thread is the same as main process thread without a private data set.

Hence an assertion failure checked_static_cast: Assertion result != nullptr failed is seen

This patch is a fix for the same where only new threads created are added and once a program

 is multi threaded then the main process private data is set instead of adding a new thread for itself.
---
 gdb/aix-thread.c | 104 +++++++++++++++++++++--------------------------
 1 file changed, 47 insertions(+), 57 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..6f910fcf847 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -795,69 +795,59 @@ sync_threadlists (int pid)
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
   iterate_over_threads (giter_accum, &g);
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
+  
+  /* If this is the first time the process is pthreaded then
+     the main process running as one thread must have a private
+     data.  */
+ 
+  if (pcount == 1 && gcount == 0)
+  {
+    pi = 0;
+    aix_thread_info *priv = new aix_thread_info;
+    priv->pdtid = pbuf[pi].pdtid;
+    priv->tid = pbuf[pi].tid;
+    process_stratum_target *proc_target
+                = current_inferior ()->process_target ();
+    thread_info *tp = find_thread_ptid (proc_target, ptid_t (pid));
+      if (tp->priv == NULL)
+        tp->priv.reset (priv);
+  }
 
-  /* Apply differences between the two arrays to GDB's thread list.  */
-  for (pi = gi = 0; pi < pcount || gi < gcount;)
-    {
-      if (pi == pcount)
-	{
-	  delete_thread (gbuf[gi]);
-	  gi++;
-	}
-      else if (gi == gcount)
-	{
-	  aix_thread_info *priv = new aix_thread_info;
-	  priv->pdtid = pbuf[pi].pdtid;
-	  priv->tid = pbuf[pi].tid;
-
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
-	  thread = add_thread_with_info (proc_target,
-					 ptid_t (pid, 0, pbuf[pi].pthid),
-					 priv);
-
-	  pi++;
-	}
-      else
-	{
-	  ptid_t pptid, gptid;
-	  int cmp_result;
+  /* This means that new threads have been created and
+     they need to be added in the GDB threads list.  */
 
-	  pptid = ptid_t (pid, 0, pbuf[pi].pthid);
-	  gptid = gbuf[gi]->ptid;
-	  pdtid = pbuf[pi].pdtid;
-	  tid = pbuf[pi].tid;
+  else if (pcount > 1 && gcount < pcount)
+  {
+    pi = 1;
+    while (pcount - 1 != gcount)
+    {
+      aix_thread_info *priv = new aix_thread_info;
+      priv->pdtid = pbuf[pi].pdtid;
+      priv->tid = pbuf[pi].tid;
 
-	  cmp_result = ptid_cmp (pptid, gptid);
+      process_stratum_target *proc_target
+        = current_inferior ()->process_target ();
+      thread = add_thread_with_info (proc_target,
+                                     ptid_t (pid, 0, pbuf[pi].pthid),
+                                     priv);
+      pi++;
+      gcount++; 
+    }    
+  }
 
-	  if (cmp_result == 0)
-	    {
-	      aix_thread_info *priv = get_aix_thread_info (gbuf[gi]);
+  /* This condition implies the threads have exited or died.  Hence
+     we delete those exited threads.  Since we don't add a thread
+     for main process pcount must be equal to gcount + 1.  */
 
-	      priv->pdtid = pdtid;
-	      priv->tid = tid;
-	      pi++;
-	      gi++;
-	    }
-	  else if (cmp_result > 0)
-	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
-	    }
-	  else
-	    {
-	      process_stratum_target *proc_target
-		= current_inferior ()->process_target ();
-	      thread = add_thread (proc_target, pptid);
-
-	      aix_thread_info *priv = new aix_thread_info;
-	      thread->priv.reset (priv);
-	      priv->pdtid = pdtid;
-	      priv->tid = tid;
-	      pi++;
-	    }
-	}
+  else if (pcount >= 1 && gcount > pcount)
+  {
+    gi  = 0;
+    while (gcount != pcount - 1)
+    {
+      delete_thread (gbuf[gi++]);
+      gcount--;
     }
+  }
 
   xfree (pbuf);
   xfree (gbuf);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-13 18:15       ` Aditya Kamath1
@ 2022-11-15 18:16         ` Ulrich Weigand
  2022-11-21  8:27           ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2022-11-15 18:16 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>So I think instead of adding a "priv" struct to that GDB thread
>>identifying the main process, the sync_threadlists routine should
>>actually just delete it (or replace it with the actual first thread,
>>whatever is easier).
>
>I have chosen not to add the first main thread as new thread. Instead,
>we carry on with main process thread itself adding private data to it.
>Kindly see the first if condition. I observed this with the linux folks
>where in their output as you mentioned do not add any new threads the
>first time on recognition of multi thread debugee for the main process.  

OK, but this is still weird:
>* 1    process 26149278                   0xd0595fb0 in _p_nsleep ()
>  2    Thread 258 (tid 24445361, running) thread_function (arg=0x0)
>  3    Thread 515 (tid 16187681, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)

Why does the first thread look so different?  That's not the
case with Linux threads.  I believe even if you re-use the
thread structure, you'll still need to switch the ptid to one
that indicates a thread instead of a non-threaded process.


>A couple of things I want to inform you is that the way the second
>for loop is executing is not correct from here on to sync both the
>buffer lists [pthread and GDB thread]. Since we are now not adding
>multiple threads for the same process main thread one representing
>the GDB thread and the other by the pthread those conditions and
>indices like pi and gi will fail. Now there has not pcount - 1
>threads in the GDB thread buffer always. Condition 2 and 3 in the
>patch take care of them for addition and deletion of threads. 

The new logic doesn't look correct to me - note that it never
even looks at thread IDs any more, just the raw number of threads.
So for example if *any* thread exits, the code will always delete
the *last* thread from the GDB list - whether this is actually
the one that exited or not.

I do think it is necessary to compare thread IDs - you need to
map the thread IDs retrieved by libpthdebug against the thread
IDs already present in GDB's thread list.  If a matching thread
ID is present in both lists, it should not be touched.  If a
thread ID occurs only in the libpthdebug list, it needs to be
added to GDB's list.  If a thread ID occurs only in GDB's list,
it needs to be removed from there.

That's what the old code attempted to do as far as I can see;
if it got it wrong in certain corner cases, they need to be fixed;
but completely removing that logic seems just wrong.

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-15 18:16         ` Ulrich Weigand
@ 2022-11-21  8:27           ` Aditya Kamath1
  2022-11-23 14:15             ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-11-21  8:27 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 8326 bytes --]

Hi Ulrich,

Please find attached the new patch. See [0001-Fix-Multi-thread-debug-bug-fix-in-AIX.patch].


>Why does the first thread look so different?  That's not the
>case with Linux threads.  I believe even if you re-use the
>thread structure, you'll still need to switch the ptid to one
>that indicates a thread instead of a non-threaded process.

>That's what the old code attempted to do as far as I can see.
>if it got it wrong in certain corner cases, they need to be fixed.
>but completely removing that logic seems just wrong.

I misunderstood the code and what it was trying to do. You were right. In an attempt to balance pcount and gcount I messed the code up. I went one version back and corrected the same.

I underestimated the power of ptid class. As I explored parts of GDB I realised stuff we can do with the same. Actually, we need not delete the main thread. Instead, we can change the ptid from what represented the main ptid to now the main thread ptid. So, if my main thread has no private data but it has a thread info I know for a fact that this is the first time my debugee code will be multi-threaded or pthreaded. Hence, I make the change to the ptid representing the main ptid and set the private data. That is what these two lines do.


+         if (tp != NULL && tp->priv == NULL)

+          {

+           thread_change_ptid (proc_target, tp->ptid,

+                               ptid_t (pid, 0, pbuf[pi].pthid));

+           tp->priv.reset (priv);

+          }

+         else

+           thread = add_thread_with_info (proc_target,

+                                          ptid_t (pid, 0, pbuf[pi].pthid),

+                                          priv);

This will make our main thread look like how threads look like in the linux world while using GDB. Kindly see the output in the last para of this email with the code.

-      switch_to_thread (current_inferior ()->process_target (),
-                 ptid_t (user_current_pid));
+     inferior *inf = find_inferior_ptid (current_inferior ()-> process_target (),
+                                 ptid_t (user_current_pid));
+        for (thread_info *tp: inf->threads ())
+       if (tp != NULL)
+          {
+            switch_to_thread (tp);
+            break;
+          }

If you recall, we did this change a few months back to ensure we are in the right context while reading memory. So, here's the thing. So far, we had our man, the main process thread guy to save us here using ptid_t (user_current_pid). But now we have the multi-threaded ptid in case our debugee is multi-threaded and out ptid will be like ptid_t (pid, 0, pbuf[pi].pthid).. What I mean is there is no ptid representing a process in the latter case. Hence this change..

Let me know if I have thought this correctly and what you think of my analysis. If this is good kindly push this, otherwise let me know what I need to change.

Have a nice day ahead.

Thanks and regards,
Aditya.

-----------------------------------------------------------------------

PROGRAM:- [ Credits -- GDB test case continuous pending under gdb.threads ]

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <assert.h>

pthread_barrier_t barrier;

#define NUM_THREADS 3

void *
thread_function (void *arg)
{
  /* This ensures that the breakpoint is only hit after both threads
     are created, so the test can always switch to the non-event
     thread when the breakpoint triggers.  */
  pthread_barrier_wait (&barrier);

  while (1); /* break here */
}

int
main (void)
{
  int i;

  alarm (300);

  pthread_barrier_init (&barrier, NULL, NUM_THREADS);

  for (i = 0; i < NUM_THREADS; i++)
    {
      pthread_t thread;
      int res;

      res = pthread_create (&thread, NULL,
                            thread_function, NULL);
      assert (res == 0);
    }

  while (1)
    sleep (1);

  return 0;
}

---------------------------------------------------------------
Output:-



Reading symbols from /home/xyz/gdb_tests/continue-pending-status...

(gdb) r

Starting program: /home/xyz/gdb_tests/continue-pending-status

^C[New Thread 258]

[New Thread 515]

[New Thread 772]


Thread 1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1    Thread 1 (tid 32309733, running)   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

  2    Thread 258 (tid 31850777, running) thread_function (arg=0x0) at /home/xyz/gdb_tests/continue-pending-status.c:36

  3    Thread 515 (tid 30474663, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0) at /home/xyz/gdb_tests/continue-pending-status.c:36

  4    Thread 772 (tid 33423627, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0) at /home/xyz/gdb_tests/continue-pending-status.c:36

(gdb)

________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 15 November 2022 23:46
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>So I think instead of adding a "priv" struct to that GDB thread
>>identifying the main process, the sync_threadlists routine should
>>actually just delete it (or replace it with the actual first thread,
>>whatever is easier).
>
>I have chosen not to add the first main thread as new thread. Instead,
>we carry on with main process thread itself adding private data to it.
>Kindly see the first if condition. I observed this with the linux folks
>where in their output as you mentioned do not add any new threads the
>first time on recognition of multi thread debugee for the main process.

OK, but this is still weird:
>* 1    process 26149278                   0xd0595fb0 in _p_nsleep ()
>  2    Thread 258 (tid 24445361, running) thread_function (arg=0x0)
>  3    Thread 515 (tid 16187681, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)

Why does the first thread look so different?  That's not the
case with Linux threads.  I believe even if you re-use the
thread structure, you'll still need to switch the ptid to one
that indicates a thread instead of a non-threaded process.


>A couple of things I want to inform you is that the way the second
>for loop is executing is not correct from here on to sync both the
>buffer lists [pthread and GDB thread]. Since we are now not adding
>multiple threads for the same process main thread one representing
>the GDB thread and the other by the pthread those conditions and
>indices like pi and gi will fail. Now there has not pcount - 1
>threads in the GDB thread buffer always. Condition 2 and 3 in the
>patch take care of them for addition and deletion of threads.

The new logic doesn't look correct to me - note that it never
even looks at thread IDs any more, just the raw number of threads.
So for example if *any* thread exits, the code will always delete
the *last* thread from the GDB list - whether this is actually
the one that exited or not.

I do think it is necessary to compare thread IDs - you need to
map the thread IDs retrieved by libpthdebug against the thread
IDs already present in GDB's thread list.  If a matching thread
ID is present in both lists, it should not be touched.  If a
thread ID occurs only in the libpthdebug list, it needs to be
added to GDB's list.  If a thread ID occurs only in GDB's list,
it needs to be removed from there.

That's what the old code attempted to do as far as I can see;
if it got it wrong in certain corner cases, they need to be fixed;
but completely removing that logic seems just wrong.

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 2929 bytes --]

From a1ffc59b21cf75475719f070da80a45e70ba55ee Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Mon, 21 Nov 2022 02:05:08 -0600
Subject: [PATCH] Fix multi thread debug bug in AIX

When a process is multi threaded then AIX was adding a new thread with a priv set.

This thread is the same as main process thread without a private data set.

Hence an assertion failure checked_static_cast: Assertion result != nullptr failed is seen

This patch is a fix for the same where only new threads created are added and once a program

is multi threaded then the main process private data is set instead of adding a new thread for itself.
---
 gdb/aix-thread.c | 37 +++++++++++++++++++++++++++++++------
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..5792ec872fd 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -514,8 +514,16 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+    {
+	inferior *inf = find_inferior_ptid (current_inferior ()-> process_target (),
+					    ptid_t (user_current_pid));
+        for (thread_info *tp: inf->threads ()) 
+	  if (tp != NULL)
+          {
+            switch_to_thread (tp);
+            break;
+          }
+    }
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -719,7 +727,11 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* In a multi threaded application user can interrupt the main
+	 thread as well. This function should return the tid in this
+         case apart from threads that can trap or be interrupted.  */
+
+      if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)
 	return thrinf.ti_tid;
     }
 
@@ -812,9 +824,22 @@ sync_threadlists (int pid)
 
 	  process_stratum_target *proc_target
 	    = current_inferior ()->process_target ();
-	  thread = add_thread_with_info (proc_target,
-					 ptid_t (pid, 0, pbuf[pi].pthid),
-					 priv);
+	  
+	  thread_info *tp = find_thread_ptid (proc_target, ptid_t (pid));
+
+	  /* If the pthread library is used then we replace the main
+	     with the thread having the main thread ID and process ID.
+	     Otherwise this is a new thread and we need to add it.  */
+	  if (tp != NULL && tp->priv == NULL)
+          {
+	    thread_change_ptid (proc_target, tp->ptid,
+				ptid_t (pid, 0, pbuf[pi].pthid));
+	    tp->priv.reset (priv);
+          }
+	  else	
+	    thread = add_thread_with_info (proc_target,
+					   ptid_t (pid, 0, pbuf[pi].pthid),
+					   priv);
 
 	  pi++;
 	}
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-21  8:27           ` Aditya Kamath1
@ 2022-11-23 14:15             ` Ulrich Weigand
  2022-11-23 16:03               ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2022-11-23 14:15 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>@@ -514,8 +514,16 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
>        during first initialisation.  In the rest of the callbacks the
>        current thread needs to be correct.  */
>     if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-			ptid_t (user_current_pid));
>+    {
>+	inferior *inf = find_inferior_ptid (current_inferior ()-> process_target (),
>+					    ptid_t (user_current_pid));
This would be simpler using find_inferior_pid.
>+        for (thread_info *tp: inf->threads ()) 
>+	  if (tp != NULL)
This would be simpler using any_thread_of_inferior.
>+          {
>+            switch_to_thread (tp);
>+            break;
>+          }
>+    }
>     status = target_read_memory (addr, (gdb_byte *) buf, len);

However, switching to just any random thread of the process seems odd.

Looking at sol-thread.c, they don't switch to a thread at all
in the equivalent place, but rather do this:

  scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);

  if (inferior_ptid.tid_p () || !target_thread_alive (inferior_ptid))
    {
      /* It's either a thread or an LWP that isn't alive.  Any live
         LWP will do so use the first available.

         NOTE: We don't need to call switch_to_thread; we're just
         reading memory.  */
      inferior_ptid = procfs_first_available ();
    }

Since your xfer_partial routine only ever looks at the PID
component of the ptid, I'm wondering if we couldn't similarly
just switch inferior_ptid, without actually switching theads.
Something along the lines of

  scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
  if (user_current_pid != 0)
    inferior_ptid = ptid_t (user_current_pid);

Does this work for you?

>-      if (thrinf.ti_cursig == SIGTRAP)
>+      /* In a multi threaded application user can interrupt the main
>+	 thread as well. This function should return the tid in this
>+         case apart from threads that can trap or be interrupted.  */
Whitespace problem.
>+
>+      if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)
> 	return thrinf.ti_tid;

This seems an unrelated change?  If this is actually necessary,
then all the comments (e.g. at the top of this function, or at
the call site) likewise need to be updated - they only refer to
trap signals currently.

> 	  process_stratum_target *proc_target
> 	    = current_inferior ()->process_target ();
>-	  thread = add_thread_with_info (proc_target,
>-					 ptid_t (pid, 0, pbuf[pi].pthid),
>-					 priv);
>+	  
>+	  thread_info *tp = find_thread_ptid (proc_target, ptid_t (pid));
>+
>+	  /* If the pthread library is used then we replace the main
>+	     with the thread having the main thread ID and process ID.
>+	     Otherwise this is a new thread and we need to add it.  */
>+	  if (tp != NULL && tp->priv == NULL)
>+          {
>+	    thread_change_ptid (proc_target, tp->ptid,
>+				ptid_t (pid, 0, pbuf[pi].pthid));
>+	    tp->priv.reset (priv);
>+          }
>+	  else	
>+	    thread = add_thread_with_info (proc_target,
>+					   ptid_t (pid, 0, pbuf[pi].pthid),
>+					   priv);

I'm confused why this is the correct place.  From what I can see,
in this scenario, we have:

- libpthdebug reports some threads using a thread ID, i.e. pbuf has
   ptid_t (pid, 0, pthid1)
    ..
   ptid_t (pid, 0, pthidN)
with pcount >= 1.

- GDB only has one single thread in unthreaded mode, i.e. gbuf has
   ptid_t (pid, 0, 0)
with gcount == 1.

So when running the loop, during the first iteration, we should compare
   ptid_cmp (ptid_t (pid, 0, pthid1), ptid_t (pid, 0, 0))
which should be > 0 since pthid1 > 0.  Right?

This means we'll get into the branch that just does:
              delete_thread (gbuf[gi]);
thereby deleting the original thread.  Does this not happen for you?
What is going on instead?

[ Note that this is a simplified case with only a single process; in the
multi-process scenario, this may be more complex. ]
 

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-23 14:15             ` Ulrich Weigand
@ 2022-11-23 16:03               ` Aditya Kamath1
  2022-11-23 17:09                 ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-11-23 16:03 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya

[-- Attachment #1: Type: text/plain, Size: 6507 bytes --]

Hi Ulrich,

I'm confused why this is the correct place.  From what I can see,
in this scenario, we have:

- libpthdebug reports some threads using a thread ID, i.e. pbuf has
   ptid_t (pid, 0, pthid1)
    ..
   ptid_t (pid, 0, pthidN)
with pcount >= 1.

- GDB only has one single thread in unthreaded mode, i.e. gbuf has
   ptid_t (pid, 0, 0)
with gcount == 1.

So when running the loop, during the first iteration, we should compare
   ptid_cmp (ptid_t (pid, 0, pthid1), ptid_t (pid, 0, 0))
which should be > 0 since pthid1 > 0.  Right?

This means we'll get into the branch that just does:
              delete_thread (gbuf[gi]);
thereby deleting the original thread.  Does this not happen for you?
What is going on instead?

[ Note that this is a simplified case with only a single process; in the
multi-process scenario, this may be more complex.



>What is going on instead?
The one I highlighted in bold does not happen. In our first iteration pcount is 1 but gcount is 0. That is why when gi =0 == gcount becomes true and control enters into that block instead of going into the else block..

>[ Note that this is a simplified case with only a single process; in the
multi-process scenario, this may be more complex.

This I agree. I just checked now..

Hmm. So, something is going wrong here..

gcount = 0;

  iterate_over_threads (giter_count, &gcount);

  g = gbuf = XNEWVEC (struct thread_info *, gcount);

  iterate_over_threads (giter_accum, &g);

  qsort (gbuf, gcount, sizeof *gbuf, gcmp);

Let's me check these lines. Hope I answered why I was doing it..

The rest of the changes you mentioned I will do it..
________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 23 November 2022 19:45
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>@@ -514,8 +514,16 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
>        during first initialisation.  In the rest of the callbacks the
>        current thread needs to be correct.  */
>     if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-                      ptid_t (user_current_pid));
>+    {
>+      inferior *inf = find_inferior_ptid (current_inferior ()-> process_target (),
>+                                          ptid_t (user_current_pid));
This would be simpler using find_inferior_pid.
>+        for (thread_info *tp: inf->threads ())
>+        if (tp != NULL)
This would be simpler using any_thread_of_inferior.
>+          {
>+            switch_to_thread (tp);
>+            break;
>+          }
>+    }
>     status = target_read_memory (addr, (gdb_byte *) buf, len);

However, switching to just any random thread of the process seems odd.

Looking at sol-thread.c, they don't switch to a thread at all
in the equivalent place, but rather do this:

  scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);

  if (inferior_ptid.tid_p () || !target_thread_alive (inferior_ptid))
    {
      /* It's either a thread or an LWP that isn't alive.  Any live
         LWP will do so use the first available.

         NOTE: We don't need to call switch_to_thread; we're just
         reading memory.  */
      inferior_ptid = procfs_first_available ();
    }

Since your xfer_partial routine only ever looks at the PID
component of the ptid, I'm wondering if we couldn't similarly
just switch inferior_ptid, without actually switching theads.
Something along the lines of

  scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
  if (user_current_pid != 0)
    inferior_ptid = ptid_t (user_current_pid);

Does this work for you?

>-      if (thrinf.ti_cursig == SIGTRAP)
>+      /* In a multi threaded application user can interrupt the main
>+       thread as well. This function should return the tid in this
>+         case apart from threads that can trap or be interrupted.  */
Whitespace problem.
>+
>+      if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)
>        return thrinf.ti_tid;

This seems an unrelated change?  If this is actually necessary,
then all the comments (e.g. at the top of this function, or at
the call site) likewise need to be updated - they only refer to
trap signals currently.

>          process_stratum_target *proc_target
>            = current_inferior ()->process_target ();
>-        thread = add_thread_with_info (proc_target,
>-                                       ptid_t (pid, 0, pbuf[pi].pthid),
>-                                       priv);
>+
>+        thread_info *tp = find_thread_ptid (proc_target, ptid_t (pid));
>+
>+        /* If the pthread library is used then we replace the main
>+           with the thread having the main thread ID and process ID.
>+           Otherwise this is a new thread and we need to add it.  */
>+        if (tp != NULL && tp->priv == NULL)
>+          {
>+          thread_change_ptid (proc_target, tp->ptid,
>+                              ptid_t (pid, 0, pbuf[pi].pthid));
>+          tp->priv.reset (priv);
>+          }
>+        else
>+          thread = add_thread_with_info (proc_target,
>+                                         ptid_t (pid, 0, pbuf[pi].pthid),
>+                                         priv);

I'm confused why this is the correct place.  From what I can see,
in this scenario, we have:

- libpthdebug reports some threads using a thread ID, i.e. pbuf has
   ptid_t (pid, 0, pthid1)
    ..
   ptid_t (pid, 0, pthidN)
with pcount >= 1.

- GDB only has one single thread in unthreaded mode, i.e. gbuf has
   ptid_t (pid, 0, 0)
with gcount == 1.

So when running the loop, during the first iteration, we should compare
   ptid_cmp (ptid_t (pid, 0, pthid1), ptid_t (pid, 0, 0))
which should be > 0 since pthid1 > 0.  Right?

This means we'll get into the branch that just does:
              delete_thread (gbuf[gi]);
thereby deleting the original thread.  Does this not happen for you?
What is going on instead?

[ Note that this is a simplified case with only a single process; in the
multi-process scenario, this may be more complex. ]


Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-23 16:03               ` Aditya Kamath1
@ 2022-11-23 17:09                 ` Ulrich Weigand
  2022-11-23 18:45                   ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2022-11-23 17:09 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>Hmm. So, something is going wrong here..
>gcount = 0;
>  iterate_over_threads (giter_count, &gcount);
>  g = gbuf = XNEWVEC (struct thread_info *, gcount);
>  iterate_over_threads (giter_accum, &g);
>  qsort (gbuf, gcount, sizeof *gbuf, gcmp);

Looks like this is deliberate:

/* iterate_over_threads() callback for counting GDB threads.

   Do not count the main thread (whose tid is zero).  This matches
   the list of threads provided by the pthreaddebug library, which
   does not include that main thread either, and thus allows us
   to compare the two lists.  */

static int
giter_count (struct thread_info *thread, void *countp)
{
  if (PD_TID (thread->ptid))
    (*(int *) countp)++;
  return 0;
}

Maybe that comment is wrong about pthreaddebug not including
the main thread?   Or maybe that changed between AIX versions?

In any case, something needs to be fixed here.

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-23 17:09                 ` Ulrich Weigand
@ 2022-11-23 18:45                   ` Aditya Kamath1
  2022-11-29  8:18                     ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-11-23 18:45 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya

[-- Attachment #1: Type: text/plain, Size: 2775 bytes --]

Hi Ulrich,

>static int
>giter_count (struct thread_info *thread, void *countp)
>{
>  if (PD_TID (thread->ptid))
>    (*(int *) countp)++;
>  return 0;
>}
>Maybe that comment is wrong about pthreaddebug not including
>the main thread?   Or maybe that changed between AIX versions?

>In any case, something needs to be fixed here.

Even if we fix it here [assuming we are succesful], in the delete_thread_1 () in thread.c we will fail to hit thread->deletable as true while we attempt delete_thread (gbuf [gi]).. Because refcount will not be 0 when we attempt to delete main thread with ptid (pid, 0, 0). {see func below}


bool

thread_info::deletable () const

{

  /* If this is the current thread, or there's code out there that

     relies on it existing (refcount > 0) we can't delete yet.  */



  return refcount () == 0 && !is_current_thread (this);

}


I will be trying to replace the main thread instead like  thread_change_ptid (proc_target, gptid, pptid) subject to a condition check that gptid.tid () == 0.. Otherwise, if it is not a main thread [gptid.tid () != 0], we can delete gbuf[gi].. We can apply this else where as well in sync_threadlists ().

Let me know if we have an alternate optimal option that can delete this thread or why the solution in the above paragraph can fail..

Rest of the things I will handle. No problem.

Thanks and regards,
Aditya..


________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 23 November 2022 22:39
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>Hmm. So, something is going wrong here..
>gcount = 0;
>  iterate_over_threads (giter_count, &gcount);
>  g = gbuf = XNEWVEC (struct thread_info *, gcount);
>  iterate_over_threads (giter_accum, &g);
>  qsort (gbuf, gcount, sizeof *gbuf, gcmp);

Looks like this is deliberate:

/* iterate_over_threads() callback for counting GDB threads.

   Do not count the main thread (whose tid is zero).  This matches
   the list of threads provided by the pthreaddebug library, which
   does not include that main thread either, and thus allows us
   to compare the two lists.  */

static int
giter_count (struct thread_info *thread, void *countp)
{
  if (PD_TID (thread->ptid))
    (*(int *) countp)++;
  return 0;
}

Maybe that comment is wrong about pthreaddebug not including
the main thread?   Or maybe that changed between AIX versions?

In any case, something needs to be fixed here.

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-23 18:45                   ` Aditya Kamath1
@ 2022-11-29  8:18                     ` Aditya Kamath1
  2022-11-30 14:57                       ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-11-29  8:18 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 14844 bytes --]

Hi all,

Please find attached the new patch. [Kindly see: 0001-Fix-multi-thread-bug-fix-in-AIX.patch].

This patch works for a single process with multiple threads and multiple process with detach on fork on. Kindly find the program1 and 2 below and the outputs named as output 1 and output 2 respectively.

The issue:-

When we ran the unit test cases, we realised we have a problem in the multi process detach on fork on case where this patch fails. The issue is when there is interrupt pressed while a thread is running [threads after using fork() and stuck in while (1);], it somehow manages to switch the process as shown highlighted in bold in output 3 below this email. Due to this what should have been shown as threads is now being shown as process.

A detailed explaination on the solution proposed for single process:-

In AIX we were adding a main thread named say 'Thread 1' extra when we already had a main process thread named say 'process 100'. This created a mess while sinking and with the recent change in the get_aix_thread_info () function where it won't allow any thread with no private data there by leading to a crash.

So, the proposed patch[attached in this email] modifies our sync_threadlists ()  function where a check if a process has tid or not. If it has then we know my program is multi-threaded and I need to change main process thread ptid from 'process 100' to 'Thread 1' as Ulrich mentioned in the previous threads like ptid (pid, 0, 0) to ptid (pid, 0, tid or pthreadID). This is done by thread_change_ptid () in the patch.

While this is done, there are cases where when we read target memory we need to be in the right context.. Hence, the proposed patch changes inferior_ptid in pdc_read_data() as mentioned by Ulrich.
+      inferior_ptid = ptid_t (user_current_pid);

Here's the thing we were not counting our main thread in GDB thread lists [gcount].. Now we do so. The proposed patch has removed the condition.
-  if (PD_TID (thread->ptid))

Before we had an extra thread 'process 100' apart from 'thread 1'. So, in case someone interrupted a thread with ctrl+c.. In the pd_update () even if we don't have thread who signalled this interrupt when we return ptid_t (pid) it was fine. But now with no 'process 100' and only 'thread 1', we need to take care of interrupt as well, otherwise GDB core will take ptid_t(100) as a new process. So the change
+      if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)

These changes solve the single process case.

A detailed explaination on the solution proposed for multi process with multi thread case:-

Consider my main program is multi-threaded and each thread can fork().. After we record all events there are two things, we need to take care of in my rs6000-aix-nat.c..

1:- There can be a child event whose parent can be pthreaded or multi-threaded. The negation of this is fine since we anyway can return only ptid_t (pid).

In this case if we return a ptid_t (pid) then gdb core treats it as a new process since our parent thread is pthreaded GDB core only knows threads like ptid_t (pid, tid, pthid) and not ptid_t (pid). In order to avoid this, the proposed patch uses a function call find_the_return_ptid () to figure out the same. The changes like below are for this reason.

ourstatus->set_forked (ptid_t (pid));
-             return ptid_t (parent_pid);
+             return find_the_return_ptid (parent_pid);

Kindly note that the control will always not come from aix-thread.c for such events. Hence, we cannot take care of the same there, though it will be a relief if we can do that.

Having said that the propsed patch uses lwp parameter inorder to store the kernel thread ID so that we can fetch it here and use it here.

2:- There are cases where we need to pass my ptid_t (pid, lwp, tid) to aix-thread.c instead of ptid_t (pid)..


If we try to fix 1 this way, then then below assertion causes us trouble if we go through aix-thread.c for our beneath->wait. Hence the proposed patch has it removed.

-  /* The target beneath does not deal with threads, so it should only return
-     pid-only ptids.  */
-  gdb_assert (ptid.is_pid ());


We are not able to know why our output 3 is switching to a process[ Output 3 in the mail below]..

So, we are at one of the threads of process 1. On an interrupt we need to switch our threads to the thread that caused it. But the GDB core is saying we switched our process. It shouldn't as we are in the same process. And I also checked if we returned the correct ptid that raised the interrupt and we do.

I tried to debug on what is going on in the GDB core but failed to understand for the last couple of days, and therefore I did not succeed.

If I do not understand, then I will not be able to solve this problem for AIX and GDB. Hence, I wish to reach out to you all after this effort for a solution.

Since you all are experts having vast experience and knowledge in GDB, kindly let me know what is going on in the output 3 case [pasted below in this email] or the things mentioned in the patch are done using a wrong method. If so, then let me know what we can do correctly and efficiently from here.

Hope to seek a reply. Have a nice day ahead.

Thanks and regards,
Aditya



----------------------------------------------------------------
Program1:- [Credits gdb.threads/continuous-pending.c]


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 3


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */

  pthread_barrier_wait (&barrier);


  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

    sleep (1);


  return 0;

}



----------------------------------------------------------------
Output1:- Single process


Reading symbols from /home/aditya/gdb_tests/continue-pending-status...

(gdb) r

Starting program: /home/aditya/gdb_tests/continue-pending-status

^C[New Thread 258]

[New Thread 515]

[New Thread 772]


Thread 3 received signal SIGINT, Interrupt.

[Switching to Thread 515]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

36        while (1); /* break here */

(gdb) info threads

  Id   Target Id                          Frame

  1    Thread 1 (tid 24838585, running)   warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


  2    Thread 258 (tid 23134635, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

* 3    Thread 515 (tid 30146867, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

  4    Thread 772 (tid 27853165, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

---------------------------------------------------------------------------------

Program 2:- Multi process Code


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else{

    printf (" Iam child \n");

    child = fork ();

    if (child > 0)

      printf ("From child I became a parent \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

    break;

  }


  return 0;

}


-------------------------------------------------------------------------

Output 2:- With detach-on-fork on


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[Detaching after fork from child process 8323572]

 Iam child

I am grandchild

From child I became a parent

I am parent

[Detaching after fork from child process 11665884]

 Iam child

I am grandchild

From child I became a parent

I am parent

^C

Thread 2 received signal SIGINT, Interrupt.

[Switching to Thread 258]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

32        while (1); /* break here */

(gdb) info threads

  Id   Target Id                          Frame

  1    Thread 1 (tid 27263269, running)   warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


* 2    Thread 258 (tid 28705075, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  3    Thread 515 (tid 27853169, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32


-------------------------------------------------------------------------

Output 3:- With detach-on-fork off


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 8323570)]

I am parent

[New inferior 3 (process 17957172)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to process 16777620]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id         Frame

* 1.1  process 16777620  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 16777620  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 16777620  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2.1  process 8323570   0xd0594fc8 in ?? ()

  3.1  process 17957172  0xd0594fc8 in ?? ()


________________________________
From: Gdb-patches <gdb-patches-bounces+aditya.kamath1=ibm.com@sourceware.org> on behalf of Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org>
Sent: 24 November 2022 00:15
To: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>; simark@simark.ca <simark@simark.ca>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: [EXTERNAL] Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Hi Ulrich,

>static int
>giter_count (struct thread_info *thread, void *countp)
>{
>  if (PD_TID (thread->ptid))
>    (*(int *) countp)++;
>  return 0;
>}
>Maybe that comment is wrong about pthreaddebug not including
>the main thread?   Or maybe that changed between AIX versions?

>In any case, something needs to be fixed here.

Even if we fix it here [assuming we are succesful], in the delete_thread_1 () in thread.c we will fail to hit thread->deletable as true while we attempt delete_thread (gbuf [gi]).. Because refcount will not be 0 when we attempt to delete main thread with ptid (pid, 0, 0). {see func below}


bool

thread_info::deletable () const

{

  /* If this is the current thread, or there's code out there that

     relies on it existing (refcount > 0) we can't delete yet.  */



  return refcount () == 0 && !is_current_thread (this);

}


I will be trying to replace the main thread instead like  thread_change_ptid (proc_target, gptid, pptid) subject to a condition check that gptid.tid () == 0.. Otherwise, if it is not a main thread [gptid.tid () != 0], we can delete gbuf[gi].. We can apply this else where as well in sync_threadlists ().

Let me know if we have an alternate optimal option that can delete this thread or why the solution in the above paragraph can fail..

Rest of the things I will handle. No problem.

Thanks and regards,
Aditya..


________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 23 November 2022 22:39
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>Hmm. So, something is going wrong here..
>gcount = 0;
>  iterate_over_threads (giter_count, &gcount);
>  g = gbuf = XNEWVEC (struct thread_info *, gcount);
>  iterate_over_threads (giter_accum, &g);
>  qsort (gbuf, gcount, sizeof *gbuf, gcmp);

Looks like this is deliberate:

/* iterate_over_threads() callback for counting GDB threads.

   Do not count the main thread (whose tid is zero).  This matches
   the list of threads provided by the pthreaddebug library, which
   does not include that main thread either, and thus allows us
   to compare the two lists.  */

static int
giter_count (struct thread_info *thread, void *countp)
{
  if (PD_TID (thread->ptid))
    (*(int *) countp)++;
  return 0;
}

Maybe that comment is wrong about pthreaddebug not including
the main thread?   Or maybe that changed between AIX versions?

In any case, something needs to be fixed here.

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-bug-fix-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 7814 bytes --]

From 5521677adbd492472de047a9d31597addb9502f4 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Tue, 29 Nov 2022 00:20:19 -0600
Subject: [PATCH] Fix multi thread bug in AIX

---
 gdb/aix-thread.c     | 66 +++++++++++++++++++++-----------------------
 gdb/rs6000-aix-nat.c | 48 ++++++++++++++++++++++++++++++--
 2 files changed, 77 insertions(+), 37 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..6783aa4999f 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -508,14 +508,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +638,24 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
+/* iterate_over_threads() callback for counting GDB threads.  */
 
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
 
 static int
 giter_count (struct thread_info *thread, void *countp)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
+  (*(int *) countp)++;
   return 0;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for accumulating GDB thread pids.  */
 
 static int
 giter_accum (struct thread_info *thread, void *bufp)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  **(struct thread_info ***) bufp = thread;
+  (*(struct thread_info ***) bufp)++;
+    
   return 0;
 }
 
@@ -704,8 +691,8 @@ gcmp (const void *t1v, const void *t2v)
 }
 
 /* Search through the list of all kernel threads for the thread
-   that has stopped on a SIGTRAP signal, and return its TID.
-   Return 0 if none found.  */
+   that has stopped on a SIGTRAP or SIGINT signal, and return
+   its TID.  Return 0 if none found.  */
 
 static pthdb_tid_t
 get_signaled_thread (int pid)
@@ -719,7 +706,7 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +737,9 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+	= current_inferior ()->process_target ();
+  thread_info *tp;
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -810,10 +800,8 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
-					 ptid_t (pid, 0, pbuf[pi].pthid),
+					 ptid_t (pid, pbuf[pi].tid, pbuf[pi].pthid),
 					 priv);
 
 	  pi++;
@@ -823,7 +811,7 @@ sync_threadlists (int pid)
 	  ptid_t pptid, gptid;
 	  int cmp_result;
 
-	  pptid = ptid_t (pid, 0, pbuf[pi].pthid);
+	  pptid = ptid_t (pid, pbuf[pi].tid, pbuf[pi].pthid);
 	  gptid = gbuf[gi]->ptid;
 	  pdtid = pbuf[pi].pdtid;
 	  tid = pbuf[pi].tid;
@@ -841,8 +829,22 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      if (gptid.is_pid ())
+		{
+		  thread_change_ptid (proc_target, gptid, pptid);
+		  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+		  tp = find_thread_ptid (proc_target, pptid);
+		  tp->priv.reset (priv);
+		  pi++;
+		  gi++;
+		}
+	      else
+		{
+		  delete_thread (gbuf[gi]);
+		  gi++;
+		}
 	    }
 	  else
 	    {
@@ -1091,10 +1093,6 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
   if (ptid.pid () == -1)
     return ptid_t (-1);
 
-  /* The target beneath does not deal with threads, so it should only return
-     pid-only ptids.  */
-  gdb_assert (ptid.is_pid ());
-
   /* Check whether libpthdebug might be ready to be initialized.  */
   if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
diff --git a/gdb/rs6000-aix-nat.c b/gdb/rs6000-aix-nat.c
index 2ac1f6e70b6..db45a95952a 100644
--- a/gdb/rs6000-aix-nat.c
+++ b/gdb/rs6000-aix-nat.c
@@ -619,6 +619,48 @@ rs6000_nat_target::xfer_partial (enum target_object object,
     }
 }
 
+/* Search through the list of all kernel threads for the thread
+   that has stopped on a SIGTRAP or SIGINT signal, and return
+   its TID.  Return 0 if none found.  */
+
+static tid_t
+get_signaled_thread_rs6000 (int pid)
+{
+  struct thrdsinfo64 thrinf;
+  tid_t ktid = 0;
+
+  while (1)
+    {
+      if (getthrds (pid, &thrinf,
+                    sizeof (thrinf), &ktid, 1) != 1)
+        break;
+
+      if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)
+        return thrinf.ti_tid;
+    }
+
+  /* Didn't find any thread stopped on a SIGTRAP signal.  */
+  return 0;
+}
+
+/* If my process is pthreaded I need to return that ptid else ptid_t
+   (pid).  */
+
+static ptid_t
+find_the_return_ptid (pid_t pid)
+{
+  ptid_t ptid = ptid_t (pid);
+  process_stratum_target *proc_target
+        = current_inferior ()->process_target ();
+  inferior *inf = find_inferior_pid (proc_target, pid);
+  thread_info *tp = find_thread_ptid (inf, ptid_t (pid));
+  if (tp == nullptr)
+    for (thread_info *tp1 : inf->threads ())
+       if (tp1->ptid.lwp () == get_signaled_thread_rs6000 (pid))
+         return tp1->ptid;
+  return ptid;
+}
+
 /* Wait for the child specified by PTID to do something.  Return the
    process ID of the child, or MINUS_ONE_PTID in case of error; store
    the status in *OURSTATUS.  */
@@ -672,7 +714,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
 	      if (parent_pid > 0)
 		{
 		  ourstatus->set_forked (ptid_t (pid));
-		  return ptid_t (parent_pid);
+		  return find_the_return_ptid (parent_pid);
 		}
 	      aix_remember_child (pid);
 	    }
@@ -687,7 +729,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
 	      if (child_pid > 0)
 		{
 		  ourstatus->set_forked (ptid_t (child_pid));
-		  return ptid_t (pid);
+		  return find_the_return_ptid (pid);
 		}
 	      aix_remember_parent (pid);
 	    }
@@ -712,7 +754,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
   else
     *ourstatus = host_status_to_waitstatus (status);
 
-  return ptid_t (pid);
+  return find_the_return_ptid (pid);
 }
 \f
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-29  8:18                     ` Aditya Kamath1
@ 2022-11-30 14:57                       ` Ulrich Weigand
  2022-12-02  7:50                         ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2022-11-30 14:57 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Before we had an extra thread 'process 100' apart from 'thread 1'.
>So, in case someone interrupted a thread with ctrl+c.. In the
>pd_update () even if we don't have thread who signalled this interrupt
>when we return ptid_t (pid) it was fine. But now with no 'process 100'
>and only 'thread 1', we need to take care of interrupt as well,
>otherwise GDB core will take ptid_t(100) as a new process.
>So the change
>+      if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)

Hmm.  So when "wait" returns, it needs to determine which thread
triggered the event that caused ptrace to stop.  On Linux, "wait"
will actually return the LWP of that thread, so it can be directly
used.  It seems on AIX, "wait" only returns a PID, and you do not
immediately know which thread caused the event?

In that case, I can see why you'd have to consider SIGINT as well
as SIGTRAP. However, it seems to me that even those two are not the
*only* cases that can cause "wait" to return - doesn't *any* signal
(potentially) trigger a ptrace intercept (causing wait to return)?

But that's probably a more general problem, and wouldn't occur in
this simple test case.

>In this case if we return a ptid_t (pid) then gdb core treats it
>as a new process since our parent thread is pthreaded GDB core
>only knows threads like ptid_t (pid, tid, pthid) and not
>ptid_t (pid). In order to avoid this, the proposed patch uses a
>function call find_the_return_ptid () to figure out the same.
>The changes like below are for this reason. 
>
>ourstatus->set_forked (ptid_t (pid));
>-             return ptid_t (parent_pid);
>+             return find_the_return_ptid (parent_pid);
>
>Kindly note that the control will always not come from aix-thread.c
>for such events. Hence, we cannot take care of the same there,
>though it will be a relief if we can do that. 

I'm not sure why it is necessary to handle this in the process layer
(rs6000-aix-nat.c) instead of the thread layer (aix-thread.c).
What specifically breaks if you do not have these rs6000-aix-nat.c
changes?

If you *do* need to handle LWPs (kernel thread IDs) in the process
layer (this can be a reasonable choice, and it done by several other
native targets), then it should be *consistent*, and *all* LWP handling
should be in the process layer. In particular, under no circumstances
does it make sense to duplicate the "find current/signalled thread"
code in *both* the process any thread layers.

>[Switching to process 16777620]

This outputs inferior_ptid ...

>0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
>(gdb) info threads
>  Id   Target Id         Frame 
>* 1.1  process 16777620  0xd0595fb0 in _p_nsleep ()
>   from /usr/lib/libpthread.a(shr_xpg5.o)
>  1.2  process 16777620  0xd0595fb0 in _p_nsleep ()
>   from /usr/lib/libpthread.a(shr_xpg5.o)
>  1.3  process 16777620  0xd0595fb0 in _p_nsleep ()
>   from /usr/lib/libpthread.a(shr_xpg5.o)
>  2.1  process 8323570   0xd0594fc8 in ?? ()
>  3.1  process 17957172  0xd0594fc8 in ?? ()

... and this outputs the ptid values for those threads.

If it says "process ...", then those ptid values have not
properly been switched over to the (pid, lwp, tid) format.

You should verify that the sync_threadlists code handles
all multi-process cases correctly.  I haven't looked at
this in detail, but are you sure that here:

@@ -841,8 +829,22 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      if (gptid.is_pid ())
+		{
+		  thread_change_ptid (proc_target, gptid, pptid);

you never accidentally switch the *pid* part (if "gptid"
belows to a different pid than "pptid")?

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-11-30 14:57                       ` Ulrich Weigand
@ 2022-12-02  7:50                         ` Aditya Kamath1
  2022-12-05 18:33                           ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-12-02  7:50 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 18159 bytes --]

Hi Ulrich and community,

Please find attached the new patch [See:- 0001-Fix-multi-thread-debug-bug-in-AIX.patch ]

So now the proposed patch attached in this email works fine for single process and multi process code with detach fork on and off options. Kindly see output 1 and program 1 for single process case and program 2 and outputs 2, 3 and 4 for multi process case [It points out that if a process is a thread, it is a thread. It is highlighted in bold]. The unit test cases run well. Output 4 is following child. It is attached to be sure things work in all cases.  The proposed patch has a pid_to_str () function that helps us. So, when in a multi process case a switch happens between process due to any event the control checks in our rs6000-aix-nat.c for thread or process information from thread_target_id_str () from thread.c file.
This part is now solved.

There are some optimisations for which I need your advice since you are the expert.

So here are my reasons..

>>ourstatus->set_forked (ptid_t (pid));
>>-             return ptid_t (parent_pid);
>>+             return find_the_return_ptid (parent_pid);
>>
>>Kindly note that the control will always not come from aix-thread.c
>>for such events. Hence, we cannot take care of the same there,
>>though it will be a relief if we can do that.

>I'm not sure why it is necessary to handle this in the process layer
>(rs6000-aix-nat.c) instead of the thread layer (aix-thread.c).
>What specifically breaks if you do not have these rs6000-aix-nat.c
>changes?

So, if you observe output 3 or 4, the program first multi threads, I mean thread events are handled first and then the threads fork. So, when this happens, I cannot return ptid_t (parent_pid). If I do so, the GDB core will treat it as a new process and add it in my threadlist as say process 100 despite existence of 'thread 1' representing the same. So, I need to correctly send which thread did the fork () event or which thread of the process is the one who gave birth to a new inferior process [say 2 or 3 in output 3 below], I mean which thread caused the mult process event when the process is mutli threaded. This has to handled here as control from target.c comes directly to rs6000-aix-nat::wait and not through aix-thread.c::wait since fork () is a process event..

>If you *do* need to handle LWPs (kernel thread IDs) in the process
>layer (this can be a reasonable choice, and it done by several other
>native targets), then it should be *consistent*, and *all* LWP handling
>should be in the process layer. In particular, under no circumstances
>does it make sense to duplicate the "find current/signalled thread"
>code in *both* the process any thread layers.

This not straightforward to do. The reason being say our application is pthreaded We need our sync_threadlists() code to detect multiple threads and sync.. We cannot handle this in rs6000-aix-nat.c with the current design of the code.. Let's say child process is multi-threaded things can get complex.. It will require us to move that whole GDB list and Pthread list sync code to rs6000-aix-nat.c code. The essence or most selling product or the USP [Unique Selling Proposition] of aix-thread.c code will be lost.

Let me know what you think of this, and I will modify this patch as per how you guide or suggest.. This is where I wanted to reach out to all of you in the community as I think this might be a code design change needed.


>>[Switching to process 16777620]

>This outputs inferior_ptid ...

Yes, you were right

>>* 1.1  process 16777620  0xd0595fb0 in _p_nsleep ()
>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>  1.2  process 16777620  0xd0595fb0 in _p_nsleep ()
>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>  1.3  process 16777620  0xd0595fb0 in _p_nsleep ()
>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>  2.1  process 8323570   0xd0594fc8 in ?? ()
>>  3.1  process 17957172  0xd0594fc8 in ?? ()

>... and this outputs the ptid values for those threads.

>If it says "process ...", then those ptid values have not
>properly been switched over to the (pid, lwp, tid) format.

>You should verify that the sync_threadlists code handles
>all multi-process cases correctly.  I haven't looked at
>this in detail, but are you sure that here:

>>@@ -841,8 +829,22 @@ sync_threadlists (int pid)
 >>            }
 >>          else if (cmp_result > 0)
 >>            {
>>-             delete_thread (gbuf[gi]);


>you never accidentally switch the *pid* part (if "gptid"
>belows to a different pid than "pptid")?

So, this is not the reason. I have added an assertion here just to be sure. I get what you are thinking. While debugged in depth last two days I realised our pid_to_str is needed in rs6000-aix-nat.c as control comes here in search of it. If it doesn't GDB treats all threads as process. It is the patch. Kindly see it and the outputs [3 and 4 below this email] as well.

>Hmm.  So when "wait" returns, it needs to determine which thread
>triggered the event that caused ptrace to stop.  On Linux, "wait"
>will actually return the LWP of that thread, so it can be directly
>used.  It seems on AIX, "wait" only returns a PID, and you do not
>immediately know which thread caused the event?

>In that case, I can see why you'd have to consider SIGINT as well
>as SIGTRAP. However, it seems to me that even those two are not the
>*only* cases that can cause "wait" to return - doesn't *any* signal
>(potentially) trigger a ptrace intercept (causing wait to return)?

>But that's probably a more general problem, and wouldn't occur in
>this simple test case.

Exactly. So I tried debugging few examples causing a few other signals as mentioned in this document [https://www.ibm.com/docs/en/sdk-java-technology/8?topic=reference-signal-handling]. In AIX we have most of them mentioned in the link. It does not block us from doing things or crashes incase of a segment fault signal [from our debugger code]. Abort also works fine. Let me know what you think.



----------------------------------------------------------------
Program1:- [Credits gdb.threads/continuous-pending.c]


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 3


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */

  pthread_barrier_wait (&barrier);


  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

    sleep (1);


  return 0;

}



----------------------------------------------------------------
Output1:- Single process


Reading symbols from /home/aditya/gdb_tests/continue-pending-status...

(gdb) r

Starting program: /home/aditya/gdb_tests/continue-pending-status

^C[New Thread 258]

[New Thread 515]

[New Thread 772]


Thread 3 received signal SIGINT, Interrupt.

[Switching to Thread 515]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

36        while (1); /* break here */

(gdb) info threads

  Id   Target Id                          Frame

  1    Thread 1 (tid 24838585, running)   warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


  2    Thread 258 (tid 23134635, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

* 3    Thread 515 (tid 30146867, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

  4    Thread 772 (tid 27853165, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

---------------------------------------------------------------------------------

Program 2:- Multi process Code


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else{

    printf (" Iam child \n");

    child = fork ();

    if (child > 0)

      printf ("From child I became a parent \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

    break;

  }


  return 0;

}


-------------------------------------------------------------------------

Output 2:- With detach-on-fork on


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[Detaching after fork from child process 8323572]

 Iam child

I am grandchild

From child I became a parent

I am parent

[Detaching after fork from child process 11665884]

 Iam child

I am grandchild

From child I became a parent

I am parent

^C

Thread 2 received signal SIGINT, Interrupt.

[Switching to Thread 258]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

32        while (1); /* break here */

(gdb) info threads

  Id   Target Id                          Frame

  1    Thread 1 (tid 27263269, running)   warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


* 2    Thread 258 (tid 28705075, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  3    Thread 515 (tid 27853169, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32


-------------------------------------------------------------------------

Output 3:- With detach-on-fork off


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (Process 15466928)]

[New inferior 3 (Process 13894048)]

I am parent

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id         Frame

* 1.1  Thread 1          0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258        0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  Thread 515        0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2.1  Process 15466928  0xd0594fc8 in ?? ()

  3.1  Process 13894048  0xd0594fc8 in ?? ()

--------------------------------------------------

Output 4:- detach fork off and following child


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) set follow-fork-mode child

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[Attaching after Thread 515 fork to child Process 13894050]

[New inferior 2 (Process 13894050)]

 Iam child

[Attaching after Process 13894050 fork to child Process 11010474]

[New inferior 3 (Process 11010474)]

I am grandchild

^CReading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...


Thread 3.1 received signal SIGINT, Interrupt.

[Switching to Process 11010474]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

32        while (1); /* break here */

(gdb) info threads

  Id   Target Id         Frame

  1.1  Thread 1          0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258        0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  Thread 515        0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2.1  Process 13894050  0xd0594fc8 in ?? ()

* 3.1  Process 11010474  thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32


________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 30 November 2022 20:27
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Before we had an extra thread 'process 100' apart from 'thread 1'.
>So, in case someone interrupted a thread with ctrl+c.. In the
>pd_update () even if we don't have thread who signalled this interrupt
>when we return ptid_t (pid) it was fine. But now with no 'process 100'
>and only 'thread 1', we need to take care of interrupt as well,
>otherwise GDB core will take ptid_t(100) as a new process.
>So the change
>+      if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)

Hmm.  So when "wait" returns, it needs to determine which thread
triggered the event that caused ptrace to stop.  On Linux, "wait"
will actually return the LWP of that thread, so it can be directly
used.  It seems on AIX, "wait" only returns a PID, and you do not
immediately know which thread caused the event?

In that case, I can see why you'd have to consider SIGINT as well
as SIGTRAP. However, it seems to me that even those two are not the
*only* cases that can cause "wait" to return - doesn't *any* signal
(potentially) trigger a ptrace intercept (causing wait to return)?

But that's probably a more general problem, and wouldn't occur in
this simple test case.

>In this case if we return a ptid_t (pid) then gdb core treats it
>as a new process since our parent thread is pthreaded GDB core
>only knows threads like ptid_t (pid, tid, pthid) and not
>ptid_t (pid). In order to avoid this, the proposed patch uses a
>function call find_the_return_ptid () to figure out the same.
>The changes like below are for this reason.
>
>ourstatus->set_forked (ptid_t (pid));
>-             return ptid_t (parent_pid);
>+             return find_the_return_ptid (parent_pid);
>
>Kindly note that the control will always not come from aix-thread.c
>for such events. Hence, we cannot take care of the same there,
>though it will be a relief if we can do that.

I'm not sure why it is necessary to handle this in the process layer
(rs6000-aix-nat.c) instead of the thread layer (aix-thread.c).
What specifically breaks if you do not have these rs6000-aix-nat.c
changes?

If you *do* need to handle LWPs (kernel thread IDs) in the process
layer (this can be a reasonable choice, and it done by several other
native targets), then it should be *consistent*, and *all* LWP handling
should be in the process layer. In particular, under no circumstances
does it make sense to duplicate the "find current/signalled thread"
code in *both* the process any thread layers.

>[Switching to process 16777620]

This outputs inferior_ptid ...

>0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
>(gdb) info threads
>  Id   Target Id         Frame
>* 1.1  process 16777620  0xd0595fb0 in _p_nsleep ()
>   from /usr/lib/libpthread.a(shr_xpg5.o)
>  1.2  process 16777620  0xd0595fb0 in _p_nsleep ()
>   from /usr/lib/libpthread.a(shr_xpg5.o)
>  1.3  process 16777620  0xd0595fb0 in _p_nsleep ()
>   from /usr/lib/libpthread.a(shr_xpg5.o)
>  2.1  process 8323570   0xd0594fc8 in ?? ()
>  3.1  process 17957172  0xd0594fc8 in ?? ()

... and this outputs the ptid values for those threads.

If it says "process ...", then those ptid values have not
properly been switched over to the (pid, lwp, tid) format.

You should verify that the sync_threadlists code handles
all multi-process cases correctly.  I haven't looked at
this in detail, but are you sure that here:

@@ -841,8 +829,22 @@ sync_threadlists (int pid)
             }
           else if (cmp_result > 0)
             {
-             delete_thread (gbuf[gi]);
-             gi++;
+             if (gptid.is_pid ())
+               {
+                 thread_change_ptid (proc_target, gptid, pptid);

you never accidentally switch the *pid* part (if "gptid"
belows to a different pid than "pptid")?

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 8565 bytes --]

From f2ae16f2f392919811400ac1f1c83a1e76ec02df Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Fri, 2 Dec 2022 00:59:55 -0600
Subject: [PATCH] Fix multi-thread debug bug in AIX

---
 gdb/aix-thread.c     | 65 +++++++++++++++++++++----------------------
 gdb/rs6000-aix-nat.c | 66 ++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 95 insertions(+), 36 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..45e0d9c7ae8 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -508,14 +508,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +638,24 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
+/* iterate_over_threads() callback for counting GDB threads.  */
 
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
 
 static int
 giter_count (struct thread_info *thread, void *countp)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
+  (*(int *) countp)++;
   return 0;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for accumulating GDB thread pids.  */
 
 static int
 giter_accum (struct thread_info *thread, void *bufp)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  **(struct thread_info ***) bufp = thread;
+  (*(struct thread_info ***) bufp)++;
+    
   return 0;
 }
 
@@ -704,7 +691,7 @@ gcmp (const void *t1v, const void *t2v)
 }
 
 /* Search through the list of all kernel threads for the thread
-   that has stopped on a SIGTRAP signal, and return its TID.
+   that has stopped on a SIGTRAP or SIGINT signal, and return its TID.
    Return 0 if none found.  */
 
 static pthdb_tid_t
@@ -719,7 +706,7 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +737,9 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+	= current_inferior ()->process_target ();
+  thread_info *tp;
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -810,10 +800,8 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
-					 ptid_t (pid, 0, pbuf[pi].pthid),
+					 ptid_t (pid, pbuf[pi].tid, pbuf[pi].pthid),
 					 priv);
 
 	  pi++;
@@ -823,7 +811,7 @@ sync_threadlists (int pid)
 	  ptid_t pptid, gptid;
 	  int cmp_result;
 
-	  pptid = ptid_t (pid, 0, pbuf[pi].pthid);
+	  pptid = ptid_t (pid, pbuf[pi].tid, pbuf[pi].pthid);
 	  gptid = gbuf[gi]->ptid;
 	  pdtid = pbuf[pi].pdtid;
 	  tid = pbuf[pi].tid;
@@ -841,8 +829,23 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      if (gptid.is_pid ())
+		{
+		  gdb_assert (gptid.pid () == pptid.pid ());
+		  thread_change_ptid (proc_target, gptid, pptid);
+		  aix_thread_info *priv = new aix_thread_info;
+		  priv->pdtid = pbuf[pi].pdtid;
+		  priv->tid = pbuf[pi].tid;
+		  tp = find_thread_ptid (proc_target, pptid);
+		  tp->priv.reset (priv);
+		  pi++;
+		  gi++;
+		}
+	      else
+		{
+		  delete_thread (gbuf[gi]);
+		  gi++;
+		}
 	    }
 	  else
 	    {
@@ -1091,10 +1094,6 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
   if (ptid.pid () == -1)
     return ptid_t (-1);
 
-  /* The target beneath does not deal with threads, so it should only return
-     pid-only ptids.  */
-  gdb_assert (ptid.is_pid ());
-
   /* Check whether libpthdebug might be ready to be initialized.  */
   if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
diff --git a/gdb/rs6000-aix-nat.c b/gdb/rs6000-aix-nat.c
index 2ac1f6e70b6..e48580f1b96 100644
--- a/gdb/rs6000-aix-nat.c
+++ b/gdb/rs6000-aix-nat.c
@@ -99,6 +99,8 @@ class rs6000_nat_target final : public inf_ptrace_target
      support.  */
   void follow_fork (inferior *, ptid_t, target_waitkind, bool, bool) override;
 
+  std::string pid_to_str (ptid_t) override;
+
 protected:
 
   void post_startup_inferior (ptid_t ptid) override;
@@ -619,6 +621,64 @@ rs6000_nat_target::xfer_partial (enum target_object object,
     }
 }
 
+/* Search through the list of all kernel threads for the thread
+   that has stopped on a SIGTRAP or SIGINT signal, and return
+   its TID.  Return 0 if none found.  */
+
+static tid_t
+get_signaled_thread_rs6000 (int pid)
+{
+  struct thrdsinfo64 thrinf;
+  tid_t ktid = 0;
+
+  while (1)
+    {
+      if (getthrds (pid, &thrinf,
+                    sizeof (thrinf), &ktid, 1) != 1)
+        break;
+
+      if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)
+        return thrinf.ti_tid;
+    }
+
+  /* Didn't find any thread stopped on a SIGTRAP signal.  */
+  return 0;
+}
+
+/* If my process is pthreaded I need to return that ptid else ptid_t
+   (pid).  */
+
+static ptid_t
+find_the_return_ptid (pid_t pid)
+{
+  ptid_t ptid = ptid_t (pid);
+  process_stratum_target *proc_target
+        = current_inferior ()->process_target ();
+  inferior *inf = find_inferior_pid (proc_target, pid);
+  thread_info *tp = find_thread_ptid (inf, ptid_t (pid));
+  if (tp == nullptr)
+    for (thread_info *tp1 : inf->threads ())
+       if (tp1->ptid.lwp () == get_signaled_thread_rs6000 (pid))
+         return tp1->ptid;
+  return ptid;
+}
+
+/* Returning "thread" or "process" info as control comes here 
+   during a process switch in multi process debugging.  This 
+   is needed for "info threads" command as a process can be
+   threaded or non threaded in multi process case.  */
+
+std::string
+rs6000_nat_target::pid_to_str (ptid_t ptid)
+{
+  if (ptid.tid () != 0)
+    return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
+
+  else
+    return string_printf (_("Process %s"), pulongest (ptid.pid ()));
+}
+
+
 /* Wait for the child specified by PTID to do something.  Return the
    process ID of the child, or MINUS_ONE_PTID in case of error; store
    the status in *OURSTATUS.  */
@@ -672,7 +732,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
 	      if (parent_pid > 0)
 		{
 		  ourstatus->set_forked (ptid_t (pid));
-		  return ptid_t (parent_pid);
+		  return find_the_return_ptid (parent_pid);
 		}
 	      aix_remember_child (pid);
 	    }
@@ -687,7 +747,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
 	      if (child_pid > 0)
 		{
 		  ourstatus->set_forked (ptid_t (child_pid));
-		  return ptid_t (pid);
+		  return find_the_return_ptid (pid);
 		}
 	      aix_remember_parent (pid);
 	    }
@@ -712,7 +772,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
   else
     *ourstatus = host_status_to_waitstatus (status);
 
-  return ptid_t (pid);
+  return find_the_return_ptid (pid);
 }
 \f
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-12-02  7:50                         ` Aditya Kamath1
@ 2022-12-05 18:33                           ` Ulrich Weigand
  2022-12-08 10:28                             ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2022-12-05 18:33 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>I'm not sure why it is necessary to handle this in the process layer
>>(rs6000-aix-nat.c) instead of the thread layer (aix-thread.c).
>>What specifically breaks if you do not have these rs6000-aix-nat.c
>>changes?
>
>So, if you observe output 3 or 4, the program first multi threads,
>I mean thread events are handled first and then the threads fork.
>So, when this happens, I cannot return ptid_t (parent_pid). If I do
>so, the GDB core will treat it as a new process and add it in my
>threadlist as say process 100 despite existence of 'thread 1'
>representing the same. So, I need to correctly send which thread
>did the fork () event or which thread of the process is the one who
>gave birth to a new inferior process [say 2 or 3 in output 3 below],
>I mean which thread caused the mult process event when the process
>is mutli threaded. This has to handled here as control from target.c
>comes directly to rs6000-aix-nat::wait and not through
>aix-thread.c::wait since fork () is a process event.. 

So this last bit seems to be the problem.  Could you elaborate on
what the exact call stack is?  I thought once the thread layer is
initialized, calls to ::wait should always go through it ...

>>If you *do* need to handle LWPs (kernel thread IDs) in the process
>>layer (this can be a reasonable choice, and it done by several other
>>native targets), then it should be *consistent*, and *all* LWP handling
>>should be in the process layer. In particular, under no circumstances
>>does it make sense to duplicate the "find current/signalled thread"
>>code in *both* the process any thread layers.
>
>This not straightforward to do. The reason being say our application is pthreaded
>We need our sync_threadlists() code to detect multiple threads and sync..
>We cannot handle this in rs6000-aix-nat.c with the current design of the code..
>Let's say child process is multi-threaded things can get complex..
>It will require us to move that whole GDB list and Pthread list sync code to
>rs6000-aix-nat.c code. The essence or most selling product or the USP
>[Unique Selling Proposition] of aix-thread.c code will be lost. 

So the way this works e.g. on Linux is that the process layer handles
both processes and the *kernel* aspect of threads, while the thread
layer handles the *user-space* (libc/libpthread) aspect of threads.

In terms of the GDB ptid_t, this means that both the "pid" and "lwp"
field are "owned" by the process layer (which would be rs6000-aix-nat.c
in your case), while only the "tid" field is owned by the thread
layer (which would be aix-thread.c).  

Linux does that because it allows correctly debugging programs that
only use the kernel threading capabilities without using libpthread,
e.g. by directly calling the "clone" system call and not "pthread_create".
Such threads won't be in the thread list managed by the user space
library, but are still handled by the process layer in GDB, tracked
as lwp without associated tid.

Not sure if something like that is even possible in AIX.  If it does
make sense to handle things similarly in AIX (one other reason would
be ptrace commands that require LWPs, e.g. like the VSX register
access you had in another thread), some code would indeed need
to move, e.g. everything related to accessing *kernel* threads
(fetch_regs_kernel_thread etc.), while code that accesses *user*
threads via the libpthread accessors (fetch_regs_user_thread etc.)
would still remain in aix-thread.c.


>>>[Switching to process 16777620]
>
>>This outputs inferior_ptid ...
>
>Yes, you were right
>
>>>* 1.1  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  1.2  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  1.3  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  2.1  process 8323570   0xd0594fc8 in ?? ()
>>>  3.1  process 17957172  0xd0594fc8 in ?? ()
>
>>... and this outputs the ptid values for those threads.
>
>>If it says "process ...", then those ptid values have not
>>properly been switched over to the (pid, lwp, tid) format.

>While debugged in depth last two days I realised our pid_to_str
>is needed in rs6000-aix-nat.c as control comes here in search of it.
>If it doesn't GDB treats all threads as process.

This is again very suspicious.  We obviously already have
threads, so the thread layer should be initialized.  This
means that any "pid_to_str" call should go through the
*thread* layer (implementation in aix-thread.c).  If that
doesn't happen, we should understand why.  (This may be the
same problem that causes "wait" to be called from the
wrong layer, as seen above.)


>>You should verify that the sync_threadlists code handles
>>all multi-process cases correctly.  I haven't looked at
>>this in detail, but are you sure that here:
>
>>>@@ -841,8 +829,22 @@ sync_threadlists (int pid)
> >>            }
> >>          else if (cmp_result > 0)
> >>            {
>>>-             delete_thread (gbuf[gi]);
>
>
>>you never accidentally switch the *pid* part (if "gptid"
>>belows to a different pid than "pptid")?
>
>So, this is not the reason. I have added an assertion here just
>to be sure. I get what you are thinking.

Having an assertion is of course good, but it isn't obvious to
me that this never can be hit.


>>Hmm.  So when "wait" returns, it needs to determine which thread
>>triggered the event that caused ptrace to stop.  On Linux, "wait"
>>will actually return the LWP of that thread, so it can be directly
>>used.  It seems on AIX, "wait" only returns a PID, and you do not
>>immediately know which thread caused the event?
>
>>In that case, I can see why you'd have to consider SIGINT as well
>>as SIGTRAP. However, it seems to me that even those two are not the
>>*only* cases that can cause "wait" to return - doesn't *any* signal
>>(potentially) trigger a ptrace intercept (causing wait to return)?
>
>>But that's probably a more general problem, and wouldn't occur in
>>this simple test case.
>
>Exactly. So I tried debugging few examples causing a few other signals
>as mentioned in this document [https://www.ibm.com/docs/en/sdk-java-technology/8?topic=reference-signal-handling].
>In AIX we have most of them mentioned in the link. It does not block
>us from doing things or crashes incase of a segment fault signal
>[from our debugger code]. Abort also works fine. Let me know what you think. 

The point is if GDB stops because the target received a signal, it
should automatically switch to the particular thread where the signal
was in fact received.  I don't think this will actually happen in all
cases with the current code.

Shouldn't you instead check for *any* signal in get_signaled_thread?


Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-12-05 18:33                           ` Ulrich Weigand
@ 2022-12-08 10:28                             ` Aditya Kamath1
  2022-12-08 10:46                               ` Aditya Kamath1
  2022-12-08 16:29                               ` Ulrich Weigand
  0 siblings, 2 replies; 49+ messages in thread
From: Aditya Kamath1 @ 2022-12-08 10:28 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 28014 bytes --]

Hi Ulrich and community,

Please find the new patch [See:- 0001-Fix-multi-thread-bug-in-AIX.patch ]

I have moved get_signaled_thread () function which tells us which is the kernel thread that caused an event due to which the debugger had to wait to rs6000-aix-nat.c.

So, we figure out which kernel thread might have caused an event in the rs6000-aix-nat.c code itself. If let us say the main thread is pthreaded or a new user thread is being created or user thread is deleted, then I have taken care of what to return in sync_threadlists () code itself.

>>So, if you observe output 3 or 4, the program first multi threads,
>>I mean thread events are handled first and then the threads fork.
>>So, when this happens, I cannot return ptid_t (parent_pid). If I do
>>so, the GDB core will treat it as a new process and add it in my
>>threadlist as say process 100 despite existence of 'thread 1'
>>representing the same. So, I need to correctly send which thread
>>did the fork () event or which thread of the process is the one who
>>gave birth to a new inferior process [say 2 or 3 in output 3 below],
>>I mean which thread caused the mult process event when the process
>>is mutli threaded. This has to handled here as control from target.c
>>comes directly to rs6000-aix-nat::wait and not through
>>aix-thread.c::wait since fork () is a process event..

>So this last bit seems to be the problem.  Could you elaborate on
>what the exact call stack is?  I thought once the thread layer is
>initialized, calls to ::wait should always go through it ...

Kindly see the backtrace sections

  *   BT:- Thread_wait [which is on a thread event like new thread born or main process is pthreaded],
  *   BT:- Post thread wait in rs6000-aix-nat::wait  [which is the beneath ()->wait () in aix_thread_target::wait],
  *   BT:- If direct rs6000-aix-nat::wait [ where in output 3 and 4 {below in this email} you can see it will directly come to rs6000-aix-nat.c if the main process after having threads forks or uses a fork () call ] pasted below in this email.

>So the way this works e.g. on Linux is that the process layer handles
>both processes and the *kernel* aspect of threads, while the thread
>layer handles the *user-space* (libc/libpthread) aspect of threads.

>In terms of the GDB ptid_t, this means that both the "pid" and "lwp"
>field are "owned" by the process layer (which would be rs6000-aix-nat.c
>in your case), while only the "tid" field is owned by the thread
>layer (which would be aix-thread.c).

>Linux does that because it allows correctly debugging programs that
>only use the kernel threading capabilities without using libpthread,
>e.g. by directly calling the "clone" system call and not "pthread_create".
>Such threads won't be in the thread list managed by the user space
>library, but are still handled by the process layer in GDB, tracked
>as lwp without associated tid.

>Not sure if something like that is even possible in AIX.  If it does
>make sense to handle things similarly in AIX (one other reason would
>be ptrace commands that require LWPs, e.g. like the VSX register
>access you had in another thread), some code would indeed need
>to move, e.g. everything related to accessing *kernel* threads
>(fetch_regs_kernel_thread etc.), while code that accesses *user*
>threads via the libpthread accessors (fetch_regs_user_thread etc.)
>would still remain in aix-thread.c.

With this patch I have moved all my lwp checks in rs6000-aix-nat.c file and user thread things in aix-thread.c.. Yes, this will help us in the vector patch.

>>While debugged in depth last two days I realised our pid_to_str
>>is needed in rs6000-aix-nat.c as control comes here in search of it.
>>If it doesn't GDB treats all threads as process.

>This is again very suspicious.  We obviously already have
>threads, so the thread layer should be initialized.  This
>means that any "pid_to_str" call should go through the
>*thread* layer (implementation in aix-thread.c).  If that
>doesn't happen, we should understand why.  (This may be the
>same problem that causes "wait" to be called from the
>wrong layer, as seen above.)

Kindly check the backtrace section [ BT:- pid_to_str stack ] below this email. So, what is happening is a thread event will come through threads and a process even will come through process layer. For example, while I press an interrupt key [Ctrl+c] in a multi process scenario, for the GDB core knowing which process is needed. By looking at the stack, it is built assuming the target will figure out the kernel thread that eventually caused this event in the process layer.

Secondly kindly look at aix-thread.c:pid_to_str. We have a beneath()->pid_to_str () there in case the process is not threaded. So, we need one in the rs6000-aix-nat.c.

aix_thread_target::pid_to_str (ptid_t ptid)

{

  if (!PD_TID (ptid))

    return beneath ()->pid_to_str (ptid);


  return string_printf (_("Thread %s"), pulongest (ptid.tid ()));

}

>>I have added an assertion here just
>>to be sure. I get what you are thinking.
>Having an assertion is of course good, but it isn't obvious to
>me that this never can be hit.

So, while I ran a few unit tests I did not find any case where we might end swapping the pid. So, I added the same so that if anyone hits this in the future, we are aware and can change accordingly.

>The point is if GDB stops because the target received a signal, it
>should automatically switch to the particular thread where the signal
>was in fact received.  I don't think this will actually happen in all
>cases with the current code.

>Shouldn't you instead check for *any* signal in get_signaled_thread?

Yes, kindly check the get_signaled_thread_rs6000 ().

----------------------------------------------------

These are the changes I made thinking about how we can handle that get_signaled thread in one place. I have also attached the outputs and programs below.

Also, now we pass ptid in some functions instead of pid in aix-thread.c.

Kindly let me know what you think.

Have a nice day ahead.

Thanks and regards,
Aditya.



---------------------------------------------------------------------------------------------

BT:- Thread_wait


Thread 1 hit Breakpoint 1, aix_thread_target::wait (this=0x11001f758 <_aixthread.rw_+24>, ptid=...,

    status=0xffffffffffff360, options=...) at aix-thread.c:1051

1051        pid_to_prc (&ptid);

(gdb) bt

#0  aix_thread_target::wait (this=0x11001f758 <_aixthread.rw_+24>, ptid=..., status=0xffffffffffff360,

    options=...) at aix-thread.c:1051

#1  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

#2  0x000000010037f158 in do_target_wait_1 (inf=0x1101713f0, ptid=..., status=0xffffffffffff360,

    options=...) at infrun.c:3763

#3  0x000000010037f41c in <lambda(inferior*)>::operator()(inferior *) const (

    __closure=0xffffffffffff130, inf=0x1101713f0) at infrun.c:3822

#4  0x000000010037f85c in do_target_wait (ecs=0xffffffffffff338, options=...) at infrun.c:3841

#5  0x0000000100380cc8 in fetch_inferior_event () at infrun.c:4201

#6  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#7  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#8  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#9  0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#10 0x0000000100001dd0 in start_event_loop () at main.c:411

#11 0x0000000100001fd8 in captured_command_loop () at main.c:471

#12 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#13 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#14 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32

------------------------------------------------------------

BT:- Post thread wait in rs6000-aix-nat::wait


(gdb) c

Continuing.


Thread 1 hit Breakpoint 2, rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

695           set_sigint_trap ();

(gdb) bt

#0  rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

#1  0x0000000100599d68 in aix_thread_target::wait (this=0x11001f758 <_aixthread.rw_+24>, ptid=...,

    status=0xffffffffffff360, options=...) at aix-thread.c:1053

#2  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

#3  0x000000010037f158 in do_target_wait_1 (inf=0x1101713f0, ptid=..., status=0xffffffffffff360,

    options=...) at infrun.c:3763

#4  0x000000010037f41c in <lambda(inferior*)>::operator()(inferior *) const (

    __closure=0xffffffffffff130, inf=0x1101713f0) at infrun.c:3822

#5  0x000000010037f85c in do_target_wait (ecs=0xffffffffffff338, options=...) at infrun.c:3841

#6  0x0000000100380cc8 in fetch_inferior_event () at infrun.c:4201

#7  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#8  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#9  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#10 0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#11 0x0000000100001dd0 in start_event_loop () at main.c:411

#12 0x0000000100001fd8 in captured_command_loop () at main.c:471

#13 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#14 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#15 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32

-----------------------------------------------------------------------------------------------------
BT:- If direct rs6000-aix-nat::wait


Thread 1 hit Breakpoint 2, rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

695           set_sigint_trap ();

(gdb) bt

#0  rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

#1  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

#2  0x000000010037f158 in do_target_wait_1 (inf=0x1105f4430, ptid=..., status=0xffffffffffff360,

    options=...) at infrun.c:3763

#3  0x000000010037f41c in <lambda(inferior*)>::operator()(inferior *) const (

    __closure=0xffffffffffff130, inf=0x1105f4430) at infrun.c:3822

#4  0x000000010037f85c in do_target_wait (ecs=0xffffffffffff338, options=...) at infrun.c:3841

#5  0x0000000100380cc8 in fetch_inferior_event () at infrun.c:4201

#6  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#7  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#8  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#9  0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#10 0x0000000100001dd0 in start_event_loop () at main.c:411

#11 0x0000000100001fd8 in captured_command_loop () at main.c:471

#12 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#13 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#14 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32

---------------------------------------------------------

BT:- pid_to_str stack


(gdb) bt

#0  rs6000_nat_target::pid_to_str[abi:cxx11](ptid_t) (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...)

    at rs6000-aix-nat.c:674

#1  0x00000001003409ec in target_pid_to_str[abi:cxx11](ptid_t) (ptid=...) at target.c:2623

#2  0x000000010038fc08 in normal_stop () at infrun.c:8697

#3  0x0000000100380ff4 in fetch_inferior_event () at infrun.c:4266

#4  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#5  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#6  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#7  0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#8  0x0000000100001dd0 in start_event_loop () at main.c:411

#9  0x0000000100001fd8 in captured_command_loop () at main.c:471

#10 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#11 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#12 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32


---------------------------------------------------------------------


Program1:- [Credits gdb.threads/continuous-pending.c]


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 3


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */

  pthread_barrier_wait (&barrier);


  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

    sleep (1);


  return 0;

}



----------------------------------------------------------------
Output1:- Single process


Reading symbols from /home/aditya/gdb_tests/continue-pending-status...

(gdb) r

Starting program: /home/aditya/gdb_tests/continue-pending-status

^C[New Thread 258]

[New Thread 515]

[New Thread 772]


Thread 3 received signal SIGINT, Interrupt.

[Switching to Thread 515]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

36        while (1); /* break here */

(gdb) info threads

  Id   Target Id                          Frame

  1    Thread 1 (tid 24838585, running)   warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


  2    Thread 258 (tid 23134635, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

* 3    Thread 515 (tid 30146867, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

  4    Thread 772 (tid 27853165, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

---------------------------------------------------------------------------------

Program 2:- Multi process Code


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else{

    printf (" Iam child \n");

    child = fork ();

    if (child > 0)

      printf ("From child I became a parent \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

    break;

  }


  return 0;

}


-------------------------------------------------------------------------

Output 2:- With detach-on-fork on


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[Detaching after fork from child process 8323572]

 Iam child

I am grandchild

From child I became a parent

I am parent

[Detaching after fork from child process 11665884]

 Iam child

I am grandchild

From child I became a parent

I am parent

^C

Thread 2 received signal SIGINT, Interrupt.

[Switching to Thread 258]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

32        while (1); /* break here */

(gdb) info threads

  Id   Target Id                          Frame

  1    Thread 1 (tid 27263269, running)   warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


* 2    Thread 258 (tid 28705075, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  3    Thread 515 (tid 27853169, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32


-------------------------------------------------------------------------

Output 3:- With detach-on-fork off


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (Process 15466928)]

[New inferior 3 (Process 13894048)]

I am parent

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id         Frame

* 1.1  Thread 1          0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258        0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  Thread 515        0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2.1  Process 15466928  0xd0594fc8 in ?? ()

  3.1  Process 13894048  0xd0594fc8 in ?? ()

--------------------------------------------------

Output 4:- detach fork off and following child


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) set follow-fork-mode child

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[Attaching after Thread 515 fork to child Process 13894050]

[New inferior 2 (Process 13894050)]

 Iam child

[Attaching after Process 13894050 fork to child Process 11010474]

[New inferior 3 (Process 11010474)]

I am grandchild

^CReading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...


Thread 3.1 received signal SIGINT, Interrupt.

[Switching to Process 11010474]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

32        while (1); /* break here */

(gdb) info threads

  Id   Target Id         Frame

  1.1  Thread 1          0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258        0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  Thread 515        0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2.1  Process 13894050  0xd0594fc8 in ?? ()

* 3.1  Process 11010474  thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32




________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 06 December 2022 00:03
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>I'm not sure why it is necessary to handle this in the process layer
>>(rs6000-aix-nat.c) instead of the thread layer (aix-thread.c).
>>What specifically breaks if you do not have these rs6000-aix-nat.c
>>changes?
>
>So, if you observe output 3 or 4, the program first multi threads,
>I mean thread events are handled first and then the threads fork.
>So, when this happens, I cannot return ptid_t (parent_pid). If I do
>so, the GDB core will treat it as a new process and add it in my
>threadlist as say process 100 despite existence of 'thread 1'
>representing the same. So, I need to correctly send which thread
>did the fork () event or which thread of the process is the one who
>gave birth to a new inferior process [say 2 or 3 in output 3 below],
>I mean which thread caused the mult process event when the process
>is mutli threaded. This has to handled here as control from target.c
>comes directly to rs6000-aix-nat::wait and not through
>aix-thread.c::wait since fork () is a process event..

So this last bit seems to be the problem.  Could you elaborate on
what the exact call stack is?  I thought once the thread layer is
initialized, calls to ::wait should always go through it ...

>>If you *do* need to handle LWPs (kernel thread IDs) in the process
>>layer (this can be a reasonable choice, and it done by several other
>>native targets), then it should be *consistent*, and *all* LWP handling
>>should be in the process layer. In particular, under no circumstances
>>does it make sense to duplicate the "find current/signalled thread"
>>code in *both* the process any thread layers.
>
>This not straightforward to do. The reason being say our application is pthreaded
>We need our sync_threadlists() code to detect multiple threads and sync..
>We cannot handle this in rs6000-aix-nat.c with the current design of the code..
>Let's say child process is multi-threaded things can get complex..
>It will require us to move that whole GDB list and Pthread list sync code to
>rs6000-aix-nat.c code. The essence or most selling product or the USP
>[Unique Selling Proposition] of aix-thread.c code will be lost.

So the way this works e.g. on Linux is that the process layer handles
both processes and the *kernel* aspect of threads, while the thread
layer handles the *user-space* (libc/libpthread) aspect of threads.

In terms of the GDB ptid_t, this means that both the "pid" and "lwp"
field are "owned" by the process layer (which would be rs6000-aix-nat.c
in your case), while only the "tid" field is owned by the thread
layer (which would be aix-thread.c).

Linux does that because it allows correctly debugging programs that
only use the kernel threading capabilities without using libpthread,
e.g. by directly calling the "clone" system call and not "pthread_create".
Such threads won't be in the thread list managed by the user space
library, but are still handled by the process layer in GDB, tracked
as lwp without associated tid.

Not sure if something like that is even possible in AIX.  If it does
make sense to handle things similarly in AIX (one other reason would
be ptrace commands that require LWPs, e.g. like the VSX register
access you had in another thread), some code would indeed need
to move, e.g. everything related to accessing *kernel* threads
(fetch_regs_kernel_thread etc.), while code that accesses *user*
threads via the libpthread accessors (fetch_regs_user_thread etc.)
would still remain in aix-thread.c.


>>>[Switching to process 16777620]
>
>>This outputs inferior_ptid ...
>
>Yes, you were right
>
>>>* 1.1  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  1.2  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  1.3  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  2.1  process 8323570   0xd0594fc8 in ?? ()
>>>  3.1  process 17957172  0xd0594fc8 in ?? ()
>
>>... and this outputs the ptid values for those threads.
>
>>If it says "process ...", then those ptid values have not
>>properly been switched over to the (pid, lwp, tid) format.

>While debugged in depth last two days I realised our pid_to_str
>is needed in rs6000-aix-nat.c as control comes here in search of it.
>If it doesn't GDB treats all threads as process.

This is again very suspicious.  We obviously already have
threads, so the thread layer should be initialized.  This
means that any "pid_to_str" call should go through the
*thread* layer (implementation in aix-thread.c).  If that
doesn't happen, we should understand why.  (This may be the
same problem that causes "wait" to be called from the
wrong layer, as seen above.)


>>You should verify that the sync_threadlists code handles
>>all multi-process cases correctly.  I haven't looked at
>>this in detail, but are you sure that here:
>
>>>@@ -841,8 +829,22 @@ sync_threadlists (int pid)
> >>            }
> >>          else if (cmp_result > 0)
> >>            {
>>>-             delete_thread (gbuf[gi]);
>
>
>>you never accidentally switch the *pid* part (if "gptid"
>>belows to a different pid than "pptid")?
>
>So, this is not the reason. I have added an assertion here just
>to be sure. I get what you are thinking.

Having an assertion is of course good, but it isn't obvious to
me that this never can be hit.


>>Hmm.  So when "wait" returns, it needs to determine which thread
>>triggered the event that caused ptrace to stop.  On Linux, "wait"
>>will actually return the LWP of that thread, so it can be directly
>>used.  It seems on AIX, "wait" only returns a PID, and you do not
>>immediately know which thread caused the event?
>
>>In that case, I can see why you'd have to consider SIGINT as well
>>as SIGTRAP. However, it seems to me that even those two are not the
>>*only* cases that can cause "wait" to return - doesn't *any* signal
>>(potentially) trigger a ptrace intercept (causing wait to return)?
>
>>But that's probably a more general problem, and wouldn't occur in
>>this simple test case.
>
>Exactly. So I tried debugging few examples causing a few other signals
>as mentioned in this document [https://www.ibm.com/docs/en/sdk-java-technology/8?topic=reference-signal-handling].
>In AIX we have most of them mentioned in the link. It does not block
>us from doing things or crashes incase of a segment fault signal
>[from our debugger code]. Abort also works fine. Let me know what you think.

The point is if GDB stops because the target received a signal, it
should automatically switch to the particular thread where the signal
was in fact received.  I don't think this will actually happen in all
cases with the current code.

Shouldn't you instead check for *any* signal in get_signaled_thread?


Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 12381 bytes --]

From bba91f717b7779f39d71282835cceaaeda7ef588 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Thu, 8 Dec 2022 01:03:58 -0600
Subject: [PATCH] Fix multi thread bug in AIX

---
 gdb/aix-thread.c     | 149 ++++++++++++++++---------------------------
 gdb/rs6000-aix-nat.c |  66 ++++++++++++++++++-
 2 files changed, 118 insertions(+), 97 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..6e6f8619b64 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -508,14 +508,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +638,24 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
+/* iterate_over_threads() callback for counting GDB threads.  */
 
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
 
 static int
 giter_count (struct thread_info *thread, void *countp)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
+  (*(int *) countp)++;
   return 0;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for accumulating GDB thread pids.  */
 
 static int
 giter_accum (struct thread_info *thread, void *bufp)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  **(struct thread_info ***) bufp = thread;
+  (*(struct thread_info ***) bufp)++;
+    
   return 0;
 }
 
@@ -703,30 +690,6 @@ gcmp (const void *t1v, const void *t2v)
   return ptid_cmp (t1->ptid, t2->ptid);
 }
 
-/* Search through the list of all kernel threads for the thread
-   that has stopped on a SIGTRAP signal, and return its TID.
-   Return 0 if none found.  */
-
-static pthdb_tid_t
-get_signaled_thread (int pid)
-{
-  struct thrdsinfo64 thrinf;
-  tid_t ktid = 0;
-
-  while (1)
-    {
-      if (getthrds (pid, &thrinf,
-		    sizeof (thrinf), &ktid, 1) != 1)
-	break;
-
-      if (thrinf.ti_cursig == SIGTRAP)
-	return thrinf.ti_tid;
-    }
-
-  /* Didn't find any thread stopped on a SIGTRAP signal.  */
-  return 0;
-}
-
 /* Synchronize GDB's thread list with libpthdebug's.
 
    There are some benefits of doing this every time the inferior stops:
@@ -740,8 +703,8 @@ get_signaled_thread (int pid)
      - simplifies the demands placed on libpthdebug, which seems to
        have difficulty with certain call patterns */
 
-static void
-sync_threadlists (int pid)
+static ptid_t 
+sync_threadlists (ptid_t ptid)
 {
   int cmd, status;
   int pcount, psize, pi, gcount, gi;
@@ -750,6 +713,15 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+	= current_inferior ()->process_target ();
+  thread_info *tp;
+  pid_t pid = ptid.pid ();
+
+  /* This ptid should hold the ptid of a new thread 
+     or return the incoming ptid incase of delete thread.  */
+
+  ptid_t post_sync_ptid = ptid;
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -810,12 +782,10 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
-					 ptid_t (pid, 0, pbuf[pi].pthid),
+					 ptid_t (pid, pbuf[pi].tid, pbuf[pi].pthid),
 					 priv);
-
+	  post_sync_ptid = thread->ptid;
 	  pi++;
 	}
       else
@@ -823,7 +793,7 @@ sync_threadlists (int pid)
 	  ptid_t pptid, gptid;
 	  int cmp_result;
 
-	  pptid = ptid_t (pid, 0, pbuf[pi].pthid);
+	  pptid = ptid_t (pid, pbuf[pi].tid, pbuf[pi].pthid);
 	  gptid = gbuf[gi]->ptid;
 	  pdtid = pbuf[pi].pdtid;
 	  tid = pbuf[pi].tid;
@@ -841,15 +811,31 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      if (gptid.is_pid ())
+		{
+		  gdb_assert (gptid.pid () == pptid.pid ());
+		  thread_change_ptid (proc_target, gptid, pptid);
+		  aix_thread_info *priv = new aix_thread_info;
+		  priv->pdtid = pbuf[pi].pdtid;
+		  priv->tid = pbuf[pi].tid;
+		  tp = find_thread_ptid (proc_target, pptid);
+		  tp->priv.reset (priv);
+		  pi++;
+		  gi++;
+		  post_sync_ptid = pptid;
+		}
+	      else
+		{
+		  delete_thread (gbuf[gi]);
+		  gi++;
+		}
 	    }
 	  else
 	    {
 	      process_stratum_target *proc_target
 		= current_inferior ()->process_target ();
 	      thread = add_thread (proc_target, pptid);
-
+	      post_sync_ptid = pptid;	
 	      aix_thread_info *priv = new aix_thread_info;
 	      thread->priv.reset (priv);
 	      priv->pdtid = pdtid;
@@ -861,18 +847,7 @@ sync_threadlists (int pid)
 
   xfree (pbuf);
   xfree (gbuf);
-}
-
-/* Iterate_over_threads() callback for locating a thread, using
-   the TID of its associated kernel thread.  */
-
-static int
-iter_tid (struct thread_info *thread, void *tidp)
-{
-  const pthdb_tid_t tid = *(pthdb_tid_t *)tidp;
-  aix_thread_info *priv = get_aix_thread_info (thread);
-
-  return priv->tid == tid;
+  return post_sync_ptid;
 }
 
 /* Synchronize libpthdebug's state with the inferior and with GDB,
@@ -881,33 +856,23 @@ iter_tid (struct thread_info *thread, void *tidp)
    return a pid-only ptid with PID.  */
 
 static ptid_t
-pd_update (int pid)
+pd_update (ptid_t ptid)
 {
   int status;
-  ptid_t ptid;
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
+  ptid_t post_sync_ptid;
 
   if (!pd_active)
-    return ptid_t (pid);
+    return ptid;
 
   status = pthdb_session_update (pd_session);
   if (status != PTHDB_SUCCESS)
-    return ptid_t (pid);
+    return ptid;
 
-  sync_threadlists (pid);
+  post_sync_ptid = sync_threadlists (ptid);
 
-  /* Define "current thread" as one that just received a trap signal.  */
-
-  tid = get_signaled_thread (pid);
-  if (tid != 0)
-    thread = iterate_over_threads (iter_tid, &tid);
-  if (!thread)
-    ptid = ptid_t (pid);
-  else
-    ptid = thread->ptid;
-
-  return ptid;
+  return post_sync_ptid;
 }
 
 /* Try to start debugging threads in the current process.
@@ -915,19 +880,19 @@ pd_update (int pid)
    for that thread.  Otherwise, return a ptid-only ptid using PID.  */
 
 static ptid_t
-pd_activate (int pid)
+pd_activate (ptid_t ptid)
 {
   int status;
 		
-  status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
+  status = pthdb_session_init (ptid.pid (), arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
 			       &pd_session);
   if (status != PTHDB_SUCCESS)
     {
-      return ptid_t (pid);
+      return ptid;
     }
   pd_active = 1;
-  return pd_update (pid);
+  return pd_update (ptid);
 }
 
 /* Undo the effects of pd_activate().  */
@@ -983,7 +948,7 @@ pd_enable (void)
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
      activate thread debugging.  */
-  pd_activate (inferior_ptid.pid ());
+  pd_activate (inferior_ptid);
 }
 
 /* Undo the effects of pd_enable().  */
@@ -1091,10 +1056,6 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
   if (ptid.pid () == -1)
     return ptid_t (-1);
 
-  /* The target beneath does not deal with threads, so it should only return
-     pid-only ptids.  */
-  gdb_assert (ptid.is_pid ());
-
   /* Check whether libpthdebug might be ready to be initialized.  */
   if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
@@ -1106,10 +1067,10 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
 
       if (regcache_read_pc (regcache)
 	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
-	return pd_activate (ptid.pid ());
+	return pd_activate (ptid);
     }
 
-  return pd_update (ptid.pid ());
+  return pd_update (ptid);
 }
 
 /* Record that the 64-bit general-purpose registers contain VALS.  */
diff --git a/gdb/rs6000-aix-nat.c b/gdb/rs6000-aix-nat.c
index 2ac1f6e70b6..fe7ca4f3f2a 100644
--- a/gdb/rs6000-aix-nat.c
+++ b/gdb/rs6000-aix-nat.c
@@ -99,6 +99,8 @@ class rs6000_nat_target final : public inf_ptrace_target
      support.  */
   void follow_fork (inferior *, ptid_t, target_waitkind, bool, bool) override;
 
+  std::string pid_to_str (ptid_t) override;
+
 protected:
 
   void post_startup_inferior (ptid_t ptid) override;
@@ -619,6 +621,64 @@ rs6000_nat_target::xfer_partial (enum target_object object,
     }
 }
 
+/* Search through the list of all kernel threads for the thread
+   that has stopped on a SIGTRAP or SIGINT signal, and return
+   its TID.  Return 0 if none found.  */
+
+static tid_t
+get_signaled_thread_rs6000 (int pid)
+{
+  struct thrdsinfo64 thrinf;
+  tid_t ktid = 0;
+
+  while (1)
+    {
+      if (getthrds (pid, &thrinf,
+                    sizeof (thrinf), &ktid, 1) != 1)
+        break;
+
+      if (thrinf.ti_cursig != 0)
+        return thrinf.ti_tid;
+    }
+
+  /* Didn't find any thread stopped on a SIGTRAP signal.  */
+  return 0;
+}
+
+/* If my process is pthreaded I need to return that ptid else ptid_t
+   (pid).  */
+
+static ptid_t
+find_the_return_ptid (pid_t pid)
+{
+  ptid_t ptid = ptid_t (pid);
+  process_stratum_target *proc_target
+        = current_inferior ()->process_target ();
+  inferior *inf = find_inferior_pid (proc_target, pid);
+  thread_info *tp = find_thread_ptid (inf, ptid_t (pid));
+  if (tp == nullptr)
+    for (thread_info *tp1 : inf->threads ())
+       if (tp1->ptid.lwp () == get_signaled_thread_rs6000 (pid))
+         return tp1->ptid;
+  return ptid;
+}
+
+/* Returning "thread" or "process" info as control comes here 
+   during a process switch in multi process debugging.  This 
+   is needed for "info threads" command as a process can be
+   threaded or non threaded in multi process case.  */
+
+std::string
+rs6000_nat_target::pid_to_str (ptid_t ptid)
+{
+  if (ptid.tid () != 0)
+    return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
+
+  else
+    return string_printf (_("Process %s"), pulongest (ptid.pid ()));
+}
+
+
 /* Wait for the child specified by PTID to do something.  Return the
    process ID of the child, or MINUS_ONE_PTID in case of error; store
    the status in *OURSTATUS.  */
@@ -672,7 +732,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
 	      if (parent_pid > 0)
 		{
 		  ourstatus->set_forked (ptid_t (pid));
-		  return ptid_t (parent_pid);
+		  return find_the_return_ptid (parent_pid);
 		}
 	      aix_remember_child (pid);
 	    }
@@ -687,7 +747,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
 	      if (child_pid > 0)
 		{
 		  ourstatus->set_forked (ptid_t (child_pid));
-		  return ptid_t (pid);
+		  return find_the_return_ptid (pid);
 		}
 	      aix_remember_parent (pid);
 	    }
@@ -712,7 +772,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
   else
     *ourstatus = host_status_to_waitstatus (status);
 
-  return ptid_t (pid);
+  return find_the_return_ptid (pid);
 }
 \f
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-12-08 10:28                             ` Aditya Kamath1
@ 2022-12-08 10:46                               ` Aditya Kamath1
  2022-12-08 16:29                               ` Ulrich Weigand
  1 sibling, 0 replies; 49+ messages in thread
From: Aditya Kamath1 @ 2022-12-08 10:46 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 41485 bytes --]

Hi Ulrich and community,

Please find the new patch [See:- 0001-Fix-multi-thread-bug-in-AIX.patch ]. The last message size was too long. So, sending again so it can be part of mailing list as well.

I have moved get_signaled_thread () function which tells us which is the kernel thread that caused an event due to which the debugger had to wait to rs6000-aix-nat.c.

So, we figure out which kernel thread might have caused an event in the rs6000-aix-nat.c code itself. If let us say the main thread is pthreaded or a new user thread is being created or user thread is deleted, then I have taken care of what to return in sync_threadlists () code itself.

>>So, if you observe output 3 or 4, the program first multi threads,
>>I mean thread events are handled first and then the threads fork.
>>So, when this happens, I cannot return ptid_t (parent_pid). If I do
>>so, the GDB core will treat it as a new process and add it in my
>>threadlist as say process 100 despite existence of 'thread 1'
>>representing the same. So, I need to correctly send which thread
>>did the fork () event or which thread of the process is the one who
>>gave birth to a new inferior process [say 2 or 3 in output 3 below],
>>I mean which thread caused the mult process event when the process
>>is mutli threaded. This has to handled here as control from target.c
>>comes directly to rs6000-aix-nat::wait and not through
>>aix-thread.c::wait since fork () is a process event..

>So this last bit seems to be the problem.  Could you elaborate on
>what the exact call stack is?  I thought once the thread layer is
>initialized, calls to ::wait should always go through it ...

Kindly see the backtrace sections

  *   BT:- Thread_wait [which is on a thread event like new thread born or main process is pthreaded],
  *   BT:- Post thread wait in rs6000-aix-nat::wait  [which is the beneath ()->wait () in aix_thread_target::wait],
  *   BT:- If direct rs6000-aix-nat::wait [ where in output 3 and 4 {below in the previous email} you can see it will directly come to rs6000-aix-nat.c if the main process after having threads forks or uses a fork () call ] pasted below in this email.

>So the way this works e.g. on Linux is that the process layer handles
>both processes and the *kernel* aspect of threads, while the thread
>layer handles the *user-space* (libc/libpthread) aspect of threads.

>In terms of the GDB ptid_t, this means that both the "pid" and "lwp"
>field are "owned" by the process layer (which would be rs6000-aix-nat.c
>in your case), while only the "tid" field is owned by the thread
>layer (which would be aix-thread.c).

>Linux does that because it allows correctly debugging programs that
>only use the kernel threading capabilities without using libpthread,
>e.g. by directly calling the "clone" system call and not "pthread_create".
>Such threads won't be in the thread list managed by the user space
>library, but are still handled by the process layer in GDB, tracked
>as lwp without associated tid.

>Not sure if something like that is even possible in AIX.  If it does
>make sense to handle things similarly in AIX (one other reason would
>be ptrace commands that require LWPs, e.g. like the VSX register
>access you had in another thread), some code would indeed need
>to move, e.g. everything related to accessing *kernel* threads
>(fetch_regs_kernel_thread etc.), while code that accesses *user*
>threads via the libpthread accessors (fetch_regs_user_thread etc.)
>would still remain in aix-thread.c.

With this patch I have moved all my lwp checks in rs6000-aix-nat.c file and user thread things in aix-thread.c.. Yes, this will help us in the vector patch.

>>While debugged in depth last two days I realised our pid_to_str
>>is needed in rs6000-aix-nat.c as control comes here in search of it.
>>If it doesn't GDB treats all threads as process.

>This is again very suspicious.  We obviously already have
>threads, so the thread layer should be initialized.  This
>means that any "pid_to_str" call should go through the
>*thread* layer (implementation in aix-thread.c).  If that
>doesn't happen, we should understand why.  (This may be the
>same problem that causes "wait" to be called from the
>wrong layer, as seen above.)

Kindly check the backtrace section [ BT:- pid_to_str stack ] below this email. So, what is happening is a thread event will come through threads and a process even will come through process layer. For example, while I press an interrupt key [Ctrl+c] in a multi process scenario, for the GDB core knowing which process is needed. By looking at the stack, it is built assuming the target will figure out the kernel thread that eventually caused this event in the process layer.

Secondly kindly look at aix-thread.c:pid_to_str. We have a beneath()->pid_to_str () there in case the process is not threaded. So, we need one in the rs6000-aix-nat.c.

aix_thread_target::pid_to_str (ptid_t ptid)

{

  if (!PD_TID (ptid))

    return beneath ()->pid_to_str (ptid);


  return string_printf (_("Thread %s"), pulongest (ptid.tid ()));

}

>>I have added an assertion here just
>>to be sure. I get what you are thinking.
>Having an assertion is of course good, but it isn't obvious to
>me that this never can be hit.

So, while I ran a few unit tests I did not find any case where we might end swapping the pid. So, I added the same so that if anyone hits this in the future, we are aware and can change accordingly.

>The point is if GDB stops because the target received a signal, it
>should automatically switch to the particular thread where the signal
>was in fact received.  I don't think this will actually happen in all
>cases with the current code.

>Shouldn't you instead check for *any* signal in get_signaled_thread?

Yes, kindly check the get_signaled_thread_rs6000 ().

----------------------------------------------------

These are the changes I made thinking about how we can handle that get_signaled thread in one place. I have also attached the outputs and programs below.

Also, now we pass ptid in some functions instead of pid in aix-thread.c.

Kindly let me know what you think.

Have a nice day ahead.

Thanks and regards,
Aditya.

Below outputs are the one's obtained by ./gdb ./gdb

---------------------------------------------------------------------------------------------

BT:- Thread_wait


Thread 1 hit Breakpoint 1, aix_thread_target::wait (this=0x11001f758 <_aixthread.rw_+24>, ptid=...,

    status=0xffffffffffff360, options=...) at aix-thread.c:1051

1051        pid_to_prc (&ptid);

(gdb) bt

#0  aix_thread_target::wait (this=0x11001f758 <_aixthread.rw_+24>, ptid=..., status=0xffffffffffff360,

    options=...) at aix-thread.c:1051

#1  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

#2  0x000000010037f158 in do_target_wait_1 (inf=0x1101713f0, ptid=..., status=0xffffffffffff360,

    options=...) at infrun.c:3763

#3  0x000000010037f41c in <lambda(inferior*)>::operator()(inferior *) const (

    __closure=0xffffffffffff130, inf=0x1101713f0) at infrun.c:3822

#4  0x000000010037f85c in do_target_wait (ecs=0xffffffffffff338, options=...) at infrun.c:3841

#5  0x0000000100380cc8 in fetch_inferior_event () at infrun.c:4201

#6  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#7  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#8  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#9  0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#10 0x0000000100001dd0 in start_event_loop () at main.c:411

#11 0x0000000100001fd8 in captured_command_loop () at main.c:471

#12 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#13 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#14 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32

------------------------------------------------------------

BT:- Post thread wait in rs6000-aix-nat::wait


(gdb) c

Continuing.


Thread 1 hit Breakpoint 2, rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

695           set_sigint_trap ();

(gdb) bt

#0  rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

#1  0x0000000100599d68 in aix_thread_target::wait (this=0x11001f758 <_aixthread.rw_+24>, ptid=...,

    status=0xffffffffffff360, options=...) at aix-thread.c:1053

#2  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

#3  0x000000010037f158 in do_target_wait_1 (inf=0x1101713f0, ptid=..., status=0xffffffffffff360,

    options=...) at infrun.c:3763

#4  0x000000010037f41c in <lambda(inferior*)>::operator()(inferior *) const (

    __closure=0xffffffffffff130, inf=0x1101713f0) at infrun.c:3822

#5  0x000000010037f85c in do_target_wait (ecs=0xffffffffffff338, options=...) at infrun.c:3841

#6  0x0000000100380cc8 in fetch_inferior_event () at infrun.c:4201

#7  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#8  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#9  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#10 0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#11 0x0000000100001dd0 in start_event_loop () at main.c:411

#12 0x0000000100001fd8 in captured_command_loop () at main.c:471

#13 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#14 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#15 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32

-----------------------------------------------------------------------------------------------------
BT:- If direct rs6000-aix-nat::wait


Thread 1 hit Breakpoint 2, rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

695           set_sigint_trap ();

(gdb) bt

#0  rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

#1  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

#2  0x000000010037f158 in do_target_wait_1 (inf=0x1105f4430, ptid=..., status=0xffffffffffff360,

    options=...) at infrun.c:3763

#3  0x000000010037f41c in <lambda(inferior*)>::operator()(inferior *) const (

    __closure=0xffffffffffff130, inf=0x1105f4430) at infrun.c:3822

#4  0x000000010037f85c in do_target_wait (ecs=0xffffffffffff338, options=...) at infrun.c:3841

#5  0x0000000100380cc8 in fetch_inferior_event () at infrun.c:4201

#6  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#7  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#8  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#9  0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#10 0x0000000100001dd0 in start_event_loop () at main.c:411

#11 0x0000000100001fd8 in captured_command_loop () at main.c:471

#12 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#13 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#14 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32

---------------------------------------------------------

BT:- pid_to_str stack


(gdb) bt

#0  rs6000_nat_target::pid_to_str[abi:cxx11](ptid_t) (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...)

    at rs6000-aix-nat.c:674

#1  0x00000001003409ec in target_pid_to_str[abi:cxx11](ptid_t) (ptid=...) at target.c:2623

#2  0x000000010038fc08 in normal_stop () at infrun.c:8697

#3  0x0000000100380ff4 in fetch_inferior_event () at infrun.c:4266

#4  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#5  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#6  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#7  0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#8  0x0000000100001dd0 in start_event_loop () at main.c:411

#9  0x0000000100001fd8 in captured_command_loop () at main.c:471

#10 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#11 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#12 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32


-

________________________________
From: Aditya Kamath1 <Aditya.Kamath1@ibm.com>
Sent: 08 December 2022 15:58
To: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>; simark@simark.ca <simark@simark.ca>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Hi Ulrich and community,

Please find the new patch [See:- 0001-Fix-multi-thread-bug-in-AIX.patch ]

I have moved get_signaled_thread () function which tells us which is the kernel thread that caused an event due to which the debugger had to wait to rs6000-aix-nat.c.

So, we figure out which kernel thread might have caused an event in the rs6000-aix-nat.c code itself. If let us say the main thread is pthreaded or a new user thread is being created or user thread is deleted, then I have taken care of what to return in sync_threadlists () code itself.

>>So, if you observe output 3 or 4, the program first multi threads,
>>I mean thread events are handled first and then the threads fork.
>>So, when this happens, I cannot return ptid_t (parent_pid). If I do
>>so, the GDB core will treat it as a new process and add it in my
>>threadlist as say process 100 despite existence of 'thread 1'
>>representing the same. So, I need to correctly send which thread
>>did the fork () event or which thread of the process is the one who
>>gave birth to a new inferior process [say 2 or 3 in output 3 below],
>>I mean which thread caused the mult process event when the process
>>is mutli threaded. This has to handled here as control from target.c
>>comes directly to rs6000-aix-nat::wait and not through
>>aix-thread.c::wait since fork () is a process event..

>So this last bit seems to be the problem.  Could you elaborate on
>what the exact call stack is?  I thought once the thread layer is
>initialized, calls to ::wait should always go through it ...

Kindly see the backtrace sections

  *   BT:- Thread_wait [which is on a thread event like new thread born or main process is pthreaded],
  *   BT:- Post thread wait in rs6000-aix-nat::wait  [which is the beneath ()->wait () in aix_thread_target::wait],
  *   BT:- If direct rs6000-aix-nat::wait [ where in output 3 and 4 {below in this email} you can see it will directly come to rs6000-aix-nat.c if the main process after having threads forks or uses a fork () call ] pasted below in this email.

>So the way this works e.g. on Linux is that the process layer handles
>both processes and the *kernel* aspect of threads, while the thread
>layer handles the *user-space* (libc/libpthread) aspect of threads.

>In terms of the GDB ptid_t, this means that both the "pid" and "lwp"
>field are "owned" by the process layer (which would be rs6000-aix-nat.c
>in your case), while only the "tid" field is owned by the thread
>layer (which would be aix-thread.c).

>Linux does that because it allows correctly debugging programs that
>only use the kernel threading capabilities without using libpthread,
>e.g. by directly calling the "clone" system call and not "pthread_create".
>Such threads won't be in the thread list managed by the user space
>library, but are still handled by the process layer in GDB, tracked
>as lwp without associated tid.

>Not sure if something like that is even possible in AIX.  If it does
>make sense to handle things similarly in AIX (one other reason would
>be ptrace commands that require LWPs, e.g. like the VSX register
>access you had in another thread), some code would indeed need
>to move, e.g. everything related to accessing *kernel* threads
>(fetch_regs_kernel_thread etc.), while code that accesses *user*
>threads via the libpthread accessors (fetch_regs_user_thread etc.)
>would still remain in aix-thread.c.

With this patch I have moved all my lwp checks in rs6000-aix-nat.c file and user thread things in aix-thread.c.. Yes, this will help us in the vector patch.

>>While debugged in depth last two days I realised our pid_to_str
>>is needed in rs6000-aix-nat.c as control comes here in search of it.
>>If it doesn't GDB treats all threads as process.

>This is again very suspicious.  We obviously already have
>threads, so the thread layer should be initialized.  This
>means that any "pid_to_str" call should go through the
>*thread* layer (implementation in aix-thread.c).  If that
>doesn't happen, we should understand why.  (This may be the
>same problem that causes "wait" to be called from the
>wrong layer, as seen above.)

Kindly check the backtrace section [ BT:- pid_to_str stack ] below this email. So, what is happening is a thread event will come through threads and a process even will come through process layer. For example, while I press an interrupt key [Ctrl+c] in a multi process scenario, for the GDB core knowing which process is needed. By looking at the stack, it is built assuming the target will figure out the kernel thread that eventually caused this event in the process layer.

Secondly kindly look at aix-thread.c:pid_to_str. We have a beneath()->pid_to_str () there in case the process is not threaded. So, we need one in the rs6000-aix-nat.c.

aix_thread_target::pid_to_str (ptid_t ptid)

{

  if (!PD_TID (ptid))

    return beneath ()->pid_to_str (ptid);


  return string_printf (_("Thread %s"), pulongest (ptid.tid ()));

}

>>I have added an assertion here just
>>to be sure. I get what you are thinking.
>Having an assertion is of course good, but it isn't obvious to
>me that this never can be hit.

So, while I ran a few unit tests I did not find any case where we might end swapping the pid. So, I added the same so that if anyone hits this in the future, we are aware and can change accordingly.

>The point is if GDB stops because the target received a signal, it
>should automatically switch to the particular thread where the signal
>was in fact received.  I don't think this will actually happen in all
>cases with the current code.

>Shouldn't you instead check for *any* signal in get_signaled_thread?

Yes, kindly check the get_signaled_thread_rs6000 ().

----------------------------------------------------

These are the changes I made thinking about how we can handle that get_signaled thread in one place. I have also attached the outputs and programs below.

Also, now we pass ptid in some functions instead of pid in aix-thread.c.

Kindly let me know what you think.

Have a nice day ahead.

Thanks and regards,
Aditya.



---------------------------------------------------------------------------------------------

BT:- Thread_wait


Thread 1 hit Breakpoint 1, aix_thread_target::wait (this=0x11001f758 <_aixthread.rw_+24>, ptid=...,

    status=0xffffffffffff360, options=...) at aix-thread.c:1051

1051        pid_to_prc (&ptid);

(gdb) bt

#0  aix_thread_target::wait (this=0x11001f758 <_aixthread.rw_+24>, ptid=..., status=0xffffffffffff360,

    options=...) at aix-thread.c:1051

#1  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

#2  0x000000010037f158 in do_target_wait_1 (inf=0x1101713f0, ptid=..., status=0xffffffffffff360,

    options=...) at infrun.c:3763

#3  0x000000010037f41c in <lambda(inferior*)>::operator()(inferior *) const (

    __closure=0xffffffffffff130, inf=0x1101713f0) at infrun.c:3822

#4  0x000000010037f85c in do_target_wait (ecs=0xffffffffffff338, options=...) at infrun.c:3841

#5  0x0000000100380cc8 in fetch_inferior_event () at infrun.c:4201

#6  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#7  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#8  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#9  0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#10 0x0000000100001dd0 in start_event_loop () at main.c:411

#11 0x0000000100001fd8 in captured_command_loop () at main.c:471

#12 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#13 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#14 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32

------------------------------------------------------------

BT:- Post thread wait in rs6000-aix-nat::wait


(gdb) c

Continuing.


Thread 1 hit Breakpoint 2, rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

695           set_sigint_trap ();

(gdb) bt

#0  rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

#1  0x0000000100599d68 in aix_thread_target::wait (this=0x11001f758 <_aixthread.rw_+24>, ptid=...,

    status=0xffffffffffff360, options=...) at aix-thread.c:1053

#2  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

#3  0x000000010037f158 in do_target_wait_1 (inf=0x1101713f0, ptid=..., status=0xffffffffffff360,

    options=...) at infrun.c:3763

#4  0x000000010037f41c in <lambda(inferior*)>::operator()(inferior *) const (

    __closure=0xffffffffffff130, inf=0x1101713f0) at infrun.c:3822

#5  0x000000010037f85c in do_target_wait (ecs=0xffffffffffff338, options=...) at infrun.c:3841

#6  0x0000000100380cc8 in fetch_inferior_event () at infrun.c:4201

#7  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#8  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#9  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#10 0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#11 0x0000000100001dd0 in start_event_loop () at main.c:411

#12 0x0000000100001fd8 in captured_command_loop () at main.c:471

#13 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#14 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#15 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32

-----------------------------------------------------------------------------------------------------
BT:- If direct rs6000-aix-nat::wait


Thread 1 hit Breakpoint 2, rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

695           set_sigint_trap ();

(gdb) bt

#0  rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,

    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695

#1  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

#2  0x000000010037f158 in do_target_wait_1 (inf=0x1105f4430, ptid=..., status=0xffffffffffff360,

    options=...) at infrun.c:3763

#3  0x000000010037f41c in <lambda(inferior*)>::operator()(inferior *) const (

    __closure=0xffffffffffff130, inf=0x1105f4430) at infrun.c:3822

#4  0x000000010037f85c in do_target_wait (ecs=0xffffffffffff338, options=...) at infrun.c:3841

#5  0x0000000100380cc8 in fetch_inferior_event () at infrun.c:4201

#6  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#7  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#8  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#9  0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#10 0x0000000100001dd0 in start_event_loop () at main.c:411

#11 0x0000000100001fd8 in captured_command_loop () at main.c:471

#12 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#13 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#14 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32

---------------------------------------------------------

BT:- pid_to_str stack


(gdb) bt

#0  rs6000_nat_target::pid_to_str[abi:cxx11](ptid_t) (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...)

    at rs6000-aix-nat.c:674

#1  0x00000001003409ec in target_pid_to_str[abi:cxx11](ptid_t) (ptid=...) at target.c:2623

#2  0x000000010038fc08 in normal_stop () at infrun.c:8697

#3  0x0000000100380ff4 in fetch_inferior_event () at infrun.c:4266

#4  0x0000000100a1e354 in inferior_event_handler (event_type=INF_REG_EVENT) at inf-loop.c:41

#5  0x0000000100392700 in infrun_async_inferior_event_handler (data=0x0) at infrun.c:9555

#6  0x0000000100677d88 in check_async_event_handlers () at async-event.c:337

#7  0x000000010067439c in gdb_do_one_event (mstimeout=-1) at event-loop.cc:221

#8  0x0000000100001dd0 in start_event_loop () at main.c:411

#9  0x0000000100001fd8 in captured_command_loop () at main.c:471

#10 0x0000000100004150 in captured_main (data=0xffffffffffff9f0) at main.c:1330

#11 0x0000000100004224 in gdb_main (args=0xffffffffffff9f0) at main.c:1345

#12 0x0000000100000aa0 in main (argc=2, argv=0xffffffffffffa90) at gdb.c:32


---------------------------------------------------------------------


Program1:- [Credits gdb.threads/continuous-pending.c]


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 3


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */

  pthread_barrier_wait (&barrier);


  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

    sleep (1);


  return 0;

}



----------------------------------------------------------------
Output1:- Single process


Reading symbols from /home/aditya/gdb_tests/continue-pending-status...

(gdb) r

Starting program: /home/aditya/gdb_tests/continue-pending-status

^C[New Thread 258]

[New Thread 515]

[New Thread 772]


Thread 3 received signal SIGINT, Interrupt.

[Switching to Thread 515]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

36        while (1); /* break here */

(gdb) info threads

  Id   Target Id                          Frame

  1    Thread 1 (tid 24838585, running)   warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


  2    Thread 258 (tid 23134635, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

* 3    Thread 515 (tid 30146867, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

  4    Thread 772 (tid 27853165, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/continue-pending-status.c:36

---------------------------------------------------------------------------------

Program 2:- Multi process Code


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else{

    printf (" Iam child \n");

    child = fork ();

    if (child > 0)

      printf ("From child I became a parent \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

    break;

  }


  return 0;

}


-------------------------------------------------------------------------

Output 2:- With detach-on-fork on


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[Detaching after fork from child process 8323572]

 Iam child

I am grandchild

From child I became a parent

I am parent

[Detaching after fork from child process 11665884]

 Iam child

I am grandchild

From child I became a parent

I am parent

^C

Thread 2 received signal SIGINT, Interrupt.

[Switching to Thread 258]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

32        while (1); /* break here */

(gdb) info threads

  Id   Target Id                          Frame

  1    Thread 1 (tid 27263269, running)   warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


* 2    Thread 258 (tid 28705075, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  3    Thread 515 (tid 27853169, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32


-------------------------------------------------------------------------

Output 3:- With detach-on-fork off


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (Process 15466928)]

[New inferior 3 (Process 13894048)]

I am parent

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id         Frame

* 1.1  Thread 1          0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258        0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  Thread 515        0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2.1  Process 15466928  0xd0594fc8 in ?? ()

  3.1  Process 13894048  0xd0594fc8 in ?? ()

--------------------------------------------------

Output 4:- detach fork off and following child


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) set follow-fork-mode child

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[Attaching after Thread 515 fork to child Process 13894050]

[New inferior 2 (Process 13894050)]

 Iam child

[Attaching after Process 13894050 fork to child Process 11010474]

[New inferior 3 (Process 11010474)]

I am grandchild

^CReading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...


Thread 3.1 received signal SIGINT, Interrupt.

[Switching to Process 11010474]

thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

32        while (1); /* break here */

(gdb) info threads

  Id   Target Id         Frame

  1.1  Thread 1          0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258        0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  Thread 515        0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2.1  Process 13894050  0xd0594fc8 in ?? ()

* 3.1  Process 11010474  thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32




________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 06 December 2022 00:03
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>I'm not sure why it is necessary to handle this in the process layer
>>(rs6000-aix-nat.c) instead of the thread layer (aix-thread.c).
>>What specifically breaks if you do not have these rs6000-aix-nat.c
>>changes?
>
>So, if you observe output 3 or 4, the program first multi threads,
>I mean thread events are handled first and then the threads fork.
>So, when this happens, I cannot return ptid_t (parent_pid). If I do
>so, the GDB core will treat it as a new process and add it in my
>threadlist as say process 100 despite existence of 'thread 1'
>representing the same. So, I need to correctly send which thread
>did the fork () event or which thread of the process is the one who
>gave birth to a new inferior process [say 2 or 3 in output 3 below],
>I mean which thread caused the mult process event when the process
>is mutli threaded. This has to handled here as control from target.c
>comes directly to rs6000-aix-nat::wait and not through
>aix-thread.c::wait since fork () is a process event..

So this last bit seems to be the problem.  Could you elaborate on
what the exact call stack is?  I thought once the thread layer is
initialized, calls to ::wait should always go through it ...

>>If you *do* need to handle LWPs (kernel thread IDs) in the process
>>layer (this can be a reasonable choice, and it done by several other
>>native targets), then it should be *consistent*, and *all* LWP handling
>>should be in the process layer. In particular, under no circumstances
>>does it make sense to duplicate the "find current/signalled thread"
>>code in *both* the process any thread layers.
>
>This not straightforward to do. The reason being say our application is pthreaded
>We need our sync_threadlists() code to detect multiple threads and sync..
>We cannot handle this in rs6000-aix-nat.c with the current design of the code..
>Let's say child process is multi-threaded things can get complex..
>It will require us to move that whole GDB list and Pthread list sync code to
>rs6000-aix-nat.c code. The essence or most selling product or the USP
>[Unique Selling Proposition] of aix-thread.c code will be lost.

So the way this works e.g. on Linux is that the process layer handles
both processes and the *kernel* aspect of threads, while the thread
layer handles the *user-space* (libc/libpthread) aspect of threads.

In terms of the GDB ptid_t, this means that both the "pid" and "lwp"
field are "owned" by the process layer (which would be rs6000-aix-nat.c
in your case), while only the "tid" field is owned by the thread
layer (which would be aix-thread.c).

Linux does that because it allows correctly debugging programs that
only use the kernel threading capabilities without using libpthread,
e.g. by directly calling the "clone" system call and not "pthread_create".
Such threads won't be in the thread list managed by the user space
library, but are still handled by the process layer in GDB, tracked
as lwp without associated tid.

Not sure if something like that is even possible in AIX.  If it does
make sense to handle things similarly in AIX (one other reason would
be ptrace commands that require LWPs, e.g. like the VSX register
access you had in another thread), some code would indeed need
to move, e.g. everything related to accessing *kernel* threads
(fetch_regs_kernel_thread etc.), while code that accesses *user*
threads via the libpthread accessors (fetch_regs_user_thread etc.)
would still remain in aix-thread.c.


>>>[Switching to process 16777620]
>
>>This outputs inferior_ptid ...
>
>Yes, you were right
>
>>>* 1.1  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  1.2  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  1.3  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  2.1  process 8323570   0xd0594fc8 in ?? ()
>>>  3.1  process 17957172  0xd0594fc8 in ?? ()
>
>>... and this outputs the ptid values for those threads.
>
>>If it says "process ...", then those ptid values have not
>>properly been switched over to the (pid, lwp, tid) format.

>While debugged in depth last two days I realised our pid_to_str
>is needed in rs6000-aix-nat.c as control comes here in search of it.
>If it doesn't GDB treats all threads as process.

This is again very suspicious.  We obviously already have
threads, so the thread layer should be initialized.  This
means that any "pid_to_str" call should go through the
*thread* layer (implementation in aix-thread.c).  If that
doesn't happen, we should understand why.  (This may be the
same problem that causes "wait" to be called from the
wrong layer, as seen above.)


>>You should verify that the sync_threadlists code handles
>>all multi-process cases correctly.  I haven't looked at
>>this in detail, but are you sure that here:
>
>>>@@ -841,8 +829,22 @@ sync_threadlists (int pid)
> >>            }
> >>          else if (cmp_result > 0)
> >>            {
>>>-             delete_thread (gbuf[gi]);
>
>
>>you never accidentally switch the *pid* part (if "gptid"
>>belows to a different pid than "pptid")?
>
>So, this is not the reason. I have added an assertion here just
>to be sure. I get what you are thinking.

Having an assertion is of course good, but it isn't obvious to
me that this never can be hit.


>>Hmm.  So when "wait" returns, it needs to determine which thread
>>triggered the event that caused ptrace to stop.  On Linux, "wait"
>>will actually return the LWP of that thread, so it can be directly
>>used.  It seems on AIX, "wait" only returns a PID, and you do not
>>immediately know which thread caused the event?
>
>>In that case, I can see why you'd have to consider SIGINT as well
>>as SIGTRAP. However, it seems to me that even those two are not the
>>*only* cases that can cause "wait" to return - doesn't *any* signal
>>(potentially) trigger a ptrace intercept (causing wait to return)?
>
>>But that's probably a more general problem, and wouldn't occur in
>>this simple test case.
>
>Exactly. So I tried debugging few examples causing a few other signals
>as mentioned in this document [https://www.ibm.com/docs/en/sdk-java-technology/8?topic=reference-signal-handling].
>In AIX we have most of them mentioned in the link. It does not block
>us from doing things or crashes incase of a segment fault signal
>[from our debugger code]. Abort also works fine. Let me know what you think.

The point is if GDB stops because the target received a signal, it
should automatically switch to the particular thread where the signal
was in fact received.  I don't think this will actually happen in all
cases with the current code.

Shouldn't you instead check for *any* signal in get_signaled_thread?


Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 12381 bytes --]

From bba91f717b7779f39d71282835cceaaeda7ef588 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Thu, 8 Dec 2022 01:03:58 -0600
Subject: [PATCH] Fix multi thread bug in AIX

---
 gdb/aix-thread.c     | 149 ++++++++++++++++---------------------------
 gdb/rs6000-aix-nat.c |  66 ++++++++++++++++++-
 2 files changed, 118 insertions(+), 97 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..6e6f8619b64 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -508,14 +508,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +638,24 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
+/* iterate_over_threads() callback for counting GDB threads.  */
 
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
 
 static int
 giter_count (struct thread_info *thread, void *countp)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
+  (*(int *) countp)++;
   return 0;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for accumulating GDB thread pids.  */
 
 static int
 giter_accum (struct thread_info *thread, void *bufp)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  **(struct thread_info ***) bufp = thread;
+  (*(struct thread_info ***) bufp)++;
+    
   return 0;
 }
 
@@ -703,30 +690,6 @@ gcmp (const void *t1v, const void *t2v)
   return ptid_cmp (t1->ptid, t2->ptid);
 }
 
-/* Search through the list of all kernel threads for the thread
-   that has stopped on a SIGTRAP signal, and return its TID.
-   Return 0 if none found.  */
-
-static pthdb_tid_t
-get_signaled_thread (int pid)
-{
-  struct thrdsinfo64 thrinf;
-  tid_t ktid = 0;
-
-  while (1)
-    {
-      if (getthrds (pid, &thrinf,
-		    sizeof (thrinf), &ktid, 1) != 1)
-	break;
-
-      if (thrinf.ti_cursig == SIGTRAP)
-	return thrinf.ti_tid;
-    }
-
-  /* Didn't find any thread stopped on a SIGTRAP signal.  */
-  return 0;
-}
-
 /* Synchronize GDB's thread list with libpthdebug's.
 
    There are some benefits of doing this every time the inferior stops:
@@ -740,8 +703,8 @@ get_signaled_thread (int pid)
      - simplifies the demands placed on libpthdebug, which seems to
        have difficulty with certain call patterns */
 
-static void
-sync_threadlists (int pid)
+static ptid_t 
+sync_threadlists (ptid_t ptid)
 {
   int cmd, status;
   int pcount, psize, pi, gcount, gi;
@@ -750,6 +713,15 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+	= current_inferior ()->process_target ();
+  thread_info *tp;
+  pid_t pid = ptid.pid ();
+
+  /* This ptid should hold the ptid of a new thread 
+     or return the incoming ptid incase of delete thread.  */
+
+  ptid_t post_sync_ptid = ptid;
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -810,12 +782,10 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
-					 ptid_t (pid, 0, pbuf[pi].pthid),
+					 ptid_t (pid, pbuf[pi].tid, pbuf[pi].pthid),
 					 priv);
-
+	  post_sync_ptid = thread->ptid;
 	  pi++;
 	}
       else
@@ -823,7 +793,7 @@ sync_threadlists (int pid)
 	  ptid_t pptid, gptid;
 	  int cmp_result;
 
-	  pptid = ptid_t (pid, 0, pbuf[pi].pthid);
+	  pptid = ptid_t (pid, pbuf[pi].tid, pbuf[pi].pthid);
 	  gptid = gbuf[gi]->ptid;
 	  pdtid = pbuf[pi].pdtid;
 	  tid = pbuf[pi].tid;
@@ -841,15 +811,31 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      if (gptid.is_pid ())
+		{
+		  gdb_assert (gptid.pid () == pptid.pid ());
+		  thread_change_ptid (proc_target, gptid, pptid);
+		  aix_thread_info *priv = new aix_thread_info;
+		  priv->pdtid = pbuf[pi].pdtid;
+		  priv->tid = pbuf[pi].tid;
+		  tp = find_thread_ptid (proc_target, pptid);
+		  tp->priv.reset (priv);
+		  pi++;
+		  gi++;
+		  post_sync_ptid = pptid;
+		}
+	      else
+		{
+		  delete_thread (gbuf[gi]);
+		  gi++;
+		}
 	    }
 	  else
 	    {
 	      process_stratum_target *proc_target
 		= current_inferior ()->process_target ();
 	      thread = add_thread (proc_target, pptid);
-
+	      post_sync_ptid = pptid;	
 	      aix_thread_info *priv = new aix_thread_info;
 	      thread->priv.reset (priv);
 	      priv->pdtid = pdtid;
@@ -861,18 +847,7 @@ sync_threadlists (int pid)
 
   xfree (pbuf);
   xfree (gbuf);
-}
-
-/* Iterate_over_threads() callback for locating a thread, using
-   the TID of its associated kernel thread.  */
-
-static int
-iter_tid (struct thread_info *thread, void *tidp)
-{
-  const pthdb_tid_t tid = *(pthdb_tid_t *)tidp;
-  aix_thread_info *priv = get_aix_thread_info (thread);
-
-  return priv->tid == tid;
+  return post_sync_ptid;
 }
 
 /* Synchronize libpthdebug's state with the inferior and with GDB,
@@ -881,33 +856,23 @@ iter_tid (struct thread_info *thread, void *tidp)
    return a pid-only ptid with PID.  */
 
 static ptid_t
-pd_update (int pid)
+pd_update (ptid_t ptid)
 {
   int status;
-  ptid_t ptid;
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
+  ptid_t post_sync_ptid;
 
   if (!pd_active)
-    return ptid_t (pid);
+    return ptid;
 
   status = pthdb_session_update (pd_session);
   if (status != PTHDB_SUCCESS)
-    return ptid_t (pid);
+    return ptid;
 
-  sync_threadlists (pid);
+  post_sync_ptid = sync_threadlists (ptid);
 
-  /* Define "current thread" as one that just received a trap signal.  */
-
-  tid = get_signaled_thread (pid);
-  if (tid != 0)
-    thread = iterate_over_threads (iter_tid, &tid);
-  if (!thread)
-    ptid = ptid_t (pid);
-  else
-    ptid = thread->ptid;
-
-  return ptid;
+  return post_sync_ptid;
 }
 
 /* Try to start debugging threads in the current process.
@@ -915,19 +880,19 @@ pd_update (int pid)
    for that thread.  Otherwise, return a ptid-only ptid using PID.  */
 
 static ptid_t
-pd_activate (int pid)
+pd_activate (ptid_t ptid)
 {
   int status;
 		
-  status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
+  status = pthdb_session_init (ptid.pid (), arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
 			       &pd_session);
   if (status != PTHDB_SUCCESS)
     {
-      return ptid_t (pid);
+      return ptid;
     }
   pd_active = 1;
-  return pd_update (pid);
+  return pd_update (ptid);
 }
 
 /* Undo the effects of pd_activate().  */
@@ -983,7 +948,7 @@ pd_enable (void)
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
      activate thread debugging.  */
-  pd_activate (inferior_ptid.pid ());
+  pd_activate (inferior_ptid);
 }
 
 /* Undo the effects of pd_enable().  */
@@ -1091,10 +1056,6 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
   if (ptid.pid () == -1)
     return ptid_t (-1);
 
-  /* The target beneath does not deal with threads, so it should only return
-     pid-only ptids.  */
-  gdb_assert (ptid.is_pid ());
-
   /* Check whether libpthdebug might be ready to be initialized.  */
   if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
@@ -1106,10 +1067,10 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
 
       if (regcache_read_pc (regcache)
 	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
-	return pd_activate (ptid.pid ());
+	return pd_activate (ptid);
     }
 
-  return pd_update (ptid.pid ());
+  return pd_update (ptid);
 }
 
 /* Record that the 64-bit general-purpose registers contain VALS.  */
diff --git a/gdb/rs6000-aix-nat.c b/gdb/rs6000-aix-nat.c
index 2ac1f6e70b6..fe7ca4f3f2a 100644
--- a/gdb/rs6000-aix-nat.c
+++ b/gdb/rs6000-aix-nat.c
@@ -99,6 +99,8 @@ class rs6000_nat_target final : public inf_ptrace_target
      support.  */
   void follow_fork (inferior *, ptid_t, target_waitkind, bool, bool) override;
 
+  std::string pid_to_str (ptid_t) override;
+
 protected:
 
   void post_startup_inferior (ptid_t ptid) override;
@@ -619,6 +621,64 @@ rs6000_nat_target::xfer_partial (enum target_object object,
     }
 }
 
+/* Search through the list of all kernel threads for the thread
+   that has stopped on a SIGTRAP or SIGINT signal, and return
+   its TID.  Return 0 if none found.  */
+
+static tid_t
+get_signaled_thread_rs6000 (int pid)
+{
+  struct thrdsinfo64 thrinf;
+  tid_t ktid = 0;
+
+  while (1)
+    {
+      if (getthrds (pid, &thrinf,
+                    sizeof (thrinf), &ktid, 1) != 1)
+        break;
+
+      if (thrinf.ti_cursig != 0)
+        return thrinf.ti_tid;
+    }
+
+  /* Didn't find any thread stopped on a SIGTRAP signal.  */
+  return 0;
+}
+
+/* If my process is pthreaded I need to return that ptid else ptid_t
+   (pid).  */
+
+static ptid_t
+find_the_return_ptid (pid_t pid)
+{
+  ptid_t ptid = ptid_t (pid);
+  process_stratum_target *proc_target
+        = current_inferior ()->process_target ();
+  inferior *inf = find_inferior_pid (proc_target, pid);
+  thread_info *tp = find_thread_ptid (inf, ptid_t (pid));
+  if (tp == nullptr)
+    for (thread_info *tp1 : inf->threads ())
+       if (tp1->ptid.lwp () == get_signaled_thread_rs6000 (pid))
+         return tp1->ptid;
+  return ptid;
+}
+
+/* Returning "thread" or "process" info as control comes here 
+   during a process switch in multi process debugging.  This 
+   is needed for "info threads" command as a process can be
+   threaded or non threaded in multi process case.  */
+
+std::string
+rs6000_nat_target::pid_to_str (ptid_t ptid)
+{
+  if (ptid.tid () != 0)
+    return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
+
+  else
+    return string_printf (_("Process %s"), pulongest (ptid.pid ()));
+}
+
+
 /* Wait for the child specified by PTID to do something.  Return the
    process ID of the child, or MINUS_ONE_PTID in case of error; store
    the status in *OURSTATUS.  */
@@ -672,7 +732,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
 	      if (parent_pid > 0)
 		{
 		  ourstatus->set_forked (ptid_t (pid));
-		  return ptid_t (parent_pid);
+		  return find_the_return_ptid (parent_pid);
 		}
 	      aix_remember_child (pid);
 	    }
@@ -687,7 +747,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
 	      if (child_pid > 0)
 		{
 		  ourstatus->set_forked (ptid_t (child_pid));
-		  return ptid_t (pid);
+		  return find_the_return_ptid (pid);
 		}
 	      aix_remember_parent (pid);
 	    }
@@ -712,7 +772,7 @@ rs6000_nat_target::wait (ptid_t ptid, struct target_waitstatus *ourstatus,
   else
     *ourstatus = host_status_to_waitstatus (status);
 
-  return ptid_t (pid);
+  return find_the_return_ptid (pid);
 }
 \f
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-12-08 10:28                             ` Aditya Kamath1
  2022-12-08 10:46                               ` Aditya Kamath1
@ 2022-12-08 16:29                               ` Ulrich Weigand
  2022-12-15 12:58                                 ` Aditya Kamath1
  1 sibling, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2022-12-08 16:29 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>So this last bit seems to be the problem.  Could you elaborate on
>>what the exact call stack is?  I thought once the thread layer is
>>initialized, calls to ::wait should always go through it ...
>
>Kindly see the backtrace sections 
>BT:- Thread_wait [which is on a thread event like new thread born or main process is pthreaded],  
>BT:- Post thread wait in rs6000-aix-nat::wait  [which is the beneath ()->wait () in aix_thread_target::wait], 
>BT:- If direct rs6000-aix-nat::wait [ where in output 3 and 4 {below in this email} you can see it will directly come to rs6000-aix-nat.c if the main process after having threads forks or uses a fork () call ] pasted below in this email. 

I'm only replying to this is right now, because that seems to
be the fundamental problem that ultimately causes a lot of the
other issues you're seeing.

It seems the core problem is that you're not initializing the
thread layer correctly for any but the first inferior!  So all
other inferiors started with fork are assumed to be single-
threaded ...

If you look at a backtrace like this:

>BT:- If direct rs6000-aix-nat::wait
>
>Thread 1 hit Breakpoint 2, rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=..., 
>    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695
>695           set_sigint_trap ();
>(gdb) bt
>#0  rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=..., 
>    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695
>#1  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

you see that the target.c code uses the current inferior's
"top_target" to find the appropriate target routines:

  target_ops *target = current_inferior ()->top_target ();
[...]
      ptid_t event_ptid = target->wait (ptid, status, options);

For a multi-threaded process "top_target" *should* point to
aix_thread_ops, which is achieved by this call in pd_enable:
  current_inferior ()->push_target (&aix_thread_ops);

However, note that this is applied only to *one* inferior.
You actually need to do this for *all* new inferiors as soon
as they are detected to become multi-threaded.

This does not happen because aix-thread.c currently has a static
global pd_able variable that applies to GDB as a whole.  Back in
the days where this was introduced, that was probably correct
since a single GDB session could only debug one single inferior
back then.  But for multiple inferiors, any of which can be
multi-threaded, this does not work.


I think you should first of all work on fixing this, and then
go back to validating your test scenarios without any of the
other changes - many of those likely will no longer be
necessary then.


Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-12-08 16:29                               ` Ulrich Weigand
@ 2022-12-15 12:58                                 ` Aditya Kamath1
  2022-12-15 15:53                                   ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-12-15 12:58 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 9278 bytes --]

Hi Ulrich and community,

Please find the new patch [See:- 0001-Fix-multi-thread-bug-in-AIX.patch ].

I understood your previous email and what you are saying is correct. If we fix this top target, we can leave the process layer undisturbed.

Having said that, I have a few obstacles I am facing in order to achieve the same. Kindly not all outputs I paste in this mail are generated with "set debug aix-thread" command and "set detach-on-fork off" command.

We first try to get the symbol name where we need to attach a trap so that debugger can get notified that Hey you need to catch an event. This will be a thread create event in our case will be caught in a condition in aix-thread::wait layer and then we call pd_update () to sync_threadlists () to catch it. For this to happen the main thing is the symbol called "n_pthreads" needs to have an address in the symbol table. This symbol is checked when a new object file is generated via the pd​​_enable () where we use pthdb_session​_pthreaded () to check the same. If we are successful, we get into pd_update () and do our stuff plus push the top target as aix-thread.c..

So, all this happens correctly for program 1's parent {code attached below} as shown in the output below.

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffffdbc8, count = 1)

  symbols[0].name = "__n_pthreads"

 returning PDC_FAILURE

pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffffdbc8, count = 1)

  symbols[0].name = "__n_pthreads"

  symbols[0].addr = 0xf0807334

 returning PDC_SUCCESS

pdc_read_data (user_current_pid = 17957132, buf = 0xfffffffffffdbc0, addr = 0xf0807334, len = 4)

  status=0, returning SUCCESS

pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffff88c8, count = 1)

  symbols[0].name = "__n_pthreads"

  symbols[0].addr = 0xf0807334



So, after this the first inferior works fine. When the second or the third inferior comes into picture from the new objfile () we to go pd_enable () then to pthdb_session​_pthreaded () .. Here we fail for the new inferior as shown the output below.


[New Thread 258]

[New Thread 515]

fetch_regs_kernel_thread tid=225018d regno=64 arch64=0

[New inferior 2 (process 8061286)]

pdc_free (user_current_pid = 17957132, buf = 0x11016f370)

pdc_free (user_current_pid = 17957132, buf = 0x11016f3b0)

pdc_free (user_current_pid = 17957132, buf = 0x11016f4f0)

pdc_free (user_current_pid = 17957132, buf = 0x1104e3a70)

pdc_free (user_current_pid = 17957132, buf = 0x1108af0d0)

pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffffdef8, count = 1)

  symbols[0].name = "__n_pthreads"

 returning PDC_FAILURE

pdc_symbol_addrs (user_current_pid = 8061286, symbols = 0xfffffffffffe248, count = 1)

  symbols[0].name = "__n_pthreads"

 returning PDC_FAILURE

I am parent

[New process 17957132]

[New inferior 3 (process 17433000)]

pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffffdef8, count = 1)

  symbols[0].name = "__n_pthreads"

 returning PDC_FAILURE

Since it could not read the symbol "_n_pthreads" it failed, and we could not set our top target for the new process as threads. So, I could not find why this happens. Because if the parent is pthreaded so will be the child as everything of the parent must be copied to the child. So, I should get my child also as pthreaded and "_n_pthread" symbol set to the address of the child's threads in the child process.

Thus, our top target remained as process layer. In target.c when our event is going to wait, our current inferior is the child, and its top target is process layer. In the process layer though it recognised the process correctly since our parent is threaded, we do not have ptid_t (pid) for it. Hence the line [New process 17957132] appeared in the output.

I did try doing searching in xcoffread.c but I felt I was in the wrong place searching for things which pthread debug library should define for us.

This is where I need guidance. Your help can be useful to solve this problem for AIX and the GDB community. Kindly guide me with your expertise and let me know what you think. I have given all the information possible of my understanding till here. Let me know if you need more information to guide me.

Waiting for a reply soon.

Have a nice day ahead.

Regards,
Aditya.

-----------------------------------
PROGRAM 1


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

dummy_thread_function (void *arg)

{

   printf ("Bye from dummy thread \n");

}


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else{

    printf (" Iam child \n");

    child = fork ();

    if (child > 0)

      printf ("From child I became a parent \n");

    else

    {

      printf ("I am grandchild \n");

      pthread_t thread;

      pthread_create (&thread, NULL,

                      dummy_thread_function, NULL);

    }

  }

  while (1); /* break here */

}

int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}











________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 08 December 2022 21:59
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>So this last bit seems to be the problem.  Could you elaborate on
>>what the exact call stack is?  I thought once the thread layer is
>>initialized, calls to ::wait should always go through it ...
>
>Kindly see the backtrace sections
>BT:- Thread_wait [which is on a thread event like new thread born or main process is pthreaded],
>BT:- Post thread wait in rs6000-aix-nat::wait  [which is the beneath ()->wait () in aix_thread_target::wait],
>BT:- If direct rs6000-aix-nat::wait [ where in output 3 and 4 {below in this email} you can see it will directly come to rs6000-aix-nat.c if the main process after having threads forks or uses a fork () call ] pasted below in this email.

I'm only replying to this is right now, because that seems to
be the fundamental problem that ultimately causes a lot of the
other issues you're seeing.

It seems the core problem is that you're not initializing the
thread layer correctly for any but the first inferior!  So all
other inferiors started with fork are assumed to be single-
threaded ...

If you look at a backtrace like this:

>BT:- If direct rs6000-aix-nat::wait
>
>Thread 1 hit Breakpoint 2, rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,
>    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695
>695           set_sigint_trap ();
>(gdb) bt
>#0  rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,
>    ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695
>#1  0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598

you see that the target.c code uses the current inferior's
"top_target" to find the appropriate target routines:

  target_ops *target = current_inferior ()->top_target ();
[...]
      ptid_t event_ptid = target->wait (ptid, status, options);

For a multi-threaded process "top_target" *should* point to
aix_thread_ops, which is achieved by this call in pd_enable:
  current_inferior ()->push_target (&aix_thread_ops);

However, note that this is applied only to *one* inferior.
You actually need to do this for *all* new inferiors as soon
as they are detected to become multi-threaded.

This does not happen because aix-thread.c currently has a static
global pd_able variable that applies to GDB as a whole.  Back in
the days where this was introduced, that was probably correct
since a single GDB session could only debug one single inferior
back then.  But for multiple inferiors, any of which can be
multi-threaded, this does not work.


I think you should first of all work on fixing this, and then
go back to validating your test scenarios without any of the
other changes - many of those likely will no longer be
necessary then.


Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 7954 bytes --]

From aef6803b770f0555d16fa5bfbdda3be8c1901ad5 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Thu, 15 Dec 2022 06:25:28 -0600
Subject: [PATCH] Fix multi thread debug bug in AIX

This is a temporary fix. It is work in progress
---
 gdb/aix-thread.c | 90 +++++++++++++++++++++++++++---------------------
 1 file changed, 50 insertions(+), 40 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..0e4d43fb488 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -55,6 +55,7 @@
 #include <sys/reg.h>
 #include <sched.h>
 #include <sys/pthdebug.h>
+#include <vector>
 
 #if !HAVE_DECL_GETTHRDS
 extern int getthrds (pid_t, struct thrdsinfo64 *, int, tid_t *, int);
@@ -70,7 +71,7 @@ static bool debug_aix_thread;
 
 /* Return whether to treat PID as a debuggable thread id.  */
 
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
+#define PD_TID(ptid)	(std::find (pd_active.begin (), pd_active.end (), ptid.pid ()) != pd_active.end () && ptid.tid () != 0)
 
 /* Success and failure values returned by pthdb callbacks.  */
 
@@ -151,11 +152,11 @@ static CORE_ADDR pd_brk_addr;
 
 /* Whether the current application is debuggable by pthdb.  */
 
-static int pd_able = 0;
+static std::vector<pid_t> pd_able;
 
 /* Whether a threaded application is being debugged.  */
 
-static int pd_active = 0;
+static std::vector<pid_t> pd_active;
 
 /* Whether the current architecture is 64-bit.  
    Only valid when pd_able is true.  */
@@ -508,14 +509,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +639,23 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for counting GDB threads.  */
 
 static int
 giter_count (struct thread_info *thread, void *countp)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
+  (*(int *) countp)++;
   return 0;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for accumulating GDB thread pids.  */
 
 static int
 giter_accum (struct thread_info *thread, void *bufp)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  **(struct thread_info ***) bufp = thread;
+  (*(struct thread_info ***) bufp)++;
+
   return 0;
 }
 
@@ -719,7 +706,7 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +737,9 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -810,8 +800,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,8 +829,23 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+              if (gptid.is_pid ())
+                {
+                  gdb_assert (gptid.pid () == pptid.pid ());
+                  thread_change_ptid (proc_target, gptid, pptid);
+                  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+                  tp = find_thread_ptid (proc_target, pptid);
+                  tp->priv.reset (priv);
+                  pi++;
+                  gi++;
+                }
+              else
+                {
+                  delete_thread (gbuf[gi]);
+                  gi++;
+                }
 	    }
 	  else
 	    {
@@ -888,7 +891,8 @@ pd_update (int pid)
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
 
-  if (!pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (), pid)
+    == pd_active.end ())
     return ptid_t (pid);
 
   status = pthdb_session_update (pd_session);
@@ -926,7 +930,7 @@ pd_activate (int pid)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  pd_active.push_back (pid);
   return pd_update (pid);
 }
 
@@ -935,12 +939,13 @@ pd_activate (int pid)
 static void
 pd_deactivate (void)
 {
-  if (!pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (), inferior_ptid.pid ())
+    == pd_active.end ())
     return;
   pthdb_session_destroy (pd_session);
   
   pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  pd_active.erase (std::find (pd_active.begin (), pd_active.end (), inferior_ptid.pid ()));
 }
 
 /* An object file has just been loaded.  Check whether the current
@@ -954,7 +959,8 @@ pd_enable (void)
   struct bound_minimal_symbol ms;
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (std::find (pd_able.begin (), pd_able.end (), inferior_ptid.pid ()) 
+	!= pd_able.end ())
     return;
 
   /* Check application word size.  */
@@ -978,7 +984,7 @@ pd_enable (void)
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  pd_able.push_back (inferior_ptid.pid ());
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
@@ -991,11 +997,14 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  if (std::find (pd_able.begin (), pd_able.end (), 
+	inferior_ptid.pid ()) == pd_able.end ())
     return;
-  if (pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (),
+        inferior_ptid.pid ()) != pd_active.end ())
     pd_deactivate ();
-  pd_able = 0;
+  pd_able.erase (std::find (pd_able.begin (), pd_able.end (), 
+	inferior_ptid.pid ()));
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
@@ -1096,7 +1105,8 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
   gdb_assert (ptid.is_pid ());
 
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (std::find (pd_active.begin (), pd_active.end (), ptid.pid ()) == pd_active.end ()
+      && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-12-15 12:58                                 ` Aditya Kamath1
@ 2022-12-15 15:53                                   ` Ulrich Weigand
  2022-12-19  6:30                                     ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2022-12-15 15:53 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>[New Thread 258]
>[New Thread 515]
>fetch_regs_kernel_thread tid=225018d regno=64 arch64=0
>[New inferior 2 (process 8061286)]
>pdc_free (user_current_pid = 17957132, buf = 0x11016f370)
>pdc_free (user_current_pid = 17957132, buf = 0x11016f3b0)
>pdc_free (user_current_pid = 17957132, buf = 0x11016f4f0)
>pdc_free (user_current_pid = 17957132, buf = 0x1104e3a70)
>pdc_free (user_current_pid = 17957132, buf = 0x1108af0d0)
>pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffffdef8, count = 1)
>  symbols[0].name = "__n_pthreads"
> returning PDC_FAILURE
>pdc_symbol_addrs (user_current_pid = 8061286, symbols = 0xfffffffffffe248, count = 1)
>  symbols[0].name = "__n_pthreads"
> returning PDC_FAILURE
>I am parent 
>[New process 17957132]
>[New inferior 3 (process 17433000)]
>pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffffdef8, count = 1)
>  symbols[0].name = "__n_pthreads"
> returning PDC_FAILURE

I think the problem here may be that the lookup_minimal_symbol
call in pdc_symbol_addrs has to be performed in the correct
process address space.  This wasn't an issue before since the
routine was only called for the first process anyway.

Look at the equivalent routine on Linux, which is
ps_pglobal_lookup in proc-service.c.  This does:

  inferior *inf = ph->thread->inf;
  scoped_restore_current_program_space restore_pspace;
  set_current_program_space (inf->pspace);

You'll need to do the equivalent (set the program space
to the one appropriate for the inferior referenced by
user_current_pid).

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-12-15 15:53                                   ` Ulrich Weigand
@ 2022-12-19  6:30                                     ` Aditya Kamath1
  2022-12-22 12:50                                       ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-12-19  6:30 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 8324 bytes --]

Hi Ulrich and community,

Please find attached the patch {See:

0001-Fix-multi-thread-debug-bug-in-AIX.patch}

>I think the problem here may be that the lookup_minimal_symbol
>call in pdc_symbol_addrs has to be performed in the correct
>process address space.  This wasn't an issue before since the
>routine was only called for the first process anyway.

>Look at the equivalent routine on Linux, which is
>ps_pglobal_lookup in proc-service.c.  This does:

 >inferior *inf = ph->thread->inf;
 > scoped_restore_current_program_space restore_pspace;
 > set_current_program_space (inf->pspace);

> You'll need to do the equivalent (set the program space
> to the one appropriate for the inferior referenced by
> user_current_pid).

So, I tried setting this right in this patch and it did not work. I even tried setting the inferior_ptid and current_inferior to user_current_pid, but that also did not help.

The output still appears as follows for the same code in the previous mail.


[New Thread 258]

[New Thread 515]

fetch_regs_kernel_thread tid=1da0189 regno=64 arch64=0

[New inferior 2 (process 6553888)]

pdc_free (user_current_pid = 11272598, buf = 0x11016ee70)

pdc_free (user_current_pid = 11272598, buf = 0x11016eeb0)

pdc_free (user_current_pid = 11272598, buf = 0x11016eff0)

pdc_free (user_current_pid = 11272598, buf = 0x1104e2530)

pdc_free (user_current_pid = 11272598, buf = 0x1108af2d0)

pdc_symbol_addrs (user_current_pid = 11272598, symbols = 0xfffffffffffdf08, count = 1)

  symbols[0].name = "__n_pthreads"

  symbols[0].addr = 0xf0807334

 returning PDC_SUCCESS

pdc_read_data (user_current_pid = 11272598, buf = 0xfffffffffffdf00, addr = 0xf0807334, len = 4)

  status=0, returning SUCCESS

pdc_symbol_addrs (user_current_pid = 6553888, symbols = 0xfffffffffffe258, count = 1)

  symbols[0].name = "__n_pthreads"

 returning PDC_FAILURE

[New process 11272598]

[New inferior 3 (process 8323536)]

pdc_symbol_addrs (user_current_pid = 11272598, symbols = 0xfffffffffffdf08, count = 1)

  symbols[0].name = "__n_pthreads"

  symbols[0].addr = 0xf0807334

 returning PDC_SUCCESS

pdc_read_data (user_current_pid = 11272598, buf = 0xfffffffffffdf00, addr = 0xf0807334, len = 4)

  status=0, returning SUCCESS

pdc_symbol_addrs (user_current_pid = 8323536, symbols = 0xfffffffffffe258, count = 1)

  symbols[0].name = "__n_pthreads"

 returning PDC_FAILURE

I am parent

I am parent

So, we can see clearly it did try to check if our new inferiors can be threaded and it failed.

What I observed one thing while I tried to guess how Linux might be doing it is that once it detects a new inferior it continuously calls ps_pglobal_lookup in proc-service.c  using an observable till it succeeds in reading the symbol. Kindly see the below output of linux for the same program. {Output Credits:- Linux GDB}


[New Thread 0x7ffff7cff170 (LWP 259785)]

[New Thread 0x7ffff74ef170 (LWP 259786)]

[New inferior 2 (process 259787)]

[Thread debugging using libthread_db enabled]

Using host libthread_db library "/lib64/libthread_db.so.1".

[New inferior 3 (process 259788)]


The backtrace might be as follows:-


#11 0x0000000010063fe8 in std::_Function_handler<void (inferior*), void (*)(inferior*)>::_M_invoke(std::_Any_data const&, inferior*&&)

    at /usr/include/c++/8/bits/std_function.h:297

#12 0x00000000102c77b8 in std::function<void (inferior*)>::operator()(inferior*) const (

    __args#0=0x112cd920, this=<optimized out>) at /usr/include/c++/8/bits/std_function.h:687

#13 gdb::observers::observable<inferior*>::notify (args#0=0x112cd920, this=<optimized out>)

    at ./../gdbsupport/observable.h:166

#14 post_create_inferior  at infcmd.c:315

#15 0x00000000102e25a4 in follow_fork_inferior (detach_fork=false, follow_child=false) at infrun.c:683

#16 follow_fork () at infrun.c:79

#17 0x00000000102ec7b8 in handle_inferior_event  at infrun.c:5728

#18 0x00000000102ed9d8 in fetch_inferior_event () at infrun.c:4233

#19 0x00000000102c05a4 in inferior_event_handler  at inf-loop.c:41

#20 0x0000000010317e7c in handle_target_event

    at -nat.c:4216

#21 0x0000000010957e68 in handle_file_event

    at event-loop.cc:549

#22 0x00000000109588c4 in gdb_wait_for_event

 at event-loop.cc:670

#23 0x0000000010958cac in gdb_wait_for_event (block=0) at event-loop.cc:569

#24 gdb_do_one_event () at event-loop.cc:210

#25 0x000000001034d684 in start_event_loop () at main.c:411

#26 captured_command_loop () at main.c:471

#27 0x000000001034f8b0 in captured_main

 at main.c:1329

#28 gdb_main (args=<optimized out>) at main.c:1344

#29 0x000000001001a188 in main at gdb.c:32


Post stack number #11 they might be going to ps_pglobal_lookup () everytime till they can make the new inferior thread debugging possible. AIX on the other hand calls the pdc_symbol_adddress () only once for a new inferior after first inferior. For the first inferior as well, things succeed only in the fourth time as shown below for a 32-bit code.


pdc_symbol_addrs (user_current_pid = 11272598, symbols = 0xfffffffffffdbd8, count = 1)

  symbols[0].name = "__n_pthreads"

 returning PDC_FAILURE

pdc_symbol_addrs (user_current_pid = 11272598, symbols = 0xfffffffffffdbd8, count = 1)

  symbols[0].name = "__n_pthreads"

 returning PDC_FAILURE

pdc_symbol_addrs (user_current_pid = 11272598, symbols = 0xfffffffffffdbd8, count = 1)

  symbols[0].name = "__n_pthreads"

 returning PDC_FAILURE

pdc_symbol_addrs (user_current_pid = 11272598, symbols = 0xfffffffffffdbd8, count = 1)

  symbols[0].name = "__n_pthreads"

  symbols[0].addr = 0xf0807334

 returning PDC_SUCCESS

So, I guess if we manage to do something similar just like for the first inferior, we will get to the solution, but I did not understand how Linux might be reading the symbol again and again for a new inferior or AIX for that matter for the first inferior.  Kindly let me know how we can do something similar or are we missing something here that I have not kept in mind in our attempt to solve this for AIX and GDB community.

Let me know what you think. Waiting for a reply soon.

Have a nice day.

Regards,
Aditya.
________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 15 December 2022 21:23
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>[New Thread 258]
>[New Thread 515]
>fetch_regs_kernel_thread tid=225018d regno=64 arch64=0
>[New inferior 2 (process 8061286)]
>pdc_free (user_current_pid = 17957132, buf = 0x11016f370)
>pdc_free (user_current_pid = 17957132, buf = 0x11016f3b0)
>pdc_free (user_current_pid = 17957132, buf = 0x11016f4f0)
>pdc_free (user_current_pid = 17957132, buf = 0x1104e3a70)
>pdc_free (user_current_pid = 17957132, buf = 0x1108af0d0)
>pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffffdef8, count = 1)
>  symbols[0].name = "__n_pthreads"
> returning PDC_FAILURE
>pdc_symbol_addrs (user_current_pid = 8061286, symbols = 0xfffffffffffe248, count = 1)
>  symbols[0].name = "__n_pthreads"
> returning PDC_FAILURE
>I am parent
>[New process 17957132]
>[New inferior 3 (process 17433000)]
>pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffffdef8, count = 1)
>  symbols[0].name = "__n_pthreads"
> returning PDC_FAILURE

I think the problem here may be that the lookup_minimal_symbol
call in pdc_symbol_addrs has to be performed in the correct
process address space.  This wasn't an issue before since the
routine was only called for the first process anyway.

Look at the equivalent routine on Linux, which is
ps_pglobal_lookup in proc-service.c.  This does:

  inferior *inf = ph->thread->inf;
  scoped_restore_current_program_space restore_pspace;
  set_current_program_space (inf->pspace);

You'll need to do the equivalent (set the program space
to the one appropriate for the inferior referenced by
user_current_pid).

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 9371 bytes --]

From 8b1f34b7d62c9e6a0f216c0e3b36fce752952eee Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Sun, 18 Dec 2022 23:19:04 -0600
Subject: [PATCH] Fix multi thread debug bug in AIX

---
 gdb/aix-thread.c | 103 ++++++++++++++++++++++++++---------------------
 1 file changed, 58 insertions(+), 45 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..3a2afad25a1 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -55,6 +55,7 @@
 #include <sys/reg.h>
 #include <sched.h>
 #include <sys/pthdebug.h>
+#include <vector>
 
 #if !HAVE_DECL_GETTHRDS
 extern int getthrds (pid_t, struct thrdsinfo64 *, int, tid_t *, int);
@@ -70,7 +71,7 @@ static bool debug_aix_thread;
 
 /* Return whether to treat PID as a debuggable thread id.  */
 
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
+#define PD_TID(ptid)	(std::find (pd_active.begin (), pd_active.end (), ptid.pid ()) != pd_active.end () && ptid.tid () != 0)
 
 /* Success and failure values returned by pthdb callbacks.  */
 
@@ -151,11 +152,11 @@ static CORE_ADDR pd_brk_addr;
 
 /* Whether the current application is debuggable by pthdb.  */
 
-static int pd_able = 0;
+static std::vector<pid_t> pd_able;
 
 /* Whether a threaded application is being debugged.  */
 
-static int pd_active = 0;
+static std::vector<pid_t> pd_active;
 
 /* Whether the current architecture is 64-bit.  
    Only valid when pd_able is true.  */
@@ -331,6 +332,9 @@ pdc_symbol_addrs (pthdb_user_t user_current_pid, pthdb_symbol_t *symbols, int co
   struct bound_minimal_symbol ms;
   int i;
   char *name;
+  scoped_restore_current_program_space restore_pspace; 
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (), user_current_pid);
+  set_current_program_space (inf->pspace);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
@@ -508,14 +512,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +642,23 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for counting GDB threads.  */
 
 static int
 giter_count (struct thread_info *thread, void *countp)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
+  (*(int *) countp)++;
   return 0;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for accumulating GDB thread pids.  */
 
 static int
 giter_accum (struct thread_info *thread, void *bufp)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  **(struct thread_info ***) bufp = thread;
+  (*(struct thread_info ***) bufp)++;
+
   return 0;
 }
 
@@ -719,7 +709,7 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +740,9 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -810,8 +803,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,8 +832,22 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+              if (gptid.is_pid () && gptid.pid () == pptid.pid ())
+                {
+                  thread_change_ptid (proc_target, gptid, pptid);
+                  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+                  tp = find_thread_ptid (proc_target, pptid);
+                  tp->priv.reset (priv);
+                  pi++;
+                  gi++;
+                }
+              else
+                {
+                  delete_thread (gbuf[gi]);
+                  gi++;
+                }
 	    }
 	  else
 	    {
@@ -888,7 +893,8 @@ pd_update (int pid)
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
 
-  if (!pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (), pid)
+    == pd_active.end ())
     return ptid_t (pid);
 
   status = pthdb_session_update (pd_session);
@@ -926,7 +932,7 @@ pd_activate (int pid)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  pd_active.push_back (pid);
   return pd_update (pid);
 }
 
@@ -935,26 +941,29 @@ pd_activate (int pid)
 static void
 pd_deactivate (void)
 {
-  if (!pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (), inferior_ptid.pid ())
+    == pd_active.end ())
     return;
   pthdb_session_destroy (pd_session);
   
   pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  pd_active.erase (std::find (pd_active.begin (), pd_active.end (), inferior_ptid.pid ()));
 }
 
 /* An object file has just been loaded.  Check whether the current
    application is pthreaded, and if so, prepare for thread debugging.  */
 
 static void
-pd_enable (void)
+pd_enable (inferior *inf)
 {
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  pid_t pid = (inf == NULL?inferior_ptid.pid ():inf->pid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (std::find (pd_able.begin (), pd_able.end (), pid) 
+	!= pd_able.end ())
     return;
 
   /* Check application word size.  */
@@ -962,7 +971,7 @@ pd_enable (void)
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
-  status = pthdb_session_pthreaded (inferior_ptid.pid (), PTHDB_FLAG_REGS,
+  status = pthdb_session_pthreaded (pid, PTHDB_FLAG_REGS,
 				    &pd_callbacks, &stub_name);
   if ((status != PTHDB_SUCCESS
        && status != PTHDB_NOT_PTHREADED) || !stub_name)
@@ -978,12 +987,12 @@ pd_enable (void)
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  pd_able.push_back (pid);
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
      activate thread debugging.  */
-  pd_activate (inferior_ptid.pid ());
+  pd_activate (pid);
 }
 
 /* Undo the effects of pd_enable().  */
@@ -991,11 +1000,14 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  if (std::find (pd_able.begin (), pd_able.end (), 
+	inferior_ptid.pid ()) == pd_able.end ())
     return;
-  if (pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (),
+        inferior_ptid.pid ()) != pd_active.end ())
     pd_deactivate ();
-  pd_able = 0;
+  pd_able.erase (std::find (pd_able.begin (), pd_able.end (), 
+	inferior_ptid.pid ()));
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
@@ -1010,7 +1022,7 @@ static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
-    pd_enable ();
+    pd_enable (NULL);
   else
     pd_disable ();
 }
@@ -1020,7 +1032,7 @@ new_objfile (struct objfile *objfile)
 static void
 aix_thread_inferior_created (inferior *inf)
 {
-  pd_enable ();
+  pd_enable (inf);
 }
 
 /* Detach from the process attached to by aix_thread_attach().  */
@@ -1096,7 +1108,8 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
   gdb_assert (ptid.is_pid ());
 
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (std::find (pd_active.begin (), pd_active.end (), ptid.pid ()) == pd_active.end ()
+      && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-12-19  6:30                                     ` Aditya Kamath1
@ 2022-12-22 12:50                                       ` Ulrich Weigand
  2022-12-26 13:18                                         ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2022-12-22 12:50 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>So, I guess if we manage to do something similar just like for the
>first inferior, we will get to the solution, but I did not understand
>how Linux might be reading the symbol again and again for a new
>inferior or AIX for that matter for the first inferior.  Kindly let
>me know how we can do something similar or are we missing something
>here that I have not kept in mind in our attempt to solve this for
>AIX and GDB community. 

So the way is works on Linux is that linux-thread-db.c registers
both an inferior_created observer and a new_objfile observer.

The inferior_created observer gets called immediately after the
new inferior is noticed - at this point, only the main executable
is detected by GDB, not any of the shared libraries.  So this will
only successfully detect a libpthread-linked executable if it was
*statically* linked.  For dynamically linked executables, the
new_objfile observer will later get called as well, once for each
shared library.  As soon as this happens for the libpthread shared
library, the fact that the inferior is multithreaded is detected.

At first glance, this should actually work the same on AIX - the
aix-thread.c file also registers both inferior_created and
new_objfile observers.  Not sure why this doesn't work ... 
Do you see the new_objfile observer being called for all the
shared libraries in the second inferior?

If not, are shared libraries actually detected correctly in the
second inferior (e.g. does "info sharedlibrary" show them correctly
in the second inferior)?  If not, maybe solib-aix.c also needs to
be reviewed whether it handles multiple inferiors correctly ...

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-12-22 12:50                                       ` Ulrich Weigand
@ 2022-12-26 13:18                                         ` Aditya Kamath1
  2023-01-09 14:04                                           ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2022-12-26 13:18 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 4788 bytes --]

Hi Ulrich,

>If not, are shared libraries actually detected correctly in the
>second inferior (e.g. does "info sharedlibrary" show them correctly
>in the second inferior)?  If not, maybe solib-aix.c also needs to
>be reviewed whether it handles multiple inferiors correctly

You were right about this in the previous email. There are 4 shared libraries all present in their respective archived files, we try to access when a new inferior is created. They are:-

/usr/lib/libpthreads.a(shr_comm.o)

/usr/lib/libcrypt.a(shr.o)

/usr/lib/libpthread.a(shr_xpg5.o)

/usr/lib/libc.a(shr.o)

Consider the patch for the analysis as shown below:-


--- solib-aix.c_orig    2022-12-23 12:05:39 +0000

+++ solib-aix.c 2022-12-26 06:12:06 +0000

@@ -615,6 +615,7 @@

     (gdb_bfd_openr_next_archived_file (archive_bfd.get (), NULL));

   while (object_bfd != NULL)

     {

+      printf ("object_bfd is %s --- compared with --- member_name is %s in path %s \n", bfd_get_filename (object_bfd.get ()), member_name.c_str (), pathname);

       if (member_name == bfd_get_filename (object_bfd.get ()))

        break;

Here I have added a print statement to ensure we are able to find the member in the archive.

What's interesting is for the first inferior this works fine for all shared libraries. For the second one and every inferior thereafter the output is as shown below in the next paragraph,


object_bfd is shr.o --- compared with --- member_name is shr_comm.o in path /usr/lib/libpthreads.a(shr_comm.o)

object_bfd is /usr/lib/libpthreads.a(shr_comm.o) --- compared with --- member_name is shr_comm.o in path /usr/lib/libpthreads.a(shr_comm.o)

I was surprised that the bfd_get_filename (object_bfd.get ()) is returning the pathname instead of the object file descriptor. Everything until here seems to correct in the solib_aix_bfd_open () function and this makes it hard for me to understand what is going on. Even if I allow a pathname match to the member_name we end up losing all the information of our threads in the first process though we still have the process information.

One more thing I want to share is that the inferior is getting correctly aligned wherever current_inferior () is used to get the inferior's shared library list or information.

So right now, I do not have answers to why the pathname is getting returned and if I need to correct the same how I can. As a consequence, the Shared libraries are missing for new inferior.  Kindly guide me on how this can be solved.

The rest of the patch I am attempting to solve this remains unchanged.

Have a nice day ahead.

Thanks and regards,
Aditya.


________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 22 December 2022 18:20
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>So, I guess if we manage to do something similar just like for the
>first inferior, we will get to the solution, but I did not understand
>how Linux might be reading the symbol again and again for a new
>inferior or AIX for that matter for the first inferior.  Kindly let
>me know how we can do something similar or are we missing something
>here that I have not kept in mind in our attempt to solve this for
>AIX and GDB community.

So the way is works on Linux is that linux-thread-db.c registers
both an inferior_created observer and a new_objfile observer.

The inferior_created observer gets called immediately after the
new inferior is noticed - at this point, only the main executable
is detected by GDB, not any of the shared libraries.  So this will
only successfully detect a libpthread-linked executable if it was
*statically* linked.  For dynamically linked executables, the
new_objfile observer will later get called as well, once for each
shared library.  As soon as this happens for the libpthread shared
library, the fact that the inferior is multithreaded is detected.

At first glance, this should actually work the same on AIX - the
aix-thread.c file also registers both inferior_created and
new_objfile observers.  Not sure why this doesn't work ...
Do you see the new_objfile observer being called for all the
shared libraries in the second inferior?

If not, are shared libraries actually detected correctly in the
second inferior (e.g. does "info sharedlibrary" show them correctly
in the second inferior)?  If not, maybe solib-aix.c also needs to
be reviewed whether it handles multiple inferiors correctly ...

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 9371 bytes --]

From 8b1f34b7d62c9e6a0f216c0e3b36fce752952eee Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Sun, 18 Dec 2022 23:19:04 -0600
Subject: [PATCH] Fix multi thread debug bug in AIX

---
 gdb/aix-thread.c | 103 ++++++++++++++++++++++++++---------------------
 1 file changed, 58 insertions(+), 45 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..3a2afad25a1 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -55,6 +55,7 @@
 #include <sys/reg.h>
 #include <sched.h>
 #include <sys/pthdebug.h>
+#include <vector>
 
 #if !HAVE_DECL_GETTHRDS
 extern int getthrds (pid_t, struct thrdsinfo64 *, int, tid_t *, int);
@@ -70,7 +71,7 @@ static bool debug_aix_thread;
 
 /* Return whether to treat PID as a debuggable thread id.  */
 
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
+#define PD_TID(ptid)	(std::find (pd_active.begin (), pd_active.end (), ptid.pid ()) != pd_active.end () && ptid.tid () != 0)
 
 /* Success and failure values returned by pthdb callbacks.  */
 
@@ -151,11 +152,11 @@ static CORE_ADDR pd_brk_addr;
 
 /* Whether the current application is debuggable by pthdb.  */
 
-static int pd_able = 0;
+static std::vector<pid_t> pd_able;
 
 /* Whether a threaded application is being debugged.  */
 
-static int pd_active = 0;
+static std::vector<pid_t> pd_active;
 
 /* Whether the current architecture is 64-bit.  
    Only valid when pd_able is true.  */
@@ -331,6 +332,9 @@ pdc_symbol_addrs (pthdb_user_t user_current_pid, pthdb_symbol_t *symbols, int co
   struct bound_minimal_symbol ms;
   int i;
   char *name;
+  scoped_restore_current_program_space restore_pspace; 
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (), user_current_pid);
+  set_current_program_space (inf->pspace);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
@@ -508,14 +512,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +642,23 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for counting GDB threads.  */
 
 static int
 giter_count (struct thread_info *thread, void *countp)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
+  (*(int *) countp)++;
   return 0;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for accumulating GDB thread pids.  */
 
 static int
 giter_accum (struct thread_info *thread, void *bufp)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  **(struct thread_info ***) bufp = thread;
+  (*(struct thread_info ***) bufp)++;
+
   return 0;
 }
 
@@ -719,7 +709,7 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +740,9 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -810,8 +803,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,8 +832,22 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+              if (gptid.is_pid () && gptid.pid () == pptid.pid ())
+                {
+                  thread_change_ptid (proc_target, gptid, pptid);
+                  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+                  tp = find_thread_ptid (proc_target, pptid);
+                  tp->priv.reset (priv);
+                  pi++;
+                  gi++;
+                }
+              else
+                {
+                  delete_thread (gbuf[gi]);
+                  gi++;
+                }
 	    }
 	  else
 	    {
@@ -888,7 +893,8 @@ pd_update (int pid)
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
 
-  if (!pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (), pid)
+    == pd_active.end ())
     return ptid_t (pid);
 
   status = pthdb_session_update (pd_session);
@@ -926,7 +932,7 @@ pd_activate (int pid)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  pd_active.push_back (pid);
   return pd_update (pid);
 }
 
@@ -935,26 +941,29 @@ pd_activate (int pid)
 static void
 pd_deactivate (void)
 {
-  if (!pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (), inferior_ptid.pid ())
+    == pd_active.end ())
     return;
   pthdb_session_destroy (pd_session);
   
   pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  pd_active.erase (std::find (pd_active.begin (), pd_active.end (), inferior_ptid.pid ()));
 }
 
 /* An object file has just been loaded.  Check whether the current
    application is pthreaded, and if so, prepare for thread debugging.  */
 
 static void
-pd_enable (void)
+pd_enable (inferior *inf)
 {
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  pid_t pid = (inf == NULL?inferior_ptid.pid ():inf->pid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (std::find (pd_able.begin (), pd_able.end (), pid) 
+	!= pd_able.end ())
     return;
 
   /* Check application word size.  */
@@ -962,7 +971,7 @@ pd_enable (void)
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
-  status = pthdb_session_pthreaded (inferior_ptid.pid (), PTHDB_FLAG_REGS,
+  status = pthdb_session_pthreaded (pid, PTHDB_FLAG_REGS,
 				    &pd_callbacks, &stub_name);
   if ((status != PTHDB_SUCCESS
        && status != PTHDB_NOT_PTHREADED) || !stub_name)
@@ -978,12 +987,12 @@ pd_enable (void)
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  pd_able.push_back (pid);
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
      activate thread debugging.  */
-  pd_activate (inferior_ptid.pid ());
+  pd_activate (pid);
 }
 
 /* Undo the effects of pd_enable().  */
@@ -991,11 +1000,14 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  if (std::find (pd_able.begin (), pd_able.end (), 
+	inferior_ptid.pid ()) == pd_able.end ())
     return;
-  if (pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (),
+        inferior_ptid.pid ()) != pd_active.end ())
     pd_deactivate ();
-  pd_able = 0;
+  pd_able.erase (std::find (pd_able.begin (), pd_able.end (), 
+	inferior_ptid.pid ()));
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
@@ -1010,7 +1022,7 @@ static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
-    pd_enable ();
+    pd_enable (NULL);
   else
     pd_disable ();
 }
@@ -1020,7 +1032,7 @@ new_objfile (struct objfile *objfile)
 static void
 aix_thread_inferior_created (inferior *inf)
 {
-  pd_enable ();
+  pd_enable (inf);
 }
 
 /* Detach from the process attached to by aix_thread_attach().  */
@@ -1096,7 +1108,8 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
   gdb_assert (ptid.is_pid ());
 
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (std::find (pd_active.begin (), pd_active.end (), ptid.pid ()) == pd_active.end ()
+      && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2022-12-26 13:18                                         ` Aditya Kamath1
@ 2023-01-09 14:04                                           ` Ulrich Weigand
  2023-01-10 12:23                                             ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2023-01-09 14:04 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Here I have added a print statement to ensure we are able to find the member in the archive. 
>
>What's interesting is for the first inferior this works fine for all shared libraries.
>For the second one and every inferior thereafter the output is as shown below in the next paragraph, 

>object_bfd is shr.o --- compared with --- member_name is shr_comm.o in path /usr/lib/libpthreads.a(shr_comm.o) 
>object_bfd is /usr/lib/libpthreads.a(shr_comm.o) --- compared with --- member_name is shr_comm.o
>in path /usr/lib/libpthreads.a(shr_comm.o)

>I was surprised that the bfd_get_filename (object_bfd.get ()) is returning the pathname
>instead of the object file descriptor. Everything until here seems to correct in the
>solib_aix_bfd_open () function and this makes it hard for me to understand what is going on.

Looks like this is because solib_aix_bfd_open *changes* the BFD filename here:

  /* Override the returned bfd's name with the name returned from solib_find
     along with appended parenthesized member name in order to allow commands
     listing all shared libraries to display.  Otherwise, we would only be
     displaying the name of the archive member object.  */
  std::string fname = string_printf ("%s%s",
                                     bfd_get_filename (archive_bfd.get ()),
                                     sep);
  bfd_set_filename (object_bfd.get (), fname.c_str ());

so when the same BFD gets checked a second time, you'll now see the changed
filename instead of the original one.

I think you'll have to allow for that modified form of the name as well.


>Even if I allow a pathname match to the member_name we end up losing all the
>information of our threads in the first process though we still have the
>process information.

This needs further debugging to understand what's going on once you allow
that match.  That original problem should be fixed by that change, so 
there's probably something else as well ...

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-01-09 14:04                                           ` Ulrich Weigand
@ 2023-01-10 12:23                                             ` Aditya Kamath1
  2023-01-11 13:31                                               ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2023-01-10 12:23 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 8129 bytes --]

Hi Ulrich and community,

Please find attached the patch. [See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch]

>I think you'll have to allow for that modified form of the name as well.

I have allowed the same. Please see the solib-aix change. With this we are able to read all the symbols in any inferior successfully. One can verify this by using the set debug aix-thread.

If one executes a "info shared library" command, one can see the 4 libraries for any inferior. Kindly check output 1,pasted below in this email for program 1.


>>Even if I allow a pathname match to the member_name we end up losing all the
>>information of our threads in the first process though we still have the
>>process information.

>This needs further debugging to understand what's going on once you allow
>that match.  That original problem should be fixed by that change, so
>there's probably something else as well ...

Yeah. As mentioned in my previous mail we are losing our threads information. Though in the output we do get a new thread my attempt to understand the root cause in the code failed miserably. Thread 1 belonging to process 1 is getting shown as 2.1 in output 2 pasted below. What's worse is the top target is also not setting properly in the process of having the right name for the shared library. I have the correct program space as well while reading the symbol. And since the top target is wrong, the new process appears though this is a threaded one in the output.

So, looking at this, I have missed out something may be minor or major causing the bug which I am unaware of in the code base. I have tried debugging the aix-thread.c.. But things look to be properly aligned as it should be there at least. Single inferior examples with multiple threads pass. But multi inferior with multi threads fail.

Kindly guide me on what I am missing here. It is surely something which I have not explored and unaware of. Your expertise can help us resolve this bug.

Thank you for the guidance so far in this bug.

Waiting for a reply soon.

Have a nice day ahead.

Thanks and regards,
Aditya.


---------------------------------------------------------------------------
Program 1:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 1


void *

thread_function (void *arg)

{


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}

---------------------------------------------------------------------------------------------------------
Output 1:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New inferior 2 (process 15925744)]

I am parent

^C[New process 11665696]


Thread 1.3 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.

(gdb) inferior 2

[Switching to inferior 2 [process 15925744] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (Thread 258)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.

(gdb)

---------------------------------------------------------------------------------------------------------
Output 2:-

Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New inferior 2 (process 16122342)]

I am parent

^C[New process 11665700]


Thread 1.3 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1.3  process 11665700                   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2.1  Thread 258 (tid 28115287, running) 0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb)

________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 09 January 2023 19:34
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Here I have added a print statement to ensure we are able to find the member in the archive.
>
>What's interesting is for the first inferior this works fine for all shared libraries.
>For the second one and every inferior thereafter the output is as shown below in the next paragraph,

>object_bfd is shr.o --- compared with --- member_name is shr_comm.o in path /usr/lib/libpthreads.a(shr_comm.o)
>object_bfd is /usr/lib/libpthreads.a(shr_comm.o) --- compared with --- member_name is shr_comm.o
>in path /usr/lib/libpthreads.a(shr_comm.o)

>I was surprised that the bfd_get_filename (object_bfd.get ()) is returning the pathname
>instead of the object file descriptor. Everything until here seems to correct in the
>solib_aix_bfd_open () function and this makes it hard for me to understand what is going on.

Looks like this is because solib_aix_bfd_open *changes* the BFD filename here:

  /* Override the returned bfd's name with the name returned from solib_find
     along with appended parenthesized member name in order to allow commands
     listing all shared libraries to display.  Otherwise, we would only be
     displaying the name of the archive member object.  */
  std::string fname = string_printf ("%s%s",
                                     bfd_get_filename (archive_bfd.get ()),
                                     sep);
  bfd_set_filename (object_bfd.get (), fname.c_str ());

so when the same BFD gets checked a second time, you'll now see the changed
filename instead of the original one.

I think you'll have to allow for that modified form of the name as well.


>Even if I allow a pathname match to the member_name we end up losing all the
>information of our threads in the first process though we still have the
>process information.

This needs further debugging to understand what's going on once you allow
that match.  That original problem should be fixed by that change, so
there's probably something else as well ...

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 9936 bytes --]

From fb8d24df7c1362204ca0045361acd37439f46746 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Tue, 10 Jan 2023 05:58:28 -0600
Subject: [PATCH] Fix multi thread debug bug in AIX

---
 gdb/aix-thread.c | 103 ++++++++++++++++++++++++++---------------------
 gdb/solib-aix.c  |   6 +++
 2 files changed, 64 insertions(+), 45 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..3a2afad25a1 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -55,6 +55,7 @@
 #include <sys/reg.h>
 #include <sched.h>
 #include <sys/pthdebug.h>
+#include <vector>
 
 #if !HAVE_DECL_GETTHRDS
 extern int getthrds (pid_t, struct thrdsinfo64 *, int, tid_t *, int);
@@ -70,7 +71,7 @@ static bool debug_aix_thread;
 
 /* Return whether to treat PID as a debuggable thread id.  */
 
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
+#define PD_TID(ptid)	(std::find (pd_active.begin (), pd_active.end (), ptid.pid ()) != pd_active.end () && ptid.tid () != 0)
 
 /* Success and failure values returned by pthdb callbacks.  */
 
@@ -151,11 +152,11 @@ static CORE_ADDR pd_brk_addr;
 
 /* Whether the current application is debuggable by pthdb.  */
 
-static int pd_able = 0;
+static std::vector<pid_t> pd_able;
 
 /* Whether a threaded application is being debugged.  */
 
-static int pd_active = 0;
+static std::vector<pid_t> pd_active;
 
 /* Whether the current architecture is 64-bit.  
    Only valid when pd_able is true.  */
@@ -331,6 +332,9 @@ pdc_symbol_addrs (pthdb_user_t user_current_pid, pthdb_symbol_t *symbols, int co
   struct bound_minimal_symbol ms;
   int i;
   char *name;
+  scoped_restore_current_program_space restore_pspace; 
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (), user_current_pid);
+  set_current_program_space (inf->pspace);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
@@ -508,14 +512,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +642,23 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for counting GDB threads.  */
 
 static int
 giter_count (struct thread_info *thread, void *countp)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
+  (*(int *) countp)++;
   return 0;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for accumulating GDB thread pids.  */
 
 static int
 giter_accum (struct thread_info *thread, void *bufp)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  **(struct thread_info ***) bufp = thread;
+  (*(struct thread_info ***) bufp)++;
+
   return 0;
 }
 
@@ -719,7 +709,7 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +740,9 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -810,8 +803,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,8 +832,22 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+              if (gptid.is_pid () && gptid.pid () == pptid.pid ())
+                {
+                  thread_change_ptid (proc_target, gptid, pptid);
+                  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+                  tp = find_thread_ptid (proc_target, pptid);
+                  tp->priv.reset (priv);
+                  pi++;
+                  gi++;
+                }
+              else
+                {
+                  delete_thread (gbuf[gi]);
+                  gi++;
+                }
 	    }
 	  else
 	    {
@@ -888,7 +893,8 @@ pd_update (int pid)
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
 
-  if (!pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (), pid)
+    == pd_active.end ())
     return ptid_t (pid);
 
   status = pthdb_session_update (pd_session);
@@ -926,7 +932,7 @@ pd_activate (int pid)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  pd_active.push_back (pid);
   return pd_update (pid);
 }
 
@@ -935,26 +941,29 @@ pd_activate (int pid)
 static void
 pd_deactivate (void)
 {
-  if (!pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (), inferior_ptid.pid ())
+    == pd_active.end ())
     return;
   pthdb_session_destroy (pd_session);
   
   pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  pd_active.erase (std::find (pd_active.begin (), pd_active.end (), inferior_ptid.pid ()));
 }
 
 /* An object file has just been loaded.  Check whether the current
    application is pthreaded, and if so, prepare for thread debugging.  */
 
 static void
-pd_enable (void)
+pd_enable (inferior *inf)
 {
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  pid_t pid = (inf == NULL?inferior_ptid.pid ():inf->pid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (std::find (pd_able.begin (), pd_able.end (), pid) 
+	!= pd_able.end ())
     return;
 
   /* Check application word size.  */
@@ -962,7 +971,7 @@ pd_enable (void)
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
-  status = pthdb_session_pthreaded (inferior_ptid.pid (), PTHDB_FLAG_REGS,
+  status = pthdb_session_pthreaded (pid, PTHDB_FLAG_REGS,
 				    &pd_callbacks, &stub_name);
   if ((status != PTHDB_SUCCESS
        && status != PTHDB_NOT_PTHREADED) || !stub_name)
@@ -978,12 +987,12 @@ pd_enable (void)
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  pd_able.push_back (pid);
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
      activate thread debugging.  */
-  pd_activate (inferior_ptid.pid ());
+  pd_activate (pid);
 }
 
 /* Undo the effects of pd_enable().  */
@@ -991,11 +1000,14 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  if (std::find (pd_able.begin (), pd_able.end (), 
+	inferior_ptid.pid ()) == pd_able.end ())
     return;
-  if (pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (),
+        inferior_ptid.pid ()) != pd_active.end ())
     pd_deactivate ();
-  pd_able = 0;
+  pd_able.erase (std::find (pd_able.begin (), pd_able.end (), 
+	inferior_ptid.pid ()));
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
@@ -1010,7 +1022,7 @@ static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
-    pd_enable ();
+    pd_enable (NULL);
   else
     pd_disable ();
 }
@@ -1020,7 +1032,7 @@ new_objfile (struct objfile *objfile)
 static void
 aix_thread_inferior_created (inferior *inf)
 {
-  pd_enable ();
+  pd_enable (inf);
 }
 
 /* Detach from the process attached to by aix_thread_attach().  */
@@ -1096,7 +1108,8 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
   gdb_assert (ptid.is_pid ());
 
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (std::find (pd_active.begin (), pd_active.end (), ptid.pid ()) == pd_active.end ()
+      && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..181140b3345 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -618,6 +618,12 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+      if (s.find (member_name) != std::string::npos)
+      {
+	return object_bfd;
+      }
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-01-10 12:23                                             ` Aditya Kamath1
@ 2023-01-11 13:31                                               ` Ulrich Weigand
  2023-01-13 14:06                                                 ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2023-01-11 13:31 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>So, looking at this, I have missed out something may be minor or
>major causing the bug which I am unaware of in the code base.

The one problem I see is that there are still global variables
that seem to get clobbered if there are multiple inferiors,
in particular pd_session and pd_brk_addr.  Those will get
overwritten by each pd_activate call for each inferior.

Note that pthdb_session_init registers the pid, so the resulting
pd_session should be different for each inferior.  If you just
overwrite it, then the pid values passed to all callbacks will
always reflect only the latest inferior.

Similarly, pd_brk_addr is potentially different in each inferior,
if libpthread is loaded at different addresses.

You should create a struct containing all per-inferior thread-
related data members, and then allocate one such struct per
inferior.

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-01-11 13:31                                               ` Ulrich Weigand
@ 2023-01-13 14:06                                                 ` Aditya Kamath1
  2023-01-20 14:44                                                   ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2023-01-13 14:06 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 4241 bytes --]

Hi Ulrich and community,

Please find attached the patch {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}


So, I took into account the information you gave me the last time. Having said that, it turned out to be a different issue that is causing such a swap of threads as shown in the outputs of the previous email {pasted again below this email}.

Our top target is getting set correctly and symbols are being read in the right way on any new-born inferior. I am thankful to you all for pointing that shared Library issue to me.

While setting the top target for a new inferior we go to sync_threadlist () from the pd_enable () -> pthdb_session_pthreaded () -> push the top target -> pd_activate () -> pthdb_session_init ()-> pd_update () -> sync_threadlists (), our sync_threadlists uses a variable called gcount which represents the number of threads in the GDB. pcount is the number of pthreads. gcount is got from iterate_over_threads ().

Honestly from my understanding we do not take into account the process ID while we iterate over threads. So, this will work in a single inferior case whereas in multi-inferior one with multi thread things mess up. Here is how: So, if we have two threads in our first inferior
{main thread and a task thread say}, and one main thread in the second inferior and we are syncing our second inferior, though our pcount will correctly be one but our gcount will 3. Which means by our current logic we will end up deleting two threads {first two of the first inferior}. This is exactly what happened in my precious mail output if you check and post this, we will swap the threads as this is how the code flow.

This is now one of the causes of the bug.

Inorder to resolve the same I request for one information. How can we iterate_over_threads of a particular process. What is that function. Is there any built-in available?? Kindly let me know and that should solve this issue.

Also kindly give me feedback on this patch if I need to change anything. I thank you for the guidance so far.

Hoping for a reply soon.

Have a nice day ahead.

Thanks and regards,
Aditya.

--------------------------------------------
Output 2:-

Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New inferior 2 (process 16122342)]

I am parent

^C[New process 11665700]


Thread 1.3 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1.3  process 11665700                   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  2.1  Thread 258 (tid 28115287, running) 0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb)

________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 11 January 2023 19:01
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>So, looking at this, I have missed out something may be minor or
>major causing the bug which I am unaware of in the code base.

The one problem I see is that there are still global variables
that seem to get clobbered if there are multiple inferiors,
in particular pd_session and pd_brk_addr.  Those will get
overwritten by each pd_activate call for each inferior.

Note that pthdb_session_init registers the pid, so the resulting
pd_session should be different for each inferior.  If you just
overwrite it, then the pid values passed to all callbacks will
always reflect only the latest inferior.

Similarly, pd_brk_addr is potentially different in each inferior,
if libpthread is loaded at different addresses.

You should create a struct containing all per-inferior thread-
related data members, and then allocate one such struct per
inferior.

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 15474 bytes --]

From 7dae4fa7e37323b355308edb8b13c118b6a2ae8a Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Fri, 13 Jan 2023 06:43:14 -0600
Subject: [PATCH] Fix multi thread debug bug in AIX

---
 gdb/aix-thread.c | 148 ++++++++++++++++++++++++++---------------------
 gdb/solib-aix.c  |   6 ++
 2 files changed, 89 insertions(+), 65 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..5894f3ecc7d 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -55,6 +55,8 @@
 #include <sys/reg.h>
 #include <sched.h>
 #include <sys/pthdebug.h>
+#include <vector>
+#include <map>
 
 #if !HAVE_DECL_GETTHRDS
 extern int getthrds (pid_t, struct thrdsinfo64 *, int, tid_t *, int);
@@ -70,7 +72,7 @@ static bool debug_aix_thread;
 
 /* Return whether to treat PID as a debuggable thread id.  */
 
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
+#define PD_TID(ptid)	(std::find (pd_active.begin (), pd_active.end (), ptid.pid ()) != pd_active.end () && ptid.tid () != 0)
 
 /* Success and failure values returned by pthdb callbacks.  */
 
@@ -142,20 +144,20 @@ class aix_thread_target final : public target_ops
   ptid_t get_ada_task_ptid (long lwp, ULONGEST thread) override;
 };
 
-static aix_thread_target aix_thread_ops;
+  static aix_thread_target aix_thread_ops;
 
 /* Address of the function that libpthread will call when libpthdebug
    is ready to be initialized.  */
 
-static CORE_ADDR pd_brk_addr;
+ static std::map<pid_t, CORE_ADDR> pd_brk_addr;
 
 /* Whether the current application is debuggable by pthdb.  */
 
-static int pd_able = 0;
+static std::vector<pid_t> pd_able;
 
 /* Whether a threaded application is being debugged.  */
 
-static int pd_active = 0;
+static std::vector<pid_t> pd_active;
 
 /* Whether the current architecture is 64-bit.  
    Only valid when pd_able is true.  */
@@ -193,7 +195,7 @@ static pthdb_callbacks_t pd_callbacks = {
 
 /* Current pthdb session.  */
 
-static pthdb_session_t pd_session;
+static std::map<pid_t, pthdb_session_t> pd_session;
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -331,6 +333,9 @@ pdc_symbol_addrs (pthdb_user_t user_current_pid, pthdb_symbol_t *symbols, int co
   struct bound_minimal_symbol ms;
   int i;
   char *name;
+  scoped_restore_current_program_space restore_pspace; 
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (), user_current_pid);
+  set_current_program_space (inf->pspace);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
@@ -508,14 +513,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +643,23 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for counting GDB threads.  */
 
 static int
 giter_count (struct thread_info *thread, void *countp)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
+  (*(int *) countp)++;
   return 0;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* iterate_over_threads() callback for accumulating GDB thread pids.  */
 
 static int
 giter_accum (struct thread_info *thread, void *bufp)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  **(struct thread_info ***) bufp = thread;
+  (*(struct thread_info ***) bufp)++;
+
   return 0;
 }
 
@@ -719,7 +710,7 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +741,9 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +753,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (pd_session[pid], &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (pd_session[pid], pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +774,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (pd_session[pid], pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -810,8 +804,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,8 +833,22 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+              if (gptid.is_pid () && gptid.pid () == pptid.pid ())
+                {
+                  thread_change_ptid (proc_target, gptid, pptid);
+                  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+                  tp = find_thread_ptid (proc_target, pptid);
+                  tp->priv.reset (priv);
+                  pi++;
+                  gi++;
+                }
+              else
+                {
+                  delete_thread (gbuf[gi]);
+                  gi++;
+                }
 	    }
 	  else
 	    {
@@ -888,10 +894,11 @@ pd_update (int pid)
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
 
-  if (!pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (), pid)
+    == pd_active.end ())
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (pd_session[pid]);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
@@ -918,15 +925,17 @@ static ptid_t
 pd_activate (int pid)
 {
   int status;
-		
+	
   status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &pd_session[pid]);
   if (status != PTHDB_SUCCESS)
     {
+      pd_session.erase (pid);
       return ptid_t (pid);
     }
-  pd_active = 1;
+  printf ("pd_activated successfully for pid %d\n", pid);
+  pd_active.push_back (pid);
   return pd_update (pid);
 }
 
@@ -935,26 +944,29 @@ pd_activate (int pid)
 static void
 pd_deactivate (void)
 {
-  if (!pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (), inferior_ptid.pid ())
+    == pd_active.end ())
     return;
-  pthdb_session_destroy (pd_session);
+  pthdb_session_destroy (pd_session.erase (inferior_ptid.pid ()));
   
   pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  pd_active.erase (std::find (pd_active.begin (), pd_active.end (), inferior_ptid.pid ()));
 }
 
 /* An object file has just been loaded.  Check whether the current
    application is pthreaded, and if so, prepare for thread debugging.  */
 
 static void
-pd_enable (void)
+pd_enable (inferior *inf)
 {
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  pid_t pid = (inf == NULL?inferior_ptid.pid ():inf->pid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (std::find (pd_able.begin (), pd_able.end (), pid) 
+	!= pd_able.end ())
     return;
 
   /* Check application word size.  */
@@ -962,7 +974,7 @@ pd_enable (void)
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
-  status = pthdb_session_pthreaded (inferior_ptid.pid (), PTHDB_FLAG_REGS,
+  status = pthdb_session_pthreaded (pid, PTHDB_FLAG_REGS,
 				    &pd_callbacks, &stub_name);
   if ((status != PTHDB_SUCCESS
        && status != PTHDB_NOT_PTHREADED) || !stub_name)
@@ -972,18 +984,20 @@ pd_enable (void)
   ms = lookup_minimal_symbol (stub_name, NULL, NULL);
   if (ms.minsym == NULL)
     return;
-  pd_brk_addr = ms.value_address ();
-  if (!create_thread_event_breakpoint (target_gdbarch (), pd_brk_addr))
+  pd_brk_addr.insert ({pid, ms.value_address ()});
+
+  if (!create_thread_event_breakpoint (target_gdbarch (), pd_brk_addr[pid]))
     return;
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+
+  pd_able.push_back (pid);
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
      activate thread debugging.  */
-  pd_activate (inferior_ptid.pid ());
+  pd_activate (pid);
 }
 
 /* Undo the effects of pd_enable().  */
@@ -991,11 +1005,14 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  if (std::find (pd_able.begin (), pd_able.end (), 
+	inferior_ptid.pid ()) == pd_able.end ())
     return;
-  if (pd_active)
+  if (std::find (pd_active.begin (), pd_active.end (),
+        inferior_ptid.pid ()) != pd_active.end ())
     pd_deactivate ();
-  pd_able = 0;
+  pd_able.erase (std::find (pd_able.begin (), pd_able.end (), 
+	inferior_ptid.pid ()));
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
@@ -1010,7 +1027,7 @@ static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
-    pd_enable ();
+    pd_enable (NULL);
   else
     pd_disable ();
 }
@@ -1020,7 +1037,7 @@ new_objfile (struct objfile *objfile)
 static void
 aix_thread_inferior_created (inferior *inf)
 {
-  pd_enable ();
+  pd_enable (inf);
 }
 
 /* Detach from the process attached to by aix_thread_attach().  */
@@ -1096,7 +1113,8 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
   gdb_assert (ptid.is_pid ());
 
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (std::find (pd_active.begin (), pd_active.end (), ptid.pid ()) == pd_active.end ()
+      && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1105,7 +1123,7 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
       struct gdbarch *gdbarch = regcache->arch ();
 
       if (regcache_read_pc (regcache)
-	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
+	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr[ptid.pid ()])
 	return pd_activate (ptid.pid ());
     }
 
@@ -1233,7 +1251,7 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (pd_session [inferior_ptid.pid ()], pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1518,7 +1536,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (pd_session [inferior_ptid.pid ()], pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1576,7 +1594,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (pd_session[inferior_ptid.pid ()], pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1800,24 +1818,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (pd_session[inferior_ptid.pid ()], pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (pd_session[inferior_ptid.pid ()], pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (pd_session[inferior_ptid.pid ()], pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (pd_session[inferior_ptid.pid ()], pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..181140b3345 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -618,6 +618,12 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+      if (s.find (member_name) != std::string::npos)
+      {
+	return object_bfd;
+      }
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-01-13 14:06                                                 ` Aditya Kamath1
@ 2023-01-20 14:44                                                   ` Ulrich Weigand
  2023-01-27 14:40                                                     ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2023-01-20 14:44 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Inorder to resolve the same I request for one information. How can we iterate_over_threads
>of a particular process. What is that function. Is there any built-in available??
>Kindly let me know and that should solve this issue. 

Instead of iterate_over_threads you could use the all_threads() iterator directly;
this can be specialized to only return threads of one inferior like this:

       for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
       {
            ...
       }

>Also kindly give me feedback on this patch if I need to change anything.

I think this change in solib-aix.c is not quite correct:
+      std::string s = bfd_get_filename (object_bfd.get ());
+      if (s.find (member_name) != std::string::npos)
+      {
+       return object_bfd;
+      }

This matches the member name *anywhere* in the full filename,
which could lead to spurious matches, I think.  The test
should be more specific.

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-01-20 14:44                                                   ` Ulrich Weigand
@ 2023-01-27 14:40                                                     ` Aditya Kamath1
  2023-01-30 19:54                                                       ` Tom Tromey
  2023-02-02  6:24                                                       ` Aditya Kamath1
  0 siblings, 2 replies; 49+ messages in thread
From: Aditya Kamath1 @ 2023-01-27 14:40 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 8745 bytes --]

Hi Ulrich and community,

Thank you for the feedback for the fix of this bug. Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}

So, I have fixed the bug and it works alright. Please find the test program, output with patch and without patch pasted below this email.

>+      if (s.find (member_name) != std::string::npos)
>+      {
>+       return object_bfd;
>+      }

>This matches the member name *anywhere* in the full >filename,
>which could lead to spurious matches, I think.  The test
>should be more specific.

This I have taken care in the patch.

There are a few changes for which I want to explain below.

We now have all variables {pd_able, pd_active and pd_session} now in a map of process ID and structure. This will help us make AIX GDB code easy to manage them per process in the aix-thread.c file.

Secondly, in the function pid_to_str () there is a beneath () call, which is why I had to put this function in rs6000-aix-nat.c file.

Third thing is previously if there was no object file, we would use pd_disable () to disable thread debugging. This is incorrect now that we support multiple inferiors. Since we rely on inferior_ptid with new object file function till a point, we must disable only when we mourn the inferior or a process dies. Otherwise, there is every chance we will disable thread debugging for a wrong inferior that can be currently inferior_ptid. It also creates a mess disabling the pd_active for the wrong inferior in cases where a new inferior is born who object file is being loaded. This change can be seen in the patch.

I have written comments for the remaining changes in the patch.

Kindly give me feedback if we can do anything better or is incorrect. If not, kindly push this patch so that AIX folks can have a better debugging experience.

Have a nice day ahead.

Thanks and regards,
Aditya.

-------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}


---------------------------------------------------
Output with patch applied:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (Process 17498448)]

I am parent

[New inferior 3 (Process 11731454)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [Process 17498448] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (Process 17498448)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

  1.1  Thread 1 (tid 25231849, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 33227061, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 23069149, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

* 2.1  Process 17498448                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  Process 11731454                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.


--------------------------------------------------------

Output without patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)

(gdb)



________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 20 January 2023 20:14
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Inorder to resolve the same I request for one information. How can we iterate_over_threads
>of a particular process. What is that function. Is there any built-in available??
>Kindly let me know and that should solve this issue.

Instead of iterate_over_threads you could use the all_threads() iterator directly;
this can be specialized to only return threads of one inferior like this:

       for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
       {
            ...
       }

>Also kindly give me feedback on this patch if I need to change anything.

I think this change in solib-aix.c is not quite correct:
+      std::string s = bfd_get_filename (object_bfd.get ());
+      if (s.find (member_name) != std::string::npos)
+      {
+       return object_bfd;
+      }

This matches the member name *anywhere* in the full filename,
which could lead to spurious matches, I think.  The test
should be more specific.

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 17098 bytes --]

From 1783be1c9161479a158a4e52dd0663f45e6b92e9 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Fri, 27 Jan 2023 08:19:56 -0600
Subject: [PATCH] Fix multi-thread debug bug in AIX

---
 gdb/aix-thread.c     | 189 ++++++++++++++++++++++++++-----------------
 gdb/rs6000-aix-nat.c |  10 +++
 gdb/solib-aix.c      |  15 ++++
 3 files changed, 141 insertions(+), 73 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..c6231e92f91 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -55,6 +55,7 @@
 #include <sys/reg.h>
 #include <sched.h>
 #include <sys/pthdebug.h>
+#include <map>
 
 #if !HAVE_DECL_GETTHRDS
 extern int getthrds (pid_t, struct thrdsinfo64 *, int, tid_t *, int);
@@ -70,7 +71,7 @@ static bool debug_aix_thread;
 
 /* Return whether to treat PID as a debuggable thread id.  */
 
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
+#define PD_TID(ptid)	(tmap[ptid.pid ()].pd_active && ptid.tid () != 0)
 
 /* Success and failure values returned by pthdb callbacks.  */
 
@@ -149,14 +150,6 @@ static aix_thread_target aix_thread_ops;
 
 static CORE_ADDR pd_brk_addr;
 
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
 /* Whether the current architecture is 64-bit.  
    Only valid when pd_able is true.  */
 
@@ -191,9 +184,21 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
 
-static pthdb_session_t pd_session;
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+};
+
+/* Collection of Aix variables per inferior.  */
+static std::map<pid_t, aix_thread_variables> tmap;
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -508,14 +513,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +643,32 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* Callback for counting GDB threads for process pid.  */
 
 static int
-giter_count (struct thread_info *thread, void *countp)
+giter_count (pid_t pid)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
+  int gcount = 0;
+  process_stratum_target *proc_target
+    = current_inferior ()->process_target ();
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
+  return gcount;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* Callback for accumulating GDB thread pids.  */
 
 static int
-giter_accum (struct thread_info *thread, void *bufp)
+giter_accum (void *bufp, pid_t pid)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  process_stratum_target *proc_target
+    = current_inferior ()->process_target ();
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+  {
+    **(struct thread_info ***) bufp = tp;
+    (*(struct thread_info ***) bufp)++;
+  }
+
   return 0;
 }
 
@@ -719,7 +719,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +753,9 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +765,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (tmap[pid].pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (tmap[pid].pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +786,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (tmap[pid].pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,10 +796,11 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
-  gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
+  gcount = giter_count (pid);
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  giter_accum (&g, pid);
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
   /* Apply differences between the two arrays to GDB's thread list.  */
@@ -810,8 +817,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -830,6 +835,23 @@ sync_threadlists (int pid)
 
 	  cmp_result = ptid_cmp (pptid, gptid);
 
+	  /* If there is only one thread then we need not make the main 
+	     thread look like a thread.  It can stay as a process. This
+	     is useful when we have multiple inferiors, but only one is
+	     threaded.  So we need not make the other inferiors with only
+	     main thread, look like a threaded one.  For example, Thread
+	     1.1, 1.2, 2.1, 3.1 exists then it is useful to skip this for
+	     loop for 2.1 and 3.1 leaving them as main process thread with
+	     a dummy priv set.  */
+
+	  if (pcount == 1 && gcount == 1)
+	  {
+	    aix_thread_info *priv = new aix_thread_info;
+	    tp = find_thread_ptid (proc_target, gptid);
+	    tp->priv.reset (priv);
+	    break;
+	  }
+
 	  if (cmp_result == 0)
 	    {
 	      aix_thread_info *priv = get_aix_thread_info (gbuf[gi]);
@@ -841,8 +863,25 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+		 like a thread.  */
+
+	      if (gptid.is_pid () && gptid.pid () == pptid.pid ())
+	      {
+		thread_change_ptid (proc_target, gptid, pptid);
+		aix_thread_info *priv = new aix_thread_info;
+		priv->pdtid = pbuf[pi].pdtid;
+		priv->tid = pbuf[pi].tid;
+		tp = find_thread_ptid (proc_target, pptid);
+		tp->priv.reset (priv);
+		pi++;
+		gi++;
+	      }
+	      else
+	      {
+		delete_thread (gbuf[gi]);
+		gi++;
+	      }
 	    }
 	  else
 	    {
@@ -888,10 +927,10 @@ pd_update (int pid)
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
 
-  if (!pd_active)
+  if (!tmap[pid].pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (tmap[pid].pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
@@ -921,12 +960,12 @@ pd_activate (int pid)
 		
   status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &tmap[pid].pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  tmap[pid].pd_active = 1;
   return pd_update (pid);
 }
 
@@ -935,12 +974,12 @@ pd_activate (int pid)
 static void
 pd_deactivate (void)
 {
-  if (!pd_active)
+  if (!tmap[inferior_ptid.pid ()].pd_active)
     return;
-  pthdb_session_destroy (pd_session);
+  pthdb_session_destroy (tmap[inferior_ptid.pid ()].pd_session);
   
   pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  tmap[inferior_ptid.pid ()].pd_active = 0;
 }
 
 /* An object file has just been loaded.  Check whether the current
@@ -953,8 +992,15 @@ pd_enable (void)
   char *stub_name;
   struct bound_minimal_symbol ms;
 
+  /* Create set of variables for this inferior.  */
+  if (tmap.find (inferior_ptid.pid ()) == tmap.end ())
+  {
+    struct aix_thread_variables z = {0, 0};
+    tmap.insert ({inferior_ptid.pid (), z});
+  }
+
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (tmap[inferior_ptid.pid ()].pd_able)
     return;
 
   /* Check application word size.  */
@@ -978,7 +1024,7 @@ pd_enable (void)
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  tmap[inferior_ptid.pid ()].pd_able = 1; 
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
@@ -991,28 +1037,25 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  if (!tmap[inferior_ptid.pid ()].pd_able)
     return;
-  if (pd_active)
+  if (tmap[inferior_ptid.pid ()].pd_active)
     pd_deactivate ();
-  pd_able = 0;
+  tmap[inferior_ptid.pid ()].pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
+  tmap.erase (inferior_ptid.pid ());
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1096,7 +1139,7 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
   gdb_assert (ptid.is_pid ());
 
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!tmap[ptid.pid ()].pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1233,7 +1276,7 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (tmap[inferior_ptid.pid ()].pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1518,7 +1561,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (tmap[inferior_ptid.pid ()].pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1576,7 +1619,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (tmap[inferior_ptid.pid ()].pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1741,7 +1784,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1800,24 +1843,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (tmap[inferior_ptid.pid ()].pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (tmap[inferior_ptid.pid ()].pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (tmap[inferior_ptid.pid ()].pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (tmap[inferior_ptid.pid ()].pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/rs6000-aix-nat.c b/gdb/rs6000-aix-nat.c
index 2ac1f6e70b6..d1fece6faa7 100644
--- a/gdb/rs6000-aix-nat.c
+++ b/gdb/rs6000-aix-nat.c
@@ -95,6 +95,8 @@ class rs6000_nat_target final : public inf_ptrace_target
 
   ptid_t wait (ptid_t, struct target_waitstatus *, target_wait_flags) override;
 
+  std::string pid_to_str (ptid_t) override;
+
   /* Fork detection related functions, For adding multi process debugging
      support.  */
   void follow_fork (inferior *, ptid_t, target_waitkind, bool, bool) override;
@@ -619,6 +621,14 @@ rs6000_nat_target::xfer_partial (enum target_object object,
     }
 }
 
+std::string 
+rs6000_nat_target::pid_to_str (ptid_t ptid)
+{
+  if (!ptid.tid ())
+    return string_printf (_("Process %s"), pulongest (ptid.pid ())); 
+
+  return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
+}
 /* Wait for the child specified by PTID to do something.  Return the
    process ID of the child, or MINUS_ONE_PTID in case of error; store
    the status in *OURSTATUS.  */
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..74676cbfa50 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -550,6 +550,10 @@ solib_aix_in_dynsym_resolve_code (CORE_ADDR pc)
   return 0;
 }
 
+/* For multi inferiors, post object file name change
+   we store the new names in this vector.  */
+std::vector<std::string> aix_slib_name;
+
 /* Implement the "bfd_open" target_so_ops method.  */
 
 static gdb_bfd_ref_ptr
@@ -618,6 +622,16 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+      auto it = aix_slib_name.begin ();
+      while (it != aix_slib_name.end ())
+      {
+	std::string s1 = *it;
+	if (s1.compare(s) == 0)
+	  return object_bfd;
+	it++;
+      }
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
@@ -644,6 +658,7 @@ solib_aix_bfd_open (const char *pathname)
   std::string fname = string_printf ("%s%s",
 				     bfd_get_filename (archive_bfd.get ()),
 				     sep);
+  aix_slib_name.push_back (fname);
   bfd_set_filename (object_bfd.get (), fname.c_str ());
 
   return object_bfd;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-01-27 14:40                                                     ` Aditya Kamath1
@ 2023-01-30 19:54                                                       ` Tom Tromey
  2023-02-02  6:24                                                       ` Aditya Kamath1
  1 sibling, 0 replies; 49+ messages in thread
From: Tom Tromey @ 2023-01-30 19:54 UTC (permalink / raw)
  To: Aditya Kamath1 via Gdb-patches
  Cc: Ulrich Weigand, simark, Aditya Kamath1, Sangamesh Mallayya

>>>>> Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org> writes:

> We now have all variables {pd_able, pd_active and pd_session} now in a
> map of process ID and structure. This will help us make AIX GDB code
> easy to manage them per process in the aix-thread.c file.

I don't really know what this is about, but it's probably better to
attach the data directly to the inferior using the registry system.
(You can't use private_inferior as apparently that's reserved for the
process stratum.)

Search for registry<inferior> for some examples.


> Secondly, in the function pid_to_str () there is a beneath () call,
> which is why I had to put this function in rs6000-aix-nat.c file.

I wonder why it's necessary, as it seems to me that
aix_thread_target::pid_to_str should have already handled the 'thread'
case, so the inherited method ought to be good enough.

> ---------------------------------------------------
> Output with patch applied:-

Is there an existing gdb test case that exercises this code?
If not then it seems like a new test is warranted.

>  static void
>  pd_disable (void)
>  {
> -  if (!pd_able)
> +  if (!tmap[inferior_ptid.pid ()].pd_able)
>      return;
> -  if (pd_active)
> +  if (tmap[inferior_ptid.pid ()].pd_active)
>      pd_deactivate ();
> -  pd_able = 0;
> +  tmap[inferior_ptid.pid ()].pd_able = 0;
>    current_inferior ()->unpush_target (&aix_thread_ops);
> +  tmap.erase (inferior_ptid.pid ());
>  }

It's better to pass in a ptid or even the aix_thread_variables object
itself than to rely on globals in low-level functions like this.

 
> diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
> index f483f54de13..74676cbfa50 100644
> --- a/gdb/solib-aix.c
> +++ b/gdb/solib-aix.c
> @@ -550,6 +550,10 @@ solib_aix_in_dynsym_resolve_code (CORE_ADDR pc)
>    return 0;
>  }
 
> +/* For multi inferiors, post object file name change
> +   we store the new names in this vector.  */
> +std::vector<std::string> aix_slib_name;

> +      std::string s = bfd_get_filename (object_bfd.get ());
> +      auto it = aix_slib_name.begin ();
> +      while (it != aix_slib_name.end ())
> +      {
> +	std::string s1 = *it;
> +	if (s1.compare(s) == 0)
> +	  return object_bfd;
> +	it++;

This doesn't look right to me at all.  Using a global means that BFDs
from one inferior might "leak" to another, based solely on whether a
certain name was ever seen.  Also nothing ever cleans out the global
vector.

It's better to attach this data to the relevant BFD using the registry
system, and not use a global at all.

Tom

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-01-27 14:40                                                     ` Aditya Kamath1
  2023-01-30 19:54                                                       ` Tom Tromey
@ 2023-02-02  6:24                                                       ` Aditya Kamath1
  2023-02-02  6:35                                                         ` Aditya Kamath1
  1 sibling, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2023-02-02  6:24 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 17722 bytes --]

Hi Tom, Ulrich and community,

Thank you for the feedback for the fix of this bug. Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}.

So, I have fixed the bug and it works alright. Please find the test program, output with patch and without patch pasted below this email.

>> We now have all variables {pd_able, pd_active and pd_session} now in a
>> map of process ID and structure. This will help us make AIX GDB code
>> easy to manage them per process in the aix-thread.c file.

>I don't really know what this is about, but it's probably better to
>attach the data directly to the inferior using the registry system.
>(You can't use private_inferior as apparently that's reserved for the
>process stratum.)

>Search for registry<inferior> for some examples.

>It's better to pass in a ptid or even the aix_thread_variables object
>itself than to rely on globals in low-level functions like this.

So, I have taken care of this. Now we use the registry. Thank you for this suggestion. I was not knowing this. This is a very nice feature.

>> Secondly, in the function pid_to_str () there is a beneath () call,
>> which is why I had to put this function in rs6000-aix-nat.c file.

>I wonder why it's necessary, as it seems to me that
>aix_thread_target::pid_to_str should have already handled the 'thread'
>case, so the inherited method ought to be good enough.

This I have removed. I made a mistake while analysing this solution. Thank you for pointing it out. It works without it. Kindly check the output below.

>Is there an existing gdb test case that exercises this code?
>If not then it seems like a new test is warranted.

This I am not aware of at least when I tried finding. What we need is a test case to check if the shared library is loaded for every new inferior born and the top target is set correctly in case of thread debugging.
If something exists, I would like to know.

>> +       return object_bfd;
>> +     it++;

>This doesn't look right to me at all.  Using a global means that BFDs
>from one inferior might "leak" to another, based solely on whether a
>certain name was ever seen.  Also nothing ever cleans out the global
>vector.

>It's better to attach this data to the relevant BFD using the registry
>system, and not use a global at all.

So we already attach this data using the lines here in the same function.

std::string fname = string_printf ("%s%s",

                                     bfd_get_filename (archive_bfd.get ()),

                                     sep);

  bfd_set_filename (object_bfd.get (), fname.c_str ());


All we need to the right match for the name of the shared library. So, we already have a pathname variable. I used it and removed the vector. Kindly see it in the patch. You were right. There is nothing that could have clean that vector.

Kindly give me feedback if we can do anything better or is incorrect. If not, kindly push this patch so that AIX folks can have a better debugging experience.

Have a nice day ahead.

Thanks and regards,
Aditya.

-------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}

------------------------

Output with patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 15728962)]

I am parent

[New inferior 3 (process 20382144)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1.1  Thread 1 (tid 34144675, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 30146951, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 37159321, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  2.1  process 15728962                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  process 20382144                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.

(gdb) inferior 2

[Switching to inferior 2 [process 15728962] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 15728962)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

----------------------------

Output without patch:-
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)

(gdb)



________________________________
From: Gdb-patches <gdb-patches-bounces+aditya.kamath1=ibm.com@sourceware.org> on behalf of Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org>
Sent: 27 January 2023 20:10
To: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>; simark@simark.ca <simark@simark.ca>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: [EXTERNAL] Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Hi Ulrich and community,

Thank you for the feedback for the fix of this bug. Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}

So, I have fixed the bug and it works alright. Please find the test program, output with patch and without patch pasted below this email.

>+      if (s.find (member_name) != std::string::npos)
>+      {
>+       return object_bfd;
>+      }

>This matches the member name *anywhere* in the full >filename,
>which could lead to spurious matches, I think.  The test
>should be more specific.

This I have taken care in the patch.

There are a few changes for which I want to explain below.

We now have all variables {pd_able, pd_active and pd_session} now in a map of process ID and structure. This will help us make AIX GDB code easy to manage them per process in the aix-thread.c file.

Secondly, in the function pid_to_str () there is a beneath () call, which is why I had to put this function in rs6000-aix-nat.c file.

Third thing is previously if there was no object file, we would use pd_disable () to disable thread debugging. This is incorrect now that we support multiple inferiors. Since we rely on inferior_ptid with new object file function till a point, we must disable only when we mourn the inferior or a process dies. Otherwise, there is every chance we will disable thread debugging for a wrong inferior that can be currently inferior_ptid. It also creates a mess disabling the pd_active for the wrong inferior in cases where a new inferior is born who object file is being loaded. This change can be seen in the patch.

I have written comments for the remaining changes in the patch.

Kindly give me feedback if we can do anything better or is incorrect. If not, kindly push this patch so that AIX folks can have a better debugging experience.

Have a nice day ahead.

Thanks and regards,
Aditya.

-------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}


---------------------------------------------------
Output with patch applied:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (Process 17498448)]

I am parent

[New inferior 3 (Process 11731454)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [Process 17498448] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (Process 17498448)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

  1.1  Thread 1 (tid 25231849, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 33227061, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 23069149, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

* 2.1  Process 17498448                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  Process 11731454                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.


--------------------------------------------------------

Output without patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)

(gdb)



________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 20 January 2023 20:14
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Inorder to resolve the same I request for one information. How can we iterate_over_threads
>of a particular process. What is that function. Is there any built-in available??
>Kindly let me know and that should solve this issue.

Instead of iterate_over_threads you could use the all_threads() iterator directly;
this can be specialized to only return threads of one inferior like this:

       for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
       {
            ...
       }

>Also kindly give me feedback on this patch if I need to change anything.

I think this change in solib-aix.c is not quite correct:
+      std::string s = bfd_get_filename (object_bfd.get ());
+      if (s.find (member_name) != std::string::npos)
+      {
+       return object_bfd;
+      }

This matches the member name *anywhere* in the full filename,
which could lead to spurious matches, I think.  The test
should be more specific.

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 20504 bytes --]

From 414eef8a95e6002f7c96a43f9b84aaa5bde8570c Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Wed, 1 Feb 2023 23:54:11 -0600
Subject: [PATCH] Fix Multi thread debug bug fix in AIX

 In the recent commit 98ed24fb35d89eb20179edf6c12f599c7a9e228e made by Mr. Tom there is a change in aix-thread.c file that changes

 static_cast <aix_thread_info *> in gdb to gdb::checked_static_cast <aix_thread_info *>

 AIX folks using the latest version will not be able to debug multi thread programs as a result of it

The error in AIX is as follows:-

internal-error checked_static_cast Assertion result != nullptr failed.

The reason being AIX shared library were not being loaded for a new inferior and top target was not set properly.

This patch is a fix for the same.
---
 gdb/aix-thread.c | 276 +++++++++++++++++++++++++++++++++--------------
 gdb/solib-aix.c  |  10 ++
 2 files changed, 206 insertions(+), 80 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..adc17c86d66 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -70,7 +70,7 @@ static bool debug_aix_thread;
 
 /* Return whether to treat PID as a debuggable thread id.  */
 
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
+#define PD_TID(ptid, data)	(data->pd_active && ptid.tid () != 0)
 
 /* Success and failure values returned by pthdb callbacks.  */
 
@@ -149,14 +149,6 @@ static aix_thread_target aix_thread_ops;
 
 static CORE_ADDR pd_brk_addr;
 
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
 /* Whether the current architecture is 64-bit.  
    Only valid when pd_able is true.  */
 
@@ -191,9 +183,60 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
+
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+};
+
+/* Key to our per-inferior data.  */
+static const registry<inferior>::key<aix_thread_variables>
+  aix_thread_variables_handle;
+
+/* Function to Get aix_thread_variables data.  */
+static struct aix_thread_variables*
+get_aix_thread_variables_data (struct inferior *inf)
+{
+  if (inf == NULL)
+    return NULL;
+
+  struct aix_thread_variables* data;
+
+  data = aix_thread_variables_handle.get (inf);
+  if (data == NULL)
+    data = aix_thread_variables_handle.emplace (inf);
 
-static pthdb_session_t pd_session;
+  return data;
+}
+
+/* Helper to get data for ptid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_ptid (ptid_t ptid)
+{
+  struct aix_thread_variables *data;
+  inferior *inf = find_inferior_ptid (current_inferior ()->process_target (),
+					ptid);
+  return get_aix_thread_variables_data (inf);
+}
+
+/* Helper to get data for pid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_pid (pid_t pid)
+{
+  struct aix_thread_variables *data;
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (),
+                                        pid);
+  return get_aix_thread_variables_data (inf);
+}
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -316,9 +359,11 @@ static void
 pid_to_prc (ptid_t *ptidp)
 {
   ptid_t ptid;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (*ptidp);
 
   ptid = *ptidp;
-  if (PD_TID (ptid))
+  if (PD_TID (ptid, data))
     *ptidp = ptid_t (ptid.pid ());
 }
 
@@ -508,14 +553,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +683,32 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* Callback for counting GDB threads for process pid.  */
 
 static int
-giter_count (struct thread_info *thread, void *countp)
+giter_count (pid_t pid)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
+  int gcount = 0;
+  process_stratum_target *proc_target
+    = current_inferior ()->process_target ();
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
+  return gcount;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* Callback for accumulating GDB thread pids.  */
 
 static int
-giter_accum (struct thread_info *thread, void *bufp)
+giter_accum (void *bufp, pid_t pid)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  process_stratum_target *proc_target
+    = current_inferior ()->process_target ();
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+  {
+    **(struct thread_info ***) bufp = tp;
+    (*(struct thread_info ***) bufp)++;
+  }
+
   return 0;
 }
 
@@ -719,7 +759,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +793,11 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +807,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (data->pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (data->pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +828,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (data->pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,10 +838,11 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
-  gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
+  gcount = giter_count (pid);
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  giter_accum (&g, pid);
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
   /* Apply differences between the two arrays to GDB's thread list.  */
@@ -810,8 +859,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -830,6 +877,23 @@ sync_threadlists (int pid)
 
 	  cmp_result = ptid_cmp (pptid, gptid);
 
+	  /* If there is only one thread then we need not make the main 
+	     thread look like a thread.  It can stay as a process. This
+	     is useful when we have multiple inferiors, but only one is
+	     threaded.  So we need not make the other inferiors with only
+	     main thread, look like a threaded one.  For example, Thread
+	     1.1, 1.2, 2.1, 3.1 exists then it is useful to skip this for
+	     loop for 2.1 and 3.1 leaving them as main process thread with
+	     a dummy priv set.  */
+
+	  if (pcount == 1 && gcount == 1)
+	  {
+	    aix_thread_info *priv = new aix_thread_info;
+	    tp = find_thread_ptid (proc_target, gptid);
+	    tp->priv.reset (priv);
+	    break;
+	  }
+
 	  if (cmp_result == 0)
 	    {
 	      aix_thread_info *priv = get_aix_thread_info (gbuf[gi]);
@@ -841,8 +905,25 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+		 like a thread.  */
+
+	      if (gptid.is_pid () && gptid.pid () == pptid.pid ())
+	      {
+		thread_change_ptid (proc_target, gptid, pptid);
+		aix_thread_info *priv = new aix_thread_info;
+		priv->pdtid = pbuf[pi].pdtid;
+		priv->tid = pbuf[pi].tid;
+		tp = find_thread_ptid (proc_target, pptid);
+		tp->priv.reset (priv);
+		pi++;
+		gi++;
+	      }
+	      else
+	      {
+		delete_thread (gbuf[gi]);
+		gi++;
+	      }
 	    }
 	  else
 	    {
@@ -888,10 +969,13 @@ pd_update (int pid)
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
 
-  if (!pd_active)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
+
+  if (!data->pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (data->pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
@@ -918,15 +1002,17 @@ static ptid_t
 pd_activate (int pid)
 {
   int status;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 		
   status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &data->pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  data->pd_active = 1;
   return pd_update (pid);
 }
 
@@ -935,12 +1021,15 @@ pd_activate (int pid)
 static void
 pd_deactivate (void)
 {
-  if (!pd_active)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_active)
     return;
-  pthdb_session_destroy (pd_session);
+  pthdb_session_destroy (data->pd_session);
   
   pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  data->pd_active = 0;
 }
 
 /* An object file has just been loaded.  Check whether the current
@@ -953,8 +1042,14 @@ pd_enable (void)
   char *stub_name;
   struct bound_minimal_symbol ms;
 
+  if (!inferior_ptid.pid ())
+    return;
+  
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (data->pd_able)
     return;
 
   /* Check application word size.  */
@@ -978,7 +1073,7 @@ pd_enable (void)
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  data->pd_able = 1; 
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
@@ -991,28 +1086,28 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_able)
     return;
-  if (pd_active)
+  if (data->pd_active)
     pd_deactivate ();
-  pd_able = 0;
+  data->pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
+  //tmap.erase (inferior_ptid.pid ());
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1042,8 +1137,10 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 {
   struct thread_info *thread;
   pthdb_tid_t tid[2];
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (ptid);
 
-  if (!PD_TID (ptid))
+  if (!PD_TID (ptid, data))
     {
       scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
       
@@ -1095,8 +1192,11 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
      pid-only ptids.  */
   gdb_assert (ptid.is_pid ());
 
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (ptid);
+
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1229,11 +1329,13 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep> (gdbarch);
   int status, i;
   pthdb_context_t ctx;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1362,8 +1464,10 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
 {
   struct thread_info *thread;
   pthdb_tid_t tid;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
-  if (!PD_TID (regcache->ptid ()))
+  if (!PD_TID (regcache->ptid (), data))
     beneath ()->fetch_registers (regcache, regno);
   else
     {
@@ -1511,6 +1615,8 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   pthdb_context_t ctx;
   uint32_t int32;
   uint64_t int64;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1518,7 +1624,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1576,7 +1682,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1702,8 +1808,10 @@ aix_thread_target::store_registers (struct regcache *regcache, int regno)
 {
   struct thread_info *thread;
   pthdb_tid_t tid;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
-  if (!PD_TID (regcache->ptid ()))
+  if (!PD_TID (regcache->ptid (), data))
     beneath ()->store_registers (regcache, regno);
   else
     {
@@ -1741,7 +1849,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1750,7 +1858,10 @@ aix_thread_target::mourn_inferior ()
 bool
 aix_thread_target::thread_alive (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!PD_TID (ptid, data))
     return beneath ()->thread_alive (ptid);
 
   /* We update the thread list every time the child stops, so all
@@ -1766,7 +1877,10 @@ aix_thread_target::thread_alive (ptid_t ptid)
 std::string
 aix_thread_target::pid_to_str (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (ptid);
+
+  if (!PD_TID (ptid, data))
     return beneath ()->pid_to_str (ptid);
 
   return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
@@ -1786,8 +1900,10 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
   pthdb_detachstate_t detachstate;
   int cancelpend;
   static char *ret = NULL;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (thread->ptid);
 
-  if (!PD_TID (thread->ptid))
+  if (!PD_TID (thread->ptid, data))
     return NULL;
 
   string_file buf;
@@ -1800,24 +1916,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (data->pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (data->pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (data->pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (data->pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..09d033ef473 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -565,6 +565,7 @@ solib_aix_bfd_open (const char *pathname)
   const char *sep;
   int filename_len;
   int found_file;
+  std::string string_path = pathname;
 
   if (pathname[path_len - 1] != ')')
     return solib_bfd_open (pathname);
@@ -618,6 +619,15 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+
+      /* For every inferior after first int bfd system we 
+	 will have the pathname instead of the member name
+	 registered. Hence the below condition exists.  */
+
+      if (string_path.compare (s) == 0)
+	return object_bfd;
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-02  6:24                                                       ` Aditya Kamath1
@ 2023-02-02  6:35                                                         ` Aditya Kamath1
  2023-02-02 17:43                                                           ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2023-02-02  6:35 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches, Aditya Kamath1; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 26818 bytes --]

Hi Tom, Ulrich and community,

I made a mistake by not removing a commented line in the last patch. I apologise for the same. Kindly ignore the same.

Thank you for the feedback for the fix of this bug. Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}.

So, I have fixed the bug and it works alright. Please find the test program, output with patch and without patch pasted below this email.

>> We now have all variables {pd_able, pd_active and pd_session} now in a
>> map of process ID and structure. This will help us make AIX GDB code
>> easy to manage them per process in the aix-thread.c file.

>I don't really know what this is about, but it's probably better to
>attach the data directly to the inferior using the registry system.
>(You can't use private_inferior as apparently that's reserved for the
>process stratum.)

>Search for registry<inferior> for some examples.

>It's better to pass in a ptid or even the aix_thread_variables object
>itself than to rely on globals in low-level functions like this.

So, I have taken care of this. Now we use the registry. Thank you for this suggestion. I was not knowing this. This is a very nice feature.

>> Secondly, in the function pid_to_str () there is a beneath () call,
>> which is why I had to put this function in rs6000-aix-nat.c file.

>I wonder why it's necessary, as it seems to me that
>aix_thread_target::pid_to_str should have already handled the 'thread'
>case, so the inherited method ought to be good enough.

This I have removed. I made a mistake while analysing this solution. Thank you for pointing it out. It works without it. Kindly check the output below.

>Is there an existing gdb test case that exercises this code?
>If not then it seems like a new test is warranted.

This I am not aware of at least when I tried finding. What we need is a test case to check if the shared library is loaded for every new inferior born and the top target is set correctly in case of thread debugging.
If something exists, I would like to know.

>> +       return object_bfd;
>> +     it++;

>This doesn't look right to me at all.  Using a global means that BFDs
>from one inferior might "leak" to another, based solely on whether a
>certain name was ever seen.  Also nothing ever cleans out the global
>vector.

>It's better to attach this data to the relevant BFD using the registry
>system, and not use a global at all.

So we already attach this data using the lines here in the same function.

std::string fname = string_printf ("%s%s",

                                     bfd_get_filename (archive_bfd.get ()),

                                     sep);

  bfd_set_filename (object_bfd.get (), fname.c_str ());


All we need to the right match for the name of the shared library. So, we already have a pathname variable. I used it and removed the vector. Kindly see it in the patch. You were right. There is nothing that could have clean that vector.

Kindly give me feedback if we can do anything better or is incorrect. If not, kindly push this patch so that AIX folks can have a better debugging experience.

Have a nice day ahead.

Thanks and regards,
Aditya.

-------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}

------------------------

Output with patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 15728962)]

I am parent

[New inferior 3 (process 20382144)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1.1  Thread 1 (tid 34144675, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 30146951, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 37159321, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  2.1  process 15728962                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  process 20382144                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.

(gdb) inferior 2

[Switching to inferior 2 [process 15728962] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 15728962)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

----------------------------

Output without patch:-
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)

(gdb)

________________________________
From: Gdb-patches <gdb-patches-bounces+aditya.kamath1=ibm.com@sourceware.org> on behalf of Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org>
Sent: 02 February 2023 11:54
To: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>; simark@simark.ca <simark@simark.ca>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: [EXTERNAL] Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Hi Tom, Ulrich and community,

Thank you for the feedback for the fix of this bug. Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}.

So, I have fixed the bug and it works alright. Please find the test program, output with patch and without patch pasted below this email.

>> We now have all variables {pd_able, pd_active and pd_session} now in a
>> map of process ID and structure. This will help us make AIX GDB code
>> easy to manage them per process in the aix-thread.c file.

>I don't really know what this is about, but it's probably better to
>attach the data directly to the inferior using the registry system.
>(You can't use private_inferior as apparently that's reserved for the
>process stratum.)

>Search for registry<inferior> for some examples.

>It's better to pass in a ptid or even the aix_thread_variables object
>itself than to rely on globals in low-level functions like this.

So, I have taken care of this. Now we use the registry. Thank you for this suggestion. I was not knowing this. This is a very nice feature.

>> Secondly, in the function pid_to_str () there is a beneath () call,
>> which is why I had to put this function in rs6000-aix-nat.c file.

>I wonder why it's necessary, as it seems to me that
>aix_thread_target::pid_to_str should have already handled the 'thread'
>case, so the inherited method ought to be good enough.

This I have removed. I made a mistake while analysing this solution. Thank you for pointing it out. It works without it. Kindly check the output below.

>Is there an existing gdb test case that exercises this code?
>If not then it seems like a new test is warranted.

This I am not aware of at least when I tried finding. What we need is a test case to check if the shared library is loaded for every new inferior born and the top target is set correctly in case of thread debugging.
If something exists, I would like to know.

>> +       return object_bfd;
>> +     it++;

>This doesn't look right to me at all.  Using a global means that BFDs
>from one inferior might "leak" to another, based solely on whether a
>certain name was ever seen.  Also nothing ever cleans out the global
>vector.

>It's better to attach this data to the relevant BFD using the registry
>system, and not use a global at all.

So we already attach this data using the lines here in the same function.

std::string fname = string_printf ("%s%s",

                                     bfd_get_filename (archive_bfd.get ()),

                                     sep);

  bfd_set_filename (object_bfd.get (), fname.c_str ());


All we need to the right match for the name of the shared library. So, we already have a pathname variable. I used it and removed the vector. Kindly see it in the patch. You were right. There is nothing that could have clean that vector.

Kindly give me feedback if we can do anything better or is incorrect. If not, kindly push this patch so that AIX folks can have a better debugging experience.

Have a nice day ahead.

Thanks and regards,
Aditya.

-------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}

------------------------

Output with patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 15728962)]

I am parent

[New inferior 3 (process 20382144)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1.1  Thread 1 (tid 34144675, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 30146951, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 37159321, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  2.1  process 15728962                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  process 20382144                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.

(gdb) inferior 2

[Switching to inferior 2 [process 15728962] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 15728962)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

----------------------------

Output without patch:-
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)

(gdb)



________________________________
From: Gdb-patches <gdb-patches-bounces+aditya.kamath1=ibm.com@sourceware.org> on behalf of Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org>
Sent: 27 January 2023 20:10
To: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>; simark@simark.ca <simark@simark.ca>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: [EXTERNAL] Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Hi Ulrich and community,

Thank you for the feedback for the fix of this bug. Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}

So, I have fixed the bug and it works alright. Please find the test program, output with patch and without patch pasted below this email.

>+      if (s.find (member_name) != std::string::npos)
>+      {
>+       return object_bfd;
>+      }

>This matches the member name *anywhere* in the full >filename,
>which could lead to spurious matches, I think.  The test
>should be more specific.

This I have taken care in the patch.

There are a few changes for which I want to explain below.

We now have all variables {pd_able, pd_active and pd_session} now in a map of process ID and structure. This will help us make AIX GDB code easy to manage them per process in the aix-thread.c file.

Secondly, in the function pid_to_str () there is a beneath () call, which is why I had to put this function in rs6000-aix-nat.c file.

Third thing is previously if there was no object file, we would use pd_disable () to disable thread debugging. This is incorrect now that we support multiple inferiors. Since we rely on inferior_ptid with new object file function till a point, we must disable only when we mourn the inferior or a process dies. Otherwise, there is every chance we will disable thread debugging for a wrong inferior that can be currently inferior_ptid. It also creates a mess disabling the pd_active for the wrong inferior in cases where a new inferior is born who object file is being loaded. This change can be seen in the patch.

I have written comments for the remaining changes in the patch.

Kindly give me feedback if we can do anything better or is incorrect. If not, kindly push this patch so that AIX folks can have a better debugging experience.

Have a nice day ahead.

Thanks and regards,
Aditya.

-------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}


---------------------------------------------------
Output with patch applied:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (Process 17498448)]

I am parent

[New inferior 3 (Process 11731454)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [Process 17498448] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (Process 17498448)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

  1.1  Thread 1 (tid 25231849, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 33227061, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 23069149, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

* 2.1  Process 17498448                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  Process 11731454                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.


--------------------------------------------------------

Output without patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)

(gdb)



________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 20 January 2023 20:14
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Inorder to resolve the same I request for one information. How can we iterate_over_threads
>of a particular process. What is that function. Is there any built-in available??
>Kindly let me know and that should solve this issue.

Instead of iterate_over_threads you could use the all_threads() iterator directly;
this can be specialized to only return threads of one inferior like this:

       for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
       {
            ...
       }

>Also kindly give me feedback on this patch if I need to change anything.

I think this change in solib-aix.c is not quite correct:
+      std::string s = bfd_get_filename (object_bfd.get ());
+      if (s.find (member_name) != std::string::npos)
+      {
+       return object_bfd;
+      }

This matches the member name *anywhere* in the full filename,
which could lead to spurious matches, I think.  The test
should be more specific.

Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 20464 bytes --]

From 96d5931f6eed659dc3c2f456a8713a1969d14ac4 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Thu, 2 Feb 2023 00:32:16 -0600
Subject: [PATCH] Fix Multi thread debug bug fix in AIX

 In the recent commit 98ed24fb35d89eb20179edf6c12f599c7a9e228e made by Mr. Tom there is a change in aix-thread.c file that changes

 static_cast <aix_thread_info *> in gdb to gdb::checked_static_cast <aix_thread_info *>

 AIX folks using the latest version will not be able to debug multi thread programs as a result of it

The error in AIX is as follows:-

internal-error checked_static_cast Assertion result != nullptr failed.

The reason being AIX shared library were not being loaded for a new inferior and top target was not set properly.

This patch is a fix for the same.
---
 gdb/aix-thread.c | 275 +++++++++++++++++++++++++++++++++--------------
 gdb/solib-aix.c  |  10 ++
 2 files changed, 205 insertions(+), 80 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..bfc7f901de0 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -70,7 +70,7 @@ static bool debug_aix_thread;
 
 /* Return whether to treat PID as a debuggable thread id.  */
 
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
+#define PD_TID(ptid, data)	(data->pd_active && ptid.tid () != 0)
 
 /* Success and failure values returned by pthdb callbacks.  */
 
@@ -149,14 +149,6 @@ static aix_thread_target aix_thread_ops;
 
 static CORE_ADDR pd_brk_addr;
 
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
 /* Whether the current architecture is 64-bit.  
    Only valid when pd_able is true.  */
 
@@ -191,9 +183,60 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
+
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+};
+
+/* Key to our per-inferior data.  */
+static const registry<inferior>::key<aix_thread_variables>
+  aix_thread_variables_handle;
+
+/* Function to Get aix_thread_variables data.  */
+static struct aix_thread_variables*
+get_aix_thread_variables_data (struct inferior *inf)
+{
+  if (inf == NULL)
+    return NULL;
+
+  struct aix_thread_variables* data;
+
+  data = aix_thread_variables_handle.get (inf);
+  if (data == NULL)
+    data = aix_thread_variables_handle.emplace (inf);
 
-static pthdb_session_t pd_session;
+  return data;
+}
+
+/* Helper to get data for ptid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_ptid (ptid_t ptid)
+{
+  struct aix_thread_variables *data;
+  inferior *inf = find_inferior_ptid (current_inferior ()->process_target (),
+					ptid);
+  return get_aix_thread_variables_data (inf);
+}
+
+/* Helper to get data for pid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_pid (pid_t pid)
+{
+  struct aix_thread_variables *data;
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (),
+                                        pid);
+  return get_aix_thread_variables_data (inf);
+}
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -316,9 +359,11 @@ static void
 pid_to_prc (ptid_t *ptidp)
 {
   ptid_t ptid;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (*ptidp);
 
   ptid = *ptidp;
-  if (PD_TID (ptid))
+  if (PD_TID (ptid, data))
     *ptidp = ptid_t (ptid.pid ());
 }
 
@@ -508,14 +553,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,36 +683,32 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* Callback for counting GDB threads for process pid.  */
 
 static int
-giter_count (struct thread_info *thread, void *countp)
+giter_count (pid_t pid)
 {
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
+  int gcount = 0;
+  process_stratum_target *proc_target
+    = current_inferior ()->process_target ();
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
+  return gcount;
 }
 
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
+/* Callback for accumulating GDB thread pids.  */
 
 static int
-giter_accum (struct thread_info *thread, void *bufp)
+giter_accum (void *bufp, pid_t pid)
 {
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
+  process_stratum_target *proc_target
+    = current_inferior ()->process_target ();
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+  {
+    **(struct thread_info ***) bufp = tp;
+    (*(struct thread_info ***) bufp)++;
+  }
+
   return 0;
 }
 
@@ -719,7 +759,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +793,11 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +807,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (data->pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (data->pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +828,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (data->pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,10 +838,11 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
-  gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
+  gcount = giter_count (pid);
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  giter_accum (&g, pid);
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
   /* Apply differences between the two arrays to GDB's thread list.  */
@@ -810,8 +859,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -830,6 +877,23 @@ sync_threadlists (int pid)
 
 	  cmp_result = ptid_cmp (pptid, gptid);
 
+	  /* If there is only one thread then we need not make the main 
+	     thread look like a thread.  It can stay as a process. This
+	     is useful when we have multiple inferiors, but only one is
+	     threaded.  So we need not make the other inferiors with only
+	     main thread, look like a threaded one.  For example, Thread
+	     1.1, 1.2, 2.1, 3.1 exists then it is useful to skip this for
+	     loop for 2.1 and 3.1 leaving them as main process thread with
+	     a dummy priv set.  */
+
+	  if (pcount == 1 && gcount == 1)
+	  {
+	    aix_thread_info *priv = new aix_thread_info;
+	    tp = find_thread_ptid (proc_target, gptid);
+	    tp->priv.reset (priv);
+	    break;
+	  }
+
 	  if (cmp_result == 0)
 	    {
 	      aix_thread_info *priv = get_aix_thread_info (gbuf[gi]);
@@ -841,8 +905,25 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+		 like a thread.  */
+
+	      if (gptid.is_pid () && gptid.pid () == pptid.pid ())
+	      {
+		thread_change_ptid (proc_target, gptid, pptid);
+		aix_thread_info *priv = new aix_thread_info;
+		priv->pdtid = pbuf[pi].pdtid;
+		priv->tid = pbuf[pi].tid;
+		tp = find_thread_ptid (proc_target, pptid);
+		tp->priv.reset (priv);
+		pi++;
+		gi++;
+	      }
+	      else
+	      {
+		delete_thread (gbuf[gi]);
+		gi++;
+	      }
 	    }
 	  else
 	    {
@@ -888,10 +969,13 @@ pd_update (int pid)
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
 
-  if (!pd_active)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
+
+  if (!data->pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (data->pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
@@ -918,15 +1002,17 @@ static ptid_t
 pd_activate (int pid)
 {
   int status;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 		
   status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &data->pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  data->pd_active = 1;
   return pd_update (pid);
 }
 
@@ -935,12 +1021,15 @@ pd_activate (int pid)
 static void
 pd_deactivate (void)
 {
-  if (!pd_active)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_active)
     return;
-  pthdb_session_destroy (pd_session);
+  pthdb_session_destroy (data->pd_session);
   
   pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  data->pd_active = 0;
 }
 
 /* An object file has just been loaded.  Check whether the current
@@ -953,8 +1042,14 @@ pd_enable (void)
   char *stub_name;
   struct bound_minimal_symbol ms;
 
+  if (!inferior_ptid.pid ())
+    return;
+  
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (data->pd_able)
     return;
 
   /* Check application word size.  */
@@ -978,7 +1073,7 @@ pd_enable (void)
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  data->pd_able = 1; 
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
@@ -991,28 +1086,27 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_able)
     return;
-  if (pd_active)
+  if (data->pd_active)
     pd_deactivate ();
-  pd_able = 0;
+  data->pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1042,8 +1136,10 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 {
   struct thread_info *thread;
   pthdb_tid_t tid[2];
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (ptid);
 
-  if (!PD_TID (ptid))
+  if (!PD_TID (ptid, data))
     {
       scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
       
@@ -1095,8 +1191,11 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
      pid-only ptids.  */
   gdb_assert (ptid.is_pid ());
 
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (ptid);
+
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1229,11 +1328,13 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep> (gdbarch);
   int status, i;
   pthdb_context_t ctx;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1362,8 +1463,10 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
 {
   struct thread_info *thread;
   pthdb_tid_t tid;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
-  if (!PD_TID (regcache->ptid ()))
+  if (!PD_TID (regcache->ptid (), data))
     beneath ()->fetch_registers (regcache, regno);
   else
     {
@@ -1511,6 +1614,8 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   pthdb_context_t ctx;
   uint32_t int32;
   uint64_t int64;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1518,7 +1623,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1576,7 +1681,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1702,8 +1807,10 @@ aix_thread_target::store_registers (struct regcache *regcache, int regno)
 {
   struct thread_info *thread;
   pthdb_tid_t tid;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
-  if (!PD_TID (regcache->ptid ()))
+  if (!PD_TID (regcache->ptid (), data))
     beneath ()->store_registers (regcache, regno);
   else
     {
@@ -1741,7 +1848,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1750,7 +1857,10 @@ aix_thread_target::mourn_inferior ()
 bool
 aix_thread_target::thread_alive (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!PD_TID (ptid, data))
     return beneath ()->thread_alive (ptid);
 
   /* We update the thread list every time the child stops, so all
@@ -1766,7 +1876,10 @@ aix_thread_target::thread_alive (ptid_t ptid)
 std::string
 aix_thread_target::pid_to_str (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (ptid);
+
+  if (!PD_TID (ptid, data))
     return beneath ()->pid_to_str (ptid);
 
   return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
@@ -1786,8 +1899,10 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
   pthdb_detachstate_t detachstate;
   int cancelpend;
   static char *ret = NULL;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (thread->ptid);
 
-  if (!PD_TID (thread->ptid))
+  if (!PD_TID (thread->ptid, data))
     return NULL;
 
   string_file buf;
@@ -1800,24 +1915,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (data->pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (data->pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (data->pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (data->pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..09d033ef473 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -565,6 +565,7 @@ solib_aix_bfd_open (const char *pathname)
   const char *sep;
   int filename_len;
   int found_file;
+  std::string string_path = pathname;
 
   if (pathname[path_len - 1] != ')')
     return solib_bfd_open (pathname);
@@ -618,6 +619,15 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+
+      /* For every inferior after first int bfd system we 
+	 will have the pathname instead of the member name
+	 registered. Hence the below condition exists.  */
+
+      if (string_path.compare (s) == 0)
+	return object_bfd;
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-02  6:35                                                         ` Aditya Kamath1
@ 2023-02-02 17:43                                                           ` Ulrich Weigand
  2023-02-03 11:10                                                             ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2023-02-02 17:43 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>Is there an existing gdb test case that exercises this code?
>>If not then it seems like a new test is warranted. 
>
>This I am not aware of at least when I tried finding. 

I think the question here is simply whether, if you run the
test suite both without and with your patch, are any of the
FAILs fixed with the patch?   If not, it would be good to
create a new test that fails without the patch and succeeds
with it, and add that to the test suite.


I think we're getting pretty close now, but I do still have
some additional comments on the latest patch:

> /* Return whether to treat PID as a debuggable thread id.  */
> 
>-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
>+#define PD_TID(ptid, data)	(data->pd_active && ptid.tid () != 0)

I'm not sure why the pd_active test is needed here
at all - if ptid.tid () != 0, thread debugging *must*
be active, otherwise you'd never have installed a ptid
with the tid field set, right?

If that check can indeed be omitted, that would simplify
the patch a bit since you wouldn't need to provide the
"data" struct in quite as many places.

>/* Address of the function that libpthread will call when libpthdebug
>   is ready to be initialized.  */
>
> static CORE_ADDR pd_brk_addr;

I believe this needs to go into the aix_thread_variables struct;
the address might be different if the pthread library is loaded
at different addresses into different inferiors.

> /* Whether the current architecture is 64-bit.  
>    Only valid when pd_able is true.  */
>
>static int arch64;

Likewise - some inferiors may be 64-bit and others 32-bit.

In general, *any* static variable in this file is suspect.


>+/* Callback for counting GDB threads for process pid.  */
> 
> static int
>-giter_count (struct thread_info *thread, void *countp)
>+giter_count (pid_t pid)

This only was a callback because it was called via
iterate_over_threads.  Now that you're using the
all_threads C++ iterator, I think both of those
routines should just be inlined into its caller.


>@@ -565,6 +565,7 @@ solib_aix_bfd_open (const char *pathname)
>   const char *sep;
>   int filename_len;
>   int found_file;
>+  std::string string_path = pathname;
> 
>   if (pathname[path_len - 1] != ')')
>     return solib_bfd_open (pathname);
>@@ -618,6 +619,15 @@ solib_aix_bfd_open (const char *pathname)
>       if (member_name == bfd_get_filename (object_bfd.get ()))
> 	break;
> 
>+      std::string s = bfd_get_filename (object_bfd.get ());
>+
>+      /* For every inferior after first int bfd system we 
>+	 will have the pathname instead of the member name
>+	 registered. Hence the below condition exists.  */
>+
>+      if (string_path.compare (s) == 0)
>+	return object_bfd;

That's still not quite right, as the pathname component
might have been changed here:
  /* Calling solib_find makes certain that sysroot path is set properly
     if program has a dependency on .a archive and sysroot is set via
     set sysroot command.  */
  gdb::unique_xmalloc_ptr<char> found_pathname
    = solib_find (filename.c_str (), &found_file);

I think a simple but correct way to check whether the BFD filename
already contains a member name is to check for the presence of an
opening parenthesis e.g. via:
    strrchr (bfd_get_filename (...), '(');


Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-02 17:43                                                           ` Ulrich Weigand
@ 2023-02-03 11:10                                                             ` Aditya Kamath1
  2023-02-06 19:07                                                               ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2023-02-03 11:10 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 12099 bytes --]

Hi Ulrich, Tom and community,

Thank you for the feedback for the fix of this bug. Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}.

So, I have fixed the bug and it works alright. Please find the test program, output with patch and without patch pasted below this email.


>I'm not sure why the pd_active test is needed here
>at all - if ptid.tid () != 0, thread debugging *must*
>be active, otherwise you'd never have installed a ptid
>with the tid field set, right?

May be in the previous versions it was needed as they were adding main process as an extra thread. In this version of the aix-thread.c there is no place I have found where pd_active is disabled and we have a tid > 0. For us to have a tid we need pd_active enabled anyway which happens in pd_activate (). Otherwise, execution will not get into sync_threadlists () to set the tid.  So, I have removed it from there as you suggested.

>> static CORE_ADDR pd_brk_addr;

>I believe this needs to go into the aix_thread_variables struct;

This is done. Kindly check in the patch.

>This only was a callback because it was called via
>iterate_over_threads.  Now that you're using the
>all_threads C++ iterator, I think both of those

This is done. Kindly check in the patch.

>I think a simple but correct way to check whether the BFD filename
>already contains a member name is to check for the presence of an
>opening parenthesis e.g. via:
>   strrchr (bfd_get_filename (...), '(');

This is also done.

>I think the question here is simply whether, if you run the
>test suite both without and with your patch, are any of the
>FAILs fixed with the patch?   If not, it would be good to
>create a new test that fails without the patch and succeeds
>with it, and add that to the test suite.

So, this is something new to me. We will add it as a continuation in the same thread after this patch. I will need one information. Which test suite will we add it in? gdb.threads or gdb.base? Also, kindly suggest a simple test case that is written that I can see and learn. Any simple hello_world program will do. I want to understand how that exp file is written and how it compares to tell if a test case is pass or fail.

Kindly give me feedback for this patch, incase we can do anything better or is incorrect. If not, kindly push this patch so that AIX folks can have a better debugging experience.

Have a nice day ahead.

Thanks and regards,
Aditya.

-------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}

-------------------------------------------------

Output with patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 18546974)]

[New inferior 3 (process 9634234)]

I am parent

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.

(gdb) info threads

  Id   Target Id                          Frame

* 1.1  Thread 1 (tid 27263453, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 29819289, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 25297199, running) thread_function (arg=0x0)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  2.1  process 18546974                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  process 9634234                    0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 18546974] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 18546974)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.

-----------------------------------------------------------------------
Output without patch:-
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)
(gdb)


________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 02 February 2023 23:13
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>Is there an existing gdb test case that exercises this code?
>>If not then it seems like a new test is warranted.
>
>This I am not aware of at least when I tried finding.

I think the question here is simply whether, if you run the
test suite both without and with your patch, are any of the
FAILs fixed with the patch?   If not, it would be good to
create a new test that fails without the patch and succeeds
with it, and add that to the test suite.


I think we're getting pretty close now, but I do still have
some additional comments on the latest patch:

> /* Return whether to treat PID as a debuggable thread id.  */
>
>-#define PD_TID(ptid)  (pd_active && ptid.tid () != 0)
>+#define PD_TID(ptid, data)    (data->pd_active && ptid.tid () != 0)

I'm not sure why the pd_active test is needed here
at all - if ptid.tid () != 0, thread debugging *must*
be active, otherwise you'd never have installed a ptid
with the tid field set, right?

If that check can indeed be omitted, that would simplify
the patch a bit since you wouldn't need to provide the
"data" struct in quite as many places.

>/* Address of the function that libpthread will call when libpthdebug
>   is ready to be initialized.  */
>
> static CORE_ADDR pd_brk_addr;

I believe this needs to go into the aix_thread_variables struct;
the address might be different if the pthread library is loaded
at different addresses into different inferiors.

> /* Whether the current architecture is 64-bit.
>    Only valid when pd_able is true.  */
>
>static int arch64;

Likewise - some inferiors may be 64-bit and others 32-bit.

In general, *any* static variable in this file is suspect.


>+/* Callback for counting GDB threads for process pid.  */
>
> static int
>-giter_count (struct thread_info *thread, void *countp)
>+giter_count (pid_t pid)

This only was a callback because it was called via
iterate_over_threads.  Now that you're using the
all_threads C++ iterator, I think both of those
routines should just be inlined into its caller.


>@@ -565,6 +565,7 @@ solib_aix_bfd_open (const char *pathname)
>   const char *sep;
>   int filename_len;
>   int found_file;
>+  std::string string_path = pathname;
>
>   if (pathname[path_len - 1] != ')')
>     return solib_bfd_open (pathname);
>@@ -618,6 +619,15 @@ solib_aix_bfd_open (const char *pathname)
>       if (member_name == bfd_get_filename (object_bfd.get ()))
>        break;
>
>+      std::string s = bfd_get_filename (object_bfd.get ());
>+
>+      /* For every inferior after first int bfd system we
>+       will have the pathname instead of the member name
>+       registered. Hence the below condition exists.  */
>+
>+      if (string_path.compare (s) == 0)
>+      return object_bfd;

That's still not quite right, as the pathname component
might have been changed here:
  /* Calling solib_find makes certain that sysroot path is set properly
     if program has a dependency on .a archive and sysroot is set via
     set sysroot command.  */
  gdb::unique_xmalloc_ptr<char> found_pathname
    = solib_find (filename.c_str (), &found_file);

I think a simple but correct way to check whether the BFD filename
already contains a member name is to check for the presence of an
opening parenthesis e.g. via:
    strrchr (bfd_get_filename (...), '(');


Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 26579 bytes --]

From 96613507fbf962eb90940c9763a60d2b50f6a46d Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Fri, 3 Feb 2023 04:24:24 -0600
Subject: [PATCH] Fix Multi thread debug bug fix in AIX

 In the recent commit 98ed24fb35d89eb20179edf6c12f599c7a9e228e made by Mr. Tom there is a change in aix-thread.c file that changes

 static_cast <aix_thread_info *> in gdb to gdb::checked_static_cast <aix_thread_info *>

 AIX folks using the latest version will not be able to debug multi thread programs as a result of it

The error in AIX is as follows:-

internal-error checked_static_cast Assertion result != nullptr failed.

The reason being AIX shared library were not being loaded for a new inferior and top target was not set properly.

This patch is a fix for the same.
---
 gdb/aix-thread.c | 333 +++++++++++++++++++++++++++++------------------
 gdb/solib-aix.c  |  10 ++
 2 files changed, 219 insertions(+), 124 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..23e1aa6e90c 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -68,10 +68,6 @@ static bool debug_aix_thread;
 #define pthdb_tid_t	tid_t
 #endif
 
-/* Return whether to treat PID as a debuggable thread id.  */
-
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
-
 /* Success and failure values returned by pthdb callbacks.  */
 
 #define PDC_SUCCESS	PTHDB_SUCCESS
@@ -144,24 +140,6 @@ class aix_thread_target final : public target_ops
 
 static aix_thread_target aix_thread_ops;
 
-/* Address of the function that libpthread will call when libpthdebug
-   is ready to be initialized.  */
-
-static CORE_ADDR pd_brk_addr;
-
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
-/* Whether the current architecture is 64-bit.  
-   Only valid when pd_able is true.  */
-
-static int arch64;
-
 /* Forward declarations for pthdb callbacks.  */
 
 static int pdc_symbol_addrs (pthdb_user_t, pthdb_symbol_t *, int);
@@ -191,9 +169,66 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
+
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+
+  /* Address of the function that libpthread will call when libpthdebug
+   is ready to be initialized.  */
+  CORE_ADDR pd_brk_addr;
+
+  /* Whether the current architecture is 64-bit.
+   Only valid when pd_able is true.  */
+  int arch64;
+};
 
-static pthdb_session_t pd_session;
+/* Key to our per-inferior data.  */
+static const registry<inferior>::key<aix_thread_variables>
+  aix_thread_variables_handle;
+
+/* Function to Get aix_thread_variables data.  */
+static struct aix_thread_variables*
+get_aix_thread_variables_data (struct inferior *inf)
+{
+  if (inf == NULL)
+    return NULL;
+
+  struct aix_thread_variables* data;
+
+  data = aix_thread_variables_handle.get (inf);
+  if (data == NULL)
+    data = aix_thread_variables_handle.emplace (inf);
+
+  return data;
+}
+
+/* Helper to get data for ptid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_ptid (ptid_t ptid)
+{
+  inferior *inf = find_inferior_ptid (current_inferior ()->process_target (),
+					ptid);
+  return get_aix_thread_variables_data (inf);
+}
+
+/* Helper to get data for pid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_pid (pid_t pid)
+{
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (),
+                                        pid);
+  return get_aix_thread_variables_data (inf);
+}
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -318,7 +353,7 @@ pid_to_prc (ptid_t *ptidp)
   ptid_t ptid;
 
   ptid = *ptidp;
-  if (PD_TID (ptid))
+  if (ptid.tid () != 0)
     *ptidp = ptid_t (ptid.pid ());
 }
 
@@ -389,6 +424,9 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
   
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_read_regs tid=%d flags=%s\n",
@@ -397,7 +435,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -423,7 +461,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -456,6 +494,10 @@ pdc_write_regs (pthdb_user_t user_current_pid,
      this is needed, I have implemented what I think it should do,
      however this code is untested.  */
 
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
+
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_write_regs tid=%d flags=%s\n",
 		(int) tid, hex_string (flags));
@@ -463,7 +505,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_WRITE_GPRS, tid, 
 		     (unsigned long) context->gpr, 0, NULL);
       else
@@ -479,7 +521,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  ptrace64aix (PTT_WRITE_SPRS, tid, 
 		       (unsigned long) &context->msr, 0, NULL);
@@ -508,14 +550,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
     if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+      inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,39 +680,6 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_count (struct thread_info *thread, void *countp)
-{
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
-}
-
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_accum (struct thread_info *thread, void *bufp)
-{
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
-  return 0;
-}
-
 /* ptid comparison function */
 
 static int
@@ -719,7 +727,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +761,11 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +775,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (data->pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (data->pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +796,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (data->pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,10 +806,17 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
   gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+  {
+    **(struct thread_info ***) &g = tp;
+    (*(struct thread_info ***) &g)++;
+  }
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
   /* Apply differences between the two arrays to GDB's thread list.  */
@@ -810,8 +833,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -830,6 +851,23 @@ sync_threadlists (int pid)
 
 	  cmp_result = ptid_cmp (pptid, gptid);
 
+	  /* If there is only one thread then we need not make the main 
+	     thread look like a thread.  It can stay as a process. This
+	     is useful when we have multiple inferiors, but only one is
+	     threaded.  So we need not make the other inferiors with only
+	     main thread, look like a threaded one.  For example, Thread
+	     1.1, 1.2, 2.1, 3.1 exists then it is useful to skip this for
+	     loop for 2.1 and 3.1 leaving them as main process thread with
+	     a dummy priv set.  */
+
+	  if (pcount == 1 && gcount == 1)
+	  {
+	    aix_thread_info *priv = new aix_thread_info;
+	    tp = find_thread_ptid (proc_target, gptid);
+	    tp->priv.reset (priv);
+	    break;
+	  }
+
 	  if (cmp_result == 0)
 	    {
 	      aix_thread_info *priv = get_aix_thread_info (gbuf[gi]);
@@ -841,13 +879,28 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+		 like a thread.  */
+
+	      if (gptid.is_pid () && gptid.pid () == pptid.pid ())
+	      {
+		thread_change_ptid (proc_target, gptid, pptid);
+		aix_thread_info *priv = new aix_thread_info;
+		priv->pdtid = pbuf[pi].pdtid;
+		priv->tid = pbuf[pi].tid;
+		tp = find_thread_ptid (proc_target, pptid);
+		tp->priv.reset (priv);
+		pi++;
+		gi++;
+	      }
+	      else
+	      {
+		delete_thread (gbuf[gi]);
+		gi++;
+	      }
 	    }
 	  else
 	    {
-	      process_stratum_target *proc_target
-		= current_inferior ()->process_target ();
 	      thread = add_thread (proc_target, pptid);
 
 	      aix_thread_info *priv = new aix_thread_info;
@@ -887,11 +940,14 @@ pd_update (int pid)
   ptid_t ptid;
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
+  struct aix_thread_variables *data;
 
-  if (!pd_active)
+  data = get_thread_data_helper_for_pid (pid);
+
+  if (!data->pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (data->pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
@@ -918,15 +974,17 @@ static ptid_t
 pd_activate (int pid)
 {
   int status;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 		
-  status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
+  status = pthdb_session_init (pid, data->arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &data->pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  data->pd_active = 1;
   return pd_update (pid);
 }
 
@@ -935,12 +993,15 @@ pd_activate (int pid)
 static void
 pd_deactivate (void)
 {
-  if (!pd_active)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_active)
     return;
-  pthdb_session_destroy (pd_session);
+  pthdb_session_destroy (data->pd_session);
   
   pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  data->pd_active = 0;
 }
 
 /* An object file has just been loaded.  Check whether the current
@@ -952,13 +1013,19 @@ pd_enable (void)
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  struct aix_thread_variables *data;
+
+  if (!inferior_ptid.pid ())
+    return;
+  
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (data->pd_able)
     return;
 
   /* Check application word size.  */
-  arch64 = register_size (target_gdbarch (), 0) == 8;
+  data->arch64 = register_size (target_gdbarch (), 0) == 8;
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
@@ -972,13 +1039,13 @@ pd_enable (void)
   ms = lookup_minimal_symbol (stub_name, NULL, NULL);
   if (ms.minsym == NULL)
     return;
-  pd_brk_addr = ms.value_address ();
-  if (!create_thread_event_breakpoint (target_gdbarch (), pd_brk_addr))
+  data->pd_brk_addr = ms.value_address ();
+  if (!create_thread_event_breakpoint (target_gdbarch (), data->pd_brk_addr))
     return;
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  data->pd_able = 1; 
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
@@ -991,28 +1058,27 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_able)
     return;
-  if (pd_active)
+  if (data->pd_active)
     pd_deactivate ();
-  pd_able = 0;
+  data->pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1042,8 +1108,11 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 {
   struct thread_info *thread;
   pthdb_tid_t tid[2];
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (ptid);
 
-  if (!PD_TID (ptid))
+  if (!(ptid.tid () != 0))
     {
       scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
       
@@ -1065,7 +1134,7 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 	       ptid.lwp ());
       tid[1] = 0;
 
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_CONTINUE, tid[0], (long long) 1,
 		     gdb_signal_to_host (sig), (PTRACE_TYPE_ARG5) tid);
       else
@@ -1082,6 +1151,7 @@ ptid_t
 aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
 			 target_wait_flags options)
 {
+  struct aix_thread_variables *data;
   {
     pid_to_prc (&ptid);
 
@@ -1095,8 +1165,10 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
      pid-only ptids.  */
   gdb_assert (ptid.is_pid ());
 
+  data = get_thread_data_helper_for_ptid (ptid);
+
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1105,7 +1177,7 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
       struct gdbarch *gdbarch = regcache->arch ();
 
       if (regcache_read_pc (regcache)
-	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
+	  - gdbarch_decr_pc_after_break (gdbarch) == data->pd_brk_addr)
 	return pd_activate (ptid.pid ());
     }
 
@@ -1229,18 +1301,20 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep> (gdbarch);
   int status, i;
   pthdb_context_t ctx;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
 
   /* General-purpose registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_gprs64 (regcache, ctx.gpr);
   else
     for (i = 0; i < ppc_num_gprs; i++)
@@ -1253,7 +1327,7 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Special registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_sprs64 (regcache, ctx.iar, ctx.msr, ctx.cr, ctx.lr, ctx.ctr,
 			     ctx.xer, ctx.fpscr);
   else
@@ -1288,18 +1362,21 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
   int i;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"fetch_regs_kernel_thread tid=%lx regno=%d arch64=%d\n",
-		(long) tid, regno, arch64);
+		(long) tid, regno, data->arch64);
 
   /* General-purpose registers.  */
   if (regno == -1
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_gprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -1331,7 +1408,7 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -1363,7 +1440,7 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (!(regcache->ptid ().tid () != 0))
     beneath ()->fetch_registers (regcache, regno);
   else
     {
@@ -1511,6 +1588,8 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   pthdb_context_t ctx;
   uint32_t int32;
   uint64_t int64;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1518,7 +1597,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1528,7 +1607,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   for (i = 0; i < ppc_num_gprs; i++)
     if (REG_VALID == regcache->get_register_status (tdep->ppc_gp0_regnum + i))
       {
-	if (arch64)
+	if (data->arch64)
 	  {
 	    regcache->raw_collect (tdep->ppc_gp0_regnum + i, (void *) &int64);
 	    ctx.gpr[i] = int64;
@@ -1545,7 +1624,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
     fill_fprs (regcache, ctx.fpr);
 
   /* Special registers (always kept in ctx as 64 bits).  */
-  if (arch64)
+  if (data->arch64)
     {
       fill_sprs64 (regcache, &ctx.iar, &ctx.msr, &ctx.cr, &ctx.lr, &ctx.ctr,
 			     &ctx.xer, &ctx.fpscr);
@@ -1576,7 +1655,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1602,6 +1681,9 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs  sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1613,7 +1695,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_fprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some regs may not be in the cache.  */
 	  ptrace64aix (PTT_READ_GPRS, tid, (unsigned long) gprs64, 0, NULL);
@@ -1646,7 +1728,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some registers won't be in the cache.  */
 	  ptrace64aix (PTT_READ_SPRS, tid, 
@@ -1703,7 +1785,7 @@ aix_thread_target::store_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (!(regcache->ptid ().tid () != 0))
     beneath ()->store_registers (regcache, regno);
   else
     {
@@ -1741,7 +1823,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1750,7 +1832,7 @@ aix_thread_target::mourn_inferior ()
 bool
 aix_thread_target::thread_alive (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (!(ptid.tid () != 0))
     return beneath ()->thread_alive (ptid);
 
   /* We update the thread list every time the child stops, so all
@@ -1766,7 +1848,7 @@ aix_thread_target::thread_alive (ptid_t ptid)
 std::string
 aix_thread_target::pid_to_str (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (!(ptid.tid () != 0))
     return beneath ()->pid_to_str (ptid);
 
   return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
@@ -1786,8 +1868,11 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
   pthdb_detachstate_t detachstate;
   int cancelpend;
   static char *ret = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_ptid (thread->ptid);
 
-  if (!PD_TID (thread->ptid))
+  if (!(thread->ptid.tid () != 0))
     return NULL;
 
   string_file buf;
@@ -1800,24 +1885,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (data->pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (data->pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (data->pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (data->pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..671c17cba46 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -618,6 +618,16 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+
+      /* For every inferior after first int bfd system we 
+	 will have the pathname instead of the member name
+	 registered. Hence the below condition exists.  */
+
+      if (s.find ('(') != std::string::npos
+	  && s.find (member_name) != std::string::npos)
+	return object_bfd; 
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-03 11:10                                                             ` Aditya Kamath1
@ 2023-02-06 19:07                                                               ` Ulrich Weigand
  2023-02-07 11:57                                                                 ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2023-02-06 19:07 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>I think the question here is simply whether, if you run the
>>test suite both without and with your patch, are any of the
>>FAILs fixed with the patch?   If not, it would be good to
>>create a new test that fails without the patch and succeeds
>>with it, and add that to the test suite.
>
>So, this is something new to me. We will add it as a continuation in the same thread
>after this patch. I will need one information. Which test suite will we add it in?
>gdb.threads or gdb.base? Also, kindly suggest a simple test case that is written that
>I can see and learn. Any simple hello_world program will do. I want to understand how
>that exp file is written and how it compares to tell if a test case is pass or fail.  

I think this would fit better into gdb.threads, given that this is about the
interaction of multiple inferiors with the threading library on AIX.

I'd just look at existing test cases in that directory.  For simple tests, we
usually have a .c file and a .exp file with the same name.  The .exp file
starts out with instructions to build the test case, and start it up under
GDB.  Then follow a series of test statements which are verified against
the output of the GDB under test.  As a simple example in a related area,
you can look e.g. at fork-child-threads.{c,exp}.


>Kindly give me feedback for this patch, incase we can do anything better
>or is incorrect.

Some comments:

>@@ -508,14 +550,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
>   /* This is needed to eliminate the dependency of current thread
>      which is null so that thread reads the correct target memory.  */
>   {
>-    scoped_restore_current_thread restore_current_thread;
>+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>     /* Before the first inferior is added, we pass inferior_ptid.pid ()
>        from pd_enable () which is 0.  There is no need to switch threads
>        during first initialisation.  In the rest of the callbacks the
>        current thread needs to be correct.  */
>     if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-			ptid_t (user_current_pid));
>+      inferior_ptid = ptid_t (user_current_pid);
>     status = target_read_memory (addr, (gdb_byte *) buf, len);
>   }

This seems unrelated to the rest of the changes at first glance.
Why is this necessary?

Also, is the "user_current_pid != 0" check even still needed given
the change to pd_enable() below?

By comparison, the Linux version of this in proc-service.c also
switches the current inferior and address space:

  scoped_restore_current_inferior restore_inferior;
  set_current_inferior (ph->thread->inf);

  scoped_restore_current_program_space restore_current_progspace;
  set_current_program_space (ph->thread->inf->pspace);

  scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
  inferior_ptid = ph->thread->ptid;

so we should probably do the same for consistency.

Also, the same logic will be required in pdc_write_data, where it
is currently missing completely.

>+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
>+  {
>+    **(struct thread_info ***) &g = tp;
>+    (*(struct thread_info ***) &g)++;
>+  }

This looks unnecessarily complicated.  Isn't this just
   *g++ = tp;
?

>+	  /* If there is only one thread then we need not make the main 
>+	     thread look like a thread.  It can stay as a process. This
>+	     is useful when we have multiple inferiors, but only one is
>+	     threaded.  So we need not make the other inferiors with only
>+	     main thread, look like a threaded one.  For example, Thread
>+	     1.1, 1.2, 2.1, 3.1 exists then it is useful to skip this for
>+	     loop for 2.1 and 3.1 leaving them as main process thread with
>+	     a dummy priv set.  */
>+
>+	  if (pcount == 1 && gcount == 1)
>+	  {
>+	    aix_thread_info *priv = new aix_thread_info;
>+	    tp = find_thread_ptid (proc_target, gptid);
>+	    tp->priv.reset (priv);
>+	    break;
>+	  }

Is this a change in behavior to current GDB?  I thought if the
application (whether a single inferior or one of multiple inferiors)
is threaded in the sense that it uses the libpthread library we
wanted to show it as threaded, so that the user can e.g. see the
thread ID in info threads.

>+	      /* This is to make the main process thread now look
>+		 like a thread.  */
>+
>+	      if (gptid.is_pid () && gptid.pid () == pptid.pid ())
>+	      {
>+		thread_change_ptid (proc_target, gptid, pptid);
>+		aix_thread_info *priv = new aix_thread_info;
>+		priv->pdtid = pbuf[pi].pdtid;
>+		priv->tid = pbuf[pi].tid;
>+		tp = find_thread_ptid (proc_target, pptid);
>+		tp->priv.reset (priv);
>+		pi++;
>+		gi++;
>+	      }
>+	      else
>+	      {
>+		delete_thread (gbuf[gi]);
>+		gi++;
>+	      }

This logic is still confusing me.  Why is the
   gptid.pid () == pptid.pid ()
check still needed?  I thought we now collected only threads
of a single process to begin with, so they all ought to have
the same PID?

Also, if the point is the gptid.is_pid () check, this can
really only happen once per inferior, as it is switched
from non-threaded to threaded mode, right?  Maybe it
would simplify the logic to have all that (including
the code under 
  if (pcount == 1 && gcount == 1)
above if it is actually needed) in a separate statement
before that loop.

I.e. directly before the loop, have a separate check
whether the current process only has a single thread,
whose ptid_t is still in the pid-only format, and if
so, upgrade it to full TID format using the main thread's
TID.  Only after that, go through the loop to handle
any other threads we may also have.  (At that point,
all GDB threads should already always be in TID format.)

>-  if (!PD_TID (ptid))
>+  if (!(ptid.tid () != 0))

That should just be "if (ptid.tid () == 0)" then.
(Here and in a few other places.)

>@@ -1741,7 +1823,7 @@ aix_thread_target::mourn_inferior ()
> {
>   target_ops *beneath = this->beneath ();
> 
>-  pd_deactivate ();
>+  pd_disable ();
>   beneath->mourn_inferior ();
> }

Why is this necessary?  If it is, do we even need two
separate pd_deactivate and pd_disable routines any more?

>@@ -618,6 +618,16 @@ solib_aix_bfd_open (const char *pathname)
>       if (member_name == bfd_get_filename (object_bfd.get ()))
> 	break;
> 
>+      std::string s = bfd_get_filename (object_bfd.get ());
>+
>+      /* For every inferior after first int bfd system we 
>+	 will have the pathname instead of the member name
>+	 registered. Hence the below condition exists.  */
>+
>+      if (s.find ('(') != std::string::npos
>+	  && s.find (member_name) != std::string::npos)
>+	return object_bfd; 

Ah, I guess you also need to ensure the member_name follows
immediately after the '(', otherwise there could be confusion
if the member name happens to be part of the file name as well.


Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-06 19:07                                                               ` Ulrich Weigand
@ 2023-02-07 11:57                                                                 ` Aditya Kamath1
  2023-02-08 18:44                                                                   ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2023-02-07 11:57 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 18085 bytes --]

Hi Ulrich, Tom and community,

Please find attached the patch. I have written my answers to the previous comments. Kindly let me know if we need more changes. If not kindly push this to the community code. The sample output and programs are pasted below this email.

>I think this would fit better into gdb.threads, given that this is about the
>interaction of multiple inferiors with the threading library on AIX.

This I will do it immediately after this patch is done.

>>     if (user_current_pid != 0)
>>+      inferior_ptid = ptid_t (user_current_pid);

>This seems unrelated to the rest of the changes at first glance.
>Why is this necessary?

So, when we need to be in the right context when we read memory. Before coming into the target wait, we switch_to_no_thread () due to which our inferior_ptid is set to null. Our target_memory needs the correct inferior_ptid.  Also, in case we don't have a ptid_t (pid) and the application is threaded we need the inferior_ptid to be set correctly like shown in the patch. Previously we used switch_to_thread ().. Now if the application is theraded and we only pass ptid_t (user_current_pid) to switch_to_thread () it will crash as main thread looks different or is ptid_t (pid, 0, tid). Hence, we set inferior_ptid to simplify.

>Also, is the "user_current_pid != 0" check even still needed given
>the change to pd_enable() below?

So, this I have removed. You were right.

>By comparison, the Linux version of this in proc-service.c also
>switches the current inferior and address space:
 > scoped_restore_current_inferior restore_inferior;
 > set_current_inferior (ph->thread->inf);
  >scoped_restore_current_program_space restore_current_progspace;
  >set_current_program_space (ph->thread->inf->pspace);
 > scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
  >inferior_ptid = ph->thread->ptid;
> so we should probably do the same for consistency.

So, kindly allow me to disagree with you on this. What is happening is in inferior.c in do_target_wait1 () we call switch_to_inferior_no_thread ().. The function is as follows


void

switch_to_inferior_no_thread (inferior *inf)

{

  set_current_inferior (inf);

  switch_to_no_thread ();

  set_current_program_space (inf->pspace);

}


Here we already set the correct current inferior and program space to the same thing as that if we set in pdc_read_memory like linux. So, it does not make any difference to add the changes like linux does. In the switch_to_no_thread () we set inferior_ptid to null and that is why we only set inferior_ptid in pdc_read_memory and not anything else. So, I suggest we stick to this plan. Secondly, things work if we do not do the same for pdc_write_memory. I have not seen anything not work. So, I don't think it is good to add it there. What say??

>This looks unnecessarily complicated.  Isn't this just
  > *g++ = tp;

This I have changed.

>Is this a change in behavior to current GDB?  I thought if the
>application (whether a single inferior or one of multiple inferiors)
>is threaded in the sense that it uses the libpthread library we
>wanted to show it as threaded, so that the user can e.g. see the
>thread ID in info threads.

So, you are right. I read it somewhere which I am not able to recall that only in multiple threads we need to show as thread. I checked the Linux output. It is what you mentioned. I have removed the gcount ==1 && pcount == 1 condition..

>This logic is still confusing me.  Why is the
 >  gptid.pid () == pptid.pid ()
>check still needed?  I thought we now collected only threads
>of a single process to begin with, so they all ought to have
>the same PID?

>Also, if the point is the gptid.is_pid () check, this can
>really only happen once per inferior, as it is switched
>from non-threaded to threaded mode, right?

So I removed the gptid.pid () == pptid.pid () condition. The reason I had added was the gcount {Thread count per process} was not per process before. I was worried to swap process. Now we do not need it.

As far as the check gptid.is_pid () is concerned, I will suggest we keep it there. If cmp_result is > 0 and we have a main process swap to create a thread. Rest is same in the loop. The reason being handling pi and gi variables becomes complex otherwise. When this swap happens, we need to increment both pi and gi.. Because we have taken care of the main threads in both pthread library and GDB. And this for loop is executed only once. So, the first event is main process being pthreaded. Once the swap happens pi and gi become one and since gcount = pcount = 1 we exit the for loop. Thread addition events comes after this.

>That should just be "if (ptid.tid () == 0)" then.
This is done

>-  pd_deactivate ();
>+  pd_disable ();
>Why is this necessary?  If it is, do we even need two
>separate pd_deactivate and pd_disable routines any more?

So, the process exits then all its threads also exit in the mourn inferior. So, we disable everything. Yes, I removed pd_deactivate ().

>>+      if (s.find ('(') != std::string::npos
>>+        && s.find (member_name) != std::string::npos)
>>+      return object_bfd;

>Ah, I guess you also need to ensure the member_name follows
>immediately after the '(', otherwise there could be confusion
>if the member name happens to be part of the file name as well.

This I have changed as per how you mentioned. Kindly check the patch and let me know :)

Have a nice day ahead.

Thanks and regards,
Aditya.

-------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}

-------------------------------------------------

Output with patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 15991124)]

I am parent

[New inferior 3 (process 20840796)]

I am parent

^Cin

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1.1  Thread 1 (tid 33947921, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 37421465, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 32899441, running) thread_function (arg=0x0)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  2.1  Thread 515 (tid 33751493, running) 0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  Thread 258 (tid 34931151, running) 0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb)


-----------------------------------------------------------------------
Output without patch:-
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)
(gdb)


________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 07 February 2023 00:37
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>I think the question here is simply whether, if you run the
>>test suite both without and with your patch, are any of the
>>FAILs fixed with the patch?   If not, it would be good to
>>create a new test that fails without the patch and succeeds
>>with it, and add that to the test suite.
>
>So, this is something new to me. We will add it as a continuation in the same thread
>after this patch. I will need one information. Which test suite will we add it in?
>gdb.threads or gdb.base? Also, kindly suggest a simple test case that is written that
>I can see and learn. Any simple hello_world program will do. I want to understand how
>that exp file is written and how it compares to tell if a test case is pass or fail.

I think this would fit better into gdb.threads, given that this is about the
interaction of multiple inferiors with the threading library on AIX.

I'd just look at existing test cases in that directory.  For simple tests, we
usually have a .c file and a .exp file with the same name.  The .exp file
starts out with instructions to build the test case, and start it up under
GDB.  Then follow a series of test statements which are verified against
the output of the GDB under test.  As a simple example in a related area,
you can look e.g. at fork-child-threads.{c,exp}.


>Kindly give me feedback for this patch, incase we can do anything better
>or is incorrect.

Some comments:

>@@ -508,14 +550,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
>   /* This is needed to eliminate the dependency of current thread
>      which is null so that thread reads the correct target memory.  */
>   {
>-    scoped_restore_current_thread restore_current_thread;
>+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>     /* Before the first inferior is added, we pass inferior_ptid.pid ()
>        from pd_enable () which is 0.  There is no need to switch threads
>        during first initialisation.  In the rest of the callbacks the
>        current thread needs to be correct.  */
>     if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-                      ptid_t (user_current_pid));
>+      inferior_ptid = ptid_t (user_current_pid);
>     status = target_read_memory (addr, (gdb_byte *) buf, len);
>   }

This seems unrelated to the rest of the changes at first glance.
Why is this necessary?

Also, is the "user_current_pid != 0" check even still needed given
the change to pd_enable() below?

By comparison, the Linux version of this in proc-service.c also
switches the current inferior and address space:

  scoped_restore_current_inferior restore_inferior;
  set_current_inferior (ph->thread->inf);

  scoped_restore_current_program_space restore_current_progspace;
  set_current_program_space (ph->thread->inf->pspace);

  scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
  inferior_ptid = ph->thread->ptid;

so we should probably do the same for consistency.

Also, the same logic will be required in pdc_write_data, where it
is currently missing completely.

>+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
>+  {
>+    **(struct thread_info ***) &g = tp;
>+    (*(struct thread_info ***) &g)++;
>+  }

This looks unnecessarily complicated.  Isn't this just
   *g++ = tp;
?

>+        /* If there is only one thread then we need not make the main
>+           thread look like a thread.  It can stay as a process. This
>+           is useful when we have multiple inferiors, but only one is
>+           threaded.  So we need not make the other inferiors with only
>+           main thread, look like a threaded one.  For example, Thread
>+           1.1, 1.2, 2.1, 3.1 exists then it is useful to skip this for
>+           loop for 2.1 and 3.1 leaving them as main process thread with
>+           a dummy priv set.  */
>+
>+        if (pcount == 1 && gcount == 1)
>+        {
>+          aix_thread_info *priv = new aix_thread_info;
>+          tp = find_thread_ptid (proc_target, gptid);
>+          tp->priv.reset (priv);
>+          break;
>+        }

Is this a change in behavior to current GDB?  I thought if the
application (whether a single inferior or one of multiple inferiors)
is threaded in the sense that it uses the libpthread library we
wanted to show it as threaded, so that the user can e.g. see the
thread ID in info threads.

>+            /* This is to make the main process thread now look
>+               like a thread.  */
>+
>+            if (gptid.is_pid () && gptid.pid () == pptid.pid ())
>+            {
>+              thread_change_ptid (proc_target, gptid, pptid);
>+              aix_thread_info *priv = new aix_thread_info;
>+              priv->pdtid = pbuf[pi].pdtid;
>+              priv->tid = pbuf[pi].tid;
>+              tp = find_thread_ptid (proc_target, pptid);
>+              tp->priv.reset (priv);
>+              pi++;
>+              gi++;
>+            }
>+            else
>+            {
>+              delete_thread (gbuf[gi]);
>+              gi++;
>+            }

This logic is still confusing me.  Why is the
   gptid.pid () == pptid.pid ()
check still needed?  I thought we now collected only threads
of a single process to begin with, so they all ought to have
the same PID?

Also, if the point is the gptid.is_pid () check, this can
really only happen once per inferior, as it is switched
from non-threaded to threaded mode, right?  Maybe it
would simplify the logic to have all that (including
the code under
  if (pcount == 1 && gcount == 1)
above if it is actually needed) in a separate statement
before that loop.

I.e. directly before the loop, have a separate check
whether the current process only has a single thread,
whose ptid_t is still in the pid-only format, and if
so, upgrade it to full TID format using the main thread's
TID.  Only after that, go through the loop to handle
any other threads we may also have.  (At that point,
all GDB threads should already always be in TID format.)

>-  if (!PD_TID (ptid))
>+  if (!(ptid.tid () != 0))

That should just be "if (ptid.tid () == 0)" then.
(Here and in a few other places.)

>@@ -1741,7 +1823,7 @@ aix_thread_target::mourn_inferior ()
> {
>   target_ops *beneath = this->beneath ();
>
>-  pd_deactivate ();
>+  pd_disable ();
>   beneath->mourn_inferior ();
> }

Why is this necessary?  If it is, do we even need two
separate pd_deactivate and pd_disable routines any more?

>@@ -618,6 +618,16 @@ solib_aix_bfd_open (const char *pathname)
>       if (member_name == bfd_get_filename (object_bfd.get ()))
>        break;
>
>+      std::string s = bfd_get_filename (object_bfd.get ());
>+
>+      /* For every inferior after first int bfd system we
>+       will have the pathname instead of the member name
>+       registered. Hence the below condition exists.  */
>+
>+      if (s.find ('(') != std::string::npos
>+        && s.find (member_name) != std::string::npos)
>+      return object_bfd;

Ah, I guess you also need to ensure the member_name follows
immediately after the '(', otherwise there could be confusion
if the member name happens to be part of the file name as well.


Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 26021 bytes --]

From 1f5e0d208173d36c6f940e4f98f73f9e93934fe9 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Tue, 7 Feb 2023 05:42:13 -0600
Subject: [PATCH] Fix Multi thread debug bug fix in AIX

In the recent commit 98ed24fb35d89eb20179edf6c12f599c7a9e228e made by Mr. Tom there is a change  in aix-thread.c file that changes

static_cast <aix_thread_info *> in gdb to gdb::checked_static_cast <aix_thread_info *>

AIX folks using the latest version will not be able to debug multi thread programs as a result of it

The error in AIX is as follows:-

internal-error: checked_static_cast: Assertion 'result != nullptr' failed.

The reason is that once the threads are syncronised with sync_threadlists () and threads are added with priv -

We iterate over threads to get the thread who caused the event and return its ptid

However the all_threads_safe () function will always return NULL as thread is yet to be updated in that list

So what happens is we were not setting the top target as threads as

Shared library was not loaded for a new process and

gcount was not counted per process

This patch is a fix for the same.
---
 gdb/aix-thread.c | 327 +++++++++++++++++++++++++++--------------------
 gdb/solib-aix.c  |  14 ++
 2 files changed, 205 insertions(+), 136 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..1d800f8b7be 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -68,10 +68,6 @@ static bool debug_aix_thread;
 #define pthdb_tid_t	tid_t
 #endif
 
-/* Return whether to treat PID as a debuggable thread id.  */
-
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
-
 /* Success and failure values returned by pthdb callbacks.  */
 
 #define PDC_SUCCESS	PTHDB_SUCCESS
@@ -144,24 +140,6 @@ class aix_thread_target final : public target_ops
 
 static aix_thread_target aix_thread_ops;
 
-/* Address of the function that libpthread will call when libpthdebug
-   is ready to be initialized.  */
-
-static CORE_ADDR pd_brk_addr;
-
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
-/* Whether the current architecture is 64-bit.  
-   Only valid when pd_able is true.  */
-
-static int arch64;
-
 /* Forward declarations for pthdb callbacks.  */
 
 static int pdc_symbol_addrs (pthdb_user_t, pthdb_symbol_t *, int);
@@ -191,9 +169,66 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
+
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+
+  /* Address of the function that libpthread will call when libpthdebug
+   is ready to be initialized.  */
+  CORE_ADDR pd_brk_addr;
+
+  /* Whether the current architecture is 64-bit.
+   Only valid when pd_able is true.  */
+  int arch64;
+};
+
+/* Key to our per-inferior data.  */
+static const registry<inferior>::key<aix_thread_variables>
+  aix_thread_variables_handle;
 
-static pthdb_session_t pd_session;
+/* Function to Get aix_thread_variables data.  */
+static struct aix_thread_variables*
+get_aix_thread_variables_data (struct inferior *inf)
+{
+  if (inf == NULL)
+    return NULL;
+
+  struct aix_thread_variables* data;
+
+  data = aix_thread_variables_handle.get (inf);
+  if (data == NULL)
+    data = aix_thread_variables_handle.emplace (inf);
+
+  return data;
+}
+
+/* Helper to get data for ptid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_ptid (ptid_t ptid)
+{
+  inferior *inf = find_inferior_ptid (current_inferior ()->process_target (),
+					ptid);
+  return get_aix_thread_variables_data (inf);
+}
+
+/* Helper to get data for pid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_pid (pid_t pid)
+{
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (),
+                                        pid);
+  return get_aix_thread_variables_data (inf);
+}
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -318,7 +353,7 @@ pid_to_prc (ptid_t *ptidp)
   ptid_t ptid;
 
   ptid = *ptidp;
-  if (PD_TID (ptid))
+  if (ptid.tid () != 0)
     *ptidp = ptid_t (ptid.pid ());
 }
 
@@ -389,6 +424,9 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
   
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_read_regs tid=%d flags=%s\n",
@@ -397,7 +435,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -423,7 +461,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -456,6 +494,10 @@ pdc_write_regs (pthdb_user_t user_current_pid,
      this is needed, I have implemented what I think it should do,
      however this code is untested.  */
 
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
+
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_write_regs tid=%d flags=%s\n",
 		(int) tid, hex_string (flags));
@@ -463,7 +505,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_WRITE_GPRS, tid, 
 		     (unsigned long) context->gpr, 0, NULL);
       else
@@ -479,7 +521,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  ptrace64aix (PTT_WRITE_SPRS, tid, 
 		       (unsigned long) &context->msr, 0, NULL);
@@ -508,14 +550,12 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
-    if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+    inferior_ptid = ptid_t (user_current_pid);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -639,39 +679,6 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_count (struct thread_info *thread, void *countp)
-{
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
-}
-
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_accum (struct thread_info *thread, void *bufp)
-{
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
-  return 0;
-}
-
 /* ptid comparison function */
 
 static int
@@ -719,7 +726,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -750,6 +760,11 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +774,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (data->pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (data->pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +795,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (data->pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,10 +805,17 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
   gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+  {
+    *g = tp;
+    *g++;
+  }
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
   /* Apply differences between the two arrays to GDB's thread list.  */
@@ -810,8 +832,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,13 +861,28 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+		 like a thread.  */
+
+	      if (gptid.is_pid ())
+	      {
+		thread_change_ptid (proc_target, gptid, pptid);
+		aix_thread_info *priv = new aix_thread_info;
+		priv->pdtid = pbuf[pi].pdtid;
+		priv->tid = pbuf[pi].tid;
+		tp = find_thread_ptid (proc_target, pptid);
+		tp->priv.reset (priv);
+		pi++;
+		gi++;
+	      }
+	      else
+	      {
+		delete_thread (gbuf[gi]);
+		gi++;
+	      }
 	    }
 	  else
 	    {
-	      process_stratum_target *proc_target
-		= current_inferior ()->process_target ();
 	      thread = add_thread (proc_target, pptid);
 
 	      aix_thread_info *priv = new aix_thread_info;
@@ -887,11 +922,14 @@ pd_update (int pid)
   ptid_t ptid;
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_pid (pid);
 
-  if (!pd_active)
+  if (!data->pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (data->pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
@@ -918,31 +956,20 @@ static ptid_t
 pd_activate (int pid)
 {
   int status;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 		
-  status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
+  status = pthdb_session_init (pid, data->arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &data->pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  data->pd_active = 1;
   return pd_update (pid);
 }
 
-/* Undo the effects of pd_activate().  */
-
-static void
-pd_deactivate (void)
-{
-  if (!pd_active)
-    return;
-  pthdb_session_destroy (pd_session);
-  
-  pid_to_prc (&inferior_ptid);
-  pd_active = 0;
-}
-
 /* An object file has just been loaded.  Check whether the current
    application is pthreaded, and if so, prepare for thread debugging.  */
 
@@ -952,13 +979,19 @@ pd_enable (void)
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  struct aix_thread_variables *data;
+
+  if (!inferior_ptid.pid ())
+    return;
+  
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (data->pd_able)
     return;
 
   /* Check application word size.  */
-  arch64 = register_size (target_gdbarch (), 0) == 8;
+  data->arch64 = register_size (target_gdbarch (), 0) == 8;
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
@@ -972,13 +1005,13 @@ pd_enable (void)
   ms = lookup_minimal_symbol (stub_name, NULL, NULL);
   if (ms.minsym == NULL)
     return;
-  pd_brk_addr = ms.value_address ();
-  if (!create_thread_event_breakpoint (target_gdbarch (), pd_brk_addr))
+  data->pd_brk_addr = ms.value_address ();
+  if (!create_thread_event_breakpoint (target_gdbarch (), data->pd_brk_addr))
     return;
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  data->pd_able = 1; 
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
@@ -991,28 +1024,31 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_able)
     return;
-  if (pd_active)
-    pd_deactivate ();
-  pd_able = 0;
+  if (!data->pd_active)
+    return;
+  pthdb_session_destroy (data->pd_session);
+ 
+  pid_to_prc (&inferior_ptid);
+  data->pd_active = 0;
+  data->pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1042,8 +1078,11 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 {
   struct thread_info *thread;
   pthdb_tid_t tid[2];
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (ptid);
 
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     {
       scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
       
@@ -1065,7 +1104,7 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 	       ptid.lwp ());
       tid[1] = 0;
 
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_CONTINUE, tid[0], (long long) 1,
 		     gdb_signal_to_host (sig), (PTRACE_TYPE_ARG5) tid);
       else
@@ -1082,6 +1121,7 @@ ptid_t
 aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
 			 target_wait_flags options)
 {
+  struct aix_thread_variables *data;
   {
     pid_to_prc (&ptid);
 
@@ -1095,8 +1135,10 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
      pid-only ptids.  */
   gdb_assert (ptid.is_pid ());
 
+  data = get_thread_data_helper_for_ptid (ptid);
+
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1105,7 +1147,7 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
       struct gdbarch *gdbarch = regcache->arch ();
 
       if (regcache_read_pc (regcache)
-	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
+	  - gdbarch_decr_pc_after_break (gdbarch) == data->pd_brk_addr)
 	return pd_activate (ptid.pid ());
     }
 
@@ -1229,18 +1271,20 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep> (gdbarch);
   int status, i;
   pthdb_context_t ctx;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
 
   /* General-purpose registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_gprs64 (regcache, ctx.gpr);
   else
     for (i = 0; i < ppc_num_gprs; i++)
@@ -1253,7 +1297,7 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Special registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_sprs64 (regcache, ctx.iar, ctx.msr, ctx.cr, ctx.lr, ctx.ctr,
 			     ctx.xer, ctx.fpscr);
   else
@@ -1288,18 +1332,21 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
   int i;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"fetch_regs_kernel_thread tid=%lx regno=%d arch64=%d\n",
-		(long) tid, regno, arch64);
+		(long) tid, regno, data->arch64);
 
   /* General-purpose registers.  */
   if (regno == -1
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_gprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -1331,7 +1378,7 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -1363,7 +1410,7 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->fetch_registers (regcache, regno);
   else
     {
@@ -1511,6 +1558,8 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   pthdb_context_t ctx;
   uint32_t int32;
   uint64_t int64;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1518,7 +1567,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1528,7 +1577,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   for (i = 0; i < ppc_num_gprs; i++)
     if (REG_VALID == regcache->get_register_status (tdep->ppc_gp0_regnum + i))
       {
-	if (arch64)
+	if (data->arch64)
 	  {
 	    regcache->raw_collect (tdep->ppc_gp0_regnum + i, (void *) &int64);
 	    ctx.gpr[i] = int64;
@@ -1545,7 +1594,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
     fill_fprs (regcache, ctx.fpr);
 
   /* Special registers (always kept in ctx as 64 bits).  */
-  if (arch64)
+  if (data->arch64)
     {
       fill_sprs64 (regcache, &ctx.iar, &ctx.msr, &ctx.cr, &ctx.lr, &ctx.ctr,
 			     &ctx.xer, &ctx.fpscr);
@@ -1576,7 +1625,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1602,6 +1651,9 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs  sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1613,7 +1665,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_fprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some regs may not be in the cache.  */
 	  ptrace64aix (PTT_READ_GPRS, tid, (unsigned long) gprs64, 0, NULL);
@@ -1646,7 +1698,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some registers won't be in the cache.  */
 	  ptrace64aix (PTT_READ_SPRS, tid, 
@@ -1703,7 +1755,7 @@ aix_thread_target::store_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->store_registers (regcache, regno);
   else
     {
@@ -1741,7 +1793,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1750,7 +1802,7 @@ aix_thread_target::mourn_inferior ()
 bool
 aix_thread_target::thread_alive (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->thread_alive (ptid);
 
   /* We update the thread list every time the child stops, so all
@@ -1766,7 +1818,7 @@ aix_thread_target::thread_alive (ptid_t ptid)
 std::string
 aix_thread_target::pid_to_str (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->pid_to_str (ptid);
 
   return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
@@ -1786,8 +1838,11 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
   pthdb_detachstate_t detachstate;
   int cancelpend;
   static char *ret = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_ptid (thread->ptid);
 
-  if (!PD_TID (thread->ptid))
+  if (thread->ptid.tid () == 0)
     return NULL;
 
   string_file buf;
@@ -1800,24 +1855,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (data->pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (data->pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (data->pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (data->pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..6be81064ebd 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -618,6 +618,20 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+
+      /* For every inferior after first int bfd system we 
+	 will have the pathname instead of the member name
+	 registered. Hence the below condition exists.  */
+
+      if (s.find ('(') != std::string::npos)
+	{
+	  int pos = s.find ('(');
+	  int len = s.find (')') - s.find ('(');
+	  if (s.substr (pos+1, len-1) == member_name) 
+	    return object_bfd;
+	}
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-07 11:57                                                                 ` Aditya Kamath1
@ 2023-02-08 18:44                                                                   ` Ulrich Weigand
  2023-02-10 16:33                                                                     ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2023-02-08 18:44 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>This seems unrelated to the rest of the changes at first glance.
>>Why is this necessary?
>
>So, when we need to be in the right context when we read memory. Before
>coming into the target wait, we switch_to_no_thread () due to which our
>inferior_ptid is set to null. Our target_memory needs the correct
>inferior_ptid.  Also, in case we don't have a ptid_t (pid) and the
>application is threaded we need the inferior_ptid to be set correctly
>like shown in the patch.

Understood.

>Previously we used switch_to_thread ().. Now if the application is
>theraded and we only pass ptid_t (user_current_pid) to switch_to_thread ()
>it will crash as main thread looks different or is ptid_t (pid, 0, tid).

This part I don't quite understand yet - how/why does it crash?

>>By comparison, the Linux version of this in proc-service.c also
>>switches the current inferior and address space:
> > scoped_restore_current_inferior restore_inferior;
> > set_current_inferior (ph->thread->inf);
> > scoped_restore_current_program_space restore_current_progspace;
> > set_current_program_space (ph->thread->inf->pspace);
> > scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
> > inferior_ptid = ph->thread->ptid;
>> so we should probably do the same for consistency.

>So, kindly allow me to disagree with you on this. What is happening is in
>inferior.c in do_target_wait1 () we call switch_to_inferior_no_thread ()..
[snip]
>Here we already set the correct current inferior and program space to
>the same thing as that if we set in pdc_read_memory like linux.
>So, it does not make any difference to add the changes like linux does

Well, it does look like if you entered the callback in this particular
context, the inferior may have already been set up correctly.  However,
in theory the callback could also be called in different contexts, and
just as a precaution it would be preferable to have it always work
correctly.  The semantics of the callback is to read memory of a
particular process as identified via the pthdb_user_t argument, and
we should write the routine so that it always does what's needed to
implement that semantics correctly.

>Secondly, things work if we do not do the same for pdc_write_memory.
>I have not seen anything not work. So, I don't think it is good to
>add it there. What say??

Similarly, I agree that everything may currently "work" without
adding the equivalent change to pdc_write_memory, but most likely
this is simply because that callback may just not be used very much.

But as a precaution, and to accommodate potential future changes
e.g. in the libpthdebug.a library, if would be preferable to
implement the semantics correctly.  (Also, it just looks surprising
to see the read and write implementation differ when there is no
apparent reason why that should be the case.)

>>This looks unnecessarily complicated.  Isn't this just
>  > *g++ = tp;
>
>This I have changed. 

The code now looks like:
>+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
>+  {
>+    *g = tp;
>+    *g++;
>+  }

Which is weird, as *g++ dereferences g for no reason.  This should
simply be:

  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
    *g++ = tp;


>As far as the check gptid.is_pid () is concerned, I will suggest we
>keep it there. If cmp_result is > 0 and we have a main process swap
>to create a thread. Rest is same in the loop. The reason being handling
>pi and gi variables becomes complex otherwise. When this swap happens,
>we need to increment both pi and gi.. Because we have taken care of the
>main threads in both pthread library and GDB. And this for loop is
>executed only once. So, the first event is main process being
>pthreaded. Once the swap happens pi and gi become one and since
>gcount = pcount = 1 we exit the for loop. Thread addition events comes
>after this. 

Hmm, handling the initial switch of a single PID-only thread
to the PID/TID-style ptid_t separately before still seems
a bit clearer to me.  But in the end your proposed code looks
correct now so I'd be fine with it as is, if you prefer.


Except for the few things mentioned above, this now looks ready to
be committed to me.  However, I'm not sure the commit message
fully describes the latest version of the patch, after we've
gone through all those iterations ...  Can you come up with a
message that maybe starts out with the high-level change
(along the lines of "update aix-thread.c to handle threads in
multiple inferiors"), and goes from there into the specific
details (aix_thread_variables structure, handling only a
single inferior per sync_threadlists invocation, solib fixes
for multiple inferiors, ...)?  Thanks!

Bye,
Ulrich

 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-08 18:44                                                                   ` Ulrich Weigand
@ 2023-02-10 16:33                                                                     ` Aditya Kamath1
  2023-02-10 16:46                                                                       ` Aditya Kamath1
  2023-02-13 19:01                                                                       ` Ulrich Weigand
  0 siblings, 2 replies; 49+ messages in thread
From: Aditya Kamath1 @ 2023-02-10 16:33 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 12895 bytes --]

Hi Ulrich, Tom and community,

Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}

Also find attached a document that I have proposed as a commit message. {See: Fix Multi Thread debug fix for AIX.pdf}.. This same document is used in the commit message of this patch.

So the bug is fixed and test cases run alright. Kindly check the sample output of the same pasted below this email.

In the document I have provided specific and detailed explaination of every change as much as possible.

So honestly there was a bug left to fix in the previous email of the patch while we followed the child in particular. I figured it out as I was testing more deeply. Kindly see section "Solution Part 1: - " where I have explained the same.

Kindly suggest me for changes if needed. Otherwise kindly let me know if this is ready for commit.

>Previously we used switch_to_thread ().. Now if the application is
>theraded and we only pass ptid_t (user_current_pid) to switch_to_thread ()
>it will crash as main thread looks different or is ptid_t (pid, 0, tid).

> This part I don't quite understand yet - how/why does it crash?

Kindly check "Solution Part 2: - " of the document, where I have explained this.

>Similarly, I agree that everything may currently "work" without
>adding the equivalent change to pdc_write_memory, but most likely
>this is simply because that callback may just not be used very much.

Yes, I agree.  We have the changed user_current_pid variable to thread so that we always switch in the right context. Kindly let me know if it is alright and any changes are necessary here.

>Can you come up with a
>message that maybe starts out with the high-level change
>(along the lines of "update aix-thread.c to handle threads in
>multiple inferiors"), and goes from there into the specific
>details (aix_thread_variables structure, handling only a
>single inferior per sync_threadlists invocation, solib fixes
>for multiple inferiors, ...)?  Thanks!

Sure, so the pdf attached in this email has it. Kindly suggest me if we can do this better.

Have a nice day ahead.

Thanks and regards,
Aditya.


--------------------------------------------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}

-------------------------------------------------
Output with patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 15335754)]

[New inferior 3 (process 8061404)]

I am parent

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.

(gdb) info threads

  Id   Target Id                          Frame

* 1.1  Thread 1 (tid 30671243, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 34406781, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 36307315, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  2.1  process 15335754                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  process 8061404                    0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 15335754] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 15335754)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb)

-----------------------------------------------------------------------
Output without patch:-
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)
(gdb)
________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 09 February 2023 00:14
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>This seems unrelated to the rest of the changes at first glance.
>>Why is this necessary?
>
>So, when we need to be in the right context when we read memory. Before
>coming into the target wait, we switch_to_no_thread () due to which our
>inferior_ptid is set to null. Our target_memory needs the correct
>inferior_ptid.  Also, in case we don't have a ptid_t (pid) and the
>application is threaded we need the inferior_ptid to be set correctly
>like shown in the patch.

Understood.

>Previously we used switch_to_thread ().. Now if the application is
>theraded and we only pass ptid_t (user_current_pid) to switch_to_thread ()
>it will crash as main thread looks different or is ptid_t (pid, 0, tid).

This part I don't quite understand yet - how/why does it crash?

>>By comparison, the Linux version of this in proc-service.c also
>>switches the current inferior and address space:
> > scoped_restore_current_inferior restore_inferior;
> > set_current_inferior (ph->thread->inf);
> > scoped_restore_current_program_space restore_current_progspace;
> > set_current_program_space (ph->thread->inf->pspace);
> > scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
> > inferior_ptid = ph->thread->ptid;
>> so we should probably do the same for consistency.

>So, kindly allow me to disagree with you on this. What is happening is in
>inferior.c in do_target_wait1 () we call switch_to_inferior_no_thread ()..
[snip]
>Here we already set the correct current inferior and program space to
>the same thing as that if we set in pdc_read_memory like linux.
>So, it does not make any difference to add the changes like linux does

Well, it does look like if you entered the callback in this particular
context, the inferior may have already been set up correctly.  However,
in theory the callback could also be called in different contexts, and
just as a precaution it would be preferable to have it always work
correctly.  The semantics of the callback is to read memory of a
particular process as identified via the pthdb_user_t argument, and
we should write the routine so that it always does what's needed to
implement that semantics correctly.

>Secondly, things work if we do not do the same for pdc_write_memory.
>I have not seen anything not work. So, I don't think it is good to
>add it there. What say??

Similarly, I agree that everything may currently "work" without
adding the equivalent change to pdc_write_memory, but most likely
this is simply because that callback may just not be used very much.

But as a precaution, and to accommodate potential future changes
e.g. in the libpthdebug.a library, if would be preferable to
implement the semantics correctly.  (Also, it just looks surprising
to see the read and write implementation differ when there is no
apparent reason why that should be the case.)

>>This looks unnecessarily complicated.  Isn't this just
>  > *g++ = tp;
>
>This I have changed.

The code now looks like:
>+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
>+  {
>+    *g = tp;
>+    *g++;
>+  }

Which is weird, as *g++ dereferences g for no reason.  This should
simply be:

  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
    *g++ = tp;


>As far as the check gptid.is_pid () is concerned, I will suggest we
>keep it there. If cmp_result is > 0 and we have a main process swap
>to create a thread. Rest is same in the loop. The reason being handling
>pi and gi variables becomes complex otherwise. When this swap happens,
>we need to increment both pi and gi.. Because we have taken care of the
>main threads in both pthread library and GDB. And this for loop is
>executed only once. So, the first event is main process being
>pthreaded. Once the swap happens pi and gi become one and since
>gcount = pcount = 1 we exit the for loop. Thread addition events comes
>after this.

Hmm, handling the initial switch of a single PID-only thread
to the PID/TID-style ptid_t separately before still seems
a bit clearer to me.  But in the end your proposed code looks
correct now so I'd be fine with it as is, if you prefer.


Except for the few things mentioned above, this now looks ready to
be committed to me.  However, I'm not sure the commit message
fully describes the latest version of the patch, after we've
gone through all those iterations ...  Can you come up with a
message that maybe starts out with the high-level change
(along the lines of "update aix-thread.c to handle threads in
multiple inferiors"), and goes from there into the specific
details (aix_thread_variables structure, handling only a
single inferior per sync_threadlists invocation, solib fixes
for multiple inferiors, ...)?  Thanks!

Bye,
Ulrich



[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 44777 bytes --]

From ae2f4812d5cf561ac24bbd51cfdaa532c73ea900 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Fri, 10 Feb 2023 09:36:11 -0600
Subject: [PATCH] Fix Multi Thread debug fix for AIX
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The bug:- In the recent commit 98ed24fb35d89eb20179edf6c12f599c7a9e228e there is a change in aix-thread.c file that changes 



static_cast<aix_thread_info *> in gdb to gdb::checked_static_cast<aix_thread_info *>



AIX folks using the latest version thus will not be able to debug multi thread programs as a result of it. 

The error in AIX is as follows: - 

internal error: checked_static_cast: Assertion `result! = nullptr' failed.

The root cause of the issue:-  The private data was not set for the first thread or the main thread of a process. In AIX when we run an “info threads” command, we showed main process as “process <pid>” without private data set and added a new thread Thread<Tid> representing the same with private set. When we iterate_over_threads () we call get_aix_thread_info (). This leads to the crash as we had the main process thread “process <pid>” with no private data. Hence the checked static cast will not allow us to debug any further which is rightly so as we had a thread with no private data.

What should be the fix: - Removing the main process thread i.e. “process <pid> “was the first proposed solution as the “Thread <tid>” representing the same already exists with private data set. This was happening in the sync_threadlists () code of AIX.

Solution Part 1: -

Why the change?

The delete_thread () with the cmp_result > 0 block of the for loop in the sync_threadlists () function which applies the difference between the pthread and GDB threadlist, will fail to delete the main process thread. The reason is that it “process <pid>” is the current process and thus GDB core will not delete it despite we are calling it. Hence even if we add the “thread <tid>” representing the same “process <pid>” in the next iteration of the for loop we will not be successful.

Hence this forces us to change the main process thread “process <pid>” to “thread <tid>” via the thread_change_ptid () and the private data set. These changes can be seen in the sync_threadlists () part.

However, we also need to keep in mind that before we think this will work, our libpthread library is only ready when the following condition in the wait () of aix-thread.c is satisfied.

/* Check whether libpthdebug might be ready to be initialized.  */
  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
      && status->sig () == GDB_SIGNAL_TRAP)

Until then changing the “process <pid>” to “thread <tid>” is incorrect. Even though the session is ready and initalised via pd_enable () and pd_activate () functions respectively. Therfore this made us to keep a variable pthdebugready in all functions that lead to sync_threadlists () so that we change the process thread to a thread with private data only when libpthdebug is initialised for a particular process.

The first if condition below this paragraph change in the sync_threadlists () as shown below means the pthread debug library is not intialised. This is just to set priv to main process thread.

if (gbuf[0]->ptid.is_pid () && !pthdebugready)
    {
      aix_thread_info *priv = new aix_thread_info;
      tp->priv.reset (priv);
    }

The second if condition below this paragraph change is for changing “process <pid>” to “thread <tid>” as the pthread debug library is intialised.

if (gptid.is_pid () && pthdebugready)
                {
                  thread_change_ptid (proc_target, gptid, pptid);
                  aix_thread_info *priv = new aix_thread_info;
                  priv->pdtid = pbuf[pi].pdtid;
                  priv->tid = pbuf[pi].tid;
                  tp->priv.reset (priv);
                  gi++;
                  pi++;
                }

Failing to do so leads us to two problems. One while we fetch_registers () our regcache-> ptid though changed to ptid_t (pid, 0, tid) will not be able to get the private data in a case where we switch to a child process from the parent process via “inferior 2” command leading to the crash that private data was not set for a thread. Because we incorrectly changed the “process <pid>” to “thread <tid>” before the process itself could raise a trap and tell the debugger we are now ready to debug threads.

Example of the crash:-
(gdb) set detach-on-fork off
(gdb) r
Starting program:
[New Thread 258]
[New Thread 515]
[New inferior 2 (process 21627386)]
I am parent
[New inferior 3 (process 9372064)]
I am parent
^C
Thread 1.1 received signal SIGINT, Interrupt.
[Switching to Thread 1]
0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
(gdb) inferior 2
[Switching to inferior 2 [process 21627386] (/home /gdb_tests/ultimate-multi-thread-fork)]
[Switching to thread 2.1 (Thread 515)]
(gdb) c
Continuing.
./../gdbsupport/gdb-checked-static-cast.h:58: internal-error: checked_static_cast: Assertion `result != nullptr' failed.

The process stack of the crash due to the is as below: -

0x000000010059ef60  aix_thread_info* gdb::checked_static_cast<aix_thread_info*, private_thread_info>(private_thread_info*)(0x0) + 0x7c
0x0000000100596ea0  get_aix_thread_info(thread_info*)(0x0) + 0x34
0x000000010059b778  aix_thread_target::fetch_registers(regcache*, int)(0x11001f3f8, 0x1107c5030, 0x4000000000) + 0xf8
0x00000001003675f0  target_fetch_registers(regcache*, int)(0x1107c5030, 0x40e0ddf00d) + 0x6c
0x00000001005817c0  regcache::raw_update(int)(0x1107c5030, 0x401001f3f8) + 0x94
0x0000000100581904  readable_regcache::raw_read(int, unsigned char*)(0x1107c5030, 0x4000000203, 0xfffffffffffebc0) + 0x8c
0x0000000100581f54  readable_regcache::cooked_read(int, unsigned char*)(0x1107c5030, 0x40ffffeb90, 0xfffffffffffebc0) + 0xec
0x0000000100daba10  register_status readable_regcache::cooked_read<unsigned long, void>(int, unsigned long*)(0x1107c5030, 0x40ffffec50, 0xfffffffffffed10) + 0xd4
0x00000001005826a0  regcache_cooked_read_unsigned(regcache*, int, unsigned long*)(0x1107c5030, 0x40ffffecd0, 0xfffffffffffed10) + 0x70
0x0000000100584e2c  regcache_read_pc(regcache*)(0x1107c5030) + 0xa4
0x0000000100387614  handle_signal_stop(execution_control_state*)(0xffffffffffff3a8) + 0x158
0x00000001003864e4

Secondly in a case where, if we follow the child instead of the parent and we end up changing our “process <pid>” to “thread <tid>” before the process itself raises a trap and tells the debugger “I am ready for threads”, then when we switch_to_thread in the follow_fork () we end up not finding the “process <pid>” and thus leading to an assertion failure as shown below and rightly so, because we changed threads without the library being initialised. This happens when the follow_fork () is called, and we switch to the child thread there.

(gdb) set detach-on-fork off
(gdb) set follow-fork-mode child
(gdb) r
Starting program:
[New Thread 258]
[New Thread 515]
[Attaching after Thread 515 fork to child process 18809098]
[New inferior 2 (process 18809098)]
thread.c:1337: internal-error: switch_to_thread: Assertion `thr != NULL' failed.

The process stack is as follows:-
0x0000000100036590  internal_error_loc(char const*, int, char const*, ...)(0x10192ba70, 0x53900000000, 0x10192b970) + 0x58
0x0000000100619918  switch_to_thread(thread_info*)(0x0) + 0x48
0x000000010037635c  follow_fork()() + 0x4c8
0x0000000100385af8  handle_inferior_event(execution_control_state*)(0xffffffffffff3a8) + 0xda8
0x00000001003809d0  fetch_inferior_event()() + 0x2f8
0x0000000100719a0c  inferior_event_handler(inferior_event_type)(0x10207a50) + 0x38
0x000000010039228c  infrun_async_inferior_event_handler(void*)(0x0) + 0x30
0x0000000100671d18  check_async_event_handlers()() + 0x94
0x000000010066e32c  gdb_do_one_event(int)(0xfffffffffffff840) + 0xb4
0x0000000100001dcc  start_event_loop()() + 0x28
0x0000000100001fd4  captured_command_loop()() + 0x58
0x000000010000414c  captured_main(void*)(0xffffffffffffa60) + 0x2c
0x0000000100004220  gdb_main(captured_main_args*)(0xffffffffffffa60) + 0x20

So, the changes in the sync_threadlists () with parameter and the for loop justifies the same.

Also, we now do not use iterate_over_threads to count our GDB threads. We instead do it via for (thread_info *tp : all_threads (proc_target, ptid_t (pid))) inline.

Solution Part 2: -

Since we switch_to_no_thread before a wait (), on an event of a thread detection or any other event which makes us use the thread call-backs, we need to be in the right context while we read and write data for threads. That is why we switch our inferior_ptid, current_inferior and program space in pdc_read_data () and pdc_write_data and now pdc_write_data.

So why did we make this change
-    if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-                       ptid_t (user_current_pid));
 in pdc_read_data and change our user variable which was the process ID to a thread? Wasn’t it already doing the job?

Consider an event where the parent process is threaded, and we have a fork (). When we do a pd_update () after the beneath->wait () in thread wait () we call sync_threadlists () as well. Over there we call pthdb_pthread (data->pd_session, &pdtid, cmd);

This now will use the ptid_t (user_current_pid) to switch the thread (). However, our parent process or main thread of it, is threaded i.e is ptid_t (user_current_pid, 0, tid). Hence, we will crash with an assertion failure that thread ptid_t (user_current_pid
) has not been found.

In order to avoid the same, we now pass the thread directly. So, on any event after the main process looks like a main thread, there will be no confusion on which thread space or inferior_ptid or program space to switch, especially when a process is multi-threaded.

Solution Part 3: - In AIX we use a lot of variables for different purposes like pd_active, pd_able, arch64, pd_brk_addr and pd_session. These variables are unique per inferior. Hence, we need to keep them in a map <inferior, structure> where structure can hold all these variables per inferior. This is where we use the inbuilt GDB registry for every inferior. This change exists in this patch.

Solution Part 4: -

We figured out that the top target for a new inferior born after the main inferior was incorrect post the process being threaded.

The root cause was that the shared library was not being loaded for new process. The reason being we change our shared library file name in the BFD registry from member name to path(member_name).

Hence the changes in solib-aix takes care of the new pattern so that the shared library can be loaded correctly for every new inferior born as well via pattern matching the ‘(‘character and checking if the member_name exists after that in the new pattern registered in the BFD registry as shown in solib-aix.c changes in this patch.

-----------------------------------------------------------------------------------------------------------------

These 4 solution parts together fixes the bug.
---
 gdb/aix-thread.c | 431 ++++++++++++++++++++++++++++-------------------
 gdb/solib-aix.c  |  14 ++
 2 files changed, 271 insertions(+), 174 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..f86d9429f71 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -68,10 +68,6 @@ static bool debug_aix_thread;
 #define pthdb_tid_t	tid_t
 #endif
 
-/* Return whether to treat PID as a debuggable thread id.  */
-
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
-
 /* Success and failure values returned by pthdb callbacks.  */
 
 #define PDC_SUCCESS	PTHDB_SUCCESS
@@ -144,38 +140,20 @@ class aix_thread_target final : public target_ops
 
 static aix_thread_target aix_thread_ops;
 
-/* Address of the function that libpthread will call when libpthdebug
-   is ready to be initialized.  */
-
-static CORE_ADDR pd_brk_addr;
-
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
-/* Whether the current architecture is 64-bit.  
-   Only valid when pd_able is true.  */
-
-static int arch64;
-
 /* Forward declarations for pthdb callbacks.  */
 
-static int pdc_symbol_addrs (pthdb_user_t, pthdb_symbol_t *, int);
-static int pdc_read_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
-static int pdc_write_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
-static int pdc_read_regs (pthdb_user_t user, pthdb_tid_t tid,
+static int pdc_symbol_addrs (thread_info*, pthdb_symbol_t *, int);
+static int pdc_read_data (thread_info*, void *, pthdb_addr_t, size_t);
+static int pdc_write_data (thread_info*, void *, pthdb_addr_t, size_t);
+static int pdc_read_regs (thread_info* user, pthdb_tid_t tid,
 			  unsigned long long flags, 
 			  pthdb_context_t *context);
-static int pdc_write_regs (pthdb_user_t user, pthdb_tid_t tid,
+static int pdc_write_regs (thread_info* user, pthdb_tid_t tid,
 			   unsigned long long flags, 
 			   pthdb_context_t *context);
-static int pdc_alloc (pthdb_user_t, size_t, void **);
-static int pdc_realloc (pthdb_user_t, void *, size_t, void **);
-static int pdc_dealloc (pthdb_user_t, void *);
+static int pdc_alloc (thread_info*, size_t, void **);
+static int pdc_realloc (thread_info*, void *, size_t, void **);
+static int pdc_dealloc (thread_info*, void *);
 
 /* pthdb callbacks.  */
 
@@ -191,9 +169,66 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
+
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+
+  /* Address of the function that libpthread will call when libpthdebug
+   is ready to be initialized.  */
+  CORE_ADDR pd_brk_addr;
+
+  /* Whether the current architecture is 64-bit.
+   Only valid when pd_able is true.  */
+  int arch64;
+};
+
+/* Key to our per-inferior data.  */
+static const registry<inferior>::key<aix_thread_variables>
+  aix_thread_variables_handle;
+
+/* Function to Get aix_thread_variables data.  */
+static struct aix_thread_variables*
+get_aix_thread_variables_data (struct inferior *inf)
+{
+  if (inf == NULL)
+    return NULL;
+
+  struct aix_thread_variables* data;
+
+  data = aix_thread_variables_handle.get (inf);
+  if (data == NULL)
+    data = aix_thread_variables_handle.emplace (inf);
+
+  return data;
+}
+
+/* Helper to get data for ptid in a function.  */
 
-static pthdb_session_t pd_session;
+static struct aix_thread_variables*
+get_thread_data_helper_for_ptid (ptid_t ptid)
+{
+  inferior *inf = find_inferior_ptid (current_inferior ()->process_target (),
+					ptid);
+  return get_aix_thread_variables_data (inf);
+}
+
+/* Helper to get data for pid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_pid (pid_t pid)
+{
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (),
+                                        pid);
+  return get_aix_thread_variables_data (inf);
+}
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -318,7 +353,7 @@ pid_to_prc (ptid_t *ptidp)
   ptid_t ptid;
 
   ptid = *ptidp;
-  if (PD_TID (ptid))
+  if (ptid.tid () != 0)
     *ptidp = ptid_t (ptid.pid ());
 }
 
@@ -326,7 +361,7 @@ pid_to_prc (ptid_t *ptidp)
    the address of SYMBOLS[<i>].name.  */
 
 static int
-pdc_symbol_addrs (pthdb_user_t user_current_pid, pthdb_symbol_t *symbols, int count)
+pdc_symbol_addrs (thread_info *user_current_thread, pthdb_symbol_t *symbols, int count)
 {
   struct bound_minimal_symbol ms;
   int i;
@@ -334,8 +369,8 @@ pdc_symbol_addrs (pthdb_user_t user_current_pid, pthdb_symbol_t *symbols, int co
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
-		"pdc_symbol_addrs (user_current_pid = %ld, symbols = 0x%lx, count = %d)\n",
-		user_current_pid, (long) symbols, count);
+		"pdc_symbol_addrs (user_current_pid = %d, symbols = 0x%lx, count = %d)\n",
+		user_current_thread->ptid.pid (), (long) symbols, count);
 
   for (i = 0; i < count; i++)
     {
@@ -373,7 +408,7 @@ pdc_symbol_addrs (pthdb_user_t user_current_pid, pthdb_symbol_t *symbols, int co
    If successful return 0, else non-zero is returned.  */
 
 static int
-pdc_read_regs (pthdb_user_t user_current_pid,
+pdc_read_regs (thread_info *user_current_thread,
 	       pthdb_tid_t tid,
 	       unsigned long long flags,
 	       pthdb_context_t *context)
@@ -389,6 +424,9 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_thread->ptid.pid ());
   
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_read_regs tid=%d flags=%s\n",
@@ -397,7 +435,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -423,7 +461,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -445,7 +483,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
    If successful return 0, else non-zero is returned.  */
 
 static int
-pdc_write_regs (pthdb_user_t user_current_pid,
+pdc_write_regs (thread_info *user_current_thread,
 		pthdb_tid_t tid,
 		unsigned long long flags,
 		pthdb_context_t *context)
@@ -456,6 +494,10 @@ pdc_write_regs (pthdb_user_t user_current_pid,
      this is needed, I have implemented what I think it should do,
      however this code is untested.  */
 
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_thread->ptid.pid ());
+
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_write_regs tid=%d flags=%s\n",
 		(int) tid, hex_string (flags));
@@ -463,7 +505,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_WRITE_GPRS, tid, 
 		     (unsigned long) context->gpr, 0, NULL);
       else
@@ -479,7 +521,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  ptrace64aix (PTT_WRITE_SPRS, tid, 
 		       (unsigned long) &context->msr, 0, NULL);
@@ -495,27 +537,30 @@ pdc_write_regs (pthdb_user_t user_current_pid,
 /* pthdb callback: read LEN bytes from process ADDR into BUF.  */
 
 static int
-pdc_read_data (pthdb_user_t user_current_pid, void *buf,
+pdc_read_data (thread_info *user_current_thread, void *buf,
 	       pthdb_addr_t addr, size_t len)
 {
   int status, ret;
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
-		"pdc_read_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
-		user_current_pid, (long) buf, hex_string (addr), len);
+		"pdc_read_data (user_current_pid = %d, buf = 0x%lx, addr = %s, len = %ld)\n",
+		user_current_thread->ptid.pid (), (long) buf, hex_string (addr), len);
 
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
-    if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+    inferior_ptid = user_current_thread->ptid;
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (user_current_thread->inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (user_current_thread->inf->pspace);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -529,17 +574,27 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
 /* pthdb callback: write LEN bytes from BUF to process ADDR.  */
 
 static int
-pdc_write_data (pthdb_user_t user_current_pid, void *buf,
+pdc_write_data (thread_info *user_current_thread, void *buf,
 		pthdb_addr_t addr, size_t len)
 {
   int status, ret;
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
-		"pdc_write_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
-		user_current_pid, (long) buf, hex_string (addr), len);
+		"pdc_write_data (user_current_pid = %d, buf = 0x%lx, addr = %s, len = %ld)\n",
+		user_current_thread->ptid.pid (), (long) buf, hex_string (addr), len);
+
+  {
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
+    inferior_ptid = user_current_thread->ptid;
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (user_current_thread->inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (user_current_thread->inf->pspace);
+    status = target_write_memory (addr, (gdb_byte *) buf, len);
+  }
 
-  status = target_write_memory (addr, (gdb_byte *) buf, len);
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
 
   if (debug_aix_thread)
@@ -552,12 +607,12 @@ pdc_write_data (pthdb_user_t user_current_pid, void *buf,
    in BUFP.  */
 
 static int
-pdc_alloc (pthdb_user_t user_current_pid, size_t len, void **bufp)
+pdc_alloc (thread_info *user_current_thread, size_t len, void **bufp)
 {
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
-		"pdc_alloc (user_current_pid = %ld, len = %ld, bufp = 0x%lx)\n",
-		user_current_pid, len, (long) bufp);
+		"pdc_alloc (user_current_pid = %d, len = %ld, bufp = 0x%lx)\n",
+		user_current_thread->ptid.pid (), len, (long) bufp);
   *bufp = xmalloc (len);
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -574,12 +629,12 @@ pdc_alloc (pthdb_user_t user_current_pid, size_t len, void **bufp)
    pointer to the result in BUFP.  */
 
 static int
-pdc_realloc (pthdb_user_t user_current_pid, void *buf, size_t len, void **bufp)
+pdc_realloc (thread_info *user_current_thread, void *buf, size_t len, void **bufp)
 {
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
-		"pdc_realloc (user_current_pid = %ld, buf = 0x%lx, len = %ld, bufp = 0x%lx)\n",
-		user_current_pid, (long) buf, len, (long) bufp);
+		"pdc_realloc (user_current_pid = %d, buf = 0x%lx, len = %ld, bufp = 0x%lx)\n",
+		user_current_thread->ptid.pid (), (long) buf, len, (long) bufp);
   *bufp = xrealloc (buf, len);
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -591,11 +646,11 @@ pdc_realloc (pthdb_user_t user_current_pid, void *buf, size_t len, void **bufp)
    realloc callback.  */
 
 static int
-pdc_dealloc (pthdb_user_t user_current_pid, void *buf)
+pdc_dealloc (thread_info *user_current_thread, void *buf)
 {
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
-		"pdc_free (user_current_pid = %ld, buf = 0x%lx)\n", user_current_pid,
+		"pdc_free (user_current_pid = %d, buf = 0x%lx)\n", user_current_thread->ptid.pid (),
 		(long) buf);
   xfree (buf);
   return PDC_SUCCESS;
@@ -639,39 +694,6 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_count (struct thread_info *thread, void *countp)
-{
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
-}
-
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_accum (struct thread_info *thread, void *bufp)
-{
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
-  return 0;
-}
-
 /* ptid comparison function */
 
 static int
@@ -719,7 +741,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -741,7 +766,7 @@ get_signaled_thread (int pid)
        have difficulty with certain call patterns */
 
 static void
-sync_threadlists (int pid)
+sync_threadlists (pid_t pid, int pthdebugready)
 {
   int cmd, status;
   int pcount, psize, pi, gcount, gi;
@@ -750,6 +775,11 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +789,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (data->pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (data->pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +810,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (data->pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,13 +820,32 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
   gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    *g++ = tp;
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
+  tp = find_thread_ptid (proc_target, ptid_t (pid));
+
+  /* If the pthreadlibrary is not ready to debug 
+     then this is just a main process which needs 
+     a priv to be set.  The if condition below does 
+     the same.  Otherwise we go to the for loop to 
+     sync the pthread and GDB thread lists.  */
+
+  if (gbuf[0]->ptid.is_pid () && !pthdebugready)
+    {
+      aix_thread_info *priv = new aix_thread_info;
+      tp->priv.reset (priv);
+    }
+
   /* Apply differences between the two arrays to GDB's thread list.  */
+  else  
   for (pi = gi = 0; pi < pcount || gi < gcount;)
     {
       if (pi == pcount)
@@ -810,8 +859,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,13 +888,27 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+                 like a thread.  */
+
+	      if (gptid.is_pid () && pthdebugready)
+		{
+		  thread_change_ptid (proc_target, gptid, pptid);
+		  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+		  tp->priv.reset (priv);
+		  gi++;
+		  pi++;
+		}
+	      else
+		{
+		  delete_thread (gbuf[gi]);
+		  gi++;
+		}
 	    }
 	  else
 	    {
-	      process_stratum_target *proc_target
-		= current_inferior ()->process_target ();
 	      thread = add_thread (proc_target, pptid);
 
 	      aix_thread_info *priv = new aix_thread_info;
@@ -881,21 +942,24 @@ iter_tid (struct thread_info *thread, void *tidp)
    return a pid-only ptid with PID.  */
 
 static ptid_t
-pd_update (int pid)
+pd_update (pid_t pid, int pthdebugready)
 {
   int status;
   ptid_t ptid;
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
+  struct aix_thread_variables *data;
 
-  if (!pd_active)
+  data = get_thread_data_helper_for_pid (pid);
+
+  if (!data->pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (data->pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
-  sync_threadlists (pid);
+  sync_threadlists (pid, pthdebugready);
 
   /* Define "current thread" as one that just received a trap signal.  */
 
@@ -915,32 +979,22 @@ pd_update (int pid)
    for that thread.  Otherwise, return a ptid-only ptid using PID.  */
 
 static ptid_t
-pd_activate (int pid)
+pd_activate (pid_t pid, int pthdebugready)
 {
   int status;
-		
-  status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
+  thread_info *thread = find_thread_ptid (current_inferior (), ptid_t (pid));
+  
+  status = pthdb_session_init (thread, data->arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &data->pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
-  return pd_update (pid);
-}
-
-/* Undo the effects of pd_activate().  */
-
-static void
-pd_deactivate (void)
-{
-  if (!pd_active)
-    return;
-  pthdb_session_destroy (pd_session);
-  
-  pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  data->pd_active = 1;
+  return pd_update (pid, pthdebugready);
 }
 
 /* An object file has just been loaded.  Check whether the current
@@ -952,17 +1006,24 @@ pd_enable (void)
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  struct aix_thread_variables *data;
+
+  if (!inferior_ptid.pid ())
+    return;
+  
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (data->pd_able)
     return;
 
   /* Check application word size.  */
-  arch64 = register_size (target_gdbarch (), 0) == 8;
+  data->arch64 = register_size (target_gdbarch (), 0) == 8;
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
-  status = pthdb_session_pthreaded (inferior_ptid.pid (), PTHDB_FLAG_REGS,
+  thread_info *thread = find_thread_ptid (current_inferior (), inferior_ptid);
+  status = pthdb_session_pthreaded (thread, PTHDB_FLAG_REGS,
 				    &pd_callbacks, &stub_name);
   if ((status != PTHDB_SUCCESS
        && status != PTHDB_NOT_PTHREADED) || !stub_name)
@@ -972,18 +1033,18 @@ pd_enable (void)
   ms = lookup_minimal_symbol (stub_name, NULL, NULL);
   if (ms.minsym == NULL)
     return;
-  pd_brk_addr = ms.value_address ();
-  if (!create_thread_event_breakpoint (target_gdbarch (), pd_brk_addr))
+  data->pd_brk_addr = ms.value_address ();
+  if (!create_thread_event_breakpoint (target_gdbarch (), data->pd_brk_addr))
     return;
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  data->pd_able = 1; 
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
      activate thread debugging.  */
-  pd_activate (inferior_ptid.pid ());
+  pd_activate (inferior_ptid.pid (), 0);
 }
 
 /* Undo the effects of pd_enable().  */
@@ -991,28 +1052,31 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_able)
+    return;
+  if (!data->pd_active)
     return;
-  if (pd_active)
-    pd_deactivate ();
-  pd_able = 0;
+  pthdb_session_destroy (data->pd_session);
+ 
+  pid_to_prc (&inferior_ptid);
+  data->pd_active = 0;
+  data->pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1042,8 +1106,11 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 {
   struct thread_info *thread;
   pthdb_tid_t tid[2];
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (ptid);
 
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     {
       scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
       
@@ -1065,7 +1132,7 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 	       ptid.lwp ());
       tid[1] = 0;
 
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_CONTINUE, tid[0], (long long) 1,
 		     gdb_signal_to_host (sig), (PTRACE_TYPE_ARG5) tid);
       else
@@ -1082,6 +1149,7 @@ ptid_t
 aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
 			 target_wait_flags options)
 {
+  struct aix_thread_variables *data;
   {
     pid_to_prc (&ptid);
 
@@ -1095,8 +1163,10 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
      pid-only ptids.  */
   gdb_assert (ptid.is_pid ());
 
+  data = get_thread_data_helper_for_ptid (ptid);
+
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1105,11 +1175,11 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
       struct gdbarch *gdbarch = regcache->arch ();
 
       if (regcache_read_pc (regcache)
-	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
-	return pd_activate (ptid.pid ());
+	  - gdbarch_decr_pc_after_break (gdbarch) == data->pd_brk_addr)
+	return pd_activate (ptid.pid (), 1);
     }
 
-  return pd_update (ptid.pid ());
+  return pd_update (ptid.pid (), 0);
 }
 
 /* Record that the 64-bit general-purpose registers contain VALS.  */
@@ -1229,18 +1299,20 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep> (gdbarch);
   int status, i;
   pthdb_context_t ctx;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
 
   /* General-purpose registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_gprs64 (regcache, ctx.gpr);
   else
     for (i = 0; i < ppc_num_gprs; i++)
@@ -1253,7 +1325,7 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Special registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_sprs64 (regcache, ctx.iar, ctx.msr, ctx.cr, ctx.lr, ctx.ctr,
 			     ctx.xer, ctx.fpscr);
   else
@@ -1288,18 +1360,21 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
   int i;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"fetch_regs_kernel_thread tid=%lx regno=%d arch64=%d\n",
-		(long) tid, regno, arch64);
+		(long) tid, regno, data->arch64);
 
   /* General-purpose registers.  */
   if (regno == -1
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_gprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -1331,7 +1406,7 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -1363,7 +1438,7 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->fetch_registers (regcache, regno);
   else
     {
@@ -1511,6 +1586,8 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   pthdb_context_t ctx;
   uint32_t int32;
   uint64_t int64;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1518,7 +1595,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1528,7 +1605,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   for (i = 0; i < ppc_num_gprs; i++)
     if (REG_VALID == regcache->get_register_status (tdep->ppc_gp0_regnum + i))
       {
-	if (arch64)
+	if (data->arch64)
 	  {
 	    regcache->raw_collect (tdep->ppc_gp0_regnum + i, (void *) &int64);
 	    ctx.gpr[i] = int64;
@@ -1545,7 +1622,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
     fill_fprs (regcache, ctx.fpr);
 
   /* Special registers (always kept in ctx as 64 bits).  */
-  if (arch64)
+  if (data->arch64)
     {
       fill_sprs64 (regcache, &ctx.iar, &ctx.msr, &ctx.cr, &ctx.lr, &ctx.ctr,
 			     &ctx.xer, &ctx.fpscr);
@@ -1576,7 +1653,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1602,6 +1679,9 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs  sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1613,7 +1693,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_fprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some regs may not be in the cache.  */
 	  ptrace64aix (PTT_READ_GPRS, tid, (unsigned long) gprs64, 0, NULL);
@@ -1646,7 +1726,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some registers won't be in the cache.  */
 	  ptrace64aix (PTT_READ_SPRS, tid, 
@@ -1703,7 +1783,7 @@ aix_thread_target::store_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->store_registers (regcache, regno);
   else
     {
@@ -1741,7 +1821,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1750,7 +1830,7 @@ aix_thread_target::mourn_inferior ()
 bool
 aix_thread_target::thread_alive (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->thread_alive (ptid);
 
   /* We update the thread list every time the child stops, so all
@@ -1766,7 +1846,7 @@ aix_thread_target::thread_alive (ptid_t ptid)
 std::string
 aix_thread_target::pid_to_str (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->pid_to_str (ptid);
 
   return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
@@ -1786,8 +1866,11 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
   pthdb_detachstate_t detachstate;
   int cancelpend;
   static char *ret = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_ptid (thread->ptid);
 
-  if (!PD_TID (thread->ptid))
+  if (thread->ptid.tid () == 0)
     return NULL;
 
   string_file buf;
@@ -1800,24 +1883,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (data->pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (data->pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (data->pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (data->pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..6be81064ebd 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -618,6 +618,20 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+
+      /* For every inferior after first int bfd system we 
+	 will have the pathname instead of the member name
+	 registered. Hence the below condition exists.  */
+
+      if (s.find ('(') != std::string::npos)
+	{
+	  int pos = s.find ('(');
+	  int len = s.find (')') - s.find ('(');
+	  if (s.substr (pos+1, len-1) == member_name) 
+	    return object_bfd;
+	}
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


[-- Attachment #3: Fix Multi Thread debug fix for AIX.pdf --]
[-- Type: application/pdf, Size: 187222 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-10 16:33                                                                     ` Aditya Kamath1
@ 2023-02-10 16:46                                                                       ` Aditya Kamath1
  2023-02-13 19:01                                                                       ` Ulrich Weigand
  1 sibling, 0 replies; 49+ messages in thread
From: Aditya Kamath1 @ 2023-02-10 16:46 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 20554 bytes --]

Hi Ulrich, Tom and community,

Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}

The previous email could not get to the community mailing list as it had a size constraint.

So, the bug is fixed, and outputs are the same as pasted below in this email.

In the patch commit message I have provided specific and detailed explaination of every change as much as possible.

So honestly there was a bug left to fix in the previous email of the patch while we followed the child in particular. I figured it out as I was testing more deeply. Kindly see section "Solution Part 1: - " in the patch commit message where I have explained the same.

Kindly suggest me for changes if needed. Otherwise kindly let me know if this is ready for commit.

>Now if the application is
>theraded and we only pass ptid_t (user_current_pid) to switch_to_thread ()
>it will crash as main thread looks different or is ptid_t (pid, 0, tid).

> This part I don't quite understand yet - how/why does it crash?

Kindly check the section "Solution Part 2: - " of the patch, where I have explained this.

>Similarly, I agree that everything may currently "work" without
>adding the equivalent change to pdc_write_memory, but most likely
>this is simply because that callback may just not be used very much.

Yes, I agree.  We have the changed user_current_pid variable to thread so that we always switch in the right context. Kindly let me know if it is alright and any changes are necessary here.

>Can you come up with a
>message that maybe starts out with the high-level change
>and goes from there into the specific
>details.... Thanks!

Sure, so this patch has it. Kindly suggest me if we can do this better.

Have a nice day ahead.

Thanks and regards,
Aditya.

--------------------------------------------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}

-------------------------------------------------
Output with patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 15335754)]

[New inferior 3 (process 8061404)]

I am parent

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.

(gdb) info threads

  Id   Target Id                          Frame

* 1.1  Thread 1 (tid 30671243, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 34406781, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 36307315, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  2.1  process 15335754                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  process 8061404                    0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 15335754] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 15335754)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb)

-----------------------------------------------------------------------
Output without patch:-
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)
(gdb)



________________________________
From: Aditya Kamath1 <Aditya.Kamath1@ibm.com>
Sent: 10 February 2023 22:03
To: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>; simark@simark.ca <simark@simark.ca>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Hi Ulrich, Tom and community,

Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}

Also find attached a document that I have proposed as a commit message. {See: Fix Multi Thread debug fix for AIX.pdf}.. This same document is used in the commit message of this patch.

So the bug is fixed and test cases run alright. Kindly check the sample output of the same pasted below this email.

In the document I have provided specific and detailed explaination of every change as much as possible.

So honestly there was a bug left to fix in the previous email of the patch while we followed the child in particular. I figured it out as I was testing more deeply. Kindly see section "Solution Part 1: - " where I have explained the same.

Kindly suggest me for changes if needed. Otherwise kindly let me know if this is ready for commit.

>Previously we used switch_to_thread ().. Now if the application is
>theraded and we only pass ptid_t (user_current_pid) to switch_to_thread ()
>it will crash as main thread looks different or is ptid_t (pid, 0, tid).

> This part I don't quite understand yet - how/why does it crash?

Kindly check "Solution Part 2: - " of the document, where I have explained this.

>Similarly, I agree that everything may currently "work" without
>adding the equivalent change to pdc_write_memory, but most likely
>this is simply because that callback may just not be used very much.

Yes, I agree.  We have the changed user_current_pid variable to thread so that we always switch in the right context. Kindly let me know if it is alright and any changes are necessary here.

>Can you come up with a
>message that maybe starts out with the high-level change
>(along the lines of "update aix-thread.c to handle threads in
>multiple inferiors"), and goes from there into the specific
>details (aix_thread_variables structure, handling only a
>single inferior per sync_threadlists invocation, solib fixes
>for multiple inferiors, ...)?  Thanks!

Sure, so the pdf attached in this email has it. Kindly suggest me if we can do this better.

Have a nice day ahead.

Thanks and regards,
Aditya.


--------------------------------------------------------------------------------
Code:-


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}

-------------------------------------------------
Output with patch:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 15335754)]

[New inferior 3 (process 8061404)]

I am parent

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

[Switching to Thread 1]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info sharedlibrary

From        To          Syms Read   Shared Object Library

0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)

0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)

0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)

0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)

(*): Shared library is missing debugging information.

(gdb) info threads

  Id   Target Id                          Frame

* 1.1  Thread 1 (tid 30671243, running)   0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  Thread 258 (tid 34406781, running) thread_function (arg=0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  1.3  Thread 515 (tid 36307315, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0)

    at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32

  2.1  process 15335754                   0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  3.1  process 8061404                    0xd0594fc8 in _sigsetmask ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 15335754] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 15335754)]

#0  0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb)

-----------------------------------------------------------------------
Output without patch:-
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 1]

[New Thread 258]

[New Thread 515]

[New inferior 2 (process 11731200)]

I am parent

[New inferior 3 (process 16843200)]

I am parent

^C

Thread 1.1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) inferior 2

[Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)]

[Switching to thread 2.1 (process 11731200)]

#0  0xd0594fc8 in ?? ()

(gdb) info threads

  Id   Target Id         Frame

  1.1  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.2  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.3  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

  1.4  process 15270316  0xd0595fb0 in _p_nsleep ()

   from /usr/lib/libpthread.a(shr_xpg5.o)

* 2.1  process 11731200  0xd0594fc8 in ?? ()

  3.1  process 16843200  0xd0594fc8 in ?? ()

(gdb) info sharedlibrary

warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing.

warning: "/usr/lib/libcrypt.a": member "shr.o" missing.

warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing.

warning: "/usr/lib/libc.a": member "shr.o" missing.

warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o).

Use the "info sharedlibrary" command to see the complete listing.

Do you need "set solib-search-path" or "set sysroot"?

From        To          Syms Read   Shared Object Library

                        No          /usr/lib/libpthreads.a(shr_comm.o)

                        No          /usr/lib/libcrypt.a(shr.o)

                        No          /usr/lib/libpthread.a(shr_xpg5.o)

                        No          /usr/lib/libc.a(shr.o)
(gdb)
________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 09 February 2023 00:14
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>This seems unrelated to the rest of the changes at first glance.
>>Why is this necessary?
>
>So, when we need to be in the right context when we read memory. Before
>coming into the target wait, we switch_to_no_thread () due to which our
>inferior_ptid is set to null. Our target_memory needs the correct
>inferior_ptid.  Also, in case we don't have a ptid_t (pid) and the
>application is threaded we need the inferior_ptid to be set correctly
>like shown in the patch.

Understood.

>Previously we used switch_to_thread ().. Now if the application is
>theraded and we only pass ptid_t (user_current_pid) to switch_to_thread ()
>it will crash as main thread looks different or is ptid_t (pid, 0, tid).

This part I don't quite understand yet - how/why does it crash?

>>By comparison, the Linux version of this in proc-service.c also
>>switches the current inferior and address space:
> > scoped_restore_current_inferior restore_inferior;
> > set_current_inferior (ph->thread->inf);
> > scoped_restore_current_program_space restore_current_progspace;
> > set_current_program_space (ph->thread->inf->pspace);
> > scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
> > inferior_ptid = ph->thread->ptid;
>> so we should probably do the same for consistency.

>So, kindly allow me to disagree with you on this. What is happening is in
>inferior.c in do_target_wait1 () we call switch_to_inferior_no_thread ()..
[snip]
>Here we already set the correct current inferior and program space to
>the same thing as that if we set in pdc_read_memory like linux.
>So, it does not make any difference to add the changes like linux does

Well, it does look like if you entered the callback in this particular
context, the inferior may have already been set up correctly.  However,
in theory the callback could also be called in different contexts, and
just as a precaution it would be preferable to have it always work
correctly.  The semantics of the callback is to read memory of a
particular process as identified via the pthdb_user_t argument, and
we should write the routine so that it always does what's needed to
implement that semantics correctly.

>Secondly, things work if we do not do the same for pdc_write_memory.
>I have not seen anything not work. So, I don't think it is good to
>add it there. What say??

Similarly, I agree that everything may currently "work" without
adding the equivalent change to pdc_write_memory, but most likely
this is simply because that callback may just not be used very much.

But as a precaution, and to accommodate potential future changes
e.g. in the libpthdebug.a library, if would be preferable to
implement the semantics correctly.  (Also, it just looks surprising
to see the read and write implementation differ when there is no
apparent reason why that should be the case.)

>>This looks unnecessarily complicated.  Isn't this just
>  > *g++ = tp;
>
>This I have changed.

The code now looks like:
>+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
>+  {
>+    *g = tp;
>+    *g++;
>+  }

Which is weird, as *g++ dereferences g for no reason.  This should
simply be:

  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
    *g++ = tp;


>As far as the check gptid.is_pid () is concerned, I will suggest we
>keep it there. If cmp_result is > 0 and we have a main process swap
>to create a thread. Rest is same in the loop. The reason being handling
>pi and gi variables becomes complex otherwise. When this swap happens,
>we need to increment both pi and gi.. Because we have taken care of the
>main threads in both pthread library and GDB. And this for loop is
>executed only once. So, the first event is main process being
>pthreaded. Once the swap happens pi and gi become one and since
>gcount = pcount = 1 we exit the for loop. Thread addition events comes
>after this.

Hmm, handling the initial switch of a single PID-only thread
to the PID/TID-style ptid_t separately before still seems
a bit clearer to me.  But in the end your proposed code looks
correct now so I'd be fine with it as is, if you prefer.


Except for the few things mentioned above, this now looks ready to
be committed to me.  However, I'm not sure the commit message
fully describes the latest version of the patch, after we've
gone through all those iterations ...  Can you come up with a
message that maybe starts out with the high-level change
(along the lines of "update aix-thread.c to handle threads in
multiple inferiors"), and goes from there into the specific
details (aix_thread_variables structure, handling only a
single inferior per sync_threadlists invocation, solib fixes
for multiple inferiors, ...)?  Thanks!

Bye,
Ulrich



[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 44777 bytes --]

From ae2f4812d5cf561ac24bbd51cfdaa532c73ea900 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Fri, 10 Feb 2023 09:36:11 -0600
Subject: [PATCH] Fix Multi Thread debug fix for AIX
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The bug:- In the recent commit 98ed24fb35d89eb20179edf6c12f599c7a9e228e there is a change in aix-thread.c file that changes 



static_cast<aix_thread_info *> in gdb to gdb::checked_static_cast<aix_thread_info *>



AIX folks using the latest version thus will not be able to debug multi thread programs as a result of it. 

The error in AIX is as follows: - 

internal error: checked_static_cast: Assertion `result! = nullptr' failed.

The root cause of the issue:-  The private data was not set for the first thread or the main thread of a process. In AIX when we run an “info threads” command, we showed main process as “process <pid>” without private data set and added a new thread Thread<Tid> representing the same with private set. When we iterate_over_threads () we call get_aix_thread_info (). This leads to the crash as we had the main process thread “process <pid>” with no private data. Hence the checked static cast will not allow us to debug any further which is rightly so as we had a thread with no private data.

What should be the fix: - Removing the main process thread i.e. “process <pid> “was the first proposed solution as the “Thread <tid>” representing the same already exists with private data set. This was happening in the sync_threadlists () code of AIX.

Solution Part 1: -

Why the change?

The delete_thread () with the cmp_result > 0 block of the for loop in the sync_threadlists () function which applies the difference between the pthread and GDB threadlist, will fail to delete the main process thread. The reason is that it “process <pid>” is the current process and thus GDB core will not delete it despite we are calling it. Hence even if we add the “thread <tid>” representing the same “process <pid>” in the next iteration of the for loop we will not be successful.

Hence this forces us to change the main process thread “process <pid>” to “thread <tid>” via the thread_change_ptid () and the private data set. These changes can be seen in the sync_threadlists () part.

However, we also need to keep in mind that before we think this will work, our libpthread library is only ready when the following condition in the wait () of aix-thread.c is satisfied.

/* Check whether libpthdebug might be ready to be initialized.  */
  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
      && status->sig () == GDB_SIGNAL_TRAP)

Until then changing the “process <pid>” to “thread <tid>” is incorrect. Even though the session is ready and initalised via pd_enable () and pd_activate () functions respectively. Therfore this made us to keep a variable pthdebugready in all functions that lead to sync_threadlists () so that we change the process thread to a thread with private data only when libpthdebug is initialised for a particular process.

The first if condition below this paragraph change in the sync_threadlists () as shown below means the pthread debug library is not intialised. This is just to set priv to main process thread.

if (gbuf[0]->ptid.is_pid () && !pthdebugready)
    {
      aix_thread_info *priv = new aix_thread_info;
      tp->priv.reset (priv);
    }

The second if condition below this paragraph change is for changing “process <pid>” to “thread <tid>” as the pthread debug library is intialised.

if (gptid.is_pid () && pthdebugready)
                {
                  thread_change_ptid (proc_target, gptid, pptid);
                  aix_thread_info *priv = new aix_thread_info;
                  priv->pdtid = pbuf[pi].pdtid;
                  priv->tid = pbuf[pi].tid;
                  tp->priv.reset (priv);
                  gi++;
                  pi++;
                }

Failing to do so leads us to two problems. One while we fetch_registers () our regcache-> ptid though changed to ptid_t (pid, 0, tid) will not be able to get the private data in a case where we switch to a child process from the parent process via “inferior 2” command leading to the crash that private data was not set for a thread. Because we incorrectly changed the “process <pid>” to “thread <tid>” before the process itself could raise a trap and tell the debugger we are now ready to debug threads.

Example of the crash:-
(gdb) set detach-on-fork off
(gdb) r
Starting program:
[New Thread 258]
[New Thread 515]
[New inferior 2 (process 21627386)]
I am parent
[New inferior 3 (process 9372064)]
I am parent
^C
Thread 1.1 received signal SIGINT, Interrupt.
[Switching to Thread 1]
0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
(gdb) inferior 2
[Switching to inferior 2 [process 21627386] (/home /gdb_tests/ultimate-multi-thread-fork)]
[Switching to thread 2.1 (Thread 515)]
(gdb) c
Continuing.
./../gdbsupport/gdb-checked-static-cast.h:58: internal-error: checked_static_cast: Assertion `result != nullptr' failed.

The process stack of the crash due to the is as below: -

0x000000010059ef60  aix_thread_info* gdb::checked_static_cast<aix_thread_info*, private_thread_info>(private_thread_info*)(0x0) + 0x7c
0x0000000100596ea0  get_aix_thread_info(thread_info*)(0x0) + 0x34
0x000000010059b778  aix_thread_target::fetch_registers(regcache*, int)(0x11001f3f8, 0x1107c5030, 0x4000000000) + 0xf8
0x00000001003675f0  target_fetch_registers(regcache*, int)(0x1107c5030, 0x40e0ddf00d) + 0x6c
0x00000001005817c0  regcache::raw_update(int)(0x1107c5030, 0x401001f3f8) + 0x94
0x0000000100581904  readable_regcache::raw_read(int, unsigned char*)(0x1107c5030, 0x4000000203, 0xfffffffffffebc0) + 0x8c
0x0000000100581f54  readable_regcache::cooked_read(int, unsigned char*)(0x1107c5030, 0x40ffffeb90, 0xfffffffffffebc0) + 0xec
0x0000000100daba10  register_status readable_regcache::cooked_read<unsigned long, void>(int, unsigned long*)(0x1107c5030, 0x40ffffec50, 0xfffffffffffed10) + 0xd4
0x00000001005826a0  regcache_cooked_read_unsigned(regcache*, int, unsigned long*)(0x1107c5030, 0x40ffffecd0, 0xfffffffffffed10) + 0x70
0x0000000100584e2c  regcache_read_pc(regcache*)(0x1107c5030) + 0xa4
0x0000000100387614  handle_signal_stop(execution_control_state*)(0xffffffffffff3a8) + 0x158
0x00000001003864e4

Secondly in a case where, if we follow the child instead of the parent and we end up changing our “process <pid>” to “thread <tid>” before the process itself raises a trap and tells the debugger “I am ready for threads”, then when we switch_to_thread in the follow_fork () we end up not finding the “process <pid>” and thus leading to an assertion failure as shown below and rightly so, because we changed threads without the library being initialised. This happens when the follow_fork () is called, and we switch to the child thread there.

(gdb) set detach-on-fork off
(gdb) set follow-fork-mode child
(gdb) r
Starting program:
[New Thread 258]
[New Thread 515]
[Attaching after Thread 515 fork to child process 18809098]
[New inferior 2 (process 18809098)]
thread.c:1337: internal-error: switch_to_thread: Assertion `thr != NULL' failed.

The process stack is as follows:-
0x0000000100036590  internal_error_loc(char const*, int, char const*, ...)(0x10192ba70, 0x53900000000, 0x10192b970) + 0x58
0x0000000100619918  switch_to_thread(thread_info*)(0x0) + 0x48
0x000000010037635c  follow_fork()() + 0x4c8
0x0000000100385af8  handle_inferior_event(execution_control_state*)(0xffffffffffff3a8) + 0xda8
0x00000001003809d0  fetch_inferior_event()() + 0x2f8
0x0000000100719a0c  inferior_event_handler(inferior_event_type)(0x10207a50) + 0x38
0x000000010039228c  infrun_async_inferior_event_handler(void*)(0x0) + 0x30
0x0000000100671d18  check_async_event_handlers()() + 0x94
0x000000010066e32c  gdb_do_one_event(int)(0xfffffffffffff840) + 0xb4
0x0000000100001dcc  start_event_loop()() + 0x28
0x0000000100001fd4  captured_command_loop()() + 0x58
0x000000010000414c  captured_main(void*)(0xffffffffffffa60) + 0x2c
0x0000000100004220  gdb_main(captured_main_args*)(0xffffffffffffa60) + 0x20

So, the changes in the sync_threadlists () with parameter and the for loop justifies the same.

Also, we now do not use iterate_over_threads to count our GDB threads. We instead do it via for (thread_info *tp : all_threads (proc_target, ptid_t (pid))) inline.

Solution Part 2: -

Since we switch_to_no_thread before a wait (), on an event of a thread detection or any other event which makes us use the thread call-backs, we need to be in the right context while we read and write data for threads. That is why we switch our inferior_ptid, current_inferior and program space in pdc_read_data () and pdc_write_data and now pdc_write_data.

So why did we make this change
-    if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-                       ptid_t (user_current_pid));
 in pdc_read_data and change our user variable which was the process ID to a thread? Wasn’t it already doing the job?

Consider an event where the parent process is threaded, and we have a fork (). When we do a pd_update () after the beneath->wait () in thread wait () we call sync_threadlists () as well. Over there we call pthdb_pthread (data->pd_session, &pdtid, cmd);

This now will use the ptid_t (user_current_pid) to switch the thread (). However, our parent process or main thread of it, is threaded i.e is ptid_t (user_current_pid, 0, tid). Hence, we will crash with an assertion failure that thread ptid_t (user_current_pid
) has not been found.

In order to avoid the same, we now pass the thread directly. So, on any event after the main process looks like a main thread, there will be no confusion on which thread space or inferior_ptid or program space to switch, especially when a process is multi-threaded.

Solution Part 3: - In AIX we use a lot of variables for different purposes like pd_active, pd_able, arch64, pd_brk_addr and pd_session. These variables are unique per inferior. Hence, we need to keep them in a map <inferior, structure> where structure can hold all these variables per inferior. This is where we use the inbuilt GDB registry for every inferior. This change exists in this patch.

Solution Part 4: -

We figured out that the top target for a new inferior born after the main inferior was incorrect post the process being threaded.

The root cause was that the shared library was not being loaded for new process. The reason being we change our shared library file name in the BFD registry from member name to path(member_name).

Hence the changes in solib-aix takes care of the new pattern so that the shared library can be loaded correctly for every new inferior born as well via pattern matching the ‘(‘character and checking if the member_name exists after that in the new pattern registered in the BFD registry as shown in solib-aix.c changes in this patch.

-----------------------------------------------------------------------------------------------------------------

These 4 solution parts together fixes the bug.
---
 gdb/aix-thread.c | 431 ++++++++++++++++++++++++++++-------------------
 gdb/solib-aix.c  |  14 ++
 2 files changed, 271 insertions(+), 174 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..f86d9429f71 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -68,10 +68,6 @@ static bool debug_aix_thread;
 #define pthdb_tid_t	tid_t
 #endif
 
-/* Return whether to treat PID as a debuggable thread id.  */
-
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
-
 /* Success and failure values returned by pthdb callbacks.  */
 
 #define PDC_SUCCESS	PTHDB_SUCCESS
@@ -144,38 +140,20 @@ class aix_thread_target final : public target_ops
 
 static aix_thread_target aix_thread_ops;
 
-/* Address of the function that libpthread will call when libpthdebug
-   is ready to be initialized.  */
-
-static CORE_ADDR pd_brk_addr;
-
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
-/* Whether the current architecture is 64-bit.  
-   Only valid when pd_able is true.  */
-
-static int arch64;
-
 /* Forward declarations for pthdb callbacks.  */
 
-static int pdc_symbol_addrs (pthdb_user_t, pthdb_symbol_t *, int);
-static int pdc_read_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
-static int pdc_write_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
-static int pdc_read_regs (pthdb_user_t user, pthdb_tid_t tid,
+static int pdc_symbol_addrs (thread_info*, pthdb_symbol_t *, int);
+static int pdc_read_data (thread_info*, void *, pthdb_addr_t, size_t);
+static int pdc_write_data (thread_info*, void *, pthdb_addr_t, size_t);
+static int pdc_read_regs (thread_info* user, pthdb_tid_t tid,
 			  unsigned long long flags, 
 			  pthdb_context_t *context);
-static int pdc_write_regs (pthdb_user_t user, pthdb_tid_t tid,
+static int pdc_write_regs (thread_info* user, pthdb_tid_t tid,
 			   unsigned long long flags, 
 			   pthdb_context_t *context);
-static int pdc_alloc (pthdb_user_t, size_t, void **);
-static int pdc_realloc (pthdb_user_t, void *, size_t, void **);
-static int pdc_dealloc (pthdb_user_t, void *);
+static int pdc_alloc (thread_info*, size_t, void **);
+static int pdc_realloc (thread_info*, void *, size_t, void **);
+static int pdc_dealloc (thread_info*, void *);
 
 /* pthdb callbacks.  */
 
@@ -191,9 +169,66 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
+
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+
+  /* Address of the function that libpthread will call when libpthdebug
+   is ready to be initialized.  */
+  CORE_ADDR pd_brk_addr;
+
+  /* Whether the current architecture is 64-bit.
+   Only valid when pd_able is true.  */
+  int arch64;
+};
+
+/* Key to our per-inferior data.  */
+static const registry<inferior>::key<aix_thread_variables>
+  aix_thread_variables_handle;
+
+/* Function to Get aix_thread_variables data.  */
+static struct aix_thread_variables*
+get_aix_thread_variables_data (struct inferior *inf)
+{
+  if (inf == NULL)
+    return NULL;
+
+  struct aix_thread_variables* data;
+
+  data = aix_thread_variables_handle.get (inf);
+  if (data == NULL)
+    data = aix_thread_variables_handle.emplace (inf);
+
+  return data;
+}
+
+/* Helper to get data for ptid in a function.  */
 
-static pthdb_session_t pd_session;
+static struct aix_thread_variables*
+get_thread_data_helper_for_ptid (ptid_t ptid)
+{
+  inferior *inf = find_inferior_ptid (current_inferior ()->process_target (),
+					ptid);
+  return get_aix_thread_variables_data (inf);
+}
+
+/* Helper to get data for pid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_pid (pid_t pid)
+{
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (),
+                                        pid);
+  return get_aix_thread_variables_data (inf);
+}
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -318,7 +353,7 @@ pid_to_prc (ptid_t *ptidp)
   ptid_t ptid;
 
   ptid = *ptidp;
-  if (PD_TID (ptid))
+  if (ptid.tid () != 0)
     *ptidp = ptid_t (ptid.pid ());
 }
 
@@ -326,7 +361,7 @@ pid_to_prc (ptid_t *ptidp)
    the address of SYMBOLS[<i>].name.  */
 
 static int
-pdc_symbol_addrs (pthdb_user_t user_current_pid, pthdb_symbol_t *symbols, int count)
+pdc_symbol_addrs (thread_info *user_current_thread, pthdb_symbol_t *symbols, int count)
 {
   struct bound_minimal_symbol ms;
   int i;
@@ -334,8 +369,8 @@ pdc_symbol_addrs (pthdb_user_t user_current_pid, pthdb_symbol_t *symbols, int co
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
-		"pdc_symbol_addrs (user_current_pid = %ld, symbols = 0x%lx, count = %d)\n",
-		user_current_pid, (long) symbols, count);
+		"pdc_symbol_addrs (user_current_pid = %d, symbols = 0x%lx, count = %d)\n",
+		user_current_thread->ptid.pid (), (long) symbols, count);
 
   for (i = 0; i < count; i++)
     {
@@ -373,7 +408,7 @@ pdc_symbol_addrs (pthdb_user_t user_current_pid, pthdb_symbol_t *symbols, int co
    If successful return 0, else non-zero is returned.  */
 
 static int
-pdc_read_regs (pthdb_user_t user_current_pid,
+pdc_read_regs (thread_info *user_current_thread,
 	       pthdb_tid_t tid,
 	       unsigned long long flags,
 	       pthdb_context_t *context)
@@ -389,6 +424,9 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_thread->ptid.pid ());
   
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_read_regs tid=%d flags=%s\n",
@@ -397,7 +435,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -423,7 +461,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -445,7 +483,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
    If successful return 0, else non-zero is returned.  */
 
 static int
-pdc_write_regs (pthdb_user_t user_current_pid,
+pdc_write_regs (thread_info *user_current_thread,
 		pthdb_tid_t tid,
 		unsigned long long flags,
 		pthdb_context_t *context)
@@ -456,6 +494,10 @@ pdc_write_regs (pthdb_user_t user_current_pid,
      this is needed, I have implemented what I think it should do,
      however this code is untested.  */
 
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_thread->ptid.pid ());
+
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_write_regs tid=%d flags=%s\n",
 		(int) tid, hex_string (flags));
@@ -463,7 +505,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_WRITE_GPRS, tid, 
 		     (unsigned long) context->gpr, 0, NULL);
       else
@@ -479,7 +521,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  ptrace64aix (PTT_WRITE_SPRS, tid, 
 		       (unsigned long) &context->msr, 0, NULL);
@@ -495,27 +537,30 @@ pdc_write_regs (pthdb_user_t user_current_pid,
 /* pthdb callback: read LEN bytes from process ADDR into BUF.  */
 
 static int
-pdc_read_data (pthdb_user_t user_current_pid, void *buf,
+pdc_read_data (thread_info *user_current_thread, void *buf,
 	       pthdb_addr_t addr, size_t len)
 {
   int status, ret;
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
-		"pdc_read_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
-		user_current_pid, (long) buf, hex_string (addr), len);
+		"pdc_read_data (user_current_pid = %d, buf = 0x%lx, addr = %s, len = %ld)\n",
+		user_current_thread->ptid.pid (), (long) buf, hex_string (addr), len);
 
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
-    if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+    inferior_ptid = user_current_thread->ptid;
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (user_current_thread->inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (user_current_thread->inf->pspace);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -529,17 +574,27 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
 /* pthdb callback: write LEN bytes from BUF to process ADDR.  */
 
 static int
-pdc_write_data (pthdb_user_t user_current_pid, void *buf,
+pdc_write_data (thread_info *user_current_thread, void *buf,
 		pthdb_addr_t addr, size_t len)
 {
   int status, ret;
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
-		"pdc_write_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
-		user_current_pid, (long) buf, hex_string (addr), len);
+		"pdc_write_data (user_current_pid = %d, buf = 0x%lx, addr = %s, len = %ld)\n",
+		user_current_thread->ptid.pid (), (long) buf, hex_string (addr), len);
+
+  {
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
+    inferior_ptid = user_current_thread->ptid;
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (user_current_thread->inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (user_current_thread->inf->pspace);
+    status = target_write_memory (addr, (gdb_byte *) buf, len);
+  }
 
-  status = target_write_memory (addr, (gdb_byte *) buf, len);
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
 
   if (debug_aix_thread)
@@ -552,12 +607,12 @@ pdc_write_data (pthdb_user_t user_current_pid, void *buf,
    in BUFP.  */
 
 static int
-pdc_alloc (pthdb_user_t user_current_pid, size_t len, void **bufp)
+pdc_alloc (thread_info *user_current_thread, size_t len, void **bufp)
 {
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
-		"pdc_alloc (user_current_pid = %ld, len = %ld, bufp = 0x%lx)\n",
-		user_current_pid, len, (long) bufp);
+		"pdc_alloc (user_current_pid = %d, len = %ld, bufp = 0x%lx)\n",
+		user_current_thread->ptid.pid (), len, (long) bufp);
   *bufp = xmalloc (len);
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -574,12 +629,12 @@ pdc_alloc (pthdb_user_t user_current_pid, size_t len, void **bufp)
    pointer to the result in BUFP.  */
 
 static int
-pdc_realloc (pthdb_user_t user_current_pid, void *buf, size_t len, void **bufp)
+pdc_realloc (thread_info *user_current_thread, void *buf, size_t len, void **bufp)
 {
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
-		"pdc_realloc (user_current_pid = %ld, buf = 0x%lx, len = %ld, bufp = 0x%lx)\n",
-		user_current_pid, (long) buf, len, (long) bufp);
+		"pdc_realloc (user_current_pid = %d, buf = 0x%lx, len = %ld, bufp = 0x%lx)\n",
+		user_current_thread->ptid.pid (), (long) buf, len, (long) bufp);
   *bufp = xrealloc (buf, len);
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -591,11 +646,11 @@ pdc_realloc (pthdb_user_t user_current_pid, void *buf, size_t len, void **bufp)
    realloc callback.  */
 
 static int
-pdc_dealloc (pthdb_user_t user_current_pid, void *buf)
+pdc_dealloc (thread_info *user_current_thread, void *buf)
 {
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
-		"pdc_free (user_current_pid = %ld, buf = 0x%lx)\n", user_current_pid,
+		"pdc_free (user_current_pid = %d, buf = 0x%lx)\n", user_current_thread->ptid.pid (),
 		(long) buf);
   xfree (buf);
   return PDC_SUCCESS;
@@ -639,39 +694,6 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_count (struct thread_info *thread, void *countp)
-{
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
-}
-
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_accum (struct thread_info *thread, void *bufp)
-{
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
-  return 0;
-}
-
 /* ptid comparison function */
 
 static int
@@ -719,7 +741,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -741,7 +766,7 @@ get_signaled_thread (int pid)
        have difficulty with certain call patterns */
 
 static void
-sync_threadlists (int pid)
+sync_threadlists (pid_t pid, int pthdebugready)
 {
   int cmd, status;
   int pcount, psize, pi, gcount, gi;
@@ -750,6 +775,11 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +789,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (data->pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (data->pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +810,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (data->pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,13 +820,32 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
   gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    *g++ = tp;
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
+  tp = find_thread_ptid (proc_target, ptid_t (pid));
+
+  /* If the pthreadlibrary is not ready to debug 
+     then this is just a main process which needs 
+     a priv to be set.  The if condition below does 
+     the same.  Otherwise we go to the for loop to 
+     sync the pthread and GDB thread lists.  */
+
+  if (gbuf[0]->ptid.is_pid () && !pthdebugready)
+    {
+      aix_thread_info *priv = new aix_thread_info;
+      tp->priv.reset (priv);
+    }
+
   /* Apply differences between the two arrays to GDB's thread list.  */
+  else  
   for (pi = gi = 0; pi < pcount || gi < gcount;)
     {
       if (pi == pcount)
@@ -810,8 +859,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,13 +888,27 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+                 like a thread.  */
+
+	      if (gptid.is_pid () && pthdebugready)
+		{
+		  thread_change_ptid (proc_target, gptid, pptid);
+		  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+		  tp->priv.reset (priv);
+		  gi++;
+		  pi++;
+		}
+	      else
+		{
+		  delete_thread (gbuf[gi]);
+		  gi++;
+		}
 	    }
 	  else
 	    {
-	      process_stratum_target *proc_target
-		= current_inferior ()->process_target ();
 	      thread = add_thread (proc_target, pptid);
 
 	      aix_thread_info *priv = new aix_thread_info;
@@ -881,21 +942,24 @@ iter_tid (struct thread_info *thread, void *tidp)
    return a pid-only ptid with PID.  */
 
 static ptid_t
-pd_update (int pid)
+pd_update (pid_t pid, int pthdebugready)
 {
   int status;
   ptid_t ptid;
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
+  struct aix_thread_variables *data;
 
-  if (!pd_active)
+  data = get_thread_data_helper_for_pid (pid);
+
+  if (!data->pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (data->pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
-  sync_threadlists (pid);
+  sync_threadlists (pid, pthdebugready);
 
   /* Define "current thread" as one that just received a trap signal.  */
 
@@ -915,32 +979,22 @@ pd_update (int pid)
    for that thread.  Otherwise, return a ptid-only ptid using PID.  */
 
 static ptid_t
-pd_activate (int pid)
+pd_activate (pid_t pid, int pthdebugready)
 {
   int status;
-		
-  status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
+  thread_info *thread = find_thread_ptid (current_inferior (), ptid_t (pid));
+  
+  status = pthdb_session_init (thread, data->arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &data->pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
-  return pd_update (pid);
-}
-
-/* Undo the effects of pd_activate().  */
-
-static void
-pd_deactivate (void)
-{
-  if (!pd_active)
-    return;
-  pthdb_session_destroy (pd_session);
-  
-  pid_to_prc (&inferior_ptid);
-  pd_active = 0;
+  data->pd_active = 1;
+  return pd_update (pid, pthdebugready);
 }
 
 /* An object file has just been loaded.  Check whether the current
@@ -952,17 +1006,24 @@ pd_enable (void)
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  struct aix_thread_variables *data;
+
+  if (!inferior_ptid.pid ())
+    return;
+  
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (data->pd_able)
     return;
 
   /* Check application word size.  */
-  arch64 = register_size (target_gdbarch (), 0) == 8;
+  data->arch64 = register_size (target_gdbarch (), 0) == 8;
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
-  status = pthdb_session_pthreaded (inferior_ptid.pid (), PTHDB_FLAG_REGS,
+  thread_info *thread = find_thread_ptid (current_inferior (), inferior_ptid);
+  status = pthdb_session_pthreaded (thread, PTHDB_FLAG_REGS,
 				    &pd_callbacks, &stub_name);
   if ((status != PTHDB_SUCCESS
        && status != PTHDB_NOT_PTHREADED) || !stub_name)
@@ -972,18 +1033,18 @@ pd_enable (void)
   ms = lookup_minimal_symbol (stub_name, NULL, NULL);
   if (ms.minsym == NULL)
     return;
-  pd_brk_addr = ms.value_address ();
-  if (!create_thread_event_breakpoint (target_gdbarch (), pd_brk_addr))
+  data->pd_brk_addr = ms.value_address ();
+  if (!create_thread_event_breakpoint (target_gdbarch (), data->pd_brk_addr))
     return;
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  data->pd_able = 1; 
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
      activate thread debugging.  */
-  pd_activate (inferior_ptid.pid ());
+  pd_activate (inferior_ptid.pid (), 0);
 }
 
 /* Undo the effects of pd_enable().  */
@@ -991,28 +1052,31 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_able)
+    return;
+  if (!data->pd_active)
     return;
-  if (pd_active)
-    pd_deactivate ();
-  pd_able = 0;
+  pthdb_session_destroy (data->pd_session);
+ 
+  pid_to_prc (&inferior_ptid);
+  data->pd_active = 0;
+  data->pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1042,8 +1106,11 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 {
   struct thread_info *thread;
   pthdb_tid_t tid[2];
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (ptid);
 
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     {
       scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
       
@@ -1065,7 +1132,7 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 	       ptid.lwp ());
       tid[1] = 0;
 
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_CONTINUE, tid[0], (long long) 1,
 		     gdb_signal_to_host (sig), (PTRACE_TYPE_ARG5) tid);
       else
@@ -1082,6 +1149,7 @@ ptid_t
 aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
 			 target_wait_flags options)
 {
+  struct aix_thread_variables *data;
   {
     pid_to_prc (&ptid);
 
@@ -1095,8 +1163,10 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
      pid-only ptids.  */
   gdb_assert (ptid.is_pid ());
 
+  data = get_thread_data_helper_for_ptid (ptid);
+
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1105,11 +1175,11 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
       struct gdbarch *gdbarch = regcache->arch ();
 
       if (regcache_read_pc (regcache)
-	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
-	return pd_activate (ptid.pid ());
+	  - gdbarch_decr_pc_after_break (gdbarch) == data->pd_brk_addr)
+	return pd_activate (ptid.pid (), 1);
     }
 
-  return pd_update (ptid.pid ());
+  return pd_update (ptid.pid (), 0);
 }
 
 /* Record that the 64-bit general-purpose registers contain VALS.  */
@@ -1229,18 +1299,20 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep> (gdbarch);
   int status, i;
   pthdb_context_t ctx;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
 
   /* General-purpose registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_gprs64 (regcache, ctx.gpr);
   else
     for (i = 0; i < ppc_num_gprs; i++)
@@ -1253,7 +1325,7 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Special registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_sprs64 (regcache, ctx.iar, ctx.msr, ctx.cr, ctx.lr, ctx.ctr,
 			     ctx.xer, ctx.fpscr);
   else
@@ -1288,18 +1360,21 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
   int i;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"fetch_regs_kernel_thread tid=%lx regno=%d arch64=%d\n",
-		(long) tid, regno, arch64);
+		(long) tid, regno, data->arch64);
 
   /* General-purpose registers.  */
   if (regno == -1
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_gprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -1331,7 +1406,7 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -1363,7 +1438,7 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->fetch_registers (regcache, regno);
   else
     {
@@ -1511,6 +1586,8 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   pthdb_context_t ctx;
   uint32_t int32;
   uint64_t int64;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1518,7 +1595,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1528,7 +1605,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   for (i = 0; i < ppc_num_gprs; i++)
     if (REG_VALID == regcache->get_register_status (tdep->ppc_gp0_regnum + i))
       {
-	if (arch64)
+	if (data->arch64)
 	  {
 	    regcache->raw_collect (tdep->ppc_gp0_regnum + i, (void *) &int64);
 	    ctx.gpr[i] = int64;
@@ -1545,7 +1622,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
     fill_fprs (regcache, ctx.fpr);
 
   /* Special registers (always kept in ctx as 64 bits).  */
-  if (arch64)
+  if (data->arch64)
     {
       fill_sprs64 (regcache, &ctx.iar, &ctx.msr, &ctx.cr, &ctx.lr, &ctx.ctr,
 			     &ctx.xer, &ctx.fpscr);
@@ -1576,7 +1653,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1602,6 +1679,9 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs  sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1613,7 +1693,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_fprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some regs may not be in the cache.  */
 	  ptrace64aix (PTT_READ_GPRS, tid, (unsigned long) gprs64, 0, NULL);
@@ -1646,7 +1726,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some registers won't be in the cache.  */
 	  ptrace64aix (PTT_READ_SPRS, tid, 
@@ -1703,7 +1783,7 @@ aix_thread_target::store_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->store_registers (regcache, regno);
   else
     {
@@ -1741,7 +1821,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1750,7 +1830,7 @@ aix_thread_target::mourn_inferior ()
 bool
 aix_thread_target::thread_alive (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->thread_alive (ptid);
 
   /* We update the thread list every time the child stops, so all
@@ -1766,7 +1846,7 @@ aix_thread_target::thread_alive (ptid_t ptid)
 std::string
 aix_thread_target::pid_to_str (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->pid_to_str (ptid);
 
   return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
@@ -1786,8 +1866,11 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
   pthdb_detachstate_t detachstate;
   int cancelpend;
   static char *ret = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_ptid (thread->ptid);
 
-  if (!PD_TID (thread->ptid))
+  if (thread->ptid.tid () == 0)
     return NULL;
 
   string_file buf;
@@ -1800,24 +1883,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (data->pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (data->pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (data->pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (data->pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..6be81064ebd 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -618,6 +618,20 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+
+      /* For every inferior after first int bfd system we 
+	 will have the pathname instead of the member name
+	 registered. Hence the below condition exists.  */
+
+      if (s.find ('(') != std::string::npos)
+	{
+	  int pos = s.find ('(');
+	  int len = s.find (')') - s.find ('(');
+	  if (s.substr (pos+1, len-1) == member_name) 
+	    return object_bfd;
+	}
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-10 16:33                                                                     ` Aditya Kamath1
  2023-02-10 16:46                                                                       ` Aditya Kamath1
@ 2023-02-13 19:01                                                                       ` Ulrich Weigand
  2023-02-14 14:13                                                                         ` Aditya Kamath1
  1 sibling, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2023-02-13 19:01 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Also find attached a document that I have proposed as a commit message.

Thanks for the details.  However, this is too much detail for a
commit message - you don't need to include full debug sessions
documenting how you found a bug; instead, please concisely
summarize *what* the bug *is*, and how (at a high level) the
patch fixes the bug.

To be more specific, I'd write a commit message for this patch
somewhat along these lines:


Fix multi-threaded debugging under AIX

Multi-threaded debugging using the libpthdebug debug interface
is currently broken due to multiple issues.

When debugging a single inferior, we were getting assertion
failures in get_aix_thread_info as no tp->priv structure was
allocated for the main thread.  Fix this by switching the main
thread from a (pid, 0, 0) ptid_t to a (pid, 0, tid) ptid_t and
allocaing the tp->priv structure in sync_threadlists.

As a result, the switch_to_thread call in pdc_read_data could
now fail since the main thread no longer uses (pid, 0, 0).
Replace the call by only switching inferior_ptid, the current
inferior, and the current address space (like proc-service.c).
Add similar switching to pdc_write_data where it was missing
completely.

When debugging multiple inferiors, an additional set of
problems prevented correct multi-threaded debugging:

First of all, aix-thread.c used to have a number of global
variables holding per-inferior information.  Switch these
to a per-inferior data structure instead.

Also, sync_threadlists was getting confused as we were
comparing the list of threads returned by libpthdebug
for *one* process with GDB's list of threads for *all*
processes.  Only use the GDB threads of the current
inferior instead.

Finally, the presence of the thread library in any but
the first inferior was not correctly detected due to a
bug in solib-aix.c, where the BFD file name for shared
library members was changed when the library was loaded
for the first time, which caused the library to no longer
be recognized by name when loaded a second time,


(I'm not saying you need to use this exact message - maybe
I'm missing something or getting some details wrong.  But
this is the style you should be roughly going for.)

B.t.w. the fact that the message is now getting so long is
actually an indicator that it might be preferable to break
the patch out into multiple commits - the solib change is
one obvious stand-alone patch, and maybe it would make sense
to split off the aix-thread changes also needed for single-
inferior debugging from the multi-inferior support changes.
Given that we've already spent a long time on this patch,
I'm not insisting on this change - but something to keep
in mind for future patches.


Some additional comments on your latest changes:

>However, we also need to keep in mind that before we think this will
>work, our libpthread library is only ready when the following condition
>in the wait () of aix-thread.c is satisfied.
>
>/* Check whether libpthdebug might be ready to be initialized.  */
>  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
>      && status->sig () == GDB_SIGNAL_TRAP)
>
>Until then changing the “process <pid>” to “thread <tid>” is incorrect.
>Even though the session is ready and initalised via pd_enable () and
>pd_activate () >functions respectively. Therfore this made us to keep
>a variable pthdebugready in all functions that lead to sync_threadlists ()
>so that we change the process thread to a thread with private data only
>when libpthdebug is initialised for a particular process.

I do not understand this change.  The ->pd_active flag is supposed to
track exactly that information, why do we need to duplicate it into
yet another flag?   Note that the point of the the "if" block above
is that *it is calling pd_activacte()*, which will set ->pd_active
if the library is in fact ready to be used.  If there's anything
wrong that causes pd_active to be set when the thread library is,
in fact, not yet active, that's a bug we need to find and fix.

Also, as long as the thread library is not ready, we should not be
calling sync_threadlists in the first place.

>So why did we make this change
>-    if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-                       ptid_t (user_current_pid));
> in pdc_read_data and change our user variable which was the process
> ID to a thread? Wasn’t it already doing the job?

>This now will use the ptid_t (user_current_pid) to switch the thread ().
>However, our parent process or main thread of it, is threaded i.e is ptid_t
>(user_current_pid, 0, tid). Hence, we will crash with an assertion
>failure that thread ptid_t (user_current_pid) has not been found.

Ah, I see.  That makes sense.

>-static int pdc_read_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
>-static int pdc_write_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
>+static int pdc_read_data (thread_info*, void *, pthdb_addr_t, size_t);
>+static int pdc_write_data (thread_info*, void *, pthdb_addr_t, size_t);

These changes are also confusing.  First of all, my understanding has
been that the signature of these functions is fixed by the OS, since
they are passed as callbacks to pthdb_session_init.  This means you
cannot just go and change them arbitrarily ...

In addition, I'm not sure the change makes sense semantically.  Note
that you create one pd_session object *per inferior*, not one per
thread.  The pthdb_user_t identifies the pd_session, so it doesn't
make sense to use the thread_info pointer as pthdb_user_t, even
if that were possible from an API perspective.

What was the reason for not just continuing to use the PID here?

>+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>     /* Before the first inferior is added, we pass inferior_ptid.pid ()
>        from pd_enable () which is 0.  There is no need to switch threads
>        during first initialisation.  In the rest of the callbacks the
>        current thread needs to be correct.  */
>-    if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-			ptid_t (user_current_pid));
>+    inferior_ptid = user_current_thread->ptid;

If you no longer use switch_to_thread, can't you then continue to
use ptid_t (user_current_pid)?  This is only used during the
target_read_memory call, which should go down to the process
target, which doesn't require TIDs.

>+  tp = find_thread_ptid (proc_target, ptid_t (pid));
>+
>+  /* If the pthreadlibrary is not ready to debug 
>+     then this is just a main process which needs 
>+     a priv to be set.  The if condition below does 
>+     the same.  Otherwise we go to the for loop to 
>+     sync the pthread and GDB thread lists.  */

This goes back to my question above, if the library is not yet
ready, first of all we should never even get here, and second,
all PTIDs should still be PID-only and nobody should ever look
for any aix_thread_info ...

> static ptid_t
>-pd_activate (int pid)
>+pd_activate (pid_t pid, int pthdebugready)

I don't understand this flag - the point of this function is
to *find out whether* the library is ready - either
pthdb_session_init succeeds (and thus the library is ready)
or it fails (and thus the library is not ready).

>   /* If we're debugging a core file or an attached inferior, the
>      pthread library may already have been initialized, so try to
>      activate thread debugging.  */
>-  pd_activate (inferior_ptid.pid ());
>+  pd_activate (inferior_ptid.pid (), 0);

I guess this is the point?  As the comment says, this should only
ever make any difference for core files or attaching to an
already running inferior, never for starting up an inferior under
GDB.  If this isn't correct, we need to understand why.


Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-13 19:01                                                                       ` Ulrich Weigand
@ 2023-02-14 14:13                                                                         ` Aditya Kamath1
  2023-02-16 19:46                                                                           ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2023-02-14 14:13 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 15058 bytes --]

Hi Ulrich, Tom and community,

Thank you for all the guidance in this bug fix so far.

So we have two problems to fix when we follow the child process before we commit which I felt needed a discussion before we make any adjustments in our AIX target code. Kindly find attached the patch where I have the latest version with me. I have removed the thread_info parameter in the call backs since you mentioned that pthdb_user_t is what is defined in the OS.

Consider the program below and the output as well for the explanation.

So what is happening is that the when after a new process is born, its pthread library is getting intialised and we have changed its ptid from ptid (pid, 0, 0) to ptid (pid, 0, tid). Since we follow fork the code in inferior.c file will switch to the thread child where the child is reported as ptid (pid, 0, 0) but exists as ptid (pid, 0, tid). This leads to this crash. We did try with two variables if you recall in the previous patch. But your point of pd_active being there for it was valid. So somehow something isn't correct that I did not understand. We have pd_activate () in only two places. So is follow_fork () is expecting us to switch to child process and then change the ptid of the child?? If yes, how do we go?? And if not where are we going wrong here.

Also this ptid_t (pid, 0, 0) and our main thread being ptid_t (pid, 0, tid) might need a smarted way to switch to the main thread's process space and set the right current inferior process in pdc_read_memory. Kindly check it in this patch and let me know if we can do it better. I have done a nicer way in fetch_registers (). That is why I had changed the first parameter in previous patch version of pdc_read_memory ().

So these are the only places that is blocking us from committing this change. Rest are same as before and remaing cases work correctly.

--------------------------

>If you no longer use switch_to_thread, can't you then continue to
>use ptid_t (user_current_pid)?

So the reason is stated in the first paragraph of this email.

>B.t.w. the fact that the message is now getting so long is
>actually an indicator that it might be preferable to break
>the patch out into multiple commits - the solib change is
>one obvious stand-alone patch, and maybe it would make sense
>to split off the aix-thread changes also needed for single-
>inferior debugging from the multi-inferior support changes.
>Given that we've already spent a long time on this patch,
>I'm not insisting on this change - but something to keep
>in mind for future patches.

Yes, I have changed the commit message. Thank you. Sure will learn the art of splitting patches.
--------------------------

Waiting for a reply soon.

Have a nice day ahead.

With curiosity and regards,
Aditya.


-----------------------------------------
Code :-

#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}

-----------------------------------------
problematic output:-


Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

(gdb) set follow-fork-mode child

(gdb) set detach-on-fork off

(gdb) r

Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork

[New Thread 258]

[New Thread 515]

[Attaching after Thread 515 fork to child process 14090746]

[New inferior 2 (process 14090746)]

thread.c:1337: internal-error: switch_to_thread: Assertion `thr != NULL' failed.

A problem internal to GDB has been detected,

-----------------------------------------
Procstack for the crash.


---------- tid# 33554919 (pthread ID:      1) ----------

0x090000000016a474  __fd_poll(??, ??, ??) + 0xb4

0x0000000100674954  poll(0x1104b6e70, 0x4, 0xffffffffffffffff) + 0x2c

0x0000000100675f5c  gdb_wait_for_event(int)(0x1ffffe9c8) + 0xd0

0x0000000100674df8  gdb_do_one_event(int)(0xffffffff1072b9d0) + 0x1d4

0x00000001000093b4  gdb_readline_wrapper(char const*)(0x11072b9d0) + 0xec

0x0000000100038e2c  defaulted_query(char const*, char, char*)(0x101742c80, 0x101ab0850, 0xfffffffffffeca8) + 0x370

0x0000000100039218  query(char const*, ...)(0x101742c80) + 0x4c

0x00000001000375b0  internal_vproblem(internal_problem*, char const*, int, char const*, char*)(0x110002540, 0x10190eb30, 0x53910166090, 0x10190ea30, 0xfffffffffffef08) + 0x378

0x0000000100037a14  internal_verror(char const*, int, char const*, char*)(0x10190eb30, 0x539ffffed10, 0x10190ea30, 0xfffffffffffef08) + 0x40

0x00000001000365b0  internal_error_loc(char const*, int, char const*, ...)(0x10190eb30, 0x53900000000, 0x10190ea30) + 0x58

0x00000001005a6ddc  switch_to_thread(thread_info*)(0x0) + 0x48

0x00000001003767cc  follow_fork()() + 0x4c8

0x0000000100385f68  handle_inferior_event(execution_control_state*)(0xffffffffffff3a8) + 0xda8

0x0000000100380e40  fetch_inferior_event()() + 0x2f8

0x0000000100a1ef04  inferior_event_handler(inferior_event_type)(0x10207830) + 0x38

0x00000001003926fc  infrun_async_inferior_event_handler(void*)(0x0) + 0x30

0x00000001006786c4  check_async_event_handlers()() + 0x94

0x0000000100674cd8  gdb_do_one_event(int)(0xfffffffffffff840) + 0xb4

0x0000000100001dcc  start_event_loop()() + 0x28

0x0000000100001fd4  captured_command_loop()() + 0x58

0x000000010000414c  captured_main(void*)(0xffffffffffffa60) + 0x2c

0x0000000100004220  gdb_main(captured_main_args*)(0xffffffffffffa60) + 0x20

0x0000000100000a9c  main(0x200000000, 0xffffffffffffb00) + 0x58

________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 14 February 2023 00:31
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Also find attached a document that I have proposed as a commit message.

Thanks for the details.  However, this is too much detail for a
commit message - you don't need to include full debug sessions
documenting how you found a bug; instead, please concisely
summarize *what* the bug *is*, and how (at a high level) the
patch fixes the bug.

To be more specific, I'd write a commit message for this patch
somewhat along these lines:


Fix multi-threaded debugging under AIX

Multi-threaded debugging using the libpthdebug debug interface
is currently broken due to multiple issues.

When debugging a single inferior, we were getting assertion
failures in get_aix_thread_info as no tp->priv structure was
allocated for the main thread.  Fix this by switching the main
thread from a (pid, 0, 0) ptid_t to a (pid, 0, tid) ptid_t and
allocaing the tp->priv structure in sync_threadlists.

As a result, the switch_to_thread call in pdc_read_data could
now fail since the main thread no longer uses (pid, 0, 0).
Replace the call by only switching inferior_ptid, the current
inferior, and the current address space (like proc-service.c).
Add similar switching to pdc_write_data where it was missing
completely.

When debugging multiple inferiors, an additional set of
problems prevented correct multi-threaded debugging:

First of all, aix-thread.c used to have a number of global
variables holding per-inferior information.  Switch these
to a per-inferior data structure instead.

Also, sync_threadlists was getting confused as we were
comparing the list of threads returned by libpthdebug
for *one* process with GDB's list of threads for *all*
processes.  Only use the GDB threads of the current
inferior instead.

Finally, the presence of the thread library in any but
the first inferior was not correctly detected due to a
bug in solib-aix.c, where the BFD file name for shared
library members was changed when the library was loaded
for the first time, which caused the library to no longer
be recognized by name when loaded a second time,


(I'm not saying you need to use this exact message - maybe
I'm missing something or getting some details wrong.  But
this is the style you should be roughly going for.)

B.t.w. the fact that the message is now getting so long is
actually an indicator that it might be preferable to break
the patch out into multiple commits - the solib change is
one obvious stand-alone patch, and maybe it would make sense
to split off the aix-thread changes also needed for single-
inferior debugging from the multi-inferior support changes.
Given that we've already spent a long time on this patch,
I'm not insisting on this change - but something to keep
in mind for future patches.


Some additional comments on your latest changes:

>However, we also need to keep in mind that before we think this will
>work, our libpthread library is only ready when the following condition
>in the wait () of aix-thread.c is satisfied.
>
>/* Check whether libpthdebug might be ready to be initialized.  */
>  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
>      && status->sig () == GDB_SIGNAL_TRAP)
>
>Until then changing the “process <pid>” to “thread <tid>” is incorrect.
>Even though the session is ready and initalised via pd_enable () and
>pd_activate () >functions respectively. Therfore this made us to keep
>a variable pthdebugready in all functions that lead to sync_threadlists ()
>so that we change the process thread to a thread with private data only
>when libpthdebug is initialised for a particular process.

I do not understand this change.  The ->pd_active flag is supposed to
track exactly that information, why do we need to duplicate it into
yet another flag?   Note that the point of the the "if" block above
is that *it is calling pd_activacte()*, which will set ->pd_active
if the library is in fact ready to be used.  If there's anything
wrong that causes pd_active to be set when the thread library is,
in fact, not yet active, that's a bug we need to find and fix.

Also, as long as the thread library is not ready, we should not be
calling sync_threadlists in the first place.

>So why did we make this change
>-    if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-                       ptid_t (user_current_pid));
> in pdc_read_data and change our user variable which was the process
> ID to a thread? Wasn’t it already doing the job?

>This now will use the ptid_t (user_current_pid) to switch the thread ().
>However, our parent process or main thread of it, is threaded i.e is ptid_t
>(user_current_pid, 0, tid). Hence, we will crash with an assertion
>failure that thread ptid_t (user_current_pid) has not been found.

Ah, I see.  That makes sense.

>-static int pdc_read_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
>-static int pdc_write_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
>+static int pdc_read_data (thread_info*, void *, pthdb_addr_t, size_t);
>+static int pdc_write_data (thread_info*, void *, pthdb_addr_t, size_t);

These changes are also confusing.  First of all, my understanding has
been that the signature of these functions is fixed by the OS, since
they are passed as callbacks to pthdb_session_init.  This means you
cannot just go and change them arbitrarily ...

In addition, I'm not sure the change makes sense semantically.  Note
that you create one pd_session object *per inferior*, not one per
thread.  The pthdb_user_t identifies the pd_session, so it doesn't
make sense to use the thread_info pointer as pthdb_user_t, even
if that were possible from an API perspective.

What was the reason for not just continuing to use the PID here?

>+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>     /* Before the first inferior is added, we pass inferior_ptid.pid ()
>        from pd_enable () which is 0.  There is no need to switch threads
>        during first initialisation.  In the rest of the callbacks the
>        current thread needs to be correct.  */
>-    if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-                      ptid_t (user_current_pid));
>+    inferior_ptid = user_current_thread->ptid;

If you no longer use switch_to_thread, can't you then continue to
use ptid_t (user_current_pid)?  This is only used during the
target_read_memory call, which should go down to the process
target, which doesn't require TIDs.

>+  tp = find_thread_ptid (proc_target, ptid_t (pid));
>+
>+  /* If the pthreadlibrary is not ready to debug
>+     then this is just a main process which needs
>+     a priv to be set.  The if condition below does
>+     the same.  Otherwise we go to the for loop to
>+     sync the pthread and GDB thread lists.  */

This goes back to my question above, if the library is not yet
ready, first of all we should never even get here, and second,
all PTIDs should still be PID-only and nobody should ever look
for any aix_thread_info ...

> static ptid_t
>-pd_activate (int pid)
>+pd_activate (pid_t pid, int pthdebugready)

I don't understand this flag - the point of this function is
to *find out whether* the library is ready - either
pthdb_session_init succeeds (and thus the library is ready)
or it fails (and thus the library is not ready).

>   /* If we're debugging a core file or an attached inferior, the
>      pthread library may already have been initialized, so try to
>      activate thread debugging.  */
>-  pd_activate (inferior_ptid.pid ());
>+  pd_activate (inferior_ptid.pid (), 0);

I guess this is the point?  As the comment says, this should only
ever make any difference for core files or attaching to an
already running inferior, never for starting up an inferior under
GDB.  If this isn't correct, we need to understand why.


Bye,
Ulrich


[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 30371 bytes --]

From 13c7037a058a63b079c9e270bebf4e0a27ad2dd7 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Tue, 14 Feb 2023 07:45:05 -0600
Subject: [PATCH] Fix multi-threaded debugging under AIX

Multi-threaded debugging using the libpthdebug debug interface
is currently broken due to multiple issues.

When debugging a single inferior, we were getting assertion
failures in get_aix_thread_info as no tp->priv structure was
allocated for the main thread.

We fixed this by switching the main
thread from a (pid, 0, 0) ptid_t to a (pid, 0, tid) ptid_t and
allocaing the tp->priv structure in sync_threadlists.

As a result, the switch_to_thread call in pdc_read_data could
now fail since the main thread no longer uses (pid, 0, 0).

So we replaced the call by only switching inferior_ptid, the current
inferior, and the current address space (like proc-service.c).
Add similar switching to pdc_write_data where it was missing
completely.

When debugging multiple inferiors, an additional set of
problems prevented correct multi-threaded debugging:

First of all, aix-thread.c used to have a number of global
variables holding per-inferior information.

We switched hese
to a per-inferior data structure instead.

Also, sync_threadlists was getting confused as we were
comparing the list of threads returned by libpthdebug
for *one* process with GDB's list of threads for *all*
processes. Now we only use he GDB threads of the current
inferior instead.

Finally, the presence of the thread library in any but
the first inferior was not correctly detected due to a
bug in solib-aix.c, where the BFD file name for shared
library members was changed when the library was loaded
for the first time, which caused the library to no longer
be recognized by name when loaded a second time,
---
 gdb/aix-thread.c | 391 ++++++++++++++++++++++++++++++-----------------
 gdb/solib-aix.c  |  14 ++
 2 files changed, 263 insertions(+), 142 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..c900be53ee5 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -68,10 +68,6 @@ static bool debug_aix_thread;
 #define pthdb_tid_t	tid_t
 #endif
 
-/* Return whether to treat PID as a debuggable thread id.  */
-
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
-
 /* Success and failure values returned by pthdb callbacks.  */
 
 #define PDC_SUCCESS	PTHDB_SUCCESS
@@ -144,24 +140,6 @@ class aix_thread_target final : public target_ops
 
 static aix_thread_target aix_thread_ops;
 
-/* Address of the function that libpthread will call when libpthdebug
-   is ready to be initialized.  */
-
-static CORE_ADDR pd_brk_addr;
-
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
-/* Whether the current architecture is 64-bit.  
-   Only valid when pd_able is true.  */
-
-static int arch64;
-
 /* Forward declarations for pthdb callbacks.  */
 
 static int pdc_symbol_addrs (pthdb_user_t, pthdb_symbol_t *, int);
@@ -191,9 +169,66 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
+
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+
+  /* Address of the function that libpthread will call when libpthdebug
+   is ready to be initialized.  */
+  CORE_ADDR pd_brk_addr;
+
+  /* Whether the current architecture is 64-bit.
+   Only valid when pd_able is true.  */
+  int arch64;
+};
+
+/* Key to our per-inferior data.  */
+static const registry<inferior>::key<aix_thread_variables>
+  aix_thread_variables_handle;
+
+/* Function to Get aix_thread_variables data.  */
+static struct aix_thread_variables*
+get_aix_thread_variables_data (struct inferior *inf)
+{
+  if (inf == NULL)
+    return NULL;
+
+  struct aix_thread_variables* data;
 
-static pthdb_session_t pd_session;
+  data = aix_thread_variables_handle.get (inf);
+  if (data == NULL)
+    data = aix_thread_variables_handle.emplace (inf);
+
+  return data;
+}
+
+/* Helper to get data for ptid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_ptid (ptid_t ptid)
+{
+  inferior *inf = find_inferior_ptid (current_inferior ()->process_target (),
+					ptid);
+  return get_aix_thread_variables_data (inf);
+}
+
+/* Helper to get data for pid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_pid (pid_t pid)
+{
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (),
+                                        pid);
+  return get_aix_thread_variables_data (inf);
+}
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -318,7 +353,7 @@ pid_to_prc (ptid_t *ptidp)
   ptid_t ptid;
 
   ptid = *ptidp;
-  if (PD_TID (ptid))
+  if (ptid.tid () != 0)
     *ptidp = ptid_t (ptid.pid ());
 }
 
@@ -389,6 +424,9 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
   
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_read_regs tid=%d flags=%s\n",
@@ -397,7 +435,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -423,7 +461,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -456,6 +494,10 @@ pdc_write_regs (pthdb_user_t user_current_pid,
      this is needed, I have implemented what I think it should do,
      however this code is untested.  */
 
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
+
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_write_regs tid=%d flags=%s\n",
 		(int) tid, hex_string (flags));
@@ -463,7 +505,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_WRITE_GPRS, tid, 
 		     (unsigned long) context->gpr, 0, NULL);
       else
@@ -479,7 +521,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  ptrace64aix (PTT_WRITE_SPRS, tid, 
 		       (unsigned long) &context->msr, 0, NULL);
@@ -499,6 +541,20 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
 	       pthdb_addr_t addr, size_t len)
 {
   int status, ret;
+  thread_info *thread = find_thread_ptid (current_inferior ()->process_target (), ptid_t (user_current_pid));
+  /* If the pthread debug library is loaded, then we need the ptid_t (pid, 0 ,tid).
+     Since the main thread in the below for loop will be in the first iteration
+     we will break.  */
+
+  if (!thread)
+  {
+    for (thread_info *tp: all_threads (current_inferior ()->process_target (),
+                                        ptid_t (user_current_pid)))
+      {
+	thread = tp; 
+	break;
+      } 
+  }
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
@@ -508,14 +564,17 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
-    if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+    inferior_ptid = ptid_t (user_current_pid);
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (thread->inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (thread->inf->pspace);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -533,13 +592,37 @@ pdc_write_data (pthdb_user_t user_current_pid, void *buf,
 		pthdb_addr_t addr, size_t len)
 {
   int status, ret;
+  thread_info *thread = find_thread_ptid (current_inferior (), 
+					  ptid_t (user_current_pid));
+  /* If the pthread debug library is loaded, then we need the ptid_t (pid, 0 ,tid).
+     Since the main thread in the below for loop will be in the first iteration
+     we will break.  */ 
+  if (!thread)
+  {
+    for (thread_info *tp: all_threads (current_inferior ()->process_target (), 
+					ptid_t (user_current_pid)))
+      {
+        thread = tp;
+        break;
+      }
+  }
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"pdc_write_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
 		user_current_pid, (long) buf, hex_string (addr), len);
 
-  status = target_write_memory (addr, (gdb_byte *) buf, len);
+  {
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
+    inferior_ptid = ptid_t (user_current_pid);
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (thread->inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (thread->inf->pspace);
+    status = target_write_memory (addr, (gdb_byte *) buf, len);
+  }
+
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
 
   if (debug_aix_thread)
@@ -639,39 +722,6 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_count (struct thread_info *thread, void *countp)
-{
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
-}
-
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_accum (struct thread_info *thread, void *bufp)
-{
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
-  return 0;
-}
-
 /* ptid comparison function */
 
 static int
@@ -719,7 +769,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -741,7 +794,7 @@ get_signaled_thread (int pid)
        have difficulty with certain call patterns */
 
 static void
-sync_threadlists (int pid)
+sync_threadlists (pid_t pid) 
 {
   int cmd, status;
   int pcount, psize, pi, gcount, gi;
@@ -750,6 +803,11 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +817,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (data->pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (data->pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +838,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (data->pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,13 +848,26 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
   gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    *g++ = tp;
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
+  tp = find_thread_ptid (proc_target, ptid_t (pid));
+
+  /* If the pthreadlibrary is not ready to debug 
+     then this is just a main process which needs 
+     a priv to be set.  The if condition below does 
+     the same.  Otherwise we go to the for loop to 
+     sync the pthread and GDB thread lists.  */
+
   /* Apply differences between the two arrays to GDB's thread list.  */
+
   for (pi = gi = 0; pi < pcount || gi < gcount;)
     {
       if (pi == pcount)
@@ -810,8 +881,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,13 +910,27 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+                 like a thread.  */
+
+	      if (gptid.is_pid ())
+		{
+		  thread_change_ptid (proc_target, gptid, pptid);
+		  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+		  tp->priv.reset (priv);
+		  gi++;
+		  pi++;
+		}
+	      else
+		{
+		  delete_thread (gbuf[gi]);
+		  gi++;
+		}
 	    }
 	  else
 	    {
-	      process_stratum_target *proc_target
-		= current_inferior ()->process_target ();
 	      thread = add_thread (proc_target, pptid);
 
 	      aix_thread_info *priv = new aix_thread_info;
@@ -881,17 +964,20 @@ iter_tid (struct thread_info *thread, void *tidp)
    return a pid-only ptid with PID.  */
 
 static ptid_t
-pd_update (int pid)
+pd_update (pid_t pid)
 {
   int status;
   ptid_t ptid;
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_pid (pid);
 
-  if (!pd_active)
+  if (!data->pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (data->pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
@@ -915,34 +1001,23 @@ pd_update (int pid)
    for that thread.  Otherwise, return a ptid-only ptid using PID.  */
 
 static ptid_t
-pd_activate (int pid)
+pd_activate (pid_t pid)
 {
   int status;
-		
-  status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
+  
+  status = pthdb_session_init (pid, data->arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &data->pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  data->pd_active = 1;
   return pd_update (pid);
 }
 
-/* Undo the effects of pd_activate().  */
-
-static void
-pd_deactivate (void)
-{
-  if (!pd_active)
-    return;
-  pthdb_session_destroy (pd_session);
-  
-  pid_to_prc (&inferior_ptid);
-  pd_active = 0;
-}
-
 /* An object file has just been loaded.  Check whether the current
    application is pthreaded, and if so, prepare for thread debugging.  */
 
@@ -952,13 +1027,19 @@ pd_enable (void)
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  struct aix_thread_variables *data;
+
+  if (!inferior_ptid.pid ())
+    return;
+  
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (data->pd_able)
     return;
 
   /* Check application word size.  */
-  arch64 = register_size (target_gdbarch (), 0) == 8;
+  data->arch64 = register_size (target_gdbarch (), 0) == 8;
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
@@ -972,13 +1053,13 @@ pd_enable (void)
   ms = lookup_minimal_symbol (stub_name, NULL, NULL);
   if (ms.minsym == NULL)
     return;
-  pd_brk_addr = ms.value_address ();
-  if (!create_thread_event_breakpoint (target_gdbarch (), pd_brk_addr))
+  data->pd_brk_addr = ms.value_address ();
+  if (!create_thread_event_breakpoint (target_gdbarch (), data->pd_brk_addr))
     return;
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  data->pd_able = 1; 
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
@@ -991,28 +1072,31 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_able)
     return;
-  if (pd_active)
-    pd_deactivate ();
-  pd_able = 0;
+  if (!data->pd_active)
+    return;
+  pthdb_session_destroy (data->pd_session);
+ 
+  pid_to_prc (&inferior_ptid);
+  data->pd_active = 0;
+  data->pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1042,8 +1126,11 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 {
   struct thread_info *thread;
   pthdb_tid_t tid[2];
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (ptid);
 
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     {
       scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
       
@@ -1065,7 +1152,7 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 	       ptid.lwp ());
       tid[1] = 0;
 
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_CONTINUE, tid[0], (long long) 1,
 		     gdb_signal_to_host (sig), (PTRACE_TYPE_ARG5) tid);
       else
@@ -1082,6 +1169,7 @@ ptid_t
 aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
 			 target_wait_flags options)
 {
+  struct aix_thread_variables *data;
   {
     pid_to_prc (&ptid);
 
@@ -1095,8 +1183,10 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
      pid-only ptids.  */
   gdb_assert (ptid.is_pid ());
 
+  data = get_thread_data_helper_for_ptid (ptid);
+
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1105,7 +1195,7 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
       struct gdbarch *gdbarch = regcache->arch ();
 
       if (regcache_read_pc (regcache)
-	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
+	  - gdbarch_decr_pc_after_break (gdbarch) == data->pd_brk_addr)
 	return pd_activate (ptid.pid ());
     }
 
@@ -1229,18 +1319,20 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep> (gdbarch);
   int status, i;
   pthdb_context_t ctx;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
 
   /* General-purpose registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_gprs64 (regcache, ctx.gpr);
   else
     for (i = 0; i < ppc_num_gprs; i++)
@@ -1253,7 +1345,7 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Special registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_sprs64 (regcache, ctx.iar, ctx.msr, ctx.cr, ctx.lr, ctx.ctr,
 			     ctx.xer, ctx.fpscr);
   else
@@ -1288,18 +1380,21 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
   int i;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"fetch_regs_kernel_thread tid=%lx regno=%d arch64=%d\n",
-		(long) tid, regno, arch64);
+		(long) tid, regno, data->arch64);
 
   /* General-purpose registers.  */
   if (regno == -1
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_gprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -1331,7 +1426,7 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -1362,12 +1457,16 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
 {
   struct thread_info *thread;
   pthdb_tid_t tid;
+  thread = find_thread_ptid (current_inferior ()->process_target (), ptid_t (regcache->ptid ().pid (), 0, regcache->ptid ().tid ()));
+
+  /* If a new inferior is born, then its pthread debug library is yet to
+     initialised and hence has no private data. So the below if condition
+     exists.  */
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->fetch_registers (regcache, regno);
   else
     {
-      thread = find_thread_ptid (current_inferior (), regcache->ptid ());
       aix_thread_info *priv = get_aix_thread_info (thread);
       tid = priv->tid;
 
@@ -1511,6 +1610,8 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   pthdb_context_t ctx;
   uint32_t int32;
   uint64_t int64;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1518,7 +1619,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1528,7 +1629,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   for (i = 0; i < ppc_num_gprs; i++)
     if (REG_VALID == regcache->get_register_status (tdep->ppc_gp0_regnum + i))
       {
-	if (arch64)
+	if (data->arch64)
 	  {
 	    regcache->raw_collect (tdep->ppc_gp0_regnum + i, (void *) &int64);
 	    ctx.gpr[i] = int64;
@@ -1545,7 +1646,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
     fill_fprs (regcache, ctx.fpr);
 
   /* Special registers (always kept in ctx as 64 bits).  */
-  if (arch64)
+  if (data->arch64)
     {
       fill_sprs64 (regcache, &ctx.iar, &ctx.msr, &ctx.cr, &ctx.lr, &ctx.ctr,
 			     &ctx.xer, &ctx.fpscr);
@@ -1576,7 +1677,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1602,6 +1703,9 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs  sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1613,7 +1717,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_fprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some regs may not be in the cache.  */
 	  ptrace64aix (PTT_READ_GPRS, tid, (unsigned long) gprs64, 0, NULL);
@@ -1646,7 +1750,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some registers won't be in the cache.  */
 	  ptrace64aix (PTT_READ_SPRS, tid, 
@@ -1703,7 +1807,7 @@ aix_thread_target::store_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->store_registers (regcache, regno);
   else
     {
@@ -1741,7 +1845,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1750,7 +1854,7 @@ aix_thread_target::mourn_inferior ()
 bool
 aix_thread_target::thread_alive (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->thread_alive (ptid);
 
   /* We update the thread list every time the child stops, so all
@@ -1766,7 +1870,7 @@ aix_thread_target::thread_alive (ptid_t ptid)
 std::string
 aix_thread_target::pid_to_str (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->pid_to_str (ptid);
 
   return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
@@ -1786,8 +1890,11 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
   pthdb_detachstate_t detachstate;
   int cancelpend;
   static char *ret = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_ptid (thread->ptid);
 
-  if (!PD_TID (thread->ptid))
+  if (thread->ptid.tid () == 0)
     return NULL;
 
   string_file buf;
@@ -1800,24 +1907,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (data->pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (data->pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (data->pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (data->pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..6be81064ebd 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -618,6 +618,20 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+
+      /* For every inferior after first int bfd system we 
+	 will have the pathname instead of the member name
+	 registered. Hence the below condition exists.  */
+
+      if (s.find ('(') != std::string::npos)
+	{
+	  int pos = s.find ('(');
+	  int len = s.find (')') - s.find ('(');
+	  if (s.substr (pos+1, len-1) == member_name) 
+	    return object_bfd;
+	}
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-14 14:13                                                                         ` Aditya Kamath1
@ 2023-02-16 19:46                                                                           ` Ulrich Weigand
  2023-02-17 11:26                                                                             ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2023-02-16 19:46 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>So what is happening is that the when after a new process is born,
>its pthread library is getting intialised and we have changed its
>ptid from ptid (pid, 0, 0) to ptid (pid, 0, tid). Since we follow
>fork the code in inferior.c file will switch to the thread child
>where the child is reported as ptid (pid, 0, 0) but exists as
>ptid (pid, 0, tid). This leads to this crash. We did try with two
>variables if you recall in the previous patch. But your point of
>pd_active being there for it was valid. So somehow something isn't
>correct that I did not understand. We have pd_activate () in only
>two places. So is follow_fork () is expecting us to switch to child
>process and then change the ptid of the child?? If yes, how do we go??
>And if not where are we going wrong here. 

Not sure if I follow you exactly here, but my understanding is
indeed that follow_fork should initially create an inferior for
the new process with just a single main thread using (pid, 0, 0).
Subsequently, aix-thread should detect that the new inferior
uses pthreads and then switch its ptid to (pid, 0, tid).

Not sure where exactly this goes wrong for you.  What is the
path leading to this crash you're observing?


>Also this ptid_t (pid, 0, 0) and our main thread being
>ptid_t (pid, 0, tid) might need a smarted way to switch to the
>main thread's process space and set the right current inferior
>process in pdc_read_memory. Kindly check it in this patch and
>let me know if we can do it better. 

So you currently do:

+  thread_info *thread = find_thread_ptid (current_inferior (),
+                                         ptid_t (user_current_pid));
+  /* If the pthread debug library is loaded, then we need the ptid_t (pid, 0 ,tid).
+     Since the main thread in the below for loop will be in the first iteration
+     we will break.  */
+  if (!thread)
+  {
+    for (thread_info *tp: all_threads (current_inferior ()->process_target (),
+                                       ptid_t (user_current_pid)))
+      {
+        thread = tp;
+        break;
+      }
+  }
[...]
+  {
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
+    inferior_ptid = ptid_t (user_current_pid);
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (thread->inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (thread->inf->pspace);
+    status = target_write_memory (addr, (gdb_byte *) buf, len);
+  }

This is overkill.  Note that at no point do you actually ever
use "thread" itself, only "thread->inf".  But you can determine
the inferior associated with a PID directly via find_inferior_pid
without ever involving any thread.

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-16 19:46                                                                           ` Ulrich Weigand
@ 2023-02-17 11:26                                                                             ` Aditya Kamath1
  2023-02-17 12:04                                                                               ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2023-02-17 11:26 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 9181 bytes --]

Hi Ulrich, Tom and community.

Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}

>Not sure if I follow you exactly here, but my understanding is
>indeed that follow_fork should initially create an inferior for
>the new process with just a single main thread using (pid, 0, 0).
>Subsequently, aix-thread should detect that the new inferior
>uses pthreads and then switch its ptid to (pid, 0, tid).

>Not sure where exactly this goes wrong for you.  What is the
>path leading to this crash you're observing?

Let me explain this using the example code pasted below this email. Consider we now follow the child {which is the only case, previous version of the patch was failing}.

---------------------------------------------------------------------
>(gdb) set follow-fork-mode child
>(gdb) set detach-on-fork off
>(gdb) r
>Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork
>[New Thread 258]
>[New Thread 515]
>[Attaching after Thread 515 fork to child process 10748216]
>[New inferior 2 (process 10748216)]
---------------------------------------------------------------------

GDB until here know that a fork event occurred and the child process is <10748216, 0, 0> via wait () in the rs6000-aix-nat.c file. The child in the GDB core is <pid, 0 ,0>.

GDB has three things to do here.
** It will get a new object file notifier of the new process,
** new inferior is created so the new inferior notifier {Kindly See initialize_aix_thread () in aix-thread.c where aix_thread_inferior_created () will be called }
** And follow_fork () event since GDB must decide it has to follow the child.
Note:- Our child process is <pid, 0, 0>

This is the exact order in which the above 3 things will be executed from the main GDB event loop.

So, the first task gets done. The new object file notifier has called pd_enable (), to pd_activate () to sync_threadlist () where we will change child process to <pid, 0 ,tid> if the libpthdebug session is initialised successfully.

The second task now, new inferior notifier calls aix_thread_inferior_created () to pd_enable () (), to pd_activate () to sync_threadlist ().

Now we have changed the child process from <pid, 0, 0> to <pid, 0, tid>.

But no one has informed the GDB core about this change. It is still of the view that child process is <pid, 0, 0> from the set_forked () status it got from rs6000-aix-nat wait () while a fork () event was being detected. So when it attempts to handle the third task which is to follow the child process since we have given that command, the child ptid passed to switch_to_thread is <pid, 0 ,0>. But this does not exists. Since the first two events changed it. So GDB got surprised on not finding it and failed to debug further thereby giving an option to dump the core with an assertion below. {Kindly see the back trace I pasted in the previous email. One can verify this}.

-----------------------------------------------------------------------
>thread.c:1337: internal-error: switch_to_thread: Assertion `thr != NULL' ?>failed.
>A problem internal to GDB has been detected,
>further debugging may prove unreliable.
----------------------------------------------------------------------------

The game is about when we decide to set the pd_active or say thread debugging active. We did it in two places. In the wait () and pd_enable () in aix-thread.c file. We need it to put it in the correct place which we aren’t correctly. If there is a need for any non-target function like follow_fork to use ptid_t (pid, 0 , 0) then we must not change the thread ptid though the library is initialised. This is where we are going wrong.

Let me also tell you all what I tried today and failed. I had removed pd_activate () from the pd_enable () and called pd_activate () only from wait () for all the events. {Currently we do only for trap}. But the problem is if we have many inferiors and couple of them with no private data set, since we didn’t in pd_enable () via pd_activate () as we removed it,  then at any point if we iterate_over_threads () over these inferiors we will crash since these don’t have private data set yet.

So this is a tricky problem. Wonder how Linux folks did it. If anyone knows it will be of help if shared. Kindly let me know at high level. I could not understand how they are handling the same.


>But you can determine
>the inferior associated with a PID directly via find_inferior_pid
>without ever involving any thread.

This is done in this patch. Thank you Ulrich.

Let me know what you think. Awaiting for a reply.

Have a nice day ahead.

Thanks and regards,
Aditya.

-------------------------------------------------------------------
Code

Code :-

#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{

  /* This ensures that the breakpoint is only hit after both threads

     are created, so the test can always switch to the non-event

     thread when the breakpoint triggers.  */


  pthread_barrier_wait (&barrier);

  pid_t child;


  child = fork ();

  if (child > 0)

    printf ("I am parent \n");

  else

  {

    child = fork ();

    if (child > 0)

      printf ("I am child \n");

    else

      printf ("I am grandchild \n");

  }

  while (1); /* break here */

}


int

main (void)

{

  int i;

  pthread_t thread[NUM_THREADS];


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      int res;


      res = pthread_create (&thread[i], NULL,

                            thread_function, NULL);

      assert (res == 0);

    }


  while (1)

  {

    sleep (15);

  }


  return 0;

}






From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Date: Friday, 17 February 2023 at 1:16 AM
To: simark@simark.ca <simark@simark.ca>, Aditya Kamath1 <Aditya.Kamath1@ibm.com>, gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>So what is happening is that the when after a new process is born,
>its pthread library is getting intialised and we have changed its
>ptid from ptid (pid, 0, 0) to ptid (pid, 0, tid). Since we follow
>fork the code in inferior.c file will switch to the thread child
>where the child is reported as ptid (pid, 0, 0) but exists as
>ptid (pid, 0, tid). This leads to this crash. We did try with two
>variables if you recall in the previous patch. But your point of
>pd_active being there for it was valid. So somehow something isn't
>correct that I did not understand. We have pd_activate () in only
>two places. So is follow_fork () is expecting us to switch to child
>process and then change the ptid of the child?? If yes, how do we go??
>And if not where are we going wrong here.

Not sure if I follow you exactly here, but my understanding is
indeed that follow_fork should initially create an inferior for
the new process with just a single main thread using (pid, 0, 0).
Subsequently, aix-thread should detect that the new inferior
uses pthreads and then switch its ptid to (pid, 0, tid).

Not sure where exactly this goes wrong for you.  What is the
path leading to this crash you're observing?


>Also this ptid_t (pid, 0, 0) and our main thread being
>ptid_t (pid, 0, tid) might need a smarted way to switch to the
>main thread's process space and set the right current inferior
>process in pdc_read_memory. Kindly check it in this patch and
>let me know if we can do it better.

So you currently do:

+  thread_info *thread = find_thread_ptid (current_inferior (),
+                                         ptid_t (user_current_pid));
+  /* If the pthread debug library is loaded, then we need the ptid_t (pid, 0 ,tid).
+     Since the main thread in the below for loop will be in the first iteration
+     we will break.  */
+  if (!thread)
+  {
+    for (thread_info *tp: all_threads (current_inferior ()->process_target (),
+                                       ptid_t (user_current_pid)))
+      {
+        thread = tp;
+        break;
+      }
+  }
[...]
+  {
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
+    inferior_ptid = ptid_t (user_current_pid);
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (thread->inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (thread->inf->pspace);
+    status = target_write_memory (addr, (gdb_byte *) buf, len);
+  }

This is overkill.  Note that at no point do you actually ever
use "thread" itself, only "thread->inf".  But you can determine
the inferior associated with a PID directly via find_inferior_pid
without ever involving any thread.

Bye,
Ulrich

[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 29654 bytes --]

From b469b0bc402666a7c06bb13b693185980c992f55 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Fri, 17 Feb 2023 03:16:34 -0600
Subject: [PATCH] Fix multi-threaded debugging under AIX

Multi-threaded debugging using the libpthdebug debug interface
is currently broken due to multiple issues.

When debugging a single inferior, we were getting assertion
failures in get_aix_thread_info as no tp->priv structure was
allocated for the main thread.

We fixed this by switching the main
thread from a (pid, 0, 0) ptid_t to a (pid, 0, tid) ptid_t and
allocaing the tp->priv structure in sync_threadlists.

As a result, the switch_to_thread call in pdc_read_data could
now fail since the main thread no longer uses (pid, 0, 0).

So we replaced the call by only switching inferior_ptid, the current
inferior, and the current address space (like proc-service.c).
Add similar switching to pdc_write_data where it was missing
completely.

When debugging multiple inferiors, an additional set of
problems prevented correct multi-threaded debugging:

First of all, aix-thread.c used to have a number of global
variables holding per-inferior information.

We switched hese
to a per-inferior data structure instead.

Also, sync_threadlists was getting confused as we were
comparing the list of threads returned by libpthdebug
for *one* process with GDB's list of threads for *all*
processes. Now we only use he GDB threads of the current
inferior instead.

Finally, the presence of the thread library in any but
the first inferior was not correctly detected due to a
bug in solib-aix.c, where the BFD file name for shared
library members was changed when the library was loaded
for the first time, which caused the library to no longer
be recognized by name when loaded a second time,
---
 gdb/aix-thread.c | 369 +++++++++++++++++++++++++++++------------------
 gdb/solib-aix.c  |  14 ++
 2 files changed, 240 insertions(+), 143 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..c6153e1f71e 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -68,10 +68,6 @@ static bool debug_aix_thread;
 #define pthdb_tid_t	tid_t
 #endif
 
-/* Return whether to treat PID as a debuggable thread id.  */
-
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
-
 /* Success and failure values returned by pthdb callbacks.  */
 
 #define PDC_SUCCESS	PTHDB_SUCCESS
@@ -144,24 +140,6 @@ class aix_thread_target final : public target_ops
 
 static aix_thread_target aix_thread_ops;
 
-/* Address of the function that libpthread will call when libpthdebug
-   is ready to be initialized.  */
-
-static CORE_ADDR pd_brk_addr;
-
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
-/* Whether the current architecture is 64-bit.  
-   Only valid when pd_able is true.  */
-
-static int arch64;
-
 /* Forward declarations for pthdb callbacks.  */
 
 static int pdc_symbol_addrs (pthdb_user_t, pthdb_symbol_t *, int);
@@ -191,9 +169,66 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
+
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+
+  /* Address of the function that libpthread will call when libpthdebug
+   is ready to be initialized.  */
+  CORE_ADDR pd_brk_addr;
+
+  /* Whether the current architecture is 64-bit.
+   Only valid when pd_able is true.  */
+  int arch64;
+};
+
+/* Key to our per-inferior data.  */
+static const registry<inferior>::key<aix_thread_variables>
+  aix_thread_variables_handle;
+
+/* Function to Get aix_thread_variables data.  */
+static struct aix_thread_variables*
+get_aix_thread_variables_data (struct inferior *inf)
+{
+  if (inf == NULL)
+    return NULL;
 
-static pthdb_session_t pd_session;
+  struct aix_thread_variables* data;
+
+  data = aix_thread_variables_handle.get (inf);
+  if (data == NULL)
+    data = aix_thread_variables_handle.emplace (inf);
+
+  return data;
+}
+
+/* Helper to get data for ptid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_ptid (ptid_t ptid)
+{
+  inferior *inf = find_inferior_ptid (current_inferior ()->process_target (),
+					ptid);
+  return get_aix_thread_variables_data (inf);
+}
+
+/* Helper to get data for pid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_pid (pid_t pid)
+{
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (),
+                                        pid);
+  return get_aix_thread_variables_data (inf);
+}
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -318,7 +353,7 @@ pid_to_prc (ptid_t *ptidp)
   ptid_t ptid;
 
   ptid = *ptidp;
-  if (PD_TID (ptid))
+  if (ptid.tid () != 0)
     *ptidp = ptid_t (ptid.pid ());
 }
 
@@ -389,6 +424,9 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
   
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_read_regs tid=%d flags=%s\n",
@@ -397,7 +435,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -423,7 +461,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -456,6 +494,10 @@ pdc_write_regs (pthdb_user_t user_current_pid,
      this is needed, I have implemented what I think it should do,
      however this code is untested.  */
 
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
+
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_write_regs tid=%d flags=%s\n",
 		(int) tid, hex_string (flags));
@@ -463,7 +505,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_WRITE_GPRS, tid, 
 		     (unsigned long) context->gpr, 0, NULL);
       else
@@ -479,7 +521,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  ptrace64aix (PTT_WRITE_SPRS, tid, 
 		       (unsigned long) &context->msr, 0, NULL);
@@ -499,7 +541,9 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
 	       pthdb_addr_t addr, size_t len)
 {
   int status, ret;
-
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (), 
+				     user_current_pid);
+  
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"pdc_read_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
@@ -508,14 +552,17 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
-    if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+    inferior_ptid = ptid_t (user_current_pid);
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (inf->pspace);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -533,13 +580,25 @@ pdc_write_data (pthdb_user_t user_current_pid, void *buf,
 		pthdb_addr_t addr, size_t len)
 {
   int status, ret;
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (), 
+                                     user_current_pid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"pdc_write_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
 		user_current_pid, (long) buf, hex_string (addr), len);
 
-  status = target_write_memory (addr, (gdb_byte *) buf, len);
+  {
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
+    inferior_ptid = ptid_t (user_current_pid);
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (inf->pspace);
+    status = target_write_memory (addr, (gdb_byte *) buf, len);
+  }
+
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
 
   if (debug_aix_thread)
@@ -639,39 +698,6 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_count (struct thread_info *thread, void *countp)
-{
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
-}
-
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_accum (struct thread_info *thread, void *bufp)
-{
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
-  return 0;
-}
-
 /* ptid comparison function */
 
 static int
@@ -719,7 +745,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -741,7 +770,7 @@ get_signaled_thread (int pid)
        have difficulty with certain call patterns */
 
 static void
-sync_threadlists (int pid)
+sync_threadlists (pid_t pid) 
 {
   int cmd, status;
   int pcount, psize, pi, gcount, gi;
@@ -750,6 +779,11 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +793,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (data->pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (data->pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +814,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (data->pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,13 +824,26 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
   gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    *g++ = tp;
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
+  tp = find_thread_ptid (proc_target, ptid_t (pid));
+
+  /* If the pthreadlibrary is not ready to debug 
+     then this is just a main process which needs 
+     a priv to be set.  The if condition below does 
+     the same.  Otherwise we go to the for loop to 
+     sync the pthread and GDB thread lists.  */
+
   /* Apply differences between the two arrays to GDB's thread list.  */
+
   for (pi = gi = 0; pi < pcount || gi < gcount;)
     {
       if (pi == pcount)
@@ -810,8 +857,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,13 +886,27 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+                 like a thread.  */
+
+	      if (gptid.is_pid ())
+		{
+		  thread_change_ptid (proc_target, gptid, pptid);
+		  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+		  tp->priv.reset (priv);
+		  gi++;
+		  pi++;
+		}
+	      else
+		{
+		  delete_thread (gbuf[gi]);
+		  gi++;
+		}
 	    }
 	  else
 	    {
-	      process_stratum_target *proc_target
-		= current_inferior ()->process_target ();
 	      thread = add_thread (proc_target, pptid);
 
 	      aix_thread_info *priv = new aix_thread_info;
@@ -881,17 +940,20 @@ iter_tid (struct thread_info *thread, void *tidp)
    return a pid-only ptid with PID.  */
 
 static ptid_t
-pd_update (int pid)
+pd_update (pid_t pid)
 {
   int status;
   ptid_t ptid;
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_pid (pid);
 
-  if (!pd_active)
+  if (!data->pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (data->pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
@@ -915,34 +977,23 @@ pd_update (int pid)
    for that thread.  Otherwise, return a ptid-only ptid using PID.  */
 
 static ptid_t
-pd_activate (int pid)
+pd_activate (pid_t pid)
 {
   int status;
-		
-  status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
+  
+  status = pthdb_session_init (pid, data->arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &data->pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  data->pd_active = 1;
   return pd_update (pid);
 }
 
-/* Undo the effects of pd_activate().  */
-
-static void
-pd_deactivate (void)
-{
-  if (!pd_active)
-    return;
-  pthdb_session_destroy (pd_session);
-  
-  pid_to_prc (&inferior_ptid);
-  pd_active = 0;
-}
-
 /* An object file has just been loaded.  Check whether the current
    application is pthreaded, and if so, prepare for thread debugging.  */
 
@@ -952,13 +1003,19 @@ pd_enable (void)
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  struct aix_thread_variables *data;
+
+  if (!inferior_ptid.pid ())
+    return;
+  
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (data->pd_able)
     return;
 
   /* Check application word size.  */
-  arch64 = register_size (target_gdbarch (), 0) == 8;
+  data->arch64 = register_size (target_gdbarch (), 0) == 8;
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
@@ -972,13 +1029,13 @@ pd_enable (void)
   ms = lookup_minimal_symbol (stub_name, NULL, NULL);
   if (ms.minsym == NULL)
     return;
-  pd_brk_addr = ms.value_address ();
-  if (!create_thread_event_breakpoint (target_gdbarch (), pd_brk_addr))
+  data->pd_brk_addr = ms.value_address ();
+  if (!create_thread_event_breakpoint (target_gdbarch (), data->pd_brk_addr))
     return;
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  data->pd_able = 1; 
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
@@ -991,28 +1048,31 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_able)
+    return;
+  if (!data->pd_active)
     return;
-  if (pd_active)
-    pd_deactivate ();
-  pd_able = 0;
+  pthdb_session_destroy (data->pd_session);
+ 
+  pid_to_prc (&inferior_ptid);
+  data->pd_active = 0;
+  data->pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1042,8 +1102,11 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 {
   struct thread_info *thread;
   pthdb_tid_t tid[2];
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (ptid);
 
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     {
       scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
       
@@ -1065,7 +1128,7 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 	       ptid.lwp ());
       tid[1] = 0;
 
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_CONTINUE, tid[0], (long long) 1,
 		     gdb_signal_to_host (sig), (PTRACE_TYPE_ARG5) tid);
       else
@@ -1082,6 +1145,7 @@ ptid_t
 aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
 			 target_wait_flags options)
 {
+  struct aix_thread_variables *data;
   {
     pid_to_prc (&ptid);
 
@@ -1095,8 +1159,10 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
      pid-only ptids.  */
   gdb_assert (ptid.is_pid ());
 
+  data = get_thread_data_helper_for_ptid (ptid);
+
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1105,7 +1171,7 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
       struct gdbarch *gdbarch = regcache->arch ();
 
       if (regcache_read_pc (regcache)
-	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
+	  - gdbarch_decr_pc_after_break (gdbarch) == data->pd_brk_addr)
 	return pd_activate (ptid.pid ());
     }
 
@@ -1229,18 +1295,20 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep> (gdbarch);
   int status, i;
   pthdb_context_t ctx;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
 
   /* General-purpose registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_gprs64 (regcache, ctx.gpr);
   else
     for (i = 0; i < ppc_num_gprs; i++)
@@ -1253,7 +1321,7 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Special registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_sprs64 (regcache, ctx.iar, ctx.msr, ctx.cr, ctx.lr, ctx.ctr,
 			     ctx.xer, ctx.fpscr);
   else
@@ -1288,18 +1356,21 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
   int i;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"fetch_regs_kernel_thread tid=%lx regno=%d arch64=%d\n",
-		(long) tid, regno, arch64);
+		(long) tid, regno, data->arch64);
 
   /* General-purpose registers.  */
   if (regno == -1
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_gprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -1331,7 +1402,7 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -1362,12 +1433,16 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
 {
   struct thread_info *thread;
   pthdb_tid_t tid;
+  thread = find_thread_ptid (current_inferior ()->process_target (), ptid_t (regcache->ptid ().pid (), 0, regcache->ptid ().tid ()));
 
-  if (!PD_TID (regcache->ptid ()))
+  /* If a new inferior is born, then its pthread debug library is yet to
+     initialised and hence has no private data. So the below if condition
+     exists.  */
+
+  if (regcache->ptid ().tid () == 0)
     beneath ()->fetch_registers (regcache, regno);
   else
     {
-      thread = find_thread_ptid (current_inferior (), regcache->ptid ());
       aix_thread_info *priv = get_aix_thread_info (thread);
       tid = priv->tid;
 
@@ -1511,6 +1586,8 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   pthdb_context_t ctx;
   uint32_t int32;
   uint64_t int64;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1518,7 +1595,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1528,7 +1605,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   for (i = 0; i < ppc_num_gprs; i++)
     if (REG_VALID == regcache->get_register_status (tdep->ppc_gp0_regnum + i))
       {
-	if (arch64)
+	if (data->arch64)
 	  {
 	    regcache->raw_collect (tdep->ppc_gp0_regnum + i, (void *) &int64);
 	    ctx.gpr[i] = int64;
@@ -1545,7 +1622,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
     fill_fprs (regcache, ctx.fpr);
 
   /* Special registers (always kept in ctx as 64 bits).  */
-  if (arch64)
+  if (data->arch64)
     {
       fill_sprs64 (regcache, &ctx.iar, &ctx.msr, &ctx.cr, &ctx.lr, &ctx.ctr,
 			     &ctx.xer, &ctx.fpscr);
@@ -1576,7 +1653,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1602,6 +1679,9 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs  sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1613,7 +1693,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_fprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some regs may not be in the cache.  */
 	  ptrace64aix (PTT_READ_GPRS, tid, (unsigned long) gprs64, 0, NULL);
@@ -1646,7 +1726,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some registers won't be in the cache.  */
 	  ptrace64aix (PTT_READ_SPRS, tid, 
@@ -1703,7 +1783,7 @@ aix_thread_target::store_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->store_registers (regcache, regno);
   else
     {
@@ -1741,7 +1821,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1750,7 +1830,7 @@ aix_thread_target::mourn_inferior ()
 bool
 aix_thread_target::thread_alive (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->thread_alive (ptid);
 
   /* We update the thread list every time the child stops, so all
@@ -1766,7 +1846,7 @@ aix_thread_target::thread_alive (ptid_t ptid)
 std::string
 aix_thread_target::pid_to_str (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->pid_to_str (ptid);
 
   return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
@@ -1786,8 +1866,11 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
   pthdb_detachstate_t detachstate;
   int cancelpend;
   static char *ret = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_ptid (thread->ptid);
 
-  if (!PD_TID (thread->ptid))
+  if (thread->ptid.tid () == 0)
     return NULL;
 
   string_file buf;
@@ -1800,24 +1883,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (data->pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (data->pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (data->pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (data->pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..6be81064ebd 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -618,6 +618,20 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+
+      /* For every inferior after first int bfd system we 
+	 will have the pathname instead of the member name
+	 registered. Hence the below condition exists.  */
+
+      if (s.find ('(') != std::string::npos)
+	{
+	  int pos = s.find ('(');
+	  int len = s.find (')') - s.find ('(');
+	  if (s.substr (pos+1, len-1) == member_name) 
+	    return object_bfd;
+	}
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-17 11:26                                                                             ` Aditya Kamath1
@ 2023-02-17 12:04                                                                               ` Ulrich Weigand
  2023-02-17 13:22                                                                                 ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2023-02-17 12:04 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>GDB has three things to do here.
>** It will get a new object file notifier of the new process, 
>** new inferior is created so the new inferior notifier
>** And follow_fork () event since GDB must decide it has to follow the child. 
>Note:- Our child process is <pid, 0, 0> 
>
>This is the exact order in which the above 3 things will be executed from the
>main GDB event loop. 

Hmm.  I understood the order to be a bit different.  I agree there will first
be new objfile notifiers for the main executable and shared libraries.

But then we should get to infrun.c:follow_fork, which calls
infrun.c:follow_fork_inferior, which does in sequence:

      switch_to_thread (*child_inf->threads ().begin ());
      post_create_inferior (0);

So it should *first* do the switch_to_thread, and only *then*
call post_create_inferior (which in turn triggers the new
inferior notifier).

As a result, I understand it should be fine to switch ptid_t
in the new inferior notifier.  But I agree that already switching
it in the new objfile notifier is a problem.

I see that linux-thread-db.c has special code for that:

  /* When attaching / handling fork child, don't try loading libthread_db
     until we know about all shared libraries.  */
  if (inf->in_initial_library_scan)
    return false;

I believe you need to similarly skip calling pd_activate
from pd_enable if that in_initial_library_scan flag is
true for the current inferior.

Can you try if this resolves the problem?

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-17 12:04                                                                               ` Ulrich Weigand
@ 2023-02-17 13:22                                                                                 ` Aditya Kamath1
  2023-02-17 14:18                                                                                   ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2023-02-17 13:22 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 4563 bytes --]

Hi Ulrich, Tom and community,

Yes, it resolves the issue. Thank you so much for all the support and guidance Ulrich, Tom and community. Kindly let me know if any changes. If not kindly push these changes.

Kindly see the outputs for both parent and child follow cases for the same code in the previous email pasted below.

Kindly find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}

Have a nice day ahead.

Thanks and regards,
Aditya.

---------------------
Output with following child
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...
(gdb) set detach-on-fork off
(gdb) set follow-fork-mode child
(gdb) r
Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork
[New Thread 258]
[New Thread 515]
[Attaching after Thread 515 fork to child process 10748334]
[New inferior 2 (process 10748334)]
[Attaching after process 10748334 fork to child process 9568638]
[New inferior 3 (process 9568638)]
I am grandchild
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...

Thread 3.1 received signal SIGINT, Interrupt.
[Switching to process 9568638]
thread_function (arg=0x0) at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32
32        while (1); /* break here */

-------------------
Output with following parent
Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork...
(gdb) set detach-on-fork off
(gdb) r
Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork
[New Thread 258]
[New Thread 515]
[New inferior 2 (process 9568642)]
I am parent
[New inferior 3 (process 15991254)]
I am parent

Thread 1.2 received signal SIGINT, Interrupt.
[Switching to Thread 258]
thread_function (arg=0x0) at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32
32        while (1); /* break here */
(gdb) info sharedlibrary
From        To          Syms Read   Shared Object Library
0xd05bc124  0xd05bf194  Yes (*)     /usr/lib/libpthreads.a(shr_comm.o)
0xd05bb240  0xd05bb9a1  Yes (*)     /usr/lib/libcrypt.a(shr.o)
0xd0576180  0xd05ba731  Yes (*)     /usr/lib/libpthread.a(shr_xpg5.o)
0xd0100e00  0xd0575123  Yes (*)     /usr/lib/libc.a(shr.o)
(*): Shared library is missing debugging information.
(gdb) info inferiors
  Num  Description       Connection           Executable
* 1    process 18743566  1 (native)           /home/aditya/gdb_tests/ultimate-multi-thread-fork
  2    process 9568642   1 (native)           /home/aditya/gdb_tests/ultimate-multi-thread-fork
  3    process 15991254  1 (native)           /home/aditya/gdb_tests/ultimate-multi-thread-fork
(gdb)

From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Date: Friday, 17 February 2023 at 5:34 PM
To: simark@simark.ca <simark@simark.ca>, Aditya Kamath1 <Aditya.Kamath1@ibm.com>, gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>GDB has three things to do here.
>** It will get a new object file notifier of the new process,
>** new inferior is created so the new inferior notifier
>** And follow_fork () event since GDB must decide it has to follow the child.
>Note:- Our child process is <pid, 0, 0>
>
>This is the exact order in which the above 3 things will be executed from the
>main GDB event loop.

Hmm.  I understood the order to be a bit different.  I agree there will first
be new objfile notifiers for the main executable and shared libraries.

But then we should get to infrun.c:follow_fork, which calls
infrun.c:follow_fork_inferior, which does in sequence:

      switch_to_thread (*child_inf->threads ().begin ());
      post_create_inferior (0);

So it should *first* do the switch_to_thread, and only *then*
call post_create_inferior (which in turn triggers the new
inferior notifier).

As a result, I understand it should be fine to switch ptid_t
in the new inferior notifier.  But I agree that already switching
it in the new objfile notifier is a problem.

I see that linux-thread-db.c has special code for that:

  /* When attaching / handling fork child, don't try loading libthread_db
     until we know about all shared libraries.  */
  if (inf->in_initial_library_scan)
    return false;

I believe you need to similarly skip calling pd_activate
from pd_enable if that in_initial_library_scan flag is
true for the current inferior.

Can you try if this resolves the problem?

Bye,
Ulrich

[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 29994 bytes --]

From 1fe6cdde8b650674e317beb5a94f6a45aa43de84 Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Fri, 17 Feb 2023 07:11:48 -0600
Subject: [PATCH] Fix multi-threaded debugging under AIX

Multi-threaded debugging using the libpthdebug debug interface
is currently broken due to multiple issues.

When debugging a single inferior, we were getting assertion
failures in get_aix_thread_info as no tp->priv structure was
allocated for the main thread.

We fixed this by switching the main
thread from a (pid, 0, 0) ptid_t to a (pid, 0, tid) ptid_t and
allocaing the tp->priv structure in sync_threadlists.

As a result, the switch_to_thread call in pdc_read_data could
now fail since the main thread no longer uses (pid, 0, 0).

So we replaced the call by only switching inferior_ptid, the current
inferior, and the current address space (like proc-service.c).
Add similar switching to pdc_write_data where it was missing
completely.

When debugging multiple inferiors, an additional set of
problems prevented correct multi-threaded debugging:

First of all, aix-thread.c used to have a number of global
variables holding per-inferior information.

We switched hese
to a per-inferior data structure instead.

Also, sync_threadlists was getting confused as we were
comparing the list of threads returned by libpthdebug
for *one* process with GDB's list of threads for *all*
processes. Now we only use he GDB threads of the current
inferior instead.

We also skip calling pd_activate
from pd_enable if that in_initial_library_scan flag is
true for the current inferior.

Finally, the presence of the thread library in any but
the first inferior was not correctly detected due to a
bug in solib-aix.c, where the BFD file name for shared
library members was changed when the library was loaded
for the first time, which caused the library to no longer
be recognized by name when loaded a second time.
---
 gdb/aix-thread.c | 375 +++++++++++++++++++++++++++++------------------
 gdb/solib-aix.c  |  14 ++
 2 files changed, 246 insertions(+), 143 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..c7328d9dc64 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -68,10 +68,6 @@ static bool debug_aix_thread;
 #define pthdb_tid_t	tid_t
 #endif
 
-/* Return whether to treat PID as a debuggable thread id.  */
-
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
-
 /* Success and failure values returned by pthdb callbacks.  */
 
 #define PDC_SUCCESS	PTHDB_SUCCESS
@@ -144,24 +140,6 @@ class aix_thread_target final : public target_ops
 
 static aix_thread_target aix_thread_ops;
 
-/* Address of the function that libpthread will call when libpthdebug
-   is ready to be initialized.  */
-
-static CORE_ADDR pd_brk_addr;
-
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
-/* Whether the current architecture is 64-bit.  
-   Only valid when pd_able is true.  */
-
-static int arch64;
-
 /* Forward declarations for pthdb callbacks.  */
 
 static int pdc_symbol_addrs (pthdb_user_t, pthdb_symbol_t *, int);
@@ -191,9 +169,66 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
+
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+
+  /* Address of the function that libpthread will call when libpthdebug
+   is ready to be initialized.  */
+  CORE_ADDR pd_brk_addr;
+
+  /* Whether the current architecture is 64-bit.
+   Only valid when pd_able is true.  */
+  int arch64;
+};
+
+/* Key to our per-inferior data.  */
+static const registry<inferior>::key<aix_thread_variables>
+  aix_thread_variables_handle;
+
+/* Function to Get aix_thread_variables data.  */
+static struct aix_thread_variables*
+get_aix_thread_variables_data (struct inferior *inf)
+{
+  if (inf == NULL)
+    return NULL;
+
+  struct aix_thread_variables* data;
+
+  data = aix_thread_variables_handle.get (inf);
+  if (data == NULL)
+    data = aix_thread_variables_handle.emplace (inf);
+
+  return data;
+}
+
+/* Helper to get data for ptid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_ptid (ptid_t ptid)
+{
+  inferior *inf = find_inferior_ptid (current_inferior ()->process_target (),
+					ptid);
+  return get_aix_thread_variables_data (inf);
+}
+
+/* Helper to get data for pid in a function.  */
 
-static pthdb_session_t pd_session;
+static struct aix_thread_variables*
+get_thread_data_helper_for_pid (pid_t pid)
+{
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (),
+                                        pid);
+  return get_aix_thread_variables_data (inf);
+}
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -318,7 +353,7 @@ pid_to_prc (ptid_t *ptidp)
   ptid_t ptid;
 
   ptid = *ptidp;
-  if (PD_TID (ptid))
+  if (ptid.tid () != 0)
     *ptidp = ptid_t (ptid.pid ());
 }
 
@@ -389,6 +424,9 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
   
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_read_regs tid=%d flags=%s\n",
@@ -397,7 +435,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -423,7 +461,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -456,6 +494,10 @@ pdc_write_regs (pthdb_user_t user_current_pid,
      this is needed, I have implemented what I think it should do,
      however this code is untested.  */
 
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
+
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_write_regs tid=%d flags=%s\n",
 		(int) tid, hex_string (flags));
@@ -463,7 +505,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_WRITE_GPRS, tid, 
 		     (unsigned long) context->gpr, 0, NULL);
       else
@@ -479,7 +521,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  ptrace64aix (PTT_WRITE_SPRS, tid, 
 		       (unsigned long) &context->msr, 0, NULL);
@@ -499,7 +541,9 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
 	       pthdb_addr_t addr, size_t len)
 {
   int status, ret;
-
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (), 
+				     user_current_pid);
+  
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"pdc_read_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
@@ -508,14 +552,17 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
     /* Before the first inferior is added, we pass inferior_ptid.pid ()
        from pd_enable () which is 0.  There is no need to switch threads
        during first initialisation.  In the rest of the callbacks the
        current thread needs to be correct.  */
-    if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+    inferior_ptid = ptid_t (user_current_pid);
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (inf->pspace);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -533,13 +580,25 @@ pdc_write_data (pthdb_user_t user_current_pid, void *buf,
 		pthdb_addr_t addr, size_t len)
 {
   int status, ret;
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (), 
+                                     user_current_pid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"pdc_write_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
 		user_current_pid, (long) buf, hex_string (addr), len);
 
-  status = target_write_memory (addr, (gdb_byte *) buf, len);
+  {
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
+    inferior_ptid = ptid_t (user_current_pid);
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (inf->pspace);
+    status = target_write_memory (addr, (gdb_byte *) buf, len);
+  }
+
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
 
   if (debug_aix_thread)
@@ -639,39 +698,6 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_count (struct thread_info *thread, void *countp)
-{
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
-}
-
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_accum (struct thread_info *thread, void *bufp)
-{
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
-  return 0;
-}
-
 /* ptid comparison function */
 
 static int
@@ -719,7 +745,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -741,7 +770,7 @@ get_signaled_thread (int pid)
        have difficulty with certain call patterns */
 
 static void
-sync_threadlists (int pid)
+sync_threadlists (pid_t pid) 
 {
   int cmd, status;
   int pcount, psize, pi, gcount, gi;
@@ -750,6 +779,11 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +793,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (data->pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (data->pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +814,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (data->pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,13 +824,26 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
   gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    *g++ = tp;
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
+  tp = find_thread_ptid (proc_target, ptid_t (pid));
+
+  /* If the pthreadlibrary is not ready to debug 
+     then this is just a main process which needs 
+     a priv to be set.  The if condition below does 
+     the same.  Otherwise we go to the for loop to 
+     sync the pthread and GDB thread lists.  */
+
   /* Apply differences between the two arrays to GDB's thread list.  */
+
   for (pi = gi = 0; pi < pcount || gi < gcount;)
     {
       if (pi == pcount)
@@ -810,8 +857,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,13 +886,27 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+                 like a thread.  */
+
+	      if (gptid.is_pid ())
+		{
+		  thread_change_ptid (proc_target, gptid, pptid);
+		  aix_thread_info *priv = new aix_thread_info;
+                  priv->pdtid = pbuf[pi].pdtid;
+                  priv->tid = pbuf[pi].tid;
+		  tp->priv.reset (priv);
+		  gi++;
+		  pi++;
+		}
+	      else
+		{
+		  delete_thread (gbuf[gi]);
+		  gi++;
+		}
 	    }
 	  else
 	    {
-	      process_stratum_target *proc_target
-		= current_inferior ()->process_target ();
 	      thread = add_thread (proc_target, pptid);
 
 	      aix_thread_info *priv = new aix_thread_info;
@@ -881,17 +940,20 @@ iter_tid (struct thread_info *thread, void *tidp)
    return a pid-only ptid with PID.  */
 
 static ptid_t
-pd_update (int pid)
+pd_update (pid_t pid)
 {
   int status;
   ptid_t ptid;
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_pid (pid);
 
-  if (!pd_active)
+  if (!data->pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (data->pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
@@ -915,34 +977,23 @@ pd_update (int pid)
    for that thread.  Otherwise, return a ptid-only ptid using PID.  */
 
 static ptid_t
-pd_activate (int pid)
+pd_activate (pid_t pid)
 {
   int status;
-		
-  status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
+  
+  status = pthdb_session_init (pid, data->arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &data->pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  data->pd_active = 1;
   return pd_update (pid);
 }
 
-/* Undo the effects of pd_activate().  */
-
-static void
-pd_deactivate (void)
-{
-  if (!pd_active)
-    return;
-  pthdb_session_destroy (pd_session);
-  
-  pid_to_prc (&inferior_ptid);
-  pd_active = 0;
-}
-
 /* An object file has just been loaded.  Check whether the current
    application is pthreaded, and if so, prepare for thread debugging.  */
 
@@ -952,13 +1003,19 @@ pd_enable (void)
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  struct aix_thread_variables *data;
+
+  if (!inferior_ptid.pid ())
+    return;
+  
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (data->pd_able)
     return;
 
   /* Check application word size.  */
-  arch64 = register_size (target_gdbarch (), 0) == 8;
+  data->arch64 = register_size (target_gdbarch (), 0) == 8;
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
@@ -972,13 +1029,19 @@ pd_enable (void)
   ms = lookup_minimal_symbol (stub_name, NULL, NULL);
   if (ms.minsym == NULL)
     return;
-  pd_brk_addr = ms.value_address ();
-  if (!create_thread_event_breakpoint (target_gdbarch (), pd_brk_addr))
+  data->pd_brk_addr = ms.value_address ();
+  if (!create_thread_event_breakpoint (target_gdbarch (), data->pd_brk_addr))
     return;
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  data->pd_able = 1; 
+
+  inferior *inf = current_inferior (); 
+  /* When attaching / handling fork child, don't try loading libthread_db
+     until we know about all shared libraries.  */
+  if (inf->in_initial_library_scan)
+    return;
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
@@ -991,28 +1054,31 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_able)
+    return;
+  if (!data->pd_active)
     return;
-  if (pd_active)
-    pd_deactivate ();
-  pd_able = 0;
+  pthdb_session_destroy (data->pd_session);
+ 
+  pid_to_prc (&inferior_ptid);
+  data->pd_active = 0;
+  data->pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1042,8 +1108,11 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 {
   struct thread_info *thread;
   pthdb_tid_t tid[2];
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (ptid);
 
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     {
       scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
       
@@ -1065,7 +1134,7 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 	       ptid.lwp ());
       tid[1] = 0;
 
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_CONTINUE, tid[0], (long long) 1,
 		     gdb_signal_to_host (sig), (PTRACE_TYPE_ARG5) tid);
       else
@@ -1082,6 +1151,7 @@ ptid_t
 aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
 			 target_wait_flags options)
 {
+  struct aix_thread_variables *data;
   {
     pid_to_prc (&ptid);
 
@@ -1095,8 +1165,10 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
      pid-only ptids.  */
   gdb_assert (ptid.is_pid ());
 
+  data = get_thread_data_helper_for_ptid (ptid);
+
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1105,7 +1177,7 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
       struct gdbarch *gdbarch = regcache->arch ();
 
       if (regcache_read_pc (regcache)
-	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
+	  - gdbarch_decr_pc_after_break (gdbarch) == data->pd_brk_addr)
 	return pd_activate (ptid.pid ());
     }
 
@@ -1229,18 +1301,20 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep> (gdbarch);
   int status, i;
   pthdb_context_t ctx;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
 
   /* General-purpose registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_gprs64 (regcache, ctx.gpr);
   else
     for (i = 0; i < ppc_num_gprs; i++)
@@ -1253,7 +1327,7 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Special registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_sprs64 (regcache, ctx.iar, ctx.msr, ctx.cr, ctx.lr, ctx.ctr,
 			     ctx.xer, ctx.fpscr);
   else
@@ -1288,18 +1362,21 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
   int i;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"fetch_regs_kernel_thread tid=%lx regno=%d arch64=%d\n",
-		(long) tid, regno, arch64);
+		(long) tid, regno, data->arch64);
 
   /* General-purpose registers.  */
   if (regno == -1
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_gprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -1331,7 +1408,7 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -1362,12 +1439,16 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
 {
   struct thread_info *thread;
   pthdb_tid_t tid;
+  thread = find_thread_ptid (current_inferior ()->process_target (), ptid_t (regcache->ptid ().pid (), 0, regcache->ptid ().tid ()));
 
-  if (!PD_TID (regcache->ptid ()))
+  /* If a new inferior is born, then its pthread debug library is yet to
+     initialised and hence has no private data. So the below if condition
+     exists.  */
+
+  if (regcache->ptid ().tid () == 0)
     beneath ()->fetch_registers (regcache, regno);
   else
     {
-      thread = find_thread_ptid (current_inferior (), regcache->ptid ());
       aix_thread_info *priv = get_aix_thread_info (thread);
       tid = priv->tid;
 
@@ -1511,6 +1592,8 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   pthdb_context_t ctx;
   uint32_t int32;
   uint64_t int64;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1518,7 +1601,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1528,7 +1611,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   for (i = 0; i < ppc_num_gprs; i++)
     if (REG_VALID == regcache->get_register_status (tdep->ppc_gp0_regnum + i))
       {
-	if (arch64)
+	if (data->arch64)
 	  {
 	    regcache->raw_collect (tdep->ppc_gp0_regnum + i, (void *) &int64);
 	    ctx.gpr[i] = int64;
@@ -1545,7 +1628,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
     fill_fprs (regcache, ctx.fpr);
 
   /* Special registers (always kept in ctx as 64 bits).  */
-  if (arch64)
+  if (data->arch64)
     {
       fill_sprs64 (regcache, &ctx.iar, &ctx.msr, &ctx.cr, &ctx.lr, &ctx.ctr,
 			     &ctx.xer, &ctx.fpscr);
@@ -1576,7 +1659,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1602,6 +1685,9 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs  sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1613,7 +1699,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_fprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some regs may not be in the cache.  */
 	  ptrace64aix (PTT_READ_GPRS, tid, (unsigned long) gprs64, 0, NULL);
@@ -1646,7 +1732,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some registers won't be in the cache.  */
 	  ptrace64aix (PTT_READ_SPRS, tid, 
@@ -1703,7 +1789,7 @@ aix_thread_target::store_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->store_registers (regcache, regno);
   else
     {
@@ -1741,7 +1827,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1750,7 +1836,7 @@ aix_thread_target::mourn_inferior ()
 bool
 aix_thread_target::thread_alive (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->thread_alive (ptid);
 
   /* We update the thread list every time the child stops, so all
@@ -1766,7 +1852,7 @@ aix_thread_target::thread_alive (ptid_t ptid)
 std::string
 aix_thread_target::pid_to_str (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->pid_to_str (ptid);
 
   return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
@@ -1786,8 +1872,11 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
   pthdb_detachstate_t detachstate;
   int cancelpend;
   static char *ret = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_ptid (thread->ptid);
 
-  if (!PD_TID (thread->ptid))
+  if (thread->ptid.tid () == 0)
     return NULL;
 
   string_file buf;
@@ -1800,24 +1889,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (data->pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (data->pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (data->pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (data->pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..6be81064ebd 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -618,6 +618,20 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+
+      /* For every inferior after first int bfd system we 
+	 will have the pathname instead of the member name
+	 registered. Hence the below condition exists.  */
+
+      if (s.find ('(') != std::string::npos)
+	{
+	  int pos = s.find ('(');
+	  int len = s.find (')') - s.find ('(');
+	  if (s.substr (pos+1, len-1) == member_name) 
+	    return object_bfd;
+	}
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-17 13:22                                                                                 ` Aditya Kamath1
@ 2023-02-17 14:18                                                                                   ` Ulrich Weigand
  2023-02-17 15:15                                                                                     ` Aditya Kamath1
  0 siblings, 1 reply; 49+ messages in thread
From: Ulrich Weigand @ 2023-02-17 14:18 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Yes, it resolves the issue.

Excellent.  A few final comment on the patch, including one
change I hadn't noticed before:

>@@ -508,14 +552,17 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
>   /* This is needed to eliminate the dependency of current thread
>      which is null so that thread reads the correct target memory.  */
>   {
>-    scoped_restore_current_thread restore_current_thread;
>+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>     /* Before the first inferior is added, we pass inferior_ptid.pid ()
>        from pd_enable () which is 0.  There is no need to switch threads
>        during first initialisation.  In the rest of the callbacks the
>        current thread needs to be correct.  */
This comment is no longer relevant as the code relating to it was
deleted.  The comment should be deleted as well.
>-    if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-			ptid_t (user_current_pid));
>+    inferior_ptid = ptid_t (user_current_pid);
>+    scoped_restore_current_inferior restore_inferior;
>+    set_current_inferior (inf);
>+
>+    scoped_restore_current_program_space restore_current_progspace;
>+    set_current_program_space (inf->pspace);
>     status = target_read_memory (addr, (gdb_byte *) buf, len);
>   }
>   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;

 
>+  tp = find_thread_ptid (proc_target, ptid_t (pid));
>+
>+  /* If the pthreadlibrary is not ready to debug 
>+     then this is just a main process which needs 
>+     a priv to be set.  The if condition below does 
>+     the same.  Otherwise we go to the for loop to 
>+     sync the pthread and GDB thread lists.  */
>+
>   /* Apply differences between the two arrays to GDB's thread list.  */
>+
>   for (pi = gi = 0; pi < pcount || gi < gcount;)

These changes all seem to be leftovers from previous attempts,
I guess they should be removed again.

>+  inferior *inf = current_inferior (); 
>+  /* When attaching / handling fork child, don't try loading libthread_db
>+     until we know about all shared libraries.  */
>+  if (inf->in_initial_library_scan)
>+    return;

"libthread_db" is Linux specific.  Please update the comment so
it makes sense in the AIX context.
 
>@@ -1362,12 +1439,16 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
> {
>   struct thread_info *thread;
>   pthdb_tid_t tid;
>+  thread = find_thread_ptid (current_inferior ()->process_target (), ptid_t (regcache->ptid ().pid (), 0, regcache->ptid ().tid ()));
> 
>-  if (!PD_TID (regcache->ptid ()))
>+  /* If a new inferior is born, then its pthread debug library is yet to
>+     initialised and hence has no private data. So the below if condition
>+     exists.  */
>+
>+  if (regcache->ptid ().tid () == 0)
>     beneath ()->fetch_registers (regcache, regno);
>   else
>     {
>-      thread = find_thread_ptid (current_inferior (), regcache->ptid ());
>       aix_thread_info *priv = get_aix_thread_info (thread);
>       tid = priv->tid;

I hadn't seen this change below, it doesn't really make sense to me.
You really need to use regcache->ptid here, this should be correct.
When did you see a case where this was not correct?   Does this still
happen now that we have the in_initial_library_scan check?

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-17 14:18                                                                                   ` Ulrich Weigand
@ 2023-02-17 15:15                                                                                     ` Aditya Kamath1
  2023-02-17 19:14                                                                                       ` Ulrich Weigand
  0 siblings, 1 reply; 49+ messages in thread
From: Aditya Kamath1 @ 2023-02-17 15:15 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya


[-- Attachment #1.1: Type: text/plain, Size: 6694 bytes --]

Hi Ulrich,

Please find attached the patch with all the changes mentioned. Kindly let me know if any more changes is need. If not kindly check this in.

Have a nice day ahead.

Thanks and regards,
Aditya.

>>@@ -508,14 +552,17 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
>>   /* This is needed to eliminate the dependency of current thread
>>      which is null so that thread reads the correct target memory.  */
>> {
>>-    scoped_restore_current_thread restore_current_thread;
>>+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>>     /* Before the first inferior is added, we pass inferior_ptid.pid ()
>>       from pd_enable () which is 0.  There is no need to switch threads
>>        during first initialisation.  In the rest of the callbacks the
>>        current thread needs to be correct.  */
>This comment is no longer relevant as the code relating to it was
>deleted.  The comment should be deleted as well.
>>-    if (user_current_pid != 0)
>>-      switch_to_thread (current_inferior ()->process_target (),
>>-                      ptid_t (user_current_pid));
>>+    inferior_ptid = ptid_t (user_current_pid);
>>+    scoped_restore_current_inferior restore_inferior;
>>+    set_current_inferior (inf);
>>+
>>+    scoped_restore_current_program_space restore_current_progspace;
>>+    set_current_program_space (inf->pspace);
>>     status = target_read_memory (addr, (gdb_byte *) buf, len);

Done..

>>+  tp = find_thread_ptid (proc_target, ptid_t (pid));
>>+
>>+  /* If the pthreadlibrary is not ready to debug
>>+     then this is just a main process which needs
>>+     a priv to be set.  The if condition below does
>>+     the same.  Otherwise we go to the for loop to
>>+     sync the pthread and GDB thread lists.  */


>These changes all seem to be leftovers from previous attempts,
>I guess they should be removed again.

Done..

>>+  inferior *inf = current_inferior ();
>>+  /* When attaching / handling fork child, don't try loading libthread_db
>>+     until we know about all shared libraries.  */
>>+  if (inf->in_initial_library_scan)
>>+    return;

>"libthread_db" is Linux specific.  Please update the comment so
>it makes sense in the AIX context.

Done..


>>@@ -1362,12 +1439,16 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
>>+  thread = find_thread_ptid (current_inferior ()->process_target (), ptid_t (regcache->ptid ().pid (), 0, regcache->ptid ().tid ()));
>I hadn't seen this change below, it doesn't really make sense to me.
>You really need to use regcache->ptid here, this should be correct.
>When did you see a case where this was not correct?   Does this still
>happen now that we have the in_initial_library_scan check?

This works with that flag change. Removed it. Thanks.

From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Date: Friday, 17 February 2023 at 7:48 PM
To: simark@simark.ca <simark@simark.ca>, Aditya Kamath1 <Aditya.Kamath1@ibm.com>, gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Yes, it resolves the issue.

Excellent.  A few final comment on the patch, including one
change I hadn't noticed before:

>@@ -508,14 +552,17 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
>   /* This is needed to eliminate the dependency of current thread
>      which is null so that thread reads the correct target memory.  */
>   {
>-    scoped_restore_current_thread restore_current_thread;
>+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>     /* Before the first inferior is added, we pass inferior_ptid.pid ()
>        from pd_enable () which is 0.  There is no need to switch threads
>        during first initialisation.  In the rest of the callbacks the
>        current thread needs to be correct.  */
This comment is no longer relevant as the code relating to it was
deleted.  The comment should be deleted as well.
>-    if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-                      ptid_t (user_current_pid));
>+    inferior_ptid = ptid_t (user_current_pid);
>+    scoped_restore_current_inferior restore_inferior;
>+    set_current_inferior (inf);
>+
>+    scoped_restore_current_program_space restore_current_progspace;
>+    set_current_program_space (inf->pspace);
>     status = target_read_memory (addr, (gdb_byte *) buf, len);
>   }
>   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;


>+  tp = find_thread_ptid (proc_target, ptid_t (pid));
>+
>+  /* If the pthreadlibrary is not ready to debug
>+     then this is just a main process which needs
>+     a priv to be set.  The if condition below does
>+     the same.  Otherwise we go to the for loop to
>+     sync the pthread and GDB thread lists.  */
>+
>   /* Apply differences between the two arrays to GDB's thread list.  */
>+
>   for (pi = gi = 0; pi < pcount || gi < gcount;)

These changes all seem to be leftovers from previous attempts,
I guess they should be removed again.

>+  inferior *inf = current_inferior ();
>+  /* When attaching / handling fork child, don't try loading libthread_db
>+     until we know about all shared libraries.  */
>+  if (inf->in_initial_library_scan)
>+    return;

"libthread_db" is Linux specific.  Please update the comment so
it makes sense in the AIX context.

>@@ -1362,12 +1439,16 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
> {
>   struct thread_info *thread;
>   pthdb_tid_t tid;
>+  thread = find_thread_ptid (current_inferior ()->process_target (), ptid_t (regcache->ptid ().pid (), 0, regcache->ptid ().tid ()));
>
>-  if (!PD_TID (regcache->ptid ()))
>+  /* If a new inferior is born, then its pthread debug library is yet to
>+     initialised and hence has no private data. So the below if condition
>+     exists.  */
>+
>+  if (regcache->ptid ().tid () == 0)
>     beneath ()->fetch_registers (regcache, regno);
>   else
>     {
>-      thread = find_thread_ptid (current_inferior (), regcache->ptid ());
>       aix_thread_info *priv = get_aix_thread_info (thread);
>       tid = priv->tid;

I hadn't seen this change below, it doesn't really make sense to me.
You really need to use regcache->ptid here, this should be correct.
When did you see a case where this was not correct?   Does this still
happen now that we have the in_initial_library_scan check?

Bye,
Ulrich

[-- Attachment #2: 0001-Fix-multi-thread-debug-bug-in-AIX.patch --]
[-- Type: application/octet-stream, Size: 29404 bytes --]

From 3d6af4224374c3f775360db5b13aa766a4331a2d Mon Sep 17 00:00:00 2001
From: Aditya Vidyadhar Kamath <Aditya.Kamath1@ibm.com>
Date: Fri, 17 Feb 2023 09:07:44 -0600
Subject: [PATCH] Fix multi-threaded debugging under AIX

Multi-threaded debugging using the libpthdebug debug interface
is currently broken due to multiple issues.

When debugging a single inferior, we were getting assertion
failures in get_aix_thread_info as no tp->priv structure was
allocated for the main thread.

We fixed this by switching the main
thread from a (pid, 0, 0) ptid_t to a (pid, 0, tid) ptid_t and
allocaing the tp->priv structure in sync_threadlists.

As a result, the switch_to_thread call in pdc_read_data could
now fail since the main thread no longer uses (pid, 0, 0).

So we replaced the call by only switching inferior_ptid, the current
inferior, and the current address space (like proc-service.c).
Add similar switching to pdc_write_data where it was missing
completely.

When debugging multiple inferiors, an additional set of
problems prevented correct multi-threaded debugging:

First of all, aix-thread.c used to have a number of global
variables holding per-inferior information.

We switched hese
to a per-inferior data structure instead.

Also, sync_threadlists was getting confused as we were
comparing the list of threads returned by libpthdebug
for *one* process with GDB's list of threads for *all*
processes. Now we only use he GDB threads of the current
inferior instead.

We also skip calling pd_activate
from pd_enable if that in_initial_library_scan flag is
true for the current inferior.

Finally, the presence of the thread library in any but
the first inferior was not correctly detected due to a
bug in solib-aix.c, where the BFD file name for shared
library members was changed when the library was loaded
for the first time, which caused the library to no longer
be recognized by name when loaded a second time.
---
 gdb/aix-thread.c | 370 ++++++++++++++++++++++++++++-------------------
 gdb/solib-aix.c  |  14 ++
 2 files changed, 238 insertions(+), 146 deletions(-)

diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e556c153576..b7a16d76cf7 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -68,10 +68,6 @@ static bool debug_aix_thread;
 #define pthdb_tid_t	tid_t
 #endif
 
-/* Return whether to treat PID as a debuggable thread id.  */
-
-#define PD_TID(ptid)	(pd_active && ptid.tid () != 0)
-
 /* Success and failure values returned by pthdb callbacks.  */
 
 #define PDC_SUCCESS	PTHDB_SUCCESS
@@ -144,24 +140,6 @@ class aix_thread_target final : public target_ops
 
 static aix_thread_target aix_thread_ops;
 
-/* Address of the function that libpthread will call when libpthdebug
-   is ready to be initialized.  */
-
-static CORE_ADDR pd_brk_addr;
-
-/* Whether the current application is debuggable by pthdb.  */
-
-static int pd_able = 0;
-
-/* Whether a threaded application is being debugged.  */
-
-static int pd_active = 0;
-
-/* Whether the current architecture is 64-bit.  
-   Only valid when pd_able is true.  */
-
-static int arch64;
-
 /* Forward declarations for pthdb callbacks.  */
 
 static int pdc_symbol_addrs (pthdb_user_t, pthdb_symbol_t *, int);
@@ -191,9 +169,66 @@ static pthdb_callbacks_t pd_callbacks = {
   NULL
 };
 
-/* Current pthdb session.  */
+/* Aix variable structure.  */
+struct aix_thread_variables 
+{
+  /* Whether the current application is debuggable by pthdb.  */
+  int pd_able;
+
+  /* Whether a threaded application is being debugged.  */
+  int pd_active;
+
+  /* Current pthdb session.  */
+  pthdb_session_t pd_session;
+
+  /* Address of the function that libpthread will call when libpthdebug
+   is ready to be initialized.  */
+  CORE_ADDR pd_brk_addr;
+
+  /* Whether the current architecture is 64-bit.
+   Only valid when pd_able is true.  */
+  int arch64;
+};
+
+/* Key to our per-inferior data.  */
+static const registry<inferior>::key<aix_thread_variables>
+  aix_thread_variables_handle;
+
+/* Function to Get aix_thread_variables data.  */
+static struct aix_thread_variables*
+get_aix_thread_variables_data (struct inferior *inf)
+{
+  if (inf == NULL)
+    return NULL;
+
+  struct aix_thread_variables* data;
+
+  data = aix_thread_variables_handle.get (inf);
+  if (data == NULL)
+    data = aix_thread_variables_handle.emplace (inf);
+
+  return data;
+}
+
+/* Helper to get data for ptid in a function.  */
 
-static pthdb_session_t pd_session;
+static struct aix_thread_variables*
+get_thread_data_helper_for_ptid (ptid_t ptid)
+{
+  inferior *inf = find_inferior_ptid (current_inferior ()->process_target (),
+					ptid);
+  return get_aix_thread_variables_data (inf);
+}
+
+/* Helper to get data for pid in a function.  */
+
+static struct aix_thread_variables*
+get_thread_data_helper_for_pid (pid_t pid)
+{
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (),
+                                        pid);
+  return get_aix_thread_variables_data (inf);
+}
 
 /* Return a printable representation of pthdebug function return
    STATUS.  */
@@ -318,7 +353,7 @@ pid_to_prc (ptid_t *ptidp)
   ptid_t ptid;
 
   ptid = *ptidp;
-  if (PD_TID (ptid))
+  if (ptid.tid () != 0)
     *ptidp = ptid_t (ptid.pid ());
 }
 
@@ -389,6 +424,9 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
   
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_read_regs tid=%d flags=%s\n",
@@ -397,7 +435,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -423,7 +461,7 @@ pdc_read_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -456,6 +494,10 @@ pdc_write_regs (pthdb_user_t user_current_pid,
      this is needed, I have implemented what I think it should do,
      however this code is untested.  */
 
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_pid (user_current_pid);
+
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, "pdc_write_regs tid=%d flags=%s\n",
 		(int) tid, hex_string (flags));
@@ -463,7 +505,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* General-purpose registers.  */
   if (flags & PTHDB_FLAG_GPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_WRITE_GPRS, tid, 
 		     (unsigned long) context->gpr, 0, NULL);
       else
@@ -479,7 +521,7 @@ pdc_write_regs (pthdb_user_t user_current_pid,
   /* Special-purpose registers.  */
   if (flags & PTHDB_FLAG_SPRS)
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  ptrace64aix (PTT_WRITE_SPRS, tid, 
 		       (unsigned long) &context->msr, 0, NULL);
@@ -499,7 +541,9 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
 	       pthdb_addr_t addr, size_t len)
 {
   int status, ret;
-
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (), 
+				     user_current_pid);
+  
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"pdc_read_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
@@ -508,14 +552,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
   /* This is needed to eliminate the dependency of current thread
      which is null so that thread reads the correct target memory.  */
   {
-    scoped_restore_current_thread restore_current_thread;
-    /* Before the first inferior is added, we pass inferior_ptid.pid ()
-       from pd_enable () which is 0.  There is no need to switch threads
-       during first initialisation.  In the rest of the callbacks the
-       current thread needs to be correct.  */
-    if (user_current_pid != 0)
-      switch_to_thread (current_inferior ()->process_target (),
-			ptid_t (user_current_pid));
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
+    inferior_ptid = ptid_t (user_current_pid);
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (inf->pspace);
     status = target_read_memory (addr, (gdb_byte *) buf, len);
   }
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
@@ -533,13 +576,25 @@ pdc_write_data (pthdb_user_t user_current_pid, void *buf,
 		pthdb_addr_t addr, size_t len)
 {
   int status, ret;
+  inferior *inf = find_inferior_pid (current_inferior ()->process_target (), 
+                                     user_current_pid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"pdc_write_data (user_current_pid = %ld, buf = 0x%lx, addr = %s, len = %ld)\n",
 		user_current_pid, (long) buf, hex_string (addr), len);
 
-  status = target_write_memory (addr, (gdb_byte *) buf, len);
+  {
+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
+    inferior_ptid = ptid_t (user_current_pid);
+    scoped_restore_current_inferior restore_inferior;
+    set_current_inferior (inf);
+
+    scoped_restore_current_program_space restore_current_progspace;
+    set_current_program_space (inf->pspace);
+    status = target_write_memory (addr, (gdb_byte *) buf, len);
+  }
+
   ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
 
   if (debug_aix_thread)
@@ -639,39 +694,6 @@ pcmp (const void *p1v, const void *p2v)
   return p1->pthid < p2->pthid ? -1 : p1->pthid > p2->pthid;
 }
 
-/* iterate_over_threads() callback for counting GDB threads.
-
-   Do not count the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_count (struct thread_info *thread, void *countp)
-{
-  if (PD_TID (thread->ptid))
-    (*(int *) countp)++;
-  return 0;
-}
-
-/* iterate_over_threads() callback for accumulating GDB thread pids.
-
-   Do not include the main thread (whose tid is zero).  This matches
-   the list of threads provided by the pthreaddebug library, which
-   does not include that main thread either, and thus allows us
-   to compare the two lists.  */
-
-static int
-giter_accum (struct thread_info *thread, void *bufp)
-{
-  if (PD_TID (thread->ptid))
-    {
-      **(struct thread_info ***) bufp = thread;
-      (*(struct thread_info ***) bufp)++;
-    }
-  return 0;
-}
-
 /* ptid comparison function */
 
 static int
@@ -719,7 +741,10 @@ get_signaled_thread (int pid)
 		    sizeof (thrinf), &ktid, 1) != 1)
 	break;
 
-      if (thrinf.ti_cursig == SIGTRAP)
+      /* We also need to keep in mind Trap and interrupt or any
+         signal that needs to be handled in pd_update ().  */
+
+      if (thrinf.ti_cursig)
 	return thrinf.ti_tid;
     }
 
@@ -741,7 +766,7 @@ get_signaled_thread (int pid)
        have difficulty with certain call patterns */
 
 static void
-sync_threadlists (int pid)
+sync_threadlists (pid_t pid) 
 {
   int cmd, status;
   int pcount, psize, pi, gcount, gi;
@@ -750,6 +775,11 @@ sync_threadlists (int pid)
   pthdb_pthread_t pdtid;
   pthread_t pthid;
   pthdb_tid_t tid;
+  process_stratum_target *proc_target
+            = current_inferior ()->process_target ();
+  thread_info  *tp;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
 
   /* Accumulate an array of libpthdebug threads sorted by pthread id.  */
 
@@ -759,11 +789,11 @@ sync_threadlists (int pid)
 
   for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT)
     {
-      status = pthdb_pthread (pd_session, &pdtid, cmd);
+      status = pthdb_pthread (data->pd_session, &pdtid, cmd);
       if (status != PTHDB_SUCCESS || pdtid == PTHDB_INVALID_PTHREAD)
 	break;
 
-      status = pthdb_pthread_ptid (pd_session, pdtid, &pthid);
+      status = pthdb_pthread_ptid (data->pd_session, pdtid, &pthid);
       if (status != PTHDB_SUCCESS || pthid == PTHDB_INVALID_PTID)
 	continue;
 
@@ -780,7 +810,7 @@ sync_threadlists (int pid)
 
   for (pi = 0; pi < pcount; pi++)
     {
-      status = pthdb_pthread_tid (pd_session, pbuf[pi].pdtid, &tid);
+      status = pthdb_pthread_tid (data->pd_session, pbuf[pi].pdtid, &tid);
       if (status != PTHDB_SUCCESS)
 	tid = PTHDB_INVALID_TID;
       pbuf[pi].tid = tid;
@@ -790,13 +820,18 @@ sync_threadlists (int pid)
 
   /* Accumulate an array of GDB threads sorted by pid.  */
 
+  /* gcount is GDB thread count and pcount is pthreadlib thread count.  */
+
   gcount = 0;
-  iterate_over_threads (giter_count, &gcount);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    gcount++;
   g = gbuf = XNEWVEC (struct thread_info *, gcount);
-  iterate_over_threads (giter_accum, &g);
+  for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
+    *g++ = tp;
   qsort (gbuf, gcount, sizeof *gbuf, gcmp);
 
   /* Apply differences between the two arrays to GDB's thread list.  */
+
   for (pi = gi = 0; pi < pcount || gi < gcount;)
     {
       if (pi == pcount)
@@ -810,8 +845,6 @@ sync_threadlists (int pid)
 	  priv->pdtid = pbuf[pi].pdtid;
 	  priv->tid = pbuf[pi].tid;
 
-	  process_stratum_target *proc_target
-	    = current_inferior ()->process_target ();
 	  thread = add_thread_with_info (proc_target,
 					 ptid_t (pid, 0, pbuf[pi].pthid),
 					 priv);
@@ -841,13 +874,28 @@ sync_threadlists (int pid)
 	    }
 	  else if (cmp_result > 0)
 	    {
-	      delete_thread (gbuf[gi]);
-	      gi++;
+	      /* This is to make the main process thread now look
+                 like a thread.  */
+
+	      if (gptid.is_pid ())
+		{
+		  tp = find_thread_ptid (proc_target, gptid);
+		  thread_change_ptid (proc_target, gptid, pptid);
+		  aix_thread_info *priv = new aix_thread_info;
+		  priv->pdtid = pbuf[pi].pdtid;
+		  priv->tid = pbuf[pi].tid;
+		  tp->priv.reset (priv);
+		  gi++;
+		  pi++;
+		}
+	      else
+		{
+		  delete_thread (gbuf[gi]);
+		  gi++;
+		}
 	    }
 	  else
 	    {
-	      process_stratum_target *proc_target
-		= current_inferior ()->process_target ();
 	      thread = add_thread (proc_target, pptid);
 
 	      aix_thread_info *priv = new aix_thread_info;
@@ -881,17 +929,20 @@ iter_tid (struct thread_info *thread, void *tidp)
    return a pid-only ptid with PID.  */
 
 static ptid_t
-pd_update (int pid)
+pd_update (pid_t pid)
 {
   int status;
   ptid_t ptid;
   pthdb_tid_t tid;
   struct thread_info *thread = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_pid (pid);
 
-  if (!pd_active)
+  if (!data->pd_active)
     return ptid_t (pid);
 
-  status = pthdb_session_update (pd_session);
+  status = pthdb_session_update (data->pd_session);
   if (status != PTHDB_SUCCESS)
     return ptid_t (pid);
 
@@ -915,34 +966,23 @@ pd_update (int pid)
    for that thread.  Otherwise, return a ptid-only ptid using PID.  */
 
 static ptid_t
-pd_activate (int pid)
+pd_activate (pid_t pid)
 {
   int status;
-		
-  status = pthdb_session_init (pid, arch64 ? PEM_64BIT : PEM_32BIT,
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_pid (pid);
+  
+  status = pthdb_session_init (pid, data->arch64 ? PEM_64BIT : PEM_32BIT,
 			       PTHDB_FLAG_REGS, &pd_callbacks, 
-			       &pd_session);
+			       &data->pd_session);
   if (status != PTHDB_SUCCESS)
     {
       return ptid_t (pid);
     }
-  pd_active = 1;
+  data->pd_active = 1;
   return pd_update (pid);
 }
 
-/* Undo the effects of pd_activate().  */
-
-static void
-pd_deactivate (void)
-{
-  if (!pd_active)
-    return;
-  pthdb_session_destroy (pd_session);
-  
-  pid_to_prc (&inferior_ptid);
-  pd_active = 0;
-}
-
 /* An object file has just been loaded.  Check whether the current
    application is pthreaded, and if so, prepare for thread debugging.  */
 
@@ -952,13 +992,19 @@ pd_enable (void)
   int status;
   char *stub_name;
   struct bound_minimal_symbol ms;
+  struct aix_thread_variables *data;
+
+  if (!inferior_ptid.pid ())
+    return;
+  
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   /* Don't initialize twice.  */
-  if (pd_able)
+  if (data->pd_able)
     return;
 
   /* Check application word size.  */
-  arch64 = register_size (target_gdbarch (), 0) == 8;
+  data->arch64 = register_size (target_gdbarch (), 0) == 8;
 
   /* Check whether the application is pthreaded.  */
   stub_name = NULL;
@@ -972,13 +1018,19 @@ pd_enable (void)
   ms = lookup_minimal_symbol (stub_name, NULL, NULL);
   if (ms.minsym == NULL)
     return;
-  pd_brk_addr = ms.value_address ();
-  if (!create_thread_event_breakpoint (target_gdbarch (), pd_brk_addr))
+  data->pd_brk_addr = ms.value_address ();
+  if (!create_thread_event_breakpoint (target_gdbarch (), data->pd_brk_addr))
     return;
 
   /* Prepare for thread debugging.  */
   current_inferior ()->push_target (&aix_thread_ops);
-  pd_able = 1;
+  data->pd_able = 1; 
+
+  inferior *inf = current_inferior (); 
+  /* When attaching / handling fork child, don't try activating
+     thread debugging until we know about all shared libraries.  */ 
+  if (inf->in_initial_library_scan)
+    return;
 
   /* If we're debugging a core file or an attached inferior, the
      pthread library may already have been initialized, so try to
@@ -991,28 +1043,31 @@ pd_enable (void)
 static void
 pd_disable (void)
 {
-  if (!pd_able)
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
+
+  if (!data->pd_able)
     return;
-  if (pd_active)
-    pd_deactivate ();
-  pd_able = 0;
+  if (!data->pd_active)
+    return;
+  pthdb_session_destroy (data->pd_session);
+ 
+  pid_to_prc (&inferior_ptid);
+  data->pd_active = 0;
+  data->pd_able = 0;
   current_inferior ()->unpush_target (&aix_thread_ops);
 }
 
 /* new_objfile observer callback.
 
    If OBJFILE is non-null, check whether a threaded application is
-   being debugged, and if so, prepare for thread debugging.
-
-   If OBJFILE is null, stop debugging threads.  */
+   being debugged, and if so, prepare for thread debugging.  */
 
 static void
 new_objfile (struct objfile *objfile)
 {
   if (objfile)
     pd_enable ();
-  else
-    pd_disable ();
 }
 
 /* Attach to process specified by ARGS.  */
@@ -1042,8 +1097,11 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 {
   struct thread_info *thread;
   pthdb_tid_t tid[2];
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (ptid);
 
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     {
       scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
       
@@ -1065,7 +1123,7 @@ aix_thread_target::resume (ptid_t ptid, int step, enum gdb_signal sig)
 	       ptid.lwp ());
       tid[1] = 0;
 
-      if (arch64)
+      if (data->arch64)
 	ptrace64aix (PTT_CONTINUE, tid[0], (long long) 1,
 		     gdb_signal_to_host (sig), (PTRACE_TYPE_ARG5) tid);
       else
@@ -1082,6 +1140,7 @@ ptid_t
 aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
 			 target_wait_flags options)
 {
+  struct aix_thread_variables *data;
   {
     pid_to_prc (&ptid);
 
@@ -1095,8 +1154,10 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
      pid-only ptids.  */
   gdb_assert (ptid.is_pid ());
 
+  data = get_thread_data_helper_for_ptid (ptid);
+
   /* Check whether libpthdebug might be ready to be initialized.  */
-  if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED
+  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
       && status->sig () == GDB_SIGNAL_TRAP)
     {
       process_stratum_target *proc_target
@@ -1105,7 +1166,7 @@ aix_thread_target::wait (ptid_t ptid, struct target_waitstatus *status,
       struct gdbarch *gdbarch = regcache->arch ();
 
       if (regcache_read_pc (regcache)
-	  - gdbarch_decr_pc_after_break (gdbarch) == pd_brk_addr)
+	  - gdbarch_decr_pc_after_break (gdbarch) == data->pd_brk_addr)
 	return pd_activate (ptid.pid ());
     }
 
@@ -1229,18 +1290,20 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
   ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep> (gdbarch);
   int status, i;
   pthdb_context_t ctx;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
 		"fetch_regs_user_thread %lx\n", (long) pdtid);
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: fetch_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
 
   /* General-purpose registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_gprs64 (regcache, ctx.gpr);
   else
     for (i = 0; i < ppc_num_gprs; i++)
@@ -1253,7 +1316,7 @@ fetch_regs_user_thread (struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Special registers.  */
 
-  if (arch64)
+  if (data->arch64)
     supply_sprs64 (regcache, ctx.iar, ctx.msr, ctx.cr, ctx.lr, ctx.ctr,
 			     ctx.xer, ctx.fpscr);
   else
@@ -1288,18 +1351,21 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
   struct ptxsprs sprs64;
   struct ptsprs sprs32;
   int i;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog,
 		"fetch_regs_kernel_thread tid=%lx regno=%d arch64=%d\n",
-		(long) tid, regno, arch64);
+		(long) tid, regno, data->arch64);
 
   /* General-purpose registers.  */
   if (regno == -1
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_gprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_GPRS, tid, 
 			    (unsigned long) gprs64, 0, NULL))
@@ -1331,7 +1397,7 @@ fetch_regs_kernel_thread (struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  if (!ptrace64aix (PTT_READ_SPRS, tid, 
 			    (unsigned long) &sprs64, 0, NULL))
@@ -1363,7 +1429,11 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  /* If a new inferior is born, then its pthread debug library is yet to
+     initialised and hence has no private data. So the below if condition
+     exists.  */
+
+  if (regcache->ptid ().tid () == 0)
     beneath ()->fetch_registers (regcache, regno);
   else
     {
@@ -1511,6 +1581,8 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   pthdb_context_t ctx;
   uint32_t int32;
   uint64_t int64;
+  struct aix_thread_variables *data;
+  data = get_thread_data_helper_for_ptid (inferior_ptid);
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1518,7 +1590,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 
   /* Retrieve the thread's current context for its non-register
      values.  */
-  status = pthdb_pthread_context (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_context (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: pthdb_pthread_context returned %s"),
 	   pd_status2str (status));
@@ -1528,7 +1600,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
   for (i = 0; i < ppc_num_gprs; i++)
     if (REG_VALID == regcache->get_register_status (tdep->ppc_gp0_regnum + i))
       {
-	if (arch64)
+	if (data->arch64)
 	  {
 	    regcache->raw_collect (tdep->ppc_gp0_regnum + i, (void *) &int64);
 	    ctx.gpr[i] = int64;
@@ -1545,7 +1617,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
     fill_fprs (regcache, ctx.fpr);
 
   /* Special registers (always kept in ctx as 64 bits).  */
-  if (arch64)
+  if (data->arch64)
     {
       fill_sprs64 (regcache, &ctx.iar, &ctx.msr, &ctx.cr, &ctx.lr, &ctx.ctr,
 			     &ctx.xer, &ctx.fpscr);
@@ -1576,7 +1648,7 @@ store_regs_user_thread (const struct regcache *regcache, pthdb_pthread_t pdtid)
 	ctx.fpscr = tmp_fpscr;
     }
 
-  status = pthdb_pthread_setcontext (pd_session, pdtid, &ctx);
+  status = pthdb_pthread_setcontext (data->pd_session, pdtid, &ctx);
   if (status != PTHDB_SUCCESS)
     error (_("aix-thread: store_registers: "
 	     "pthdb_pthread_setcontext returned %s"),
@@ -1602,6 +1674,9 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
   double fprs[ppc_num_fprs];
   struct ptxsprs sprs64;
   struct ptsprs  sprs32;
+  struct aix_thread_variables *data;
+  
+  data = get_thread_data_helper_for_ptid (regcache->ptid ());
 
   if (debug_aix_thread)
     gdb_printf (gdb_stdlog, 
@@ -1613,7 +1688,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
       || (tdep->ppc_gp0_regnum <= regno
 	  && regno < tdep->ppc_gp0_regnum + ppc_num_fprs))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some regs may not be in the cache.  */
 	  ptrace64aix (PTT_READ_GPRS, tid, (unsigned long) gprs64, 0, NULL);
@@ -1646,7 +1721,7 @@ store_regs_kernel_thread (const struct regcache *regcache, int regno,
 
   if (regno == -1 || special_register_p (gdbarch, regno))
     {
-      if (arch64)
+      if (data->arch64)
 	{
 	  /* Pre-fetch: some registers won't be in the cache.  */
 	  ptrace64aix (PTT_READ_SPRS, tid, 
@@ -1703,7 +1778,7 @@ aix_thread_target::store_registers (struct regcache *regcache, int regno)
   struct thread_info *thread;
   pthdb_tid_t tid;
 
-  if (!PD_TID (regcache->ptid ()))
+  if (regcache->ptid ().tid () == 0)
     beneath ()->store_registers (regcache, regno);
   else
     {
@@ -1741,7 +1816,7 @@ aix_thread_target::mourn_inferior ()
 {
   target_ops *beneath = this->beneath ();
 
-  pd_deactivate ();
+  pd_disable ();
   beneath->mourn_inferior ();
 }
 
@@ -1750,7 +1825,7 @@ aix_thread_target::mourn_inferior ()
 bool
 aix_thread_target::thread_alive (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->thread_alive (ptid);
 
   /* We update the thread list every time the child stops, so all
@@ -1766,7 +1841,7 @@ aix_thread_target::thread_alive (ptid_t ptid)
 std::string
 aix_thread_target::pid_to_str (ptid_t ptid)
 {
-  if (!PD_TID (ptid))
+  if (ptid.tid () == 0)
     return beneath ()->pid_to_str (ptid);
 
   return string_printf (_("Thread %s"), pulongest (ptid.tid ()));
@@ -1786,8 +1861,11 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
   pthdb_detachstate_t detachstate;
   int cancelpend;
   static char *ret = NULL;
+  struct aix_thread_variables *data;
+
+  data = get_thread_data_helper_for_ptid (thread->ptid);
 
-  if (!PD_TID (thread->ptid))
+  if (thread->ptid.tid () == 0)
     return NULL;
 
   string_file buf;
@@ -1800,24 +1878,24 @@ aix_thread_target::extra_thread_info (struct thread_info *thread)
     /* i18n: Like "thread-identifier %d, [state] running, suspended" */
     buf.printf (_("tid %d"), (int)tid);
 
-  status = pthdb_pthread_state (pd_session, pdtid, &state);
+  status = pthdb_pthread_state (data->pd_session, pdtid, &state);
   if (status != PTHDB_SUCCESS)
     state = PST_NOTSUP;
   buf.printf (", %s", state2str (state));
 
-  status = pthdb_pthread_suspendstate (pd_session, pdtid, 
+  status = pthdb_pthread_suspendstate (data->pd_session, pdtid, 
 				       &suspendstate);
   if (status == PTHDB_SUCCESS && suspendstate == PSS_SUSPENDED)
     /* i18n: Like "Thread-Id %d, [state] running, suspended" */
     buf.printf (_(", suspended"));
 
-  status = pthdb_pthread_detachstate (pd_session, pdtid, 
+  status = pthdb_pthread_detachstate (data->pd_session, pdtid, 
 				      &detachstate);
   if (status == PTHDB_SUCCESS && detachstate == PDS_DETACHED)
     /* i18n: Like "Thread-Id %d, [state] running, detached" */
     buf.printf (_(", detached"));
 
-  pthdb_pthread_cancelpend (pd_session, pdtid, &cancelpend);
+  pthdb_pthread_cancelpend (data->pd_session, pdtid, &cancelpend);
   if (status == PTHDB_SUCCESS && cancelpend)
     /* i18n: Like "Thread-Id %d, [state] running, cancel pending" */
     buf.printf (_(", cancel pending"));
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index f483f54de13..6be81064ebd 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -618,6 +618,20 @@ solib_aix_bfd_open (const char *pathname)
       if (member_name == bfd_get_filename (object_bfd.get ()))
 	break;
 
+      std::string s = bfd_get_filename (object_bfd.get ());
+
+      /* For every inferior after first int bfd system we 
+	 will have the pathname instead of the member name
+	 registered. Hence the below condition exists.  */
+
+      if (s.find ('(') != std::string::npos)
+	{
+	  int pos = s.find ('(');
+	  int len = s.find (')') - s.find ('(');
+	  if (s.substr (pos+1, len-1) == member_name) 
+	    return object_bfd;
+	}
+
       object_bfd = gdb_bfd_openr_next_archived_file (archive_bfd.get (),
 						     object_bfd.get ());
     }
-- 
2.31.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
  2023-02-17 15:15                                                                                     ` Aditya Kamath1
@ 2023-02-17 19:14                                                                                       ` Ulrich Weigand
  0 siblings, 0 replies; 49+ messages in thread
From: Ulrich Weigand @ 2023-02-17 19:14 UTC (permalink / raw)
  To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Please find attached the patch with all the changes mentioned.

This version looks good to me.  I've checked it in now.

Thanks,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
@ 2022-11-08 12:00 Aditya Kamath1
  0 siblings, 0 replies; 49+ messages in thread
From: Aditya Kamath1 @ 2022-11-08 12:00 UTC (permalink / raw)
  To: Ulrich Weigand, simark, gdb-patches; +Cc: Sangamesh Mallayya

[-- Attachment #1: Type: text/plain, Size: 3999 bytes --]

Hi Ulrich,


>You should find out why the "priv" field isn't
>set up correctly, and fix whatever was going
>wrong there.  (I believe this should have been
>done in sync_threadlists.)

You were right about this. What is happening is the main process and the thread representing it are treated as two separate threads by the libpthread library. Main process had no private data set whereas the thread representing it had. Usually, both of them should have it and their private data must be the same.

For example ,

Consider the program below:- [ Program Credits:-  GDB test case continue-pending-status.c]


#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <pthread.h>

#include <assert.h>


pthread_barrier_t barrier;


#define NUM_THREADS 2


void *

thread_function (void *arg)

{



  pthread_barrier_wait (&barrier);


  while (1); /* break here */

}


int

main (void)

{

  int i;


  alarm (300);


  pthread_barrier_init (&barrier, NULL, NUM_THREADS);


  for (i = 0; i < NUM_THREADS; i++)

    {

      pthread_t thread;

      int res;


      res = pthread_create (&thread, NULL,

                            thread_function, NULL);

      assert (res == 0);

    }

  while (1)

    sleep (1);


  return 0;

}

Here is the gdb output of the above code,  Clearly when I switched to thread 2 which same as thread1 and interrupted, thread 1 received the input. So, when we added a private data in sync_threadlists() we added for thread 2 but not 1 which is main thread and same as thread 1. This is why we got that assertion failure as thread 1 did not have a private data.


Reading symbols from /home/XYZ/gdb_tests/continue-pending-status...

(gdb) r

Starting program: /home/XYZ/gdb_tests/continue-pending-status

[New Thread 1]

^C[New Thread 258]

[New Thread 515]


Thread 1 received signal SIGINT, Interrupt.

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) info threads

  Id   Target Id                          Frame

* 1    process 12059046                   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

  2    Thread 1 (tid 39125487, running)   0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

  3    Thread 258 (tid 23396809, running) thread_function (arg=0x0) at continue-pending-status.c:36

  4    Thread 515 (tid 36503883, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)


0x0) at continue-pending-status.c:36

(gdb) thread 2

[Switching to thread 2 (Thread 1)]

#0  0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb) c

Continuing.

^C

Thread 1 received signal SIGINT, Interrupt.

[Switching to process 12059046]

0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)

(gdb)

I have written my comments in the patch. Hope this works and if it is right kindly push the same in git, otherwise Let me know what you think.

Have a nice day ahead.

Thanks and regards,
Aditya.
________________________________
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Sent: 28 October 2022 15:19
To: simark@simark.ca <simark@simark.ca>; Aditya Kamath1 <Aditya.Kamath1@ibm.com>; gdb-patches@sourceware.org <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

> static aix_thread_info *
> get_aix_thread_info (thread_info *thread)
> {
>+  if (thread->priv == NULL)
>+    return NULL;

This doesn't look right.  Note that all users of
get_aix_thread_info assume the pointer returned
from there is never NULL.

You should find out why the "priv" field isn't
set up correctly, and fix whatever was going
wrong there.  (I believe this should have been
done in sync_threadlists.)

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2023-02-17 19:14 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-25  6:47 [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch Aditya Kamath1
2022-10-28  9:49 ` Ulrich Weigand
2022-11-08 12:00   ` Aditya Kamath1
2022-11-08 12:17     ` Ulrich Weigand
2022-11-13 18:15       ` Aditya Kamath1
2022-11-15 18:16         ` Ulrich Weigand
2022-11-21  8:27           ` Aditya Kamath1
2022-11-23 14:15             ` Ulrich Weigand
2022-11-23 16:03               ` Aditya Kamath1
2022-11-23 17:09                 ` Ulrich Weigand
2022-11-23 18:45                   ` Aditya Kamath1
2022-11-29  8:18                     ` Aditya Kamath1
2022-11-30 14:57                       ` Ulrich Weigand
2022-12-02  7:50                         ` Aditya Kamath1
2022-12-05 18:33                           ` Ulrich Weigand
2022-12-08 10:28                             ` Aditya Kamath1
2022-12-08 10:46                               ` Aditya Kamath1
2022-12-08 16:29                               ` Ulrich Weigand
2022-12-15 12:58                                 ` Aditya Kamath1
2022-12-15 15:53                                   ` Ulrich Weigand
2022-12-19  6:30                                     ` Aditya Kamath1
2022-12-22 12:50                                       ` Ulrich Weigand
2022-12-26 13:18                                         ` Aditya Kamath1
2023-01-09 14:04                                           ` Ulrich Weigand
2023-01-10 12:23                                             ` Aditya Kamath1
2023-01-11 13:31                                               ` Ulrich Weigand
2023-01-13 14:06                                                 ` Aditya Kamath1
2023-01-20 14:44                                                   ` Ulrich Weigand
2023-01-27 14:40                                                     ` Aditya Kamath1
2023-01-30 19:54                                                       ` Tom Tromey
2023-02-02  6:24                                                       ` Aditya Kamath1
2023-02-02  6:35                                                         ` Aditya Kamath1
2023-02-02 17:43                                                           ` Ulrich Weigand
2023-02-03 11:10                                                             ` Aditya Kamath1
2023-02-06 19:07                                                               ` Ulrich Weigand
2023-02-07 11:57                                                                 ` Aditya Kamath1
2023-02-08 18:44                                                                   ` Ulrich Weigand
2023-02-10 16:33                                                                     ` Aditya Kamath1
2023-02-10 16:46                                                                       ` Aditya Kamath1
2023-02-13 19:01                                                                       ` Ulrich Weigand
2023-02-14 14:13                                                                         ` Aditya Kamath1
2023-02-16 19:46                                                                           ` Ulrich Weigand
2023-02-17 11:26                                                                             ` Aditya Kamath1
2023-02-17 12:04                                                                               ` Ulrich Weigand
2023-02-17 13:22                                                                                 ` Aditya Kamath1
2023-02-17 14:18                                                                                   ` Ulrich Weigand
2023-02-17 15:15                                                                                     ` Aditya Kamath1
2023-02-17 19:14                                                                                       ` Ulrich Weigand
2022-11-08 12:00 Aditya Kamath1

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).