Hi all, I have fixed the issue with multiple thread detection. Please find attached the patch and explanation. The problem is in AIX we are not able to debug multiple threads or say multiple threads are not visible in a debug session though they exist in the inferior code. Fact 1:- In AIX, before we enter into target wait thread code to fetch the pid of the inferior, we switch to no thread using switch_to_no_thread() using switch_to_inferior_no_thread() in inferior.c file. Fact 2:- How do we add threads in AIX to a single inferior process?? 1. We start off by initialising the pthread debug library using pthdb_session_pthreaded() 2. We then initialise a thread session using pthdb_session_init using pd_activate which will have callback functions to call to read/write memory buffer and symbols. 3. We then update our session variable if the pthdb_session_init() was successful [PTHDB_SUCCESS]. More about this can be read in https://www.ibm.com/docs/en/aix/7.1?topic=programming-developing-multithreaded-program-debuggers Developing multithreaded program debuggers The pthread debug library (libpthdebug.a) provides a set of functions that allows developers to provide debug capabilities for applications that use the pthread library. www.ibm.com The cause of the bug :- Since, for the GDB core we are switch_to_no_thread() i.e. we do not have a thread till we return the pid from the wait() there is no thread. So, when a call is made from pd_activate() in wait() of aix-thread.c, to pthdb_session_init() we are going to recieve PTHDB_NOT_THREADED. Reason:- We end up reading the incorrect addresses and writing incorrect ones in pthdb_session_init() call back functions, as switched to no thread in gdb core. What is the solution:- We should switch to the thread of the pid that has just returned from beneath->wait() using switch_to_thread() which is what this patch does. Once this is done, before we check pthread debug library is initialised using: if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED && status->sig () == GDB_SIGNAL_TRAP) all our registers are well aligned with required information of what addresses and symbols to read/write etc in the pthdb_session_init() callback functions. Thus this time when we call pthdb_session_init() since we are threaded or switched to a thread, our session is successfully established and pd_update () along with its friend sync_threadlists () will take care of adding threads. The one more change is just a normal semantic one instead of one we need to pass process ID now. Let me know what you think. (See patch:- 0001-Fix-for-multiple-thread-detection-in-AIX.patch) Thanks and regards, Aditya. ​------------------------------------------------------------------------- This can be shown by the following program:- #include #include #include #include #include pthread_barrier_t barrier; #define NUM_THREADS 2 void * thread_function (void *arg) { /* This ensures that the breakpoint is only hit after both threads are created, so the test can always switch to the non-event thread when the breakpoint triggers. */ pthread_barrier_wait (&barrier); while (1); /* break here */ } int main (void) { int i; alarm (300); pthread_barrier_init (&barrier, NULL, NUM_THREADS); for (i = 0; i < NUM_THREADS; i++) { pthread_t thread; int res; res = pthread_create (&thread, NULL, thread_function, NULL); assert (res == 0); } while (1) sleep (1); return 0; } Output without patch:- (gdb) r Starting program: /home/aditya/gdb_tests/continue-pending-status ^C Program received signal SIGINT, Interrupt. 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) (gdb) info threads Id Target Id Frame * 1 process 29557240 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) (gdb) Output with patch:- Reading symbols from /home/aditya/gdb_tests/continue-pending-status... (gdb) r Starting program: /home/aditya/gdb_tests/continue-pending-status [New Thread 1] ^C[New Thread 258] [New Thread 515] Thread 1 received signal SIGINT, Interrupt. 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) (gdb) info threads Id Target Id Frame * 1 process 29557210 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) 2 Thread 1 (tid 120197499, running) 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) 3 Thread 258 (tid 130486575, running) thread_function (arg=0x0) at continue-pending-status.c:36 4 Thread 515 (tid 131666371, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.) 0x0) at continue-pending-status.c:36 (gdb) ________________________________ From: Aditya Kamath1 Sent: 22 July 2022 22:33 To: Ulrich Weigand ; Sangamesh Mallayya ; simon.marchi@efficios.com ; gdb-patches@sourceware.org ; simark@simark.ca Subject: Re: [PATCH] Fix-for-multiple-thread-detection-in-AIX.patch Hi all, This applies in a multithread case. Why we need inferior_ptid reset in wait?? AIX uses pthread debug library which has to be intialised first. On an event of wait due to a thread creation, once we fetch the pid using waitpid(), the following condition satisfies.. /* Check whether libpthdebug might be ready to be initialized.*/ if (!pd_active && status->kind () == TARGET_WAITKIND_STOPPED && status->sig () == GDB_SIGNAL_TRAP) and we enter pd_activate() to. On a successful pthread debug library initialisation we have to initiate a thread debug session using pthdb_session_init(). While we do that, we have call backs to read symbol, data etc. The problem is here where we are not able to read them successfully due to incorrect process ID information. It is not able to fetch the right address to read in pdc_read_data() where we read target_read_memory() and pdc_symbol_addrs(). Kindly read https://www.ibm.com/docs/en/aix/7.1?topic=programming-developing-multithreaded-program-debuggers for more on what I shared. Developing multithreaded program debuggers The pthread debug library (libpthdebug.a) provides a set of functions that allows developers to provide debug capabilities for applications that use the pthread library. www.ibm.com What we are thinking might be the solution?? Since we initialise the gdb with an observer to notify when there is a thread ptid change, on a wait event the inferior_ptid is set to null by switch_to_no_thread () with reinit_cache_frame (). Essentially with this we loose all our pid information needed to read our data and symbols as now our thread PTID is changed to null. So it is essential for us with the way AIX code is designed that we either reset our inferior_ptid in AIX thread wait code before we go out to the non-target dependent code like target.c, infrun.c to do it for us, or use       thread_change_ptid (process_stratum_target *targ, ptid_t old_ptid, ptid_t new_ptid) to inform GDB so that we can reset and get the right address for target_read_memory() in aix-thread.c file. It will be great if you could tell us if we are thinking right at this moment. ________________________________ From: Ulrich Weigand Sent: 19 July 2022 17:51 To: Sangamesh Mallayya ; Aditya Kamath1 ; simon.marchi@efficios.com ; gdb-patches@sourceware.org ; simark@simark.ca Subject: Re: [PATCH] Fix-for-multiple-thread-detection-in-AIX.patch Aditya Kamath1 wrote: >The reason:- Since a new thread addition causes a thread target to >wait, in AIX once the event ptid is got with the waitpid(), we need to >set the inferior_ptid variable. Every time we come into >aix_thread_target::wait() we check if libpthdebug might be ready to be >initialized.In doing so we call pd_activate(). Here the session needs >to be successfully initialised failing to which just a pid is >returned. >We do not enter pd_update() in the former case to take care of the rest >of the thread addition process. The pthdb_session_init() is dependent >on inferior_ptid variable as per our observations to return >PTHDB_SUCCESS. I think the change to pd_enable makes sense, passing 1 to pd_activate seems clearly incorrect now. Simon, you recently changed pd_activate to take a PID instead of a boolean - any comments on this? However, I do not see why the change to ::wait is necessary, or even correct. Note that when ::wait calls pd_activate or pd_update, it already passes the correct pid. I do not see any path from ::wait to pd_enable. Bye, Ulrich