Hi Ulrich, Please find attached the new patch. See [0001-Fix-Multi-thread-debug-bug-fix-in-AIX.patch]. >Why does the first thread look so different? That's not the >case with Linux threads. I believe even if you re-use the >thread structure, you'll still need to switch the ptid to one >that indicates a thread instead of a non-threaded process. >That's what the old code attempted to do as far as I can see. >if it got it wrong in certain corner cases, they need to be fixed. >but completely removing that logic seems just wrong. I misunderstood the code and what it was trying to do. You were right. In an attempt to balance pcount and gcount I messed the code up. I went one version back and corrected the same. I underestimated the power of ptid class. As I explored parts of GDB I realised stuff we can do with the same. Actually, we need not delete the main thread. Instead, we can change the ptid from what represented the main ptid to now the main thread ptid. So, if my main thread has no private data but it has a thread info I know for a fact that this is the first time my debugee code will be multi-threaded or pthreaded. Hence, I make the change to the ptid representing the main ptid and set the private data. That is what these two lines do. + if (tp != NULL && tp->priv == NULL) + { + thread_change_ptid (proc_target, tp->ptid, + ptid_t (pid, 0, pbuf[pi].pthid)); + tp->priv.reset (priv); + } + else + thread = add_thread_with_info (proc_target, + ptid_t (pid, 0, pbuf[pi].pthid), + priv); This will make our main thread look like how threads look like in the linux world while using GDB. Kindly see the output in the last para of this email with the code. - switch_to_thread (current_inferior ()->process_target (), -                 ptid_t (user_current_pid)); +     inferior *inf = find_inferior_ptid (current_inferior ()-> process_target (), +                              ptid_t (user_current_pid)); + for (thread_info *tp: inf->threads ()) +      if (tp != NULL) + { + switch_to_thread (tp); + break; + } If you recall, we did this change a few months back to ensure we are in the right context while reading memory. So, here's the thing. So far, we had our man, the main process thread guy to save us here using ptid_t (user_current_pid). But now we have the multi-threaded ptid in case our debugee is multi-threaded and out ptid will be like ptid_t (pid, 0, pbuf[pi].pthid).. What I mean is there is no ptid representing a process in the latter case. Hence this change.. Let me know if I have thought this correctly and what you think of my analysis. If this is good kindly push this, otherwise let me know what I need to change. Have a nice day ahead. Thanks and regards, Aditya. ----------------------------------------------------------------------- PROGRAM:- [ Credits -- GDB test case continuous pending under gdb.threads ] #include #include #include #include #include pthread_barrier_t barrier; #define NUM_THREADS 3 void * thread_function (void *arg) { /* This ensures that the breakpoint is only hit after both threads are created, so the test can always switch to the non-event thread when the breakpoint triggers. */ pthread_barrier_wait (&barrier); while (1); /* break here */ } int main (void) { int i; alarm (300); pthread_barrier_init (&barrier, NULL, NUM_THREADS); for (i = 0; i < NUM_THREADS; i++) { pthread_t thread; int res; res = pthread_create (&thread, NULL, thread_function, NULL); assert (res == 0); } while (1) sleep (1); return 0; } --------------------------------------------------------------- Output:- Reading symbols from /home/xyz/gdb_tests/continue-pending-status... (gdb) r Starting program: /home/xyz/gdb_tests/continue-pending-status ^C[New Thread 258] [New Thread 515] [New Thread 772] Thread 1 received signal SIGINT, Interrupt. [Switching to Thread 1] 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) (gdb) info threads Id Target Id Frame * 1 Thread 1 (tid 32309733, running) 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) 2 Thread 258 (tid 31850777, running) thread_function (arg=0x0) at /home/xyz/gdb_tests/continue-pending-status.c:36 3 Thread 515 (tid 30474663, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.) 0x0) at /home/xyz/gdb_tests/continue-pending-status.c:36 4 Thread 772 (tid 33423627, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.) 0x0) at /home/xyz/gdb_tests/continue-pending-status.c:36 (gdb) ________________________________ From: Ulrich Weigand Sent: 15 November 2022 23:46 To: simark@simark.ca ; Aditya Kamath1 ; gdb-patches@sourceware.org Cc: Sangamesh Mallayya Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch Aditya Kamath1 wrote: >>So I think instead of adding a "priv" struct to that GDB thread >>identifying the main process, the sync_threadlists routine should >>actually just delete it (or replace it with the actual first thread, >>whatever is easier). > >I have chosen not to add the first main thread as new thread. Instead, >we carry on with main process thread itself adding private data to it. >Kindly see the first if condition. I observed this with the linux folks >where in their output as you mentioned do not add any new threads the >first time on recognition of multi thread debugee for the main process. OK, but this is still weird: >* 1 process 26149278 0xd0595fb0 in _p_nsleep () > 2 Thread 258 (tid 24445361, running) thread_function (arg=0x0) > 3 Thread 515 (tid 16187681, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.) Why does the first thread look so different? That's not the case with Linux threads. I believe even if you re-use the thread structure, you'll still need to switch the ptid to one that indicates a thread instead of a non-threaded process. >A couple of things I want to inform you is that the way the second >for loop is executing is not correct from here on to sync both the >buffer lists [pthread and GDB thread]. Since we are now not adding >multiple threads for the same process main thread one representing >the GDB thread and the other by the pthread those conditions and >indices like pi and gi will fail. Now there has not pcount - 1 >threads in the GDB thread buffer always. Condition 2 and 3 in the >patch take care of them for addition and deletion of threads. The new logic doesn't look correct to me - note that it never even looks at thread IDs any more, just the raw number of threads. So for example if *any* thread exits, the code will always delete the *last* thread from the GDB list - whether this is actually the one that exited or not. I do think it is necessary to compare thread IDs - you need to map the thread IDs retrieved by libpthdebug against the thread IDs already present in GDB's thread list. If a matching thread ID is present in both lists, it should not be touched. If a thread ID occurs only in the libpthdebug list, it needs to be added to GDB's list. If a thread ID occurs only in GDB's list, it needs to be removed from there. That's what the old code attempted to do as far as I can see; if it got it wrong in certain corner cases, they need to be fixed; but completely removing that logic seems just wrong. Bye, Ulrich