Hi Ulrich, I'm confused why this is the correct place. From what I can see, in this scenario, we have: - libpthdebug reports some threads using a thread ID, i.e. pbuf has ptid_t (pid, 0, pthid1) .. ptid_t (pid, 0, pthidN) with pcount >= 1. - GDB only has one single thread in unthreaded mode, i.e. gbuf has ptid_t (pid, 0, 0) with gcount == 1. So when running the loop, during the first iteration, we should compare ptid_cmp (ptid_t (pid, 0, pthid1), ptid_t (pid, 0, 0)) which should be > 0 since pthid1 > 0. Right? This means we'll get into the branch that just does: delete_thread (gbuf[gi]); thereby deleting the original thread. Does this not happen for you? What is going on instead? [ Note that this is a simplified case with only a single process; in the multi-process scenario, this may be more complex. >What is going on instead? The one I highlighted in bold does not happen. In our first iteration pcount is 1 but gcount is 0. That is why when gi =0 == gcount becomes true and control enters into that block instead of going into the else block.. >[ Note that this is a simplified case with only a single process; in the multi-process scenario, this may be more complex. This I agree. I just checked now.. Hmm. So, something is going wrong here.. gcount = 0; iterate_over_threads (giter_count, &gcount); g = gbuf = XNEWVEC (struct thread_info *, gcount); iterate_over_threads (giter_accum, &g); qsort (gbuf, gcount, sizeof *gbuf, gcmp); Let's me check these lines. Hope I answered why I was doing it.. The rest of the changes you mentioned I will do it.. ________________________________ From: Ulrich Weigand Sent: 23 November 2022 19:45 To: simark@simark.ca ; Aditya Kamath1 ; gdb-patches@sourceware.org Cc: Sangamesh Mallayya Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch Aditya Kamath1 wrote: >@@ -514,8 +514,16 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf, > during first initialisation. In the rest of the callbacks the > current thread needs to be correct. */ > if (user_current_pid != 0) >- switch_to_thread (current_inferior ()->process_target (), >- ptid_t (user_current_pid)); >+ { >+ inferior *inf = find_inferior_ptid (current_inferior ()-> process_target (), >+ ptid_t (user_current_pid)); This would be simpler using find_inferior_pid. >+ for (thread_info *tp: inf->threads ()) >+ if (tp != NULL) This would be simpler using any_thread_of_inferior. >+ { >+ switch_to_thread (tp); >+ break; >+ } >+ } > status = target_read_memory (addr, (gdb_byte *) buf, len); However, switching to just any random thread of the process seems odd. Looking at sol-thread.c, they don't switch to a thread at all in the equivalent place, but rather do this: scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid); if (inferior_ptid.tid_p () || !target_thread_alive (inferior_ptid)) { /* It's either a thread or an LWP that isn't alive. Any live LWP will do so use the first available. NOTE: We don't need to call switch_to_thread; we're just reading memory. */ inferior_ptid = procfs_first_available (); } Since your xfer_partial routine only ever looks at the PID component of the ptid, I'm wondering if we couldn't similarly just switch inferior_ptid, without actually switching theads. Something along the lines of scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid); if (user_current_pid != 0) inferior_ptid = ptid_t (user_current_pid); Does this work for you? >- if (thrinf.ti_cursig == SIGTRAP) >+ /* In a multi threaded application user can interrupt the main >+ thread as well. This function should return the tid in this >+ case apart from threads that can trap or be interrupted. */ Whitespace problem. >+ >+ if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT) > return thrinf.ti_tid; This seems an unrelated change? If this is actually necessary, then all the comments (e.g. at the top of this function, or at the call site) likewise need to be updated - they only refer to trap signals currently. > process_stratum_target *proc_target > = current_inferior ()->process_target (); >- thread = add_thread_with_info (proc_target, >- ptid_t (pid, 0, pbuf[pi].pthid), >- priv); >+ >+ thread_info *tp = find_thread_ptid (proc_target, ptid_t (pid)); >+ >+ /* If the pthread library is used then we replace the main >+ with the thread having the main thread ID and process ID. >+ Otherwise this is a new thread and we need to add it. */ >+ if (tp != NULL && tp->priv == NULL) >+ { >+ thread_change_ptid (proc_target, tp->ptid, >+ ptid_t (pid, 0, pbuf[pi].pthid)); >+ tp->priv.reset (priv); >+ } >+ else >+ thread = add_thread_with_info (proc_target, >+ ptid_t (pid, 0, pbuf[pi].pthid), >+ priv); I'm confused why this is the correct place. From what I can see, in this scenario, we have: - libpthdebug reports some threads using a thread ID, i.e. pbuf has ptid_t (pid, 0, pthid1) .. ptid_t (pid, 0, pthidN) with pcount >= 1. - GDB only has one single thread in unthreaded mode, i.e. gbuf has ptid_t (pid, 0, 0) with gcount == 1. So when running the loop, during the first iteration, we should compare ptid_cmp (ptid_t (pid, 0, pthid1), ptid_t (pid, 0, 0)) which should be > 0 since pthid1 > 0. Right? This means we'll get into the branch that just does: delete_thread (gbuf[gi]); thereby deleting the original thread. Does this not happen for you? What is going on instead? [ Note that this is a simplified case with only a single process; in the multi-process scenario, this may be more complex. ] Bye, Ulrich