Hi Ulrich, Tom and community, Thank you for the feedback for the fix of this bug. Please find attached the patch. {See: 0001-Fix-multi-thread-debug-bug-in-AIX.patch}. So, I have fixed the bug and it works alright. Please find the test program, output with patch and without patch pasted below this email. >I'm not sure why the pd_active test is needed here >at all - if ptid.tid () != 0, thread debugging *must* >be active, otherwise you'd never have installed a ptid >with the tid field set, right? May be in the previous versions it was needed as they were adding main process as an extra thread. In this version of the aix-thread.c there is no place I have found where pd_active is disabled and we have a tid > 0. For us to have a tid we need pd_active enabled anyway which happens in pd_activate (). Otherwise, execution will not get into sync_threadlists () to set the tid. So, I have removed it from there as you suggested. >> static CORE_ADDR pd_brk_addr; >I believe this needs to go into the aix_thread_variables struct; This is done. Kindly check in the patch. >This only was a callback because it was called via >iterate_over_threads. Now that you're using the >all_threads C++ iterator, I think both of those This is done. Kindly check in the patch. >I think a simple but correct way to check whether the BFD filename >already contains a member name is to check for the presence of an >opening parenthesis e.g. via: > strrchr (bfd_get_filename (...), '('); This is also done. >I think the question here is simply whether, if you run the >test suite both without and with your patch, are any of the >FAILs fixed with the patch? If not, it would be good to >create a new test that fails without the patch and succeeds >with it, and add that to the test suite. So, this is something new to me. We will add it as a continuation in the same thread after this patch. I will need one information. Which test suite will we add it in? gdb.threads or gdb.base? Also, kindly suggest a simple test case that is written that I can see and learn. Any simple hello_world program will do. I want to understand how that exp file is written and how it compares to tell if a test case is pass or fail. Kindly give me feedback for this patch, incase we can do anything better or is incorrect. If not, kindly push this patch so that AIX folks can have a better debugging experience. Have a nice day ahead. Thanks and regards, Aditya. ------------------------------------------- Code:- #include #include #include #include #include pthread_barrier_t barrier; #define NUM_THREADS 2 void * thread_function (void *arg) { /* This ensures that the breakpoint is only hit after both threads are created, so the test can always switch to the non-event thread when the breakpoint triggers. */ pthread_barrier_wait (&barrier); pid_t child; child = fork (); if (child > 0) printf ("I am parent \n"); else { child = fork (); if (child > 0) printf ("I am child \n"); else printf ("I am grandchild \n"); } while (1); /* break here */ } int main (void) { int i; pthread_t thread[NUM_THREADS]; alarm (300); pthread_barrier_init (&barrier, NULL, NUM_THREADS); for (i = 0; i < NUM_THREADS; i++) { int res; res = pthread_create (&thread[i], NULL, thread_function, NULL); assert (res == 0); } while (1) { sleep (15); } return 0; } ------------------------------------------------- Output with patch:- Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork... (gdb) set detach-on-fork off (gdb) r Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork [New Thread 258] [New Thread 515] [New inferior 2 (process 18546974)] [New inferior 3 (process 9634234)] I am parent I am parent ^C Thread 1.1 received signal SIGINT, Interrupt. [Switching to Thread 1] 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) (gdb) info sharedlibrary From To Syms Read Shared Object Library 0xd05bc124 0xd05bf194 Yes (*) /usr/lib/libpthreads.a(shr_comm.o) 0xd05bb240 0xd05bb9a1 Yes (*) /usr/lib/libcrypt.a(shr.o) 0xd0576180 0xd05ba731 Yes (*) /usr/lib/libpthread.a(shr_xpg5.o) 0xd0100e00 0xd0575123 Yes (*) /usr/lib/libc.a(shr.o) (*): Shared library is missing debugging information. (gdb) info threads Id Target Id Frame * 1.1 Thread 1 (tid 27263453, running) 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) 1.2 Thread 258 (tid 29819289, running) thread_function (arg=0x0) at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32 1.3 Thread 515 (tid 25297199, running) thread_function (arg=0x0) 0x0) at /home/aditya/gdb_tests/ultimate-multi-thread-fork.c:32 2.1 process 18546974 0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o) 3.1 process 9634234 0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o) (gdb) inferior 2 [Switching to inferior 2 [process 18546974] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)] [Switching to thread 2.1 (process 18546974)] #0 0xd0594fc8 in _sigsetmask () from /usr/lib/libpthread.a(shr_xpg5.o) (gdb) info sharedlibrary From To Syms Read Shared Object Library 0xd05bc124 0xd05bf194 Yes (*) /usr/lib/libpthreads.a(shr_comm.o) 0xd05bb240 0xd05bb9a1 Yes (*) /usr/lib/libcrypt.a(shr.o) 0xd0576180 0xd05ba731 Yes (*) /usr/lib/libpthread.a(shr_xpg5.o) 0xd0100e00 0xd0575123 Yes (*) /usr/lib/libc.a(shr.o) (*): Shared library is missing debugging information. ----------------------------------------------------------------------- Output without patch:- Reading symbols from /home/aditya/gdb_tests/ultimate-multi-thread-fork... (gdb) set detach-on-fork off (gdb) r Starting program: /home/aditya/gdb_tests/ultimate-multi-thread-fork [New Thread 1] [New Thread 258] [New Thread 515] [New inferior 2 (process 11731200)] I am parent [New inferior 3 (process 16843200)] I am parent ^C Thread 1.1 received signal SIGINT, Interrupt. 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) (gdb) inferior 2 [Switching to inferior 2 [process 11731200] (/home/aditya/gdb_tests/ultimate-multi-thread-fork)] [Switching to thread 2.1 (process 11731200)] #0 0xd0594fc8 in ?? () (gdb) info threads Id Target Id Frame 1.1 process 15270316 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) 1.2 process 15270316 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) 1.3 process 15270316 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) 1.4 process 15270316 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o) * 2.1 process 11731200 0xd0594fc8 in ?? () 3.1 process 16843200 0xd0594fc8 in ?? () (gdb) info sharedlibrary warning: "/usr/lib/libpthreads.a": member "shr_comm.o" missing. warning: "/usr/lib/libcrypt.a": member "shr.o" missing. warning: "/usr/lib/libpthread.a": member "shr_xpg5.o" missing. warning: "/usr/lib/libc.a": member "shr.o" missing. warning: Could not load shared library symbols for 4 libraries, e.g. /usr/lib/libpthreads.a(shr_comm.o). Use the "info sharedlibrary" command to see the complete listing. Do you need "set solib-search-path" or "set sysroot"? From To Syms Read Shared Object Library No /usr/lib/libpthreads.a(shr_comm.o) No /usr/lib/libcrypt.a(shr.o) No /usr/lib/libpthread.a(shr_xpg5.o) No /usr/lib/libc.a(shr.o) (gdb) ________________________________ From: Ulrich Weigand Sent: 02 February 2023 23:13 To: simark@simark.ca ; Aditya Kamath1 ; gdb-patches@sourceware.org Cc: Sangamesh Mallayya Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch Aditya Kamath1 wrote: >>Is there an existing gdb test case that exercises this code? >>If not then it seems like a new test is warranted. > >This I am not aware of at least when I tried finding. I think the question here is simply whether, if you run the test suite both without and with your patch, are any of the FAILs fixed with the patch? If not, it would be good to create a new test that fails without the patch and succeeds with it, and add that to the test suite. I think we're getting pretty close now, but I do still have some additional comments on the latest patch: > /* Return whether to treat PID as a debuggable thread id. */ > >-#define PD_TID(ptid) (pd_active && ptid.tid () != 0) >+#define PD_TID(ptid, data) (data->pd_active && ptid.tid () != 0) I'm not sure why the pd_active test is needed here at all - if ptid.tid () != 0, thread debugging *must* be active, otherwise you'd never have installed a ptid with the tid field set, right? If that check can indeed be omitted, that would simplify the patch a bit since you wouldn't need to provide the "data" struct in quite as many places. >/* Address of the function that libpthread will call when libpthdebug > is ready to be initialized. */ > > static CORE_ADDR pd_brk_addr; I believe this needs to go into the aix_thread_variables struct; the address might be different if the pthread library is loaded at different addresses into different inferiors. > /* Whether the current architecture is 64-bit. > Only valid when pd_able is true. */ > >static int arch64; Likewise - some inferiors may be 64-bit and others 32-bit. In general, *any* static variable in this file is suspect. >+/* Callback for counting GDB threads for process pid. */ > > static int >-giter_count (struct thread_info *thread, void *countp) >+giter_count (pid_t pid) This only was a callback because it was called via iterate_over_threads. Now that you're using the all_threads C++ iterator, I think both of those routines should just be inlined into its caller. >@@ -565,6 +565,7 @@ solib_aix_bfd_open (const char *pathname) > const char *sep; > int filename_len; > int found_file; >+ std::string string_path = pathname; > > if (pathname[path_len - 1] != ')') > return solib_bfd_open (pathname); >@@ -618,6 +619,15 @@ solib_aix_bfd_open (const char *pathname) > if (member_name == bfd_get_filename (object_bfd.get ())) > break; > >+ std::string s = bfd_get_filename (object_bfd.get ()); >+ >+ /* For every inferior after first int bfd system we >+ will have the pathname instead of the member name >+ registered. Hence the below condition exists. */ >+ >+ if (string_path.compare (s) == 0) >+ return object_bfd; That's still not quite right, as the pathname component might have been changed here: /* Calling solib_find makes certain that sysroot path is set properly if program has a dependency on .a archive and sysroot is set via set sysroot command. */ gdb::unique_xmalloc_ptr found_pathname = solib_find (filename.c_str (), &found_file); I think a simple but correct way to check whether the BFD filename already contains a member name is to check for the presence of an opening parenthesis e.g. via: strrchr (bfd_get_filename (...), '('); Bye, Ulrich