public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [RFC] Using all three fields of ptid and handling thread events from rs6000-aix-nat.c
@ 2024-04-15 14:37 Aditya Kamath1
  2024-04-15 17:36 ` John Baldwin
  2024-04-16  8:52 ` Ulrich Weigand
  0 siblings, 2 replies; 6+ messages in thread
From: Aditya Kamath1 @ 2024-04-15 14:37 UTC (permalink / raw)
  To: Aditya Kamath1 via Gdb-patches, Ulrich Weigand; +Cc: Sangamesh Mallayya

[-- Attachment #1: Type: text/plain, Size: 6021 bytes --]

Respected community members,

Hi,

I am currently working on fixing a bug on handling the thread exit event in AIX and displaying the same in UI. Currently, in AIX we miss the event resulting in incorrect display of info threads.

For example, for the program 1 pasted below this email. This is a single process creating three more threads.

The GDB output in AIX is:-

(gdb) b main
Breakpoint 1 at 0x10000788: file //gdb_tests/continue-pending-status_exit_test.c, line 44.
(gdb) r
Starting program: /gdb_tests/continue-pending-status_exit_test
Breakpoint 1, main () at //gdb_tests/continue-pending-status_exit_test.c:44
44        alarm (300);
(gdb) c
Hello World
Hello World
Hello World
[New Thread 258]
[New Thread 515]
[New Thread 772]
Thread 1 received signal SIGINT, Interrupt.
0xd0611d70 in _p_nsleep () from /usr/lib/libpthreads.a(_shr_xpg5.o)
(gdb) info threads
  Id   Target Id                                          Frame
* 1    Thread 1 (tid 26607979) (tid 26607979, running)    0xd0611d70 in _p_nsleep () from /usr/lib/libpthreads.a(_shr_xpg5.o)
  2    Thread 258 (tid 30998799) (tid 30998799, finished) aix-thread: ptrace (52, 30998799) returned -1 (errno = 3 The process does not exist.)

The Linux output for the same is

(gdb) b main
Breakpoint 1 at 0x10000990: file test_thread.c, line 27.
(gdb) r
Starting program: /home/buildusr/binutils-gdb/gdb/test_thread
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1,
main () at test_thread.c:27
27        alarm (300);
(gdb) c
Continuing.
[New Thread 0x7ffff7c7f170 (LWP 4032543)]
[New Thread 0x7ffff746f170 (LWP 4032544)]
[New Thread 0x7ffff6c5f170 (LWP 4032545)]
Hello World
Hello World
Hello World
[Thread 0x7ffff6c5f170 (LWP 4032545) exited]
[Thread 0x7ffff746f170 (LWP 4032544) exited]
[Thread 0x7ffff7c7f170 (LWP 4032543) exited]
^C
Thread 1 "test_thread" received signal SIGINT, Interrupt.
0x00007ffff7da5b04 in nanosleep () from /lib64/libc.so.6
(gdb) info threads
  Id   Target Id                                         Frame
* 1    Thread 0x7ffff7ff3e00 (LWP 4032541)
0x00007ffff7da5b04 in nanosleep () from /lib64/libc.so.6
(gdb)


Reason why this happened.

In Linux, I see in the linux-nat.c /* Check if the thread has exited.  */
2278<https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdb/linux-nat.c;h=2602e1f240d0056b0e199a952d86bbaf96e6d2f3;hb=34d5ac9244ccfe566232469ec3bef1329f0bc42e#l2278>       if (WIFEXITED (status) || WIFSIGNALED (status))
2279<https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdb/linux-nat.c;h=2602e1f240d0056b0e199a952d86bbaf96e6d2f3;hb=34d5ac9244ccfe566232469ec3bef1329f0bc42e#l2279>         {

This is where it gets handled when wait () is called.

But in AIX, in sync_threadlist () despite having the code we miss the bus. If we observe the sync_threadlists () code our pcount and gcount remain the same despite the threads exiting resulting in, we not capturing the exit.

The debugger checks why it had to wait (), goes to pd_activate (), then to pd_update (), then to sync_threadlists () where the event of threads born are captured.

One interesting thing here is that in Linux the print “Hello World” happens after the thread born event is captured but in AIX it happens before.

Are we missing something in AIX?? One information I want to give is we do go to wait () before the print Hello world happens, but we do not call pd_activate () and even if we do [We call it on purpose skipping the if condition in wait () in aix-thread.c], the thread debug session is not successful. In the next wait () event is when we capture the new thread or add_thread () event.

There is one more wait () event that happens after this, but here as well AIX fails to capture the thread_exit event in sync_threadlists () since the pcount and gcount are the same.

So, neither could we capture the thread exit nor found a way to correct it later because we missed the same.

Kindly let me know the reasons you think from your experience on why this happened. This will help us fix this issue on AIX. Are all these thread related issues got to do with the fact that GDB core is expecting certain callbacks in the rs6000-aix-nat.c which currently does not exist?

So, looking at the UI of Linux and this bug I also want to do two things in GDB code:

1: Move the kernel thread handling part to rs6000-aix-nat.c by using all the three fields in ptid i.e. pid, lwp and tid, so we can be in sync with Linux both in terms of event handling and UI display. [As Ulrich mentions here<https://sourceware.org/pipermail/gdb-patches/2023-February/197365.html>]
2: Contribute this test case as well to make sure these bugs are captured. So, is there anything similar already? In continuous-pending-status.exp, with the while (1) in the thread this issue I missed.

Kindly let me know if the community is okay with the above or anything we have to keep in mind.

Have a nice day ahead.

Thanks and regards,
Aditya.





========================================================================
Program 1:-

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <assert.h>

pthread_barrier_t barrier;

#define NUM_THREADS 3

void *
thread_function (void *arg)
{
  /* This ensures that the breakpoint is only hit after both threads
     are created, so the test can always switch to the non-event
     thread when the breakpoint triggers.  */
  pthread_barrier_wait (&barrier);

  printf ("Hello World \n"); /* break here */
}

int
main (void)
{
  int i;

  alarm (300);

  pthread_barrier_init (&barrier, NULL, NUM_THREADS);

  for (i = 0; i < NUM_THREADS; i++)
    {
      pthread_t thread;
      int res;

      res = pthread_create (&thread, NULL,
                            thread_function, NULL);
      assert (res == 0);
    }

  while (1)
    sleep (1);

  return 0;
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] Using all three fields of ptid and handling thread events from rs6000-aix-nat.c
  2024-04-15 14:37 [RFC] Using all three fields of ptid and handling thread events from rs6000-aix-nat.c Aditya Kamath1
@ 2024-04-15 17:36 ` John Baldwin
  2024-04-17 10:36   ` Aditya Kamath1
  2024-04-16  8:52 ` Ulrich Weigand
  1 sibling, 1 reply; 6+ messages in thread
From: John Baldwin @ 2024-04-15 17:36 UTC (permalink / raw)
  To: Aditya Kamath1, Aditya Kamath1 via Gdb-patches, Ulrich Weigand
  Cc: Sangamesh Mallayya

On 4/15/24 7:37 AM, Aditya Kamath1 wrote:
> Respected community members,
> 
> Hi,
> 
> I am currently working on fixing a bug on handling the thread exit event in AIX and displaying the same in UI. Currently, in AIX we miss the event resulting in incorrect display of info threads.
> 
> For example, for the program 1 pasted below this email. This is a single process creating three more threads.
> 
> The GDB output in AIX is:-
> 
> (gdb) b main
> Breakpoint 1 at 0x10000788: file //gdb_tests/continue-pending-status_exit_test.c, line 44.
> (gdb) r
> Starting program: /gdb_tests/continue-pending-status_exit_test
> Breakpoint 1, main () at //gdb_tests/continue-pending-status_exit_test.c:44
> 44        alarm (300);
> (gdb) c
> Hello World
> Hello World
> Hello World
> [New Thread 258]
> [New Thread 515]
> [New Thread 772]
> Thread 1 received signal SIGINT, Interrupt.
> 0xd0611d70 in _p_nsleep () from /usr/lib/libpthreads.a(_shr_xpg5.o)
> (gdb) info threads
>    Id   Target Id                                          Frame
> * 1    Thread 1 (tid 26607979) (tid 26607979, running)    0xd0611d70 in _p_nsleep () from /usr/lib/libpthreads.a(_shr_xpg5.o)
>    2    Thread 258 (tid 30998799) (tid 30998799, finished) aix-thread: ptrace (52, 30998799) returned -1 (errno = 3 The process does not exist.)
> 
> The Linux output for the same is
> 
> (gdb) b main
> Breakpoint 1 at 0x10000990: file test_thread.c, line 27.
> (gdb) r
> Starting program: /home/buildusr/binutils-gdb/gdb/test_thread
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> 
> Breakpoint 1,
> main () at test_thread.c:27
> 27        alarm (300);
> (gdb) c
> Continuing.
> [New Thread 0x7ffff7c7f170 (LWP 4032543)]
> [New Thread 0x7ffff746f170 (LWP 4032544)]
> [New Thread 0x7ffff6c5f170 (LWP 4032545)]
> Hello World
> Hello World
> Hello World
> [Thread 0x7ffff6c5f170 (LWP 4032545) exited]
> [Thread 0x7ffff746f170 (LWP 4032544) exited]
> [Thread 0x7ffff7c7f170 (LWP 4032543) exited]
> ^C
> Thread 1 "test_thread" received signal SIGINT, Interrupt.
> 0x00007ffff7da5b04 in nanosleep () from /lib64/libc.so.6
> (gdb) info threads
>    Id   Target Id                                         Frame
> * 1    Thread 0x7ffff7ff3e00 (LWP 4032541)
> 0x00007ffff7da5b04 in nanosleep () from /lib64/libc.so.6
> (gdb)
> 
> 
> Reason why this happened.
> 
> In Linux, I see in the linux-nat.c /* Check if the thread has exited.  */
> 2278<https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdb/linux-nat.c;h=2602e1f240d0056b0e199a952d86bbaf96e6d2f3;hb=34d5ac9244ccfe566232469ec3bef1329f0bc42e#l2278>       if (WIFEXITED (status) || WIFSIGNALED (status))
> 2279<https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdb/linux-nat.c;h=2602e1f240d0056b0e199a952d86bbaf96e6d2f3;hb=34d5ac9244ccfe566232469ec3bef1329f0bc42e#l2279>         {
> 
> This is where it gets handled when wait () is called.
> 
> But in AIX, in sync_threadlist () despite having the code we miss the bus. If we observe the sync_threadlists () code our pcount and gcount remain the same despite the threads exiting resulting in, we not capturing the exit.

Even though pcount == gcount, in theory your ptid's should be different in pbuf[]
vs gbuf[].

That said, when I added thread support to FreeBSD (albeit just using LWPs and not
bothering with libthread_db), there in my 'update_thread_list' target I first
call 'prune_threads' which uses the 'target::thread_alive' hook to remove any
threads that are no longer active, and then call a function that does the
equivalent of fetching pbuf[] and adding any threads whose ptid is not already
present (see fbsd_nat.c::fbsd_add_threads).

The approach of using qsort with two sets of lists and comparing the lists should
be correct from what I can tell, but it is also more complex and might be harder
to debug?

> The debugger checks why it had to wait (), goes to pd_activate (), then to pd_update (), then to sync_threadlists () where the event of threads born are captured.
> 
> One interesting thing here is that in Linux the print “Hello World” happens after the thread born event is captured but in AIX it happens before.

Linux raises a ptrace() stop when a new thread is created.  When using libthread_db
I believe you should be placing a breakpoint in your thread library so that you can
capture new threads in userspace before they start executing (e.g. in the internal
routine in your thread library that eventually calls the thread's start routine).

For FreeBSD with LWP-backed threads I ended up just adding a new ptrace event in the
kernel when a new thread starts executing and use that to add threads instead.

In many cases this race wouldn't matter much for users though.

> Are we missing something in AIX?? One information I want to give is we do go to wait () before the print Hello world happens, but we do not call pd_activate () and even if we do [We call it on purpose skipping the if condition in wait () in aix-thread.c], the thread debug session is not successful. In the next wait () event is when we capture the new thread or add_thread () event.
> 
> There is one more wait () event that happens after this, but here as well AIX fails to capture the thread_exit event in sync_threadlists () since the pcount and gcount are the same.

I'm still not sure how we you are missing the thread exit event.  Even though
pcount == gcount it looks like sync_threadlists still walks the two lists using
ptid_cmp so should still DTRT?

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] Using all three fields of ptid and handling thread events from rs6000-aix-nat.c
  2024-04-15 14:37 [RFC] Using all three fields of ptid and handling thread events from rs6000-aix-nat.c Aditya Kamath1
  2024-04-15 17:36 ` John Baldwin
@ 2024-04-16  8:52 ` Ulrich Weigand
  1 sibling, 0 replies; 6+ messages in thread
From: Ulrich Weigand @ 2024-04-16  8:52 UTC (permalink / raw)
  To: gdb-patches, Aditya Kamath1; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>But in AIX, in sync_threadlist () despite having the code we miss the bus.
>If we observe the sync_threadlists () code our pcount and gcount remain the
>same despite the threads exiting resulting in, we not capturing the exit.

As John already mentioned, I think the main question is why sync_threadlists
doesn't appear to correct update the thread list.  What is the list of threads
it receives from the OS here?  What is supposed to happen according to the
AIX interface specification?

>1: Move the kernel thread handling part to rs6000-aix-nat.c by using all the
>three fields in ptid i.e. pid, lwp and tid, so we can be in sync with Linux
>both in terms of event handling and UI display. [As Ulrich mentions here]

While I still agree this would be useful to do, I don't see how it would
address the problem you're currently seeing.

Bye,
Ulrich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [RFC] Using all three fields of ptid and handling thread events from rs6000-aix-nat.c
  2024-04-15 17:36 ` John Baldwin
@ 2024-04-17 10:36   ` Aditya Kamath1
  2024-04-17 18:48     ` Ulrich Weigand
  2024-04-20 20:25     ` John Baldwin
  0 siblings, 2 replies; 6+ messages in thread
From: Aditya Kamath1 @ 2024-04-17 10:36 UTC (permalink / raw)
  To: John Baldwin, Aditya Kamath1 via Gdb-patches, Ulrich Weigand
  Cc: Sangamesh Mallayya

[-- Attachment #1: Type: text/plain, Size: 10862 bytes --]

Respected John, Ulrich and community members,

Hi,

Thank you for your feedback. That 'prune_threads'  was useful information as I was looking for to get to the root cause and find a place to call the sync_threadlists () to capture the thread exit event.

>Even though pcount == gcount, in theory your ptid's should be different in pbuf[]>
>vs gbuf[].

>That said, when I added thread support to FreeBSD (albeit just using LWPs and not
>bothering with libthread_db), there in my 'update_thread_list' target I first
>call 'prune_threads' which uses the 'target::thread_alive' hook to remove any
>threads that are no longer active, and then call a function that does the
>equivalent of fetching pbuf[] and adding any threads whose ptid is not already
>present (see fbsd_nat.c::fbsd_add_threads).

>The approach of using qsort with two sets of lists and comparing the lists should
>be correct from what I can tell, but it is also more complex and might be harder
>to debug?

>I'm still not sure how we you are missing the thread exit event.  Even though
>pcount == gcount it looks like sync_threadlists still walks the two lists using
>ptid_cmp so should still DTRT?

Currently in AIX, we do not call prune_threads (). Even if we do call it and it internally calls our AIX thread_alive (), the ‘return in_thread_list (proc_target, ptid);’ is always returning true even if the threads in AIX is in finished state after it was born and in the GDB thread list. (Also, we never made an attempt to delete).

In AIX we have idle, run, sleep, ready and terminated thread states. In sync_threadlists () once a thread is born we use the for (cmd = PTHDB_LIST_FIRST;; cmd = PTHDB_LIST_NEXT) to iterate through the threads. pthdb_pthread () help us get the pdtid using cmd. pthdb_pthread_ptid () help us get the pthtid. Then we update the pbuf.

Here is the catch.

Even if our thread reached the terminated state, the above information do not change. They still remain the same. This makes the gbuf and pbuf list be exactly the same and we do not delete the threads.

In theory pbuf and gbuf should be different. But it does not happen.

For program 1 pasted below this email I print three things.

The cmp_result for the threads, and their state numbers for each thread, pcount and gcount .
They are:-
0
0
0
0
state = 2
state = 5
state = 5
state = 5
pcount = 4, gcount = 4

There are 3 threads apart from the main thread all whose born event is captured correctly and this is the print information of the last sync_threadlists() before I pressed interrupt while the main thread is in the while (1) of the main.  [These are printf statements added to debug].

So if the cmp_result are 0 for all the threads mean no change in pbuf and gbuf. If we observe the states, main thread is 2 which PST_RUN and the others are PST_TERM.

I am planning to use
“status = pthdb_pthread_state (data->pd_session, pdtid, &state);”

And if the state is PST_TERM then I want to remove the thread vector from the pbuf list and make sure pcount is also reduced. This can fix the issue.

So these are my findings of why we missed the thread exit event and what can fix this. Kindly let me know what you think. In case I missed something do let me know.

>>1: Move the kernel thread handling part to rs6000-aix-nat.c by using all the
>>three fields in ptid i.e. pid, lwp and tid, so we can be in sync with Linux
>>both in terms of event handling and UI display. [As Ulrich mentions here]

>While I still agree this would be useful to do, I don't see how it would
>address the problem you're currently seeing.

Yes, it will not solve the issue. Since I did not have the information on how exactly another target is calling delete_thread () on the thread exit event, I thought may be because we are not handling kernel thread part we faced this issue.

Have a nice day ahead.

Thanks and regards,
Aditya.


===============================================
PROGRAM 1
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <pthread.h>
#include <assert.h>

pthread_barrier_t barrier;

#define NUM_THREADS 3

void *
thread_function (void *arg)
{
  /* This ensures that the breakpoint is only hit after both threads
     are created, so the test can always switch to the non-event
     thread when the breakpoint triggers.  */
  pthread_barrier_wait (&barrier);

  while (1); /* break here */
}

int
main (void)
{
  int i;

  alarm (300);

  pthread_barrier_init (&barrier, NULL, NUM_THREADS);

  for (i = 0; i < NUM_THREADS; i++)
    {
      pthread_t thread;
      int res;

      res = pthread_create (&thread, NULL,
                            thread_function, NULL);
      assert (res == 0);
    }

  while (1)
    sleep (1);

  return 0;
}


From: John Baldwin <jhb@FreeBSD.org>
Date: Monday, 15 April 2024 at 11:06 PM
To: Aditya Kamath1 <Aditya.Kamath1@ibm.com>, Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org>, Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: [EXTERNAL] Re: [RFC] Using all three fields of ptid and handling thread events from rs6000-aix-nat.c
On 4/15/24 7:37 AM, Aditya Kamath1 wrote:
> Respected community members,
>
> Hi,
>
> I am currently working on fixing a bug on handling the thread exit event in AIX and displaying the same in UI. Currently, in AIX we miss the event resulting in incorrect display of info threads.
>
> For example, for the program 1 pasted below this email. This is a single process creating three more threads.
>
> The GDB output in AIX is:-
>
> (gdb) b main
> Breakpoint 1 at 0x10000788: file //gdb_tests/continue-pending-status_exit_test.c, line 44.
> (gdb) r
> Starting program: /gdb_tests/continue-pending-status_exit_test
> Breakpoint 1, main () at //gdb_tests/continue-pending-status_exit_test.c:44
> 44        alarm (300);
> (gdb) c
> Hello World
> Hello World
> Hello World
> [New Thread 258]
> [New Thread 515]
> [New Thread 772]
> Thread 1 received signal SIGINT, Interrupt.
> 0xd0611d70 in _p_nsleep () from /usr/lib/libpthreads.a(_shr_xpg5.o)
> (gdb) info threads
>    Id   Target Id                                          Frame
> * 1    Thread 1 (tid 26607979) (tid 26607979, running)    0xd0611d70 in _p_nsleep () from /usr/lib/libpthreads.a(_shr_xpg5.o)
>    2    Thread 258 (tid 30998799) (tid 30998799, finished) aix-thread: ptrace (52, 30998799) returned -1 (errno = 3 The process does not exist.)
>
> The Linux output for the same is
>
> (gdb) b main
> Breakpoint 1 at 0x10000990: file test_thread.c, line 27.
> (gdb) r
> Starting program: /home/buildusr/binutils-gdb/gdb/test_thread
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
>
> Breakpoint 1,
> main () at test_thread.c:27
> 27        alarm (300);
> (gdb) c
> Continuing.
> [New Thread 0x7ffff7c7f170 (LWP 4032543)]
> [New Thread 0x7ffff746f170 (LWP 4032544)]
> [New Thread 0x7ffff6c5f170 (LWP 4032545)]
> Hello World
> Hello World
> Hello World
> [Thread 0x7ffff6c5f170 (LWP 4032545) exited]
> [Thread 0x7ffff746f170 (LWP 4032544) exited]
> [Thread 0x7ffff7c7f170 (LWP 4032543) exited]
> ^C
> Thread 1 "test_thread" received signal SIGINT, Interrupt.
> 0x00007ffff7da5b04 in nanosleep () from /lib64/libc.so.6
> (gdb) info threads
>    Id   Target Id                                         Frame
> * 1    Thread 0x7ffff7ff3e00 (LWP 4032541)
> 0x00007ffff7da5b04 in nanosleep () from /lib64/libc.so.6
> (gdb)
>
>
> Reason why this happened.
>
> In Linux, I see in the linux-nat.c /* Check if the thread has exited.  */
> 2278<https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdb/linux-nat.c;h=2602e1f240d0056b0e199a952d86bbaf96e6d2f3;hb=34d5ac9244ccfe566232469ec3bef1329f0bc42e#l2278 >       if (WIFEXITED (status) || WIFSIGNALED (status))
> 2279<https://sourceware.org/git?p=binutils-gdb.git;a=blob;f=gdb/linux-nat.c;h=2602e1f240d0056b0e199a952d86bbaf96e6d2f3;hb=34d5ac9244ccfe566232469ec3bef1329f0bc42e#l2279 >         {
>
> This is where it gets handled when wait () is called.
>
> But in AIX, in sync_threadlist () despite having the code we miss the bus. If we observe the sync_threadlists () code our pcount and gcount remain the same despite the threads exiting resulting in, we not capturing the exit.

Even though pcount == gcount, in theory your ptid's should be different in pbuf[]
vs gbuf[].

That said, when I added thread support to FreeBSD (albeit just using LWPs and not
bothering with libthread_db), there in my 'update_thread_list' target I first
call 'prune_threads' which uses the 'target::thread_alive' hook to remove any
threads that are no longer active, and then call a function that does the
equivalent of fetching pbuf[] and adding any threads whose ptid is not already
present (see fbsd_nat.c::fbsd_add_threads).

The approach of using qsort with two sets of lists and comparing the lists should
be correct from what I can tell, but it is also more complex and might be harder
to debug?

> The debugger checks why it had to wait (), goes to pd_activate (), then to pd_update (), then to sync_threadlists () where the event of threads born are captured.
>
> One interesting thing here is that in Linux the print “Hello World” happens after the thread born event is captured but in AIX it happens before.

Linux raises a ptrace() stop when a new thread is created.  When using libthread_db
I believe you should be placing a breakpoint in your thread library so that you can
capture new threads in userspace before they start executing (e.g. in the internal
routine in your thread library that eventually calls the thread's start routine).

For FreeBSD with LWP-backed threads I ended up just adding a new ptrace event in the
kernel when a new thread starts executing and use that to add threads instead.

In many cases this race wouldn't matter much for users though.

> Are we missing something in AIX?? One information I want to give is we do go to wait () before the print Hello world happens, but we do not call pd_activate () and even if we do [We call it on purpose skipping the if condition in wait () in aix-thread.c], the thread debug session is not successful. In the next wait () event is when we capture the new thread or add_thread () event.
>
> There is one more wait () event that happens after this, but here as well AIX fails to capture the thread_exit event in sync_threadlists () since the pcount and gcount are the same.

I'm still not sure how we you are missing the thread exit event.  Even though
pcount == gcount it looks like sync_threadlists still walks the two lists using
ptid_cmp so should still DTRT?

--
John Baldwin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] Using all three fields of ptid and handling thread events from rs6000-aix-nat.c
  2024-04-17 10:36   ` Aditya Kamath1
@ 2024-04-17 18:48     ` Ulrich Weigand
  2024-04-20 20:25     ` John Baldwin
  1 sibling, 0 replies; 6+ messages in thread
From: Ulrich Weigand @ 2024-04-17 18:48 UTC (permalink / raw)
  To: gdb-patches, Aditya Kamath1, jhb; +Cc: Sangamesh Mallayya

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>I am planning to use 
>“status = pthdb_pthread_state (data->pd_session, pdtid, &state);”
>
>And if the state is PST_TERM then I want to remove the thread vector
>from the pbuf list and make sure pcount is also reduced.
>This can fix the issue. 

This makes sense to me.

Bye,
Ulrich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] Using all three fields of ptid and handling thread events from rs6000-aix-nat.c
  2024-04-17 10:36   ` Aditya Kamath1
  2024-04-17 18:48     ` Ulrich Weigand
@ 2024-04-20 20:25     ` John Baldwin
  1 sibling, 0 replies; 6+ messages in thread
From: John Baldwin @ 2024-04-20 20:25 UTC (permalink / raw)
  To: Aditya Kamath1, Aditya Kamath1 via Gdb-patches, Ulrich Weigand
  Cc: Sangamesh Mallayya

On 4/17/24 3:36 AM, Aditya Kamath1 wrote:
> I am planning to use
> “status = pthdb_pthread_state (data->pd_session, pdtid, &state);”
> 
> And if the state is PST_TERM then I want to remove the thread vector from the pbuf list and make sure pcount is also reduced. This can fix the issue.

This sounds good to me.  I would also recommend fixing your ::thread_is_alive
method to treat a thread whose state is PST_TERM as dead (if you have a
thread_is_alive method, if you don't have such a method, I would recommend
adding one).

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-04-20 20:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-15 14:37 [RFC] Using all three fields of ptid and handling thread events from rs6000-aix-nat.c Aditya Kamath1
2024-04-15 17:36 ` John Baldwin
2024-04-17 10:36   ` Aditya Kamath1
2024-04-17 18:48     ` Ulrich Weigand
2024-04-20 20:25     ` John Baldwin
2024-04-16  8:52 ` Ulrich Weigand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).