public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
To: "simark@simark.ca" <simark@simark.ca>,
	Aditya Kamath1 <Aditya.Kamath1@ibm.com>,
	"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
Date: Mon, 5 Dec 2022 18:33:29 +0000	[thread overview]
Message-ID: <8302c3570292b864ab21176e58bdee546f6e4544.camel@de.ibm.com> (raw)
In-Reply-To: <CH2PR15MB354486F155F5CC97501D056DD6179@CH2PR15MB3544.namprd15.prod.outlook.com>

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>>I'm not sure why it is necessary to handle this in the process layer
>>(rs6000-aix-nat.c) instead of the thread layer (aix-thread.c).
>>What specifically breaks if you do not have these rs6000-aix-nat.c
>>changes?
>
>So, if you observe output 3 or 4, the program first multi threads,
>I mean thread events are handled first and then the threads fork.
>So, when this happens, I cannot return ptid_t (parent_pid). If I do
>so, the GDB core will treat it as a new process and add it in my
>threadlist as say process 100 despite existence of 'thread 1'
>representing the same. So, I need to correctly send which thread
>did the fork () event or which thread of the process is the one who
>gave birth to a new inferior process [say 2 or 3 in output 3 below],
>I mean which thread caused the mult process event when the process
>is mutli threaded. This has to handled here as control from target.c
>comes directly to rs6000-aix-nat::wait and not through
>aix-thread.c::wait since fork () is a process event.. 

So this last bit seems to be the problem.  Could you elaborate on
what the exact call stack is?  I thought once the thread layer is
initialized, calls to ::wait should always go through it ...

>>If you *do* need to handle LWPs (kernel thread IDs) in the process
>>layer (this can be a reasonable choice, and it done by several other
>>native targets), then it should be *consistent*, and *all* LWP handling
>>should be in the process layer. In particular, under no circumstances
>>does it make sense to duplicate the "find current/signalled thread"
>>code in *both* the process any thread layers.
>
>This not straightforward to do. The reason being say our application is pthreaded
>We need our sync_threadlists() code to detect multiple threads and sync..
>We cannot handle this in rs6000-aix-nat.c with the current design of the code..
>Let's say child process is multi-threaded things can get complex..
>It will require us to move that whole GDB list and Pthread list sync code to
>rs6000-aix-nat.c code. The essence or most selling product or the USP
>[Unique Selling Proposition] of aix-thread.c code will be lost. 

So the way this works e.g. on Linux is that the process layer handles
both processes and the *kernel* aspect of threads, while the thread
layer handles the *user-space* (libc/libpthread) aspect of threads.

In terms of the GDB ptid_t, this means that both the "pid" and "lwp"
field are "owned" by the process layer (which would be rs6000-aix-nat.c
in your case), while only the "tid" field is owned by the thread
layer (which would be aix-thread.c).  

Linux does that because it allows correctly debugging programs that
only use the kernel threading capabilities without using libpthread,
e.g. by directly calling the "clone" system call and not "pthread_create".
Such threads won't be in the thread list managed by the user space
library, but are still handled by the process layer in GDB, tracked
as lwp without associated tid.

Not sure if something like that is even possible in AIX.  If it does
make sense to handle things similarly in AIX (one other reason would
be ptrace commands that require LWPs, e.g. like the VSX register
access you had in another thread), some code would indeed need
to move, e.g. everything related to accessing *kernel* threads
(fetch_regs_kernel_thread etc.), while code that accesses *user*
threads via the libpthread accessors (fetch_regs_user_thread etc.)
would still remain in aix-thread.c.


>>>[Switching to process 16777620]
>
>>This outputs inferior_ptid ...
>
>Yes, you were right
>
>>>* 1.1  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  1.2  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  1.3  process 16777620  0xd0595fb0 in _p_nsleep ()
>>>   from /usr/lib/libpthread.a(shr_xpg5.o)
>>>  2.1  process 8323570   0xd0594fc8 in ?? ()
>>>  3.1  process 17957172  0xd0594fc8 in ?? ()
>
>>... and this outputs the ptid values for those threads.
>
>>If it says "process ...", then those ptid values have not
>>properly been switched over to the (pid, lwp, tid) format.

>While debugged in depth last two days I realised our pid_to_str
>is needed in rs6000-aix-nat.c as control comes here in search of it.
>If it doesn't GDB treats all threads as process.

This is again very suspicious.  We obviously already have
threads, so the thread layer should be initialized.  This
means that any "pid_to_str" call should go through the
*thread* layer (implementation in aix-thread.c).  If that
doesn't happen, we should understand why.  (This may be the
same problem that causes "wait" to be called from the
wrong layer, as seen above.)


>>You should verify that the sync_threadlists code handles
>>all multi-process cases correctly.  I haven't looked at
>>this in detail, but are you sure that here:
>
>>>@@ -841,8 +829,22 @@ sync_threadlists (int pid)
> >>            }
> >>          else if (cmp_result > 0)
> >>            {
>>>-             delete_thread (gbuf[gi]);
>
>
>>you never accidentally switch the *pid* part (if "gptid"
>>belows to a different pid than "pptid")?
>
>So, this is not the reason. I have added an assertion here just
>to be sure. I get what you are thinking.

Having an assertion is of course good, but it isn't obvious to
me that this never can be hit.


>>Hmm.  So when "wait" returns, it needs to determine which thread
>>triggered the event that caused ptrace to stop.  On Linux, "wait"
>>will actually return the LWP of that thread, so it can be directly
>>used.  It seems on AIX, "wait" only returns a PID, and you do not
>>immediately know which thread caused the event?
>
>>In that case, I can see why you'd have to consider SIGINT as well
>>as SIGTRAP. However, it seems to me that even those two are not the
>>*only* cases that can cause "wait" to return - doesn't *any* signal
>>(potentially) trigger a ptrace intercept (causing wait to return)?
>
>>But that's probably a more general problem, and wouldn't occur in
>>this simple test case.
>
>Exactly. So I tried debugging few examples causing a few other signals
>as mentioned in this document [https://www.ibm.com/docs/en/sdk-java-technology/8?topic=reference-signal-handling].
>In AIX we have most of them mentioned in the link. It does not block
>us from doing things or crashes incase of a segment fault signal
>[from our debugger code]. Abort also works fine. Let me know what you think. 

The point is if GDB stops because the target received a signal, it
should automatically switch to the particular thread where the signal
was in fact received.  I don't think this will actually happen in all
cases with the current code.

Shouldn't you instead check for *any* signal in get_signaled_thread?


Bye,
Ulrich


  reply	other threads:[~2022-12-05 18:33 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-25  6:47 Aditya Kamath1
2022-10-28  9:49 ` Ulrich Weigand
2022-11-08 12:00   ` Aditya Kamath1
2022-11-08 12:17     ` Ulrich Weigand
2022-11-13 18:15       ` Aditya Kamath1
2022-11-15 18:16         ` Ulrich Weigand
2022-11-21  8:27           ` Aditya Kamath1
2022-11-23 14:15             ` Ulrich Weigand
2022-11-23 16:03               ` Aditya Kamath1
2022-11-23 17:09                 ` Ulrich Weigand
2022-11-23 18:45                   ` Aditya Kamath1
2022-11-29  8:18                     ` Aditya Kamath1
2022-11-30 14:57                       ` Ulrich Weigand
2022-12-02  7:50                         ` Aditya Kamath1
2022-12-05 18:33                           ` Ulrich Weigand [this message]
2022-12-08 10:28                             ` Aditya Kamath1
2022-12-08 10:46                               ` Aditya Kamath1
2022-12-08 16:29                               ` Ulrich Weigand
2022-12-15 12:58                                 ` Aditya Kamath1
2022-12-15 15:53                                   ` Ulrich Weigand
2022-12-19  6:30                                     ` Aditya Kamath1
2022-12-22 12:50                                       ` Ulrich Weigand
2022-12-26 13:18                                         ` Aditya Kamath1
2023-01-09 14:04                                           ` Ulrich Weigand
2023-01-10 12:23                                             ` Aditya Kamath1
2023-01-11 13:31                                               ` Ulrich Weigand
2023-01-13 14:06                                                 ` Aditya Kamath1
2023-01-20 14:44                                                   ` Ulrich Weigand
2023-01-27 14:40                                                     ` Aditya Kamath1
2023-01-30 19:54                                                       ` Tom Tromey
2023-02-02  6:24                                                       ` Aditya Kamath1
2023-02-02  6:35                                                         ` Aditya Kamath1
2023-02-02 17:43                                                           ` Ulrich Weigand
2023-02-03 11:10                                                             ` Aditya Kamath1
2023-02-06 19:07                                                               ` Ulrich Weigand
2023-02-07 11:57                                                                 ` Aditya Kamath1
2023-02-08 18:44                                                                   ` Ulrich Weigand
2023-02-10 16:33                                                                     ` Aditya Kamath1
2023-02-10 16:46                                                                       ` Aditya Kamath1
2023-02-13 19:01                                                                       ` Ulrich Weigand
2023-02-14 14:13                                                                         ` Aditya Kamath1
2023-02-16 19:46                                                                           ` Ulrich Weigand
2023-02-17 11:26                                                                             ` Aditya Kamath1
2023-02-17 12:04                                                                               ` Ulrich Weigand
2023-02-17 13:22                                                                                 ` Aditya Kamath1
2023-02-17 14:18                                                                                   ` Ulrich Weigand
2023-02-17 15:15                                                                                     ` Aditya Kamath1
2023-02-17 19:14                                                                                       ` Ulrich Weigand
2022-11-08 12:00 Aditya Kamath1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8302c3570292b864ab21176e58bdee546f6e4544.camel@de.ibm.com \
    --to=ulrich.weigand@de.ibm.com \
    --cc=Aditya.Kamath1@ibm.com \
    --cc=gdb-patches@sourceware.org \
    --cc=sangamesh.swamy@in.ibm.com \
    --cc=simark@simark.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).