public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
To: "simark@simark.ca" <simark@simark.ca>,
	Aditya Kamath1 <Aditya.Kamath1@ibm.com>,
	"gdb-patches@sourceware.org" <gdb-patches@sourceware.org>
Cc: Sangamesh Mallayya <sangamesh.swamy@in.ibm.com>
Subject: Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
Date: Mon, 13 Feb 2023 19:01:48 +0000	[thread overview]
Message-ID: <10cb526ef81fd8a63fd776e51bf9505c8dff1e33.camel@de.ibm.com> (raw)
In-Reply-To: <DM6PR15MB3545F7F15C72DF6970E4D0AED6DE9@DM6PR15MB3545.namprd15.prod.outlook.com>

Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:

>Also find attached a document that I have proposed as a commit message.

Thanks for the details.  However, this is too much detail for a
commit message - you don't need to include full debug sessions
documenting how you found a bug; instead, please concisely
summarize *what* the bug *is*, and how (at a high level) the
patch fixes the bug.

To be more specific, I'd write a commit message for this patch
somewhat along these lines:


Fix multi-threaded debugging under AIX

Multi-threaded debugging using the libpthdebug debug interface
is currently broken due to multiple issues.

When debugging a single inferior, we were getting assertion
failures in get_aix_thread_info as no tp->priv structure was
allocated for the main thread.  Fix this by switching the main
thread from a (pid, 0, 0) ptid_t to a (pid, 0, tid) ptid_t and
allocaing the tp->priv structure in sync_threadlists.

As a result, the switch_to_thread call in pdc_read_data could
now fail since the main thread no longer uses (pid, 0, 0).
Replace the call by only switching inferior_ptid, the current
inferior, and the current address space (like proc-service.c).
Add similar switching to pdc_write_data where it was missing
completely.

When debugging multiple inferiors, an additional set of
problems prevented correct multi-threaded debugging:

First of all, aix-thread.c used to have a number of global
variables holding per-inferior information.  Switch these
to a per-inferior data structure instead.

Also, sync_threadlists was getting confused as we were
comparing the list of threads returned by libpthdebug
for *one* process with GDB's list of threads for *all*
processes.  Only use the GDB threads of the current
inferior instead.

Finally, the presence of the thread library in any but
the first inferior was not correctly detected due to a
bug in solib-aix.c, where the BFD file name for shared
library members was changed when the library was loaded
for the first time, which caused the library to no longer
be recognized by name when loaded a second time,


(I'm not saying you need to use this exact message - maybe
I'm missing something or getting some details wrong.  But
this is the style you should be roughly going for.)

B.t.w. the fact that the message is now getting so long is
actually an indicator that it might be preferable to break
the patch out into multiple commits - the solib change is
one obvious stand-alone patch, and maybe it would make sense
to split off the aix-thread changes also needed for single-
inferior debugging from the multi-inferior support changes.
Given that we've already spent a long time on this patch,
I'm not insisting on this change - but something to keep
in mind for future patches.


Some additional comments on your latest changes:

>However, we also need to keep in mind that before we think this will
>work, our libpthread library is only ready when the following condition
>in the wait () of aix-thread.c is satisfied.
>
>/* Check whether libpthdebug might be ready to be initialized.  */
>  if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
>      && status->sig () == GDB_SIGNAL_TRAP)
>
>Until then changing the “process <pid>” to “thread <tid>” is incorrect.
>Even though the session is ready and initalised via pd_enable () and
>pd_activate () >functions respectively. Therfore this made us to keep
>a variable pthdebugready in all functions that lead to sync_threadlists ()
>so that we change the process thread to a thread with private data only
>when libpthdebug is initialised for a particular process.

I do not understand this change.  The ->pd_active flag is supposed to
track exactly that information, why do we need to duplicate it into
yet another flag?   Note that the point of the the "if" block above
is that *it is calling pd_activacte()*, which will set ->pd_active
if the library is in fact ready to be used.  If there's anything
wrong that causes pd_active to be set when the thread library is,
in fact, not yet active, that's a bug we need to find and fix.

Also, as long as the thread library is not ready, we should not be
calling sync_threadlists in the first place.

>So why did we make this change
>-    if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-                       ptid_t (user_current_pid));
> in pdc_read_data and change our user variable which was the process
> ID to a thread? Wasn’t it already doing the job?

>This now will use the ptid_t (user_current_pid) to switch the thread ().
>However, our parent process or main thread of it, is threaded i.e is ptid_t
>(user_current_pid, 0, tid). Hence, we will crash with an assertion
>failure that thread ptid_t (user_current_pid) has not been found.

Ah, I see.  That makes sense.

>-static int pdc_read_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
>-static int pdc_write_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
>+static int pdc_read_data (thread_info*, void *, pthdb_addr_t, size_t);
>+static int pdc_write_data (thread_info*, void *, pthdb_addr_t, size_t);

These changes are also confusing.  First of all, my understanding has
been that the signature of these functions is fixed by the OS, since
they are passed as callbacks to pthdb_session_init.  This means you
cannot just go and change them arbitrarily ...

In addition, I'm not sure the change makes sense semantically.  Note
that you create one pd_session object *per inferior*, not one per
thread.  The pthdb_user_t identifies the pd_session, so it doesn't
make sense to use the thread_info pointer as pthdb_user_t, even
if that were possible from an API perspective.

What was the reason for not just continuing to use the PID here?

>+    scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
>     /* Before the first inferior is added, we pass inferior_ptid.pid ()
>        from pd_enable () which is 0.  There is no need to switch threads
>        during first initialisation.  In the rest of the callbacks the
>        current thread needs to be correct.  */
>-    if (user_current_pid != 0)
>-      switch_to_thread (current_inferior ()->process_target (),
>-			ptid_t (user_current_pid));
>+    inferior_ptid = user_current_thread->ptid;

If you no longer use switch_to_thread, can't you then continue to
use ptid_t (user_current_pid)?  This is only used during the
target_read_memory call, which should go down to the process
target, which doesn't require TIDs.

>+  tp = find_thread_ptid (proc_target, ptid_t (pid));
>+
>+  /* If the pthreadlibrary is not ready to debug 
>+     then this is just a main process which needs 
>+     a priv to be set.  The if condition below does 
>+     the same.  Otherwise we go to the for loop to 
>+     sync the pthread and GDB thread lists.  */

This goes back to my question above, if the library is not yet
ready, first of all we should never even get here, and second,
all PTIDs should still be PID-only and nobody should ever look
for any aix_thread_info ...

> static ptid_t
>-pd_activate (int pid)
>+pd_activate (pid_t pid, int pthdebugready)

I don't understand this flag - the point of this function is
to *find out whether* the library is ready - either
pthdb_session_init succeeds (and thus the library is ready)
or it fails (and thus the library is not ready).

>   /* If we're debugging a core file or an attached inferior, the
>      pthread library may already have been initialized, so try to
>      activate thread debugging.  */
>-  pd_activate (inferior_ptid.pid ());
>+  pd_activate (inferior_ptid.pid (), 0);

I guess this is the point?  As the comment says, this should only
ever make any difference for core files or attaching to an
already running inferior, never for starting up an inferior under
GDB.  If this isn't correct, we need to understand why.


Bye,
Ulrich


  parent reply	other threads:[~2023-02-13 19:01 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-25  6:47 Aditya Kamath1
2022-10-28  9:49 ` Ulrich Weigand
2022-11-08 12:00   ` Aditya Kamath1
2022-11-08 12:17     ` Ulrich Weigand
2022-11-13 18:15       ` Aditya Kamath1
2022-11-15 18:16         ` Ulrich Weigand
2022-11-21  8:27           ` Aditya Kamath1
2022-11-23 14:15             ` Ulrich Weigand
2022-11-23 16:03               ` Aditya Kamath1
2022-11-23 17:09                 ` Ulrich Weigand
2022-11-23 18:45                   ` Aditya Kamath1
2022-11-29  8:18                     ` Aditya Kamath1
2022-11-30 14:57                       ` Ulrich Weigand
2022-12-02  7:50                         ` Aditya Kamath1
2022-12-05 18:33                           ` Ulrich Weigand
2022-12-08 10:28                             ` Aditya Kamath1
2022-12-08 10:46                               ` Aditya Kamath1
2022-12-08 16:29                               ` Ulrich Weigand
2022-12-15 12:58                                 ` Aditya Kamath1
2022-12-15 15:53                                   ` Ulrich Weigand
2022-12-19  6:30                                     ` Aditya Kamath1
2022-12-22 12:50                                       ` Ulrich Weigand
2022-12-26 13:18                                         ` Aditya Kamath1
2023-01-09 14:04                                           ` Ulrich Weigand
2023-01-10 12:23                                             ` Aditya Kamath1
2023-01-11 13:31                                               ` Ulrich Weigand
2023-01-13 14:06                                                 ` Aditya Kamath1
2023-01-20 14:44                                                   ` Ulrich Weigand
2023-01-27 14:40                                                     ` Aditya Kamath1
2023-01-30 19:54                                                       ` Tom Tromey
2023-02-02  6:24                                                       ` Aditya Kamath1
2023-02-02  6:35                                                         ` Aditya Kamath1
2023-02-02 17:43                                                           ` Ulrich Weigand
2023-02-03 11:10                                                             ` Aditya Kamath1
2023-02-06 19:07                                                               ` Ulrich Weigand
2023-02-07 11:57                                                                 ` Aditya Kamath1
2023-02-08 18:44                                                                   ` Ulrich Weigand
2023-02-10 16:33                                                                     ` Aditya Kamath1
2023-02-10 16:46                                                                       ` Aditya Kamath1
2023-02-13 19:01                                                                       ` Ulrich Weigand [this message]
2023-02-14 14:13                                                                         ` Aditya Kamath1
2023-02-16 19:46                                                                           ` Ulrich Weigand
2023-02-17 11:26                                                                             ` Aditya Kamath1
2023-02-17 12:04                                                                               ` Ulrich Weigand
2023-02-17 13:22                                                                                 ` Aditya Kamath1
2023-02-17 14:18                                                                                   ` Ulrich Weigand
2023-02-17 15:15                                                                                     ` Aditya Kamath1
2023-02-17 19:14                                                                                       ` Ulrich Weigand
2022-11-08 12:00 Aditya Kamath1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=10cb526ef81fd8a63fd776e51bf9505c8dff1e33.camel@de.ibm.com \
    --to=ulrich.weigand@de.ibm.com \
    --cc=Aditya.Kamath1@ibm.com \
    --cc=gdb-patches@sourceware.org \
    --cc=sangamesh.swamy@in.ibm.com \
    --cc=simark@simark.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).