* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] <CH2PR15MB35449137854F561F1EA7DE04D6D69@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2023-02-02 17:43 ` Ulrich Weigand
[not found] ` <DM6PR15MB3545C1FD15BEC2D160B3AB81D6D79@DM6PR15MB3545.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2023-02-02 17:43 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>>Is there an existing gdb test case that exercises this code?
>>If not then it seems like a new test is warranted.
>
>This I am not aware of at least when I tried finding.
I think the question here is simply whether, if you run the
test suite both without and with your patch, are any of the
FAILs fixed with the patch? If not, it would be good to
create a new test that fails without the patch and succeeds
with it, and add that to the test suite.
I think we're getting pretty close now, but I do still have
some additional comments on the latest patch:
> /* Return whether to treat PID as a debuggable thread id. */
>
>-#define PD_TID(ptid) (pd_active && ptid.tid () != 0)
>+#define PD_TID(ptid, data) (data->pd_active && ptid.tid () != 0)
I'm not sure why the pd_active test is needed here
at all - if ptid.tid () != 0, thread debugging *must*
be active, otherwise you'd never have installed a ptid
with the tid field set, right?
If that check can indeed be omitted, that would simplify
the patch a bit since you wouldn't need to provide the
"data" struct in quite as many places.
>/* Address of the function that libpthread will call when libpthdebug
> is ready to be initialized. */
>
> static CORE_ADDR pd_brk_addr;
I believe this needs to go into the aix_thread_variables struct;
the address might be different if the pthread library is loaded
at different addresses into different inferiors.
> /* Whether the current architecture is 64-bit.
> Only valid when pd_able is true. */
>
>static int arch64;
Likewise - some inferiors may be 64-bit and others 32-bit.
In general, *any* static variable in this file is suspect.
>+/* Callback for counting GDB threads for process pid. */
>
> static int
>-giter_count (struct thread_info *thread, void *countp)
>+giter_count (pid_t pid)
This only was a callback because it was called via
iterate_over_threads. Now that you're using the
all_threads C++ iterator, I think both of those
routines should just be inlined into its caller.
>@@ -565,6 +565,7 @@ solib_aix_bfd_open (const char *pathname)
> const char *sep;
> int filename_len;
> int found_file;
>+ std::string string_path = pathname;
>
> if (pathname[path_len - 1] != ')')
> return solib_bfd_open (pathname);
>@@ -618,6 +619,15 @@ solib_aix_bfd_open (const char *pathname)
> if (member_name == bfd_get_filename (object_bfd.get ()))
> break;
>
>+ std::string s = bfd_get_filename (object_bfd.get ());
>+
>+ /* For every inferior after first int bfd system we
>+ will have the pathname instead of the member name
>+ registered. Hence the below condition exists. */
>+
>+ if (string_path.compare (s) == 0)
>+ return object_bfd;
That's still not quite right, as the pathname component
might have been changed here:
/* Calling solib_find makes certain that sysroot path is set properly
if program has a dependency on .a archive and sysroot is set via
set sysroot command. */
gdb::unique_xmalloc_ptr<char> found_pathname
= solib_find (filename.c_str (), &found_file);
I think a simple but correct way to check whether the BFD filename
already contains a member name is to check for the presence of an
opening parenthesis e.g. via:
strrchr (bfd_get_filename (...), '(');
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <DM6PR15MB3545C1FD15BEC2D160B3AB81D6D79@DM6PR15MB3545.namprd15.prod.outlook.com>
@ 2023-02-06 19:07 ` Ulrich Weigand
[not found] ` <CH2PR15MB35449148A6541A1581B1749ED6DB9@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2023-02-06 19:07 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>>I think the question here is simply whether, if you run the
>>test suite both without and with your patch, are any of the
>>FAILs fixed with the patch? If not, it would be good to
>>create a new test that fails without the patch and succeeds
>>with it, and add that to the test suite.
>
>So, this is something new to me. We will add it as a continuation in the same thread
>after this patch. I will need one information. Which test suite will we add it in?
>gdb.threads or gdb.base? Also, kindly suggest a simple test case that is written that
>I can see and learn. Any simple hello_world program will do. I want to understand how
>that exp file is written and how it compares to tell if a test case is pass or fail.
I think this would fit better into gdb.threads, given that this is about the
interaction of multiple inferiors with the threading library on AIX.
I'd just look at existing test cases in that directory. For simple tests, we
usually have a .c file and a .exp file with the same name. The .exp file
starts out with instructions to build the test case, and start it up under
GDB. Then follow a series of test statements which are verified against
the output of the GDB under test. As a simple example in a related area,
you can look e.g. at fork-child-threads.{c,exp}.
>Kindly give me feedback for this patch, incase we can do anything better
>or is incorrect.
Some comments:
>@@ -508,14 +550,13 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
> /* This is needed to eliminate the dependency of current thread
> which is null so that thread reads the correct target memory. */
> {
>- scoped_restore_current_thread restore_current_thread;
>+ scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
> /* Before the first inferior is added, we pass inferior_ptid.pid ()
> from pd_enable () which is 0. There is no need to switch threads
> during first initialisation. In the rest of the callbacks the
> current thread needs to be correct. */
> if (user_current_pid != 0)
>- switch_to_thread (current_inferior ()->process_target (),
>- ptid_t (user_current_pid));
>+ inferior_ptid = ptid_t (user_current_pid);
> status = target_read_memory (addr, (gdb_byte *) buf, len);
> }
This seems unrelated to the rest of the changes at first glance.
Why is this necessary?
Also, is the "user_current_pid != 0" check even still needed given
the change to pd_enable() below?
By comparison, the Linux version of this in proc-service.c also
switches the current inferior and address space:
scoped_restore_current_inferior restore_inferior;
set_current_inferior (ph->thread->inf);
scoped_restore_current_program_space restore_current_progspace;
set_current_program_space (ph->thread->inf->pspace);
scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
inferior_ptid = ph->thread->ptid;
so we should probably do the same for consistency.
Also, the same logic will be required in pdc_write_data, where it
is currently missing completely.
>+ for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
>+ {
>+ **(struct thread_info ***) &g = tp;
>+ (*(struct thread_info ***) &g)++;
>+ }
This looks unnecessarily complicated. Isn't this just
*g++ = tp;
?
>+ /* If there is only one thread then we need not make the main
>+ thread look like a thread. It can stay as a process. This
>+ is useful when we have multiple inferiors, but only one is
>+ threaded. So we need not make the other inferiors with only
>+ main thread, look like a threaded one. For example, Thread
>+ 1.1, 1.2, 2.1, 3.1 exists then it is useful to skip this for
>+ loop for 2.1 and 3.1 leaving them as main process thread with
>+ a dummy priv set. */
>+
>+ if (pcount == 1 && gcount == 1)
>+ {
>+ aix_thread_info *priv = new aix_thread_info;
>+ tp = find_thread_ptid (proc_target, gptid);
>+ tp->priv.reset (priv);
>+ break;
>+ }
Is this a change in behavior to current GDB? I thought if the
application (whether a single inferior or one of multiple inferiors)
is threaded in the sense that it uses the libpthread library we
wanted to show it as threaded, so that the user can e.g. see the
thread ID in info threads.
>+ /* This is to make the main process thread now look
>+ like a thread. */
>+
>+ if (gptid.is_pid () && gptid.pid () == pptid.pid ())
>+ {
>+ thread_change_ptid (proc_target, gptid, pptid);
>+ aix_thread_info *priv = new aix_thread_info;
>+ priv->pdtid = pbuf[pi].pdtid;
>+ priv->tid = pbuf[pi].tid;
>+ tp = find_thread_ptid (proc_target, pptid);
>+ tp->priv.reset (priv);
>+ pi++;
>+ gi++;
>+ }
>+ else
>+ {
>+ delete_thread (gbuf[gi]);
>+ gi++;
>+ }
This logic is still confusing me. Why is the
gptid.pid () == pptid.pid ()
check still needed? I thought we now collected only threads
of a single process to begin with, so they all ought to have
the same PID?
Also, if the point is the gptid.is_pid () check, this can
really only happen once per inferior, as it is switched
from non-threaded to threaded mode, right? Maybe it
would simplify the logic to have all that (including
the code under
if (pcount == 1 && gcount == 1)
above if it is actually needed) in a separate statement
before that loop.
I.e. directly before the loop, have a separate check
whether the current process only has a single thread,
whose ptid_t is still in the pid-only format, and if
so, upgrade it to full TID format using the main thread's
TID. Only after that, go through the loop to handle
any other threads we may also have. (At that point,
all GDB threads should already always be in TID format.)
>- if (!PD_TID (ptid))
>+ if (!(ptid.tid () != 0))
That should just be "if (ptid.tid () == 0)" then.
(Here and in a few other places.)
>@@ -1741,7 +1823,7 @@ aix_thread_target::mourn_inferior ()
> {
> target_ops *beneath = this->beneath ();
>
>- pd_deactivate ();
>+ pd_disable ();
> beneath->mourn_inferior ();
> }
Why is this necessary? If it is, do we even need two
separate pd_deactivate and pd_disable routines any more?
>@@ -618,6 +618,16 @@ solib_aix_bfd_open (const char *pathname)
> if (member_name == bfd_get_filename (object_bfd.get ()))
> break;
>
>+ std::string s = bfd_get_filename (object_bfd.get ());
>+
>+ /* For every inferior after first int bfd system we
>+ will have the pathname instead of the member name
>+ registered. Hence the below condition exists. */
>+
>+ if (s.find ('(') != std::string::npos
>+ && s.find (member_name) != std::string::npos)
>+ return object_bfd;
Ah, I guess you also need to ensure the member_name follows
immediately after the '(', otherwise there could be confusion
if the member name happens to be part of the file name as well.
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB35449148A6541A1581B1749ED6DB9@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2023-02-08 18:44 ` Ulrich Weigand
[not found] ` <DM6PR15MB3545F7F15C72DF6970E4D0AED6DE9@DM6PR15MB3545.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2023-02-08 18:44 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>>This seems unrelated to the rest of the changes at first glance.
>>Why is this necessary?
>
>So, when we need to be in the right context when we read memory. Before
>coming into the target wait, we switch_to_no_thread () due to which our
>inferior_ptid is set to null. Our target_memory needs the correct
>inferior_ptid. Also, in case we don't have a ptid_t (pid) and the
>application is threaded we need the inferior_ptid to be set correctly
>like shown in the patch.
Understood.
>Previously we used switch_to_thread ().. Now if the application is
>theraded and we only pass ptid_t (user_current_pid) to switch_to_thread ()
>it will crash as main thread looks different or is ptid_t (pid, 0, tid).
This part I don't quite understand yet - how/why does it crash?
>>By comparison, the Linux version of this in proc-service.c also
>>switches the current inferior and address space:
> > scoped_restore_current_inferior restore_inferior;
> > set_current_inferior (ph->thread->inf);
> > scoped_restore_current_program_space restore_current_progspace;
> > set_current_program_space (ph->thread->inf->pspace);
> > scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
> > inferior_ptid = ph->thread->ptid;
>> so we should probably do the same for consistency.
>So, kindly allow me to disagree with you on this. What is happening is in
>inferior.c in do_target_wait1 () we call switch_to_inferior_no_thread ()..
[snip]
>Here we already set the correct current inferior and program space to
>the same thing as that if we set in pdc_read_memory like linux.
>So, it does not make any difference to add the changes like linux does
Well, it does look like if you entered the callback in this particular
context, the inferior may have already been set up correctly. However,
in theory the callback could also be called in different contexts, and
just as a precaution it would be preferable to have it always work
correctly. The semantics of the callback is to read memory of a
particular process as identified via the pthdb_user_t argument, and
we should write the routine so that it always does what's needed to
implement that semantics correctly.
>Secondly, things work if we do not do the same for pdc_write_memory.
>I have not seen anything not work. So, I don't think it is good to
>add it there. What say??
Similarly, I agree that everything may currently "work" without
adding the equivalent change to pdc_write_memory, but most likely
this is simply because that callback may just not be used very much.
But as a precaution, and to accommodate potential future changes
e.g. in the libpthdebug.a library, if would be preferable to
implement the semantics correctly. (Also, it just looks surprising
to see the read and write implementation differ when there is no
apparent reason why that should be the case.)
>>This looks unnecessarily complicated. Isn't this just
> > *g++ = tp;
>
>This I have changed.
The code now looks like:
>+ for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
>+ {
>+ *g = tp;
>+ *g++;
>+ }
Which is weird, as *g++ dereferences g for no reason. This should
simply be:
for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
*g++ = tp;
>As far as the check gptid.is_pid () is concerned, I will suggest we
>keep it there. If cmp_result is > 0 and we have a main process swap
>to create a thread. Rest is same in the loop. The reason being handling
>pi and gi variables becomes complex otherwise. When this swap happens,
>we need to increment both pi and gi.. Because we have taken care of the
>main threads in both pthread library and GDB. And this for loop is
>executed only once. So, the first event is main process being
>pthreaded. Once the swap happens pi and gi become one and since
>gcount = pcount = 1 we exit the for loop. Thread addition events comes
>after this.
Hmm, handling the initial switch of a single PID-only thread
to the PID/TID-style ptid_t separately before still seems
a bit clearer to me. But in the end your proposed code looks
correct now so I'd be fine with it as is, if you prefer.
Except for the few things mentioned above, this now looks ready to
be committed to me. However, I'm not sure the commit message
fully describes the latest version of the patch, after we've
gone through all those iterations ... Can you come up with a
message that maybe starts out with the high-level change
(along the lines of "update aix-thread.c to handle threads in
multiple inferiors"), and goes from there into the specific
details (aix_thread_variables structure, handling only a
single inferior per sync_threadlists invocation, solib fixes
for multiple inferiors, ...)? Thanks!
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <DM6PR15MB3545F7F15C72DF6970E4D0AED6DE9@DM6PR15MB3545.namprd15.prod.outlook.com>
@ 2023-02-13 19:01 ` Ulrich Weigand
[not found] ` <CH2PR15MB35444D2E4EC3340265427F47D6A29@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2023-02-13 19:01 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>Also find attached a document that I have proposed as a commit message.
Thanks for the details. However, this is too much detail for a
commit message - you don't need to include full debug sessions
documenting how you found a bug; instead, please concisely
summarize *what* the bug *is*, and how (at a high level) the
patch fixes the bug.
To be more specific, I'd write a commit message for this patch
somewhat along these lines:
Fix multi-threaded debugging under AIX
Multi-threaded debugging using the libpthdebug debug interface
is currently broken due to multiple issues.
When debugging a single inferior, we were getting assertion
failures in get_aix_thread_info as no tp->priv structure was
allocated for the main thread. Fix this by switching the main
thread from a (pid, 0, 0) ptid_t to a (pid, 0, tid) ptid_t and
allocaing the tp->priv structure in sync_threadlists.
As a result, the switch_to_thread call in pdc_read_data could
now fail since the main thread no longer uses (pid, 0, 0).
Replace the call by only switching inferior_ptid, the current
inferior, and the current address space (like proc-service.c).
Add similar switching to pdc_write_data where it was missing
completely.
When debugging multiple inferiors, an additional set of
problems prevented correct multi-threaded debugging:
First of all, aix-thread.c used to have a number of global
variables holding per-inferior information. Switch these
to a per-inferior data structure instead.
Also, sync_threadlists was getting confused as we were
comparing the list of threads returned by libpthdebug
for *one* process with GDB's list of threads for *all*
processes. Only use the GDB threads of the current
inferior instead.
Finally, the presence of the thread library in any but
the first inferior was not correctly detected due to a
bug in solib-aix.c, where the BFD file name for shared
library members was changed when the library was loaded
for the first time, which caused the library to no longer
be recognized by name when loaded a second time,
(I'm not saying you need to use this exact message - maybe
I'm missing something or getting some details wrong. But
this is the style you should be roughly going for.)
B.t.w. the fact that the message is now getting so long is
actually an indicator that it might be preferable to break
the patch out into multiple commits - the solib change is
one obvious stand-alone patch, and maybe it would make sense
to split off the aix-thread changes also needed for single-
inferior debugging from the multi-inferior support changes.
Given that we've already spent a long time on this patch,
I'm not insisting on this change - but something to keep
in mind for future patches.
Some additional comments on your latest changes:
>However, we also need to keep in mind that before we think this will
>work, our libpthread library is only ready when the following condition
>in the wait () of aix-thread.c is satisfied.
>
>/* Check whether libpthdebug might be ready to be initialized. */
> if (!data->pd_active && status->kind () == TARGET_WAITKIND_STOPPED
> && status->sig () == GDB_SIGNAL_TRAP)
>
>Until then changing the “process <pid>” to “thread <tid>” is incorrect.
>Even though the session is ready and initalised via pd_enable () and
>pd_activate () >functions respectively. Therfore this made us to keep
>a variable pthdebugready in all functions that lead to sync_threadlists ()
>so that we change the process thread to a thread with private data only
>when libpthdebug is initialised for a particular process.
I do not understand this change. The ->pd_active flag is supposed to
track exactly that information, why do we need to duplicate it into
yet another flag? Note that the point of the the "if" block above
is that *it is calling pd_activacte()*, which will set ->pd_active
if the library is in fact ready to be used. If there's anything
wrong that causes pd_active to be set when the thread library is,
in fact, not yet active, that's a bug we need to find and fix.
Also, as long as the thread library is not ready, we should not be
calling sync_threadlists in the first place.
>So why did we make this change
>- if (user_current_pid != 0)
>- switch_to_thread (current_inferior ()->process_target (),
>- ptid_t (user_current_pid));
> in pdc_read_data and change our user variable which was the process
> ID to a thread? Wasn’t it already doing the job?
>This now will use the ptid_t (user_current_pid) to switch the thread ().
>However, our parent process or main thread of it, is threaded i.e is ptid_t
>(user_current_pid, 0, tid). Hence, we will crash with an assertion
>failure that thread ptid_t (user_current_pid) has not been found.
Ah, I see. That makes sense.
>-static int pdc_read_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
>-static int pdc_write_data (pthdb_user_t, void *, pthdb_addr_t, size_t);
>+static int pdc_read_data (thread_info*, void *, pthdb_addr_t, size_t);
>+static int pdc_write_data (thread_info*, void *, pthdb_addr_t, size_t);
These changes are also confusing. First of all, my understanding has
been that the signature of these functions is fixed by the OS, since
they are passed as callbacks to pthdb_session_init. This means you
cannot just go and change them arbitrarily ...
In addition, I'm not sure the change makes sense semantically. Note
that you create one pd_session object *per inferior*, not one per
thread. The pthdb_user_t identifies the pd_session, so it doesn't
make sense to use the thread_info pointer as pthdb_user_t, even
if that were possible from an API perspective.
What was the reason for not just continuing to use the PID here?
>+ scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
> /* Before the first inferior is added, we pass inferior_ptid.pid ()
> from pd_enable () which is 0. There is no need to switch threads
> during first initialisation. In the rest of the callbacks the
> current thread needs to be correct. */
>- if (user_current_pid != 0)
>- switch_to_thread (current_inferior ()->process_target (),
>- ptid_t (user_current_pid));
>+ inferior_ptid = user_current_thread->ptid;
If you no longer use switch_to_thread, can't you then continue to
use ptid_t (user_current_pid)? This is only used during the
target_read_memory call, which should go down to the process
target, which doesn't require TIDs.
>+ tp = find_thread_ptid (proc_target, ptid_t (pid));
>+
>+ /* If the pthreadlibrary is not ready to debug
>+ then this is just a main process which needs
>+ a priv to be set. The if condition below does
>+ the same. Otherwise we go to the for loop to
>+ sync the pthread and GDB thread lists. */
This goes back to my question above, if the library is not yet
ready, first of all we should never even get here, and second,
all PTIDs should still be PID-only and nobody should ever look
for any aix_thread_info ...
> static ptid_t
>-pd_activate (int pid)
>+pd_activate (pid_t pid, int pthdebugready)
I don't understand this flag - the point of this function is
to *find out whether* the library is ready - either
pthdb_session_init succeeds (and thus the library is ready)
or it fails (and thus the library is not ready).
> /* If we're debugging a core file or an attached inferior, the
> pthread library may already have been initialized, so try to
> activate thread debugging. */
>- pd_activate (inferior_ptid.pid ());
>+ pd_activate (inferior_ptid.pid (), 0);
I guess this is the point? As the comment says, this should only
ever make any difference for core files or attaching to an
already running inferior, never for starting up an inferior under
GDB. If this isn't correct, we need to understand why.
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB35444D2E4EC3340265427F47D6A29@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2023-02-16 19:46 ` Ulrich Weigand
[not found] ` <CH2PR15MB354455A756AF729AE5F796E2D6A19@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2023-02-16 19:46 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>So what is happening is that the when after a new process is born,
>its pthread library is getting intialised and we have changed its
>ptid from ptid (pid, 0, 0) to ptid (pid, 0, tid). Since we follow
>fork the code in inferior.c file will switch to the thread child
>where the child is reported as ptid (pid, 0, 0) but exists as
>ptid (pid, 0, tid). This leads to this crash. We did try with two
>variables if you recall in the previous patch. But your point of
>pd_active being there for it was valid. So somehow something isn't
>correct that I did not understand. We have pd_activate () in only
>two places. So is follow_fork () is expecting us to switch to child
>process and then change the ptid of the child?? If yes, how do we go??
>And if not where are we going wrong here.
Not sure if I follow you exactly here, but my understanding is
indeed that follow_fork should initially create an inferior for
the new process with just a single main thread using (pid, 0, 0).
Subsequently, aix-thread should detect that the new inferior
uses pthreads and then switch its ptid to (pid, 0, tid).
Not sure where exactly this goes wrong for you. What is the
path leading to this crash you're observing?
>Also this ptid_t (pid, 0, 0) and our main thread being
>ptid_t (pid, 0, tid) might need a smarted way to switch to the
>main thread's process space and set the right current inferior
>process in pdc_read_memory. Kindly check it in this patch and
>let me know if we can do it better.
So you currently do:
+ thread_info *thread = find_thread_ptid (current_inferior (),
+ ptid_t (user_current_pid));
+ /* If the pthread debug library is loaded, then we need the ptid_t (pid, 0 ,tid).
+ Since the main thread in the below for loop will be in the first iteration
+ we will break. */
+ if (!thread)
+ {
+ for (thread_info *tp: all_threads (current_inferior ()->process_target (),
+ ptid_t (user_current_pid)))
+ {
+ thread = tp;
+ break;
+ }
+ }
[...]
+ {
+ scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
+ inferior_ptid = ptid_t (user_current_pid);
+ scoped_restore_current_inferior restore_inferior;
+ set_current_inferior (thread->inf);
+
+ scoped_restore_current_program_space restore_current_progspace;
+ set_current_program_space (thread->inf->pspace);
+ status = target_write_memory (addr, (gdb_byte *) buf, len);
+ }
This is overkill. Note that at no point do you actually ever
use "thread" itself, only "thread->inf". But you can determine
the inferior associated with a PID directly via find_inferior_pid
without ever involving any thread.
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB354455A756AF729AE5F796E2D6A19@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2023-02-17 12:04 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544B015A880690CA1F6284AD6A19@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2023-02-17 12:04 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>GDB has three things to do here.
>** It will get a new object file notifier of the new process,
>** new inferior is created so the new inferior notifier
>** And follow_fork () event since GDB must decide it has to follow the child.
>Note:- Our child process is <pid, 0, 0>
>
>This is the exact order in which the above 3 things will be executed from the
>main GDB event loop.
Hmm. I understood the order to be a bit different. I agree there will first
be new objfile notifiers for the main executable and shared libraries.
But then we should get to infrun.c:follow_fork, which calls
infrun.c:follow_fork_inferior, which does in sequence:
switch_to_thread (*child_inf->threads ().begin ());
post_create_inferior (0);
So it should *first* do the switch_to_thread, and only *then*
call post_create_inferior (which in turn triggers the new
inferior notifier).
As a result, I understand it should be fine to switch ptid_t
in the new inferior notifier. But I agree that already switching
it in the new objfile notifier is a problem.
I see that linux-thread-db.c has special code for that:
/* When attaching / handling fork child, don't try loading libthread_db
until we know about all shared libraries. */
if (inf->in_initial_library_scan)
return false;
I believe you need to similarly skip calling pd_activate
from pd_enable if that in_initial_library_scan flag is
true for the current inferior.
Can you try if this resolves the problem?
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB3544B015A880690CA1F6284AD6A19@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2023-02-17 14:18 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544E905156ECE35D5ADCD43D6A19@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2023-02-17 14:18 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>Yes, it resolves the issue.
Excellent. A few final comment on the patch, including one
change I hadn't noticed before:
>@@ -508,14 +552,17 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
> /* This is needed to eliminate the dependency of current thread
> which is null so that thread reads the correct target memory. */
> {
>- scoped_restore_current_thread restore_current_thread;
>+ scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
> /* Before the first inferior is added, we pass inferior_ptid.pid ()
> from pd_enable () which is 0. There is no need to switch threads
> during first initialisation. In the rest of the callbacks the
> current thread needs to be correct. */
This comment is no longer relevant as the code relating to it was
deleted. The comment should be deleted as well.
>- if (user_current_pid != 0)
>- switch_to_thread (current_inferior ()->process_target (),
>- ptid_t (user_current_pid));
>+ inferior_ptid = ptid_t (user_current_pid);
>+ scoped_restore_current_inferior restore_inferior;
>+ set_current_inferior (inf);
>+
>+ scoped_restore_current_program_space restore_current_progspace;
>+ set_current_program_space (inf->pspace);
> status = target_read_memory (addr, (gdb_byte *) buf, len);
> }
> ret = status == 0 ? PDC_SUCCESS : PDC_FAILURE;
>+ tp = find_thread_ptid (proc_target, ptid_t (pid));
>+
>+ /* If the pthreadlibrary is not ready to debug
>+ then this is just a main process which needs
>+ a priv to be set. The if condition below does
>+ the same. Otherwise we go to the for loop to
>+ sync the pthread and GDB thread lists. */
>+
> /* Apply differences between the two arrays to GDB's thread list. */
>+
> for (pi = gi = 0; pi < pcount || gi < gcount;)
These changes all seem to be leftovers from previous attempts,
I guess they should be removed again.
>+ inferior *inf = current_inferior ();
>+ /* When attaching / handling fork child, don't try loading libthread_db
>+ until we know about all shared libraries. */
>+ if (inf->in_initial_library_scan)
>+ return;
"libthread_db" is Linux specific. Please update the comment so
it makes sense in the AIX context.
>@@ -1362,12 +1439,16 @@ aix_thread_target::fetch_registers (struct regcache *regcache, int regno)
> {
> struct thread_info *thread;
> pthdb_tid_t tid;
>+ thread = find_thread_ptid (current_inferior ()->process_target (), ptid_t (regcache->ptid ().pid (), 0, regcache->ptid ().tid ()));
>
>- if (!PD_TID (regcache->ptid ()))
>+ /* If a new inferior is born, then its pthread debug library is yet to
>+ initialised and hence has no private data. So the below if condition
>+ exists. */
>+
>+ if (regcache->ptid ().tid () == 0)
> beneath ()->fetch_registers (regcache, regno);
> else
> {
>- thread = find_thread_ptid (current_inferior (), regcache->ptid ());
> aix_thread_info *priv = get_aix_thread_info (thread);
> tid = priv->tid;
I hadn't seen this change below, it doesn't really make sense to me.
You really need to use regcache->ptid here, this should be correct.
When did you see a case where this was not correct? Does this still
happen now that we have the in_initial_library_scan check?
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB3544E905156ECE35D5ADCD43D6A19@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2023-02-17 19:14 ` Ulrich Weigand
0 siblings, 0 replies; 22+ messages in thread
From: Ulrich Weigand @ 2023-02-17 19:14 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>Please find attached the patch with all the changes mentioned.
This version looks good to me. I've checked it in now.
Thanks,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB3544A44737D2C7BF3CDC9BAAD6CC9@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2023-01-30 19:54 ` Tom Tromey
0 siblings, 0 replies; 22+ messages in thread
From: Tom Tromey @ 2023-01-30 19:54 UTC (permalink / raw)
To: Aditya Kamath1 via Gdb-patches
Cc: Ulrich Weigand, simark, Aditya Kamath1, Sangamesh Mallayya
>>>>> Aditya Kamath1 via Gdb-patches <gdb-patches@sourceware.org> writes:
> We now have all variables {pd_able, pd_active and pd_session} now in a
> map of process ID and structure. This will help us make AIX GDB code
> easy to manage them per process in the aix-thread.c file.
I don't really know what this is about, but it's probably better to
attach the data directly to the inferior using the registry system.
(You can't use private_inferior as apparently that's reserved for the
process stratum.)
Search for registry<inferior> for some examples.
> Secondly, in the function pid_to_str () there is a beneath () call,
> which is why I had to put this function in rs6000-aix-nat.c file.
I wonder why it's necessary, as it seems to me that
aix_thread_target::pid_to_str should have already handled the 'thread'
case, so the inherited method ought to be good enough.
> ---------------------------------------------------
> Output with patch applied:-
Is there an existing gdb test case that exercises this code?
If not then it seems like a new test is warranted.
> static void
> pd_disable (void)
> {
> - if (!pd_able)
> + if (!tmap[inferior_ptid.pid ()].pd_able)
> return;
> - if (pd_active)
> + if (tmap[inferior_ptid.pid ()].pd_active)
> pd_deactivate ();
> - pd_able = 0;
> + tmap[inferior_ptid.pid ()].pd_able = 0;
> current_inferior ()->unpush_target (&aix_thread_ops);
> + tmap.erase (inferior_ptid.pid ());
> }
It's better to pass in a ptid or even the aix_thread_variables object
itself than to rely on globals in low-level functions like this.
> diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
> index f483f54de13..74676cbfa50 100644
> --- a/gdb/solib-aix.c
> +++ b/gdb/solib-aix.c
> @@ -550,6 +550,10 @@ solib_aix_in_dynsym_resolve_code (CORE_ADDR pc)
> return 0;
> }
> +/* For multi inferiors, post object file name change
> + we store the new names in this vector. */
> +std::vector<std::string> aix_slib_name;
> + std::string s = bfd_get_filename (object_bfd.get ());
> + auto it = aix_slib_name.begin ();
> + while (it != aix_slib_name.end ())
> + {
> + std::string s1 = *it;
> + if (s1.compare(s) == 0)
> + return object_bfd;
> + it++;
This doesn't look right to me at all. Using a global means that BFDs
from one inferior might "leak" to another, based solely on whether a
certain name was ever seen. Also nothing ever cleans out the global
vector.
It's better to attach this data to the relevant BFD using the registry
system, and not use a global at all.
Tom
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <BY5PR15MB3540261B391EA10F29AF5120D6C29@BY5PR15MB3540.namprd15.prod.outlook.com>
@ 2023-01-20 14:44 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544A44737D2C7BF3CDC9BAAD6CC9@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2023-01-20 14:44 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>Inorder to resolve the same I request for one information. How can we iterate_over_threads
>of a particular process. What is that function. Is there any built-in available??
>Kindly let me know and that should solve this issue.
Instead of iterate_over_threads you could use the all_threads() iterator directly;
this can be specialized to only return threads of one inferior like this:
for (thread_info *tp : all_threads (proc_target, ptid_t (pid)))
{
...
}
>Also kindly give me feedback on this patch if I need to change anything.
I think this change in solib-aix.c is not quite correct:
+ std::string s = bfd_get_filename (object_bfd.get ());
+ if (s.find (member_name) != std::string::npos)
+ {
+ return object_bfd;
+ }
This matches the member name *anywhere* in the full filename,
which could lead to spurious matches, I think. The test
should be more specific.
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB354417A03BCE77E3BA1C3A20D6FF9@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2023-01-11 13:31 ` Ulrich Weigand
[not found] ` <BY5PR15MB3540261B391EA10F29AF5120D6C29@BY5PR15MB3540.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2023-01-11 13:31 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>So, looking at this, I have missed out something may be minor or
>major causing the bug which I am unaware of in the code base.
The one problem I see is that there are still global variables
that seem to get clobbered if there are multiple inferiors,
in particular pd_session and pd_brk_addr. Those will get
overwritten by each pd_activate call for each inferior.
Note that pthdb_session_init registers the pid, so the resulting
pd_session should be different for each inferior. If you just
overwrite it, then the pid values passed to all callbacks will
always reflect only the latest inferior.
Similarly, pd_brk_addr is potentially different in each inferior,
if libpthread is loaded at different addresses.
You should create a struct containing all per-inferior thread-
related data members, and then allocate one such struct per
inferior.
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB3544BD03B93F39BA8E9E5452D6EC9@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2023-01-09 14:04 ` Ulrich Weigand
[not found] ` <CH2PR15MB354417A03BCE77E3BA1C3A20D6FF9@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2023-01-09 14:04 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>Here I have added a print statement to ensure we are able to find the member in the archive.
>
>What's interesting is for the first inferior this works fine for all shared libraries.
>For the second one and every inferior thereafter the output is as shown below in the next paragraph,
>object_bfd is shr.o --- compared with --- member_name is shr_comm.o in path /usr/lib/libpthreads.a(shr_comm.o)
>object_bfd is /usr/lib/libpthreads.a(shr_comm.o) --- compared with --- member_name is shr_comm.o
>in path /usr/lib/libpthreads.a(shr_comm.o)
>I was surprised that the bfd_get_filename (object_bfd.get ()) is returning the pathname
>instead of the object file descriptor. Everything until here seems to correct in the
>solib_aix_bfd_open () function and this makes it hard for me to understand what is going on.
Looks like this is because solib_aix_bfd_open *changes* the BFD filename here:
/* Override the returned bfd's name with the name returned from solib_find
along with appended parenthesized member name in order to allow commands
listing all shared libraries to display. Otherwise, we would only be
displaying the name of the archive member object. */
std::string fname = string_printf ("%s%s",
bfd_get_filename (archive_bfd.get ()),
sep);
bfd_set_filename (object_bfd.get (), fname.c_str ());
so when the same BFD gets checked a second time, you'll now see the changed
filename instead of the original one.
I think you'll have to allow for that modified form of the name as well.
>Even if I allow a pathname match to the member_name we end up losing all the
>information of our threads in the first process though we still have the
>process information.
This needs further debugging to understand what's going on once you allow
that match. That original problem should be fixed by that change, so
there's probably something else as well ...
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB35449A5A2BA7153A02CE913BD6E59@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2022-12-22 12:50 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544BD03B93F39BA8E9E5452D6EC9@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2022-12-22 12:50 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>So, I guess if we manage to do something similar just like for the
>first inferior, we will get to the solution, but I did not understand
>how Linux might be reading the symbol again and again for a new
>inferior or AIX for that matter for the first inferior. Kindly let
>me know how we can do something similar or are we missing something
>here that I have not kept in mind in our attempt to solve this for
>AIX and GDB community.
So the way is works on Linux is that linux-thread-db.c registers
both an inferior_created observer and a new_objfile observer.
The inferior_created observer gets called immediately after the
new inferior is noticed - at this point, only the main executable
is detected by GDB, not any of the shared libraries. So this will
only successfully detect a libpthread-linked executable if it was
*statically* linked. For dynamically linked executables, the
new_objfile observer will later get called as well, once for each
shared library. As soon as this happens for the libpthread shared
library, the fact that the inferior is multithreaded is detected.
At first glance, this should actually work the same on AIX - the
aix-thread.c file also registers both inferior_created and
new_objfile observers. Not sure why this doesn't work ...
Do you see the new_objfile observer being called for all the
shared libraries in the second inferior?
If not, are shared libraries actually detected correctly in the
second inferior (e.g. does "info sharedlibrary" show them correctly
in the second inferior)? If not, maybe solib-aix.c also needs to
be reviewed whether it handles multiple inferiors correctly ...
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB3544B1D8ED7EA9E56FEC79E5D6E19@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2022-12-15 15:53 ` Ulrich Weigand
[not found] ` <CH2PR15MB35449A5A2BA7153A02CE913BD6E59@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2022-12-15 15:53 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>[New Thread 258]
>[New Thread 515]
>fetch_regs_kernel_thread tid=225018d regno=64 arch64=0
>[New inferior 2 (process 8061286)]
>pdc_free (user_current_pid = 17957132, buf = 0x11016f370)
>pdc_free (user_current_pid = 17957132, buf = 0x11016f3b0)
>pdc_free (user_current_pid = 17957132, buf = 0x11016f4f0)
>pdc_free (user_current_pid = 17957132, buf = 0x1104e3a70)
>pdc_free (user_current_pid = 17957132, buf = 0x1108af0d0)
>pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffffdef8, count = 1)
> symbols[0].name = "__n_pthreads"
> returning PDC_FAILURE
>pdc_symbol_addrs (user_current_pid = 8061286, symbols = 0xfffffffffffe248, count = 1)
> symbols[0].name = "__n_pthreads"
> returning PDC_FAILURE
>I am parent
>[New process 17957132]
>[New inferior 3 (process 17433000)]
>pdc_symbol_addrs (user_current_pid = 17957132, symbols = 0xfffffffffffdef8, count = 1)
> symbols[0].name = "__n_pthreads"
> returning PDC_FAILURE
I think the problem here may be that the lookup_minimal_symbol
call in pdc_symbol_addrs has to be performed in the correct
process address space. This wasn't an issue before since the
routine was only called for the first process anyway.
Look at the equivalent routine on Linux, which is
ps_pglobal_lookup in proc-service.c. This does:
inferior *inf = ph->thread->inf;
scoped_restore_current_program_space restore_pspace;
set_current_program_space (inf->pspace);
You'll need to do the equivalent (set the program space
to the one appropriate for the inferior referenced by
user_current_pid).
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB35447EE88C64F7F36A0C61F5D61D9@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2022-12-08 16:29 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544B1D8ED7EA9E56FEC79E5D6E19@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2022-12-08 16:29 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>>So this last bit seems to be the problem. Could you elaborate on
>>what the exact call stack is? I thought once the thread layer is
>>initialized, calls to ::wait should always go through it ...
>
>Kindly see the backtrace sections
>BT:- Thread_wait [which is on a thread event like new thread born or main process is pthreaded],
>BT:- Post thread wait in rs6000-aix-nat::wait [which is the beneath ()->wait () in aix_thread_target::wait],
>BT:- If direct rs6000-aix-nat::wait [ where in output 3 and 4 {below in this email} you can see it will directly come to rs6000-aix-nat.c if the main process after having threads forks or uses a fork () call ] pasted below in this email.
I'm only replying to this is right now, because that seems to
be the fundamental problem that ultimately causes a lot of the
other issues you're seeing.
It seems the core problem is that you're not initializing the
thread layer correctly for any but the first inferior! So all
other inferiors started with fork are assumed to be single-
threaded ...
If you look at a backtrace like this:
>BT:- If direct rs6000-aix-nat::wait
>
>Thread 1 hit Breakpoint 2, rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,
> ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695
>695 set_sigint_trap ();
>(gdb) bt
>#0 rs6000_nat_target::wait (this=0x1100a2e10 <_rs6000aixnat.rw_>, ptid=...,
> ourstatus=0xffffffffffff360, options=...) at rs6000-aix-nat.c:695
>#1 0x0000000100340778 in target_wait (ptid=..., status=0xffffffffffff360, options=...) at target.c:2598
you see that the target.c code uses the current inferior's
"top_target" to find the appropriate target routines:
target_ops *target = current_inferior ()->top_target ();
[...]
ptid_t event_ptid = target->wait (ptid, status, options);
For a multi-threaded process "top_target" *should* point to
aix_thread_ops, which is achieved by this call in pd_enable:
current_inferior ()->push_target (&aix_thread_ops);
However, note that this is applied only to *one* inferior.
You actually need to do this for *all* new inferiors as soon
as they are detected to become multi-threaded.
This does not happen because aix-thread.c currently has a static
global pd_able variable that applies to GDB as a whole. Back in
the days where this was introduced, that was probably correct
since a single GDB session could only debug one single inferior
back then. But for multiple inferiors, any of which can be
multi-threaded, this does not work.
I think you should first of all work on fixing this, and then
go back to validating your test scenarios without any of the
other changes - many of those likely will no longer be
necessary then.
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB354486F155F5CC97501D056DD6179@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2022-12-05 18:33 ` Ulrich Weigand
[not found] ` <CH2PR15MB35447EE88C64F7F36A0C61F5D61D9@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2022-12-05 18:33 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>>I'm not sure why it is necessary to handle this in the process layer
>>(rs6000-aix-nat.c) instead of the thread layer (aix-thread.c).
>>What specifically breaks if you do not have these rs6000-aix-nat.c
>>changes?
>
>So, if you observe output 3 or 4, the program first multi threads,
>I mean thread events are handled first and then the threads fork.
>So, when this happens, I cannot return ptid_t (parent_pid). If I do
>so, the GDB core will treat it as a new process and add it in my
>threadlist as say process 100 despite existence of 'thread 1'
>representing the same. So, I need to correctly send which thread
>did the fork () event or which thread of the process is the one who
>gave birth to a new inferior process [say 2 or 3 in output 3 below],
>I mean which thread caused the mult process event when the process
>is mutli threaded. This has to handled here as control from target.c
>comes directly to rs6000-aix-nat::wait and not through
>aix-thread.c::wait since fork () is a process event..
So this last bit seems to be the problem. Could you elaborate on
what the exact call stack is? I thought once the thread layer is
initialized, calls to ::wait should always go through it ...
>>If you *do* need to handle LWPs (kernel thread IDs) in the process
>>layer (this can be a reasonable choice, and it done by several other
>>native targets), then it should be *consistent*, and *all* LWP handling
>>should be in the process layer. In particular, under no circumstances
>>does it make sense to duplicate the "find current/signalled thread"
>>code in *both* the process any thread layers.
>
>This not straightforward to do. The reason being say our application is pthreaded
>We need our sync_threadlists() code to detect multiple threads and sync..
>We cannot handle this in rs6000-aix-nat.c with the current design of the code..
>Let's say child process is multi-threaded things can get complex..
>It will require us to move that whole GDB list and Pthread list sync code to
>rs6000-aix-nat.c code. The essence or most selling product or the USP
>[Unique Selling Proposition] of aix-thread.c code will be lost.
So the way this works e.g. on Linux is that the process layer handles
both processes and the *kernel* aspect of threads, while the thread
layer handles the *user-space* (libc/libpthread) aspect of threads.
In terms of the GDB ptid_t, this means that both the "pid" and "lwp"
field are "owned" by the process layer (which would be rs6000-aix-nat.c
in your case), while only the "tid" field is owned by the thread
layer (which would be aix-thread.c).
Linux does that because it allows correctly debugging programs that
only use the kernel threading capabilities without using libpthread,
e.g. by directly calling the "clone" system call and not "pthread_create".
Such threads won't be in the thread list managed by the user space
library, but are still handled by the process layer in GDB, tracked
as lwp without associated tid.
Not sure if something like that is even possible in AIX. If it does
make sense to handle things similarly in AIX (one other reason would
be ptrace commands that require LWPs, e.g. like the VSX register
access you had in another thread), some code would indeed need
to move, e.g. everything related to accessing *kernel* threads
(fetch_regs_kernel_thread etc.), while code that accesses *user*
threads via the libpthread accessors (fetch_regs_user_thread etc.)
would still remain in aix-thread.c.
>>>[Switching to process 16777620]
>
>>This outputs inferior_ptid ...
>
>Yes, you were right
>
>>>* 1.1 process 16777620 0xd0595fb0 in _p_nsleep ()
>>> from /usr/lib/libpthread.a(shr_xpg5.o)
>>> 1.2 process 16777620 0xd0595fb0 in _p_nsleep ()
>>> from /usr/lib/libpthread.a(shr_xpg5.o)
>>> 1.3 process 16777620 0xd0595fb0 in _p_nsleep ()
>>> from /usr/lib/libpthread.a(shr_xpg5.o)
>>> 2.1 process 8323570 0xd0594fc8 in ?? ()
>>> 3.1 process 17957172 0xd0594fc8 in ?? ()
>
>>... and this outputs the ptid values for those threads.
>
>>If it says "process ...", then those ptid values have not
>>properly been switched over to the (pid, lwp, tid) format.
>While debugged in depth last two days I realised our pid_to_str
>is needed in rs6000-aix-nat.c as control comes here in search of it.
>If it doesn't GDB treats all threads as process.
This is again very suspicious. We obviously already have
threads, so the thread layer should be initialized. This
means that any "pid_to_str" call should go through the
*thread* layer (implementation in aix-thread.c). If that
doesn't happen, we should understand why. (This may be the
same problem that causes "wait" to be called from the
wrong layer, as seen above.)
>>You should verify that the sync_threadlists code handles
>>all multi-process cases correctly. I haven't looked at
>>this in detail, but are you sure that here:
>
>>>@@ -841,8 +829,22 @@ sync_threadlists (int pid)
> >> }
> >> else if (cmp_result > 0)
> >> {
>>>- delete_thread (gbuf[gi]);
>
>
>>you never accidentally switch the *pid* part (if "gptid"
>>belows to a different pid than "pptid")?
>
>So, this is not the reason. I have added an assertion here just
>to be sure. I get what you are thinking.
Having an assertion is of course good, but it isn't obvious to
me that this never can be hit.
>>Hmm. So when "wait" returns, it needs to determine which thread
>>triggered the event that caused ptrace to stop. On Linux, "wait"
>>will actually return the LWP of that thread, so it can be directly
>>used. It seems on AIX, "wait" only returns a PID, and you do not
>>immediately know which thread caused the event?
>
>>In that case, I can see why you'd have to consider SIGINT as well
>>as SIGTRAP. However, it seems to me that even those two are not the
>>*only* cases that can cause "wait" to return - doesn't *any* signal
>>(potentially) trigger a ptrace intercept (causing wait to return)?
>
>>But that's probably a more general problem, and wouldn't occur in
>>this simple test case.
>
>Exactly. So I tried debugging few examples causing a few other signals
>as mentioned in this document [https://www.ibm.com/docs/en/sdk-java-technology/8?topic=reference-signal-handling].
>In AIX we have most of them mentioned in the link. It does not block
>us from doing things or crashes incase of a segment fault signal
>[from our debugger code]. Abort also works fine. Let me know what you think.
The point is if GDB stops because the target received a signal, it
should automatically switch to the particular thread where the signal
was in fact received. I don't think this will actually happen in all
cases with the current code.
Shouldn't you instead check for *any* signal in get_signaled_thread?
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB354465C64C4B32DA395C110AD6129@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2022-11-30 14:57 ` Ulrich Weigand
[not found] ` <CH2PR15MB354486F155F5CC97501D056DD6179@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2022-11-30 14:57 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>Before we had an extra thread 'process 100' apart from 'thread 1'.
>So, in case someone interrupted a thread with ctrl+c.. In the
>pd_update () even if we don't have thread who signalled this interrupt
>when we return ptid_t (pid) it was fine. But now with no 'process 100'
>and only 'thread 1', we need to take care of interrupt as well,
>otherwise GDB core will take ptid_t(100) as a new process.
>So the change
>+ if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)
Hmm. So when "wait" returns, it needs to determine which thread
triggered the event that caused ptrace to stop. On Linux, "wait"
will actually return the LWP of that thread, so it can be directly
used. It seems on AIX, "wait" only returns a PID, and you do not
immediately know which thread caused the event?
In that case, I can see why you'd have to consider SIGINT as well
as SIGTRAP. However, it seems to me that even those two are not the
*only* cases that can cause "wait" to return - doesn't *any* signal
(potentially) trigger a ptrace intercept (causing wait to return)?
But that's probably a more general problem, and wouldn't occur in
this simple test case.
>In this case if we return a ptid_t (pid) then gdb core treats it
>as a new process since our parent thread is pthreaded GDB core
>only knows threads like ptid_t (pid, tid, pthid) and not
>ptid_t (pid). In order to avoid this, the proposed patch uses a
>function call find_the_return_ptid () to figure out the same.
>The changes like below are for this reason.
>
>ourstatus->set_forked (ptid_t (pid));
>- return ptid_t (parent_pid);
>+ return find_the_return_ptid (parent_pid);
>
>Kindly note that the control will always not come from aix-thread.c
>for such events. Hence, we cannot take care of the same there,
>though it will be a relief if we can do that.
I'm not sure why it is necessary to handle this in the process layer
(rs6000-aix-nat.c) instead of the thread layer (aix-thread.c).
What specifically breaks if you do not have these rs6000-aix-nat.c
changes?
If you *do* need to handle LWPs (kernel thread IDs) in the process
layer (this can be a reasonable choice, and it done by several other
native targets), then it should be *consistent*, and *all* LWP handling
should be in the process layer. In particular, under no circumstances
does it make sense to duplicate the "find current/signalled thread"
code in *both* the process any thread layers.
>[Switching to process 16777620]
This outputs inferior_ptid ...
>0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
>(gdb) info threads
> Id Target Id Frame
>* 1.1 process 16777620 0xd0595fb0 in _p_nsleep ()
> from /usr/lib/libpthread.a(shr_xpg5.o)
> 1.2 process 16777620 0xd0595fb0 in _p_nsleep ()
> from /usr/lib/libpthread.a(shr_xpg5.o)
> 1.3 process 16777620 0xd0595fb0 in _p_nsleep ()
> from /usr/lib/libpthread.a(shr_xpg5.o)
> 2.1 process 8323570 0xd0594fc8 in ?? ()
> 3.1 process 17957172 0xd0594fc8 in ?? ()
... and this outputs the ptid values for those threads.
If it says "process ...", then those ptid values have not
properly been switched over to the (pid, lwp, tid) format.
You should verify that the sync_threadlists code handles
all multi-process cases correctly. I haven't looked at
this in detail, but are you sure that here:
@@ -841,8 +829,22 @@ sync_threadlists (int pid)
}
else if (cmp_result > 0)
{
- delete_thread (gbuf[gi]);
- gi++;
+ if (gptid.is_pid ())
+ {
+ thread_change_ptid (proc_target, gptid, pptid);
you never accidentally switch the *pid* part (if "gptid"
belows to a different pid than "pptid")?
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB35441FE8F8E1FECD83202EF6D60C9@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2022-11-23 17:09 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544AB7CA50F834387522DB0D60C9@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2022-11-23 17:09 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>Hmm. So, something is going wrong here..
>gcount = 0;
> iterate_over_threads (giter_count, &gcount);
> g = gbuf = XNEWVEC (struct thread_info *, gcount);
> iterate_over_threads (giter_accum, &g);
> qsort (gbuf, gcount, sizeof *gbuf, gcmp);
Looks like this is deliberate:
/* iterate_over_threads() callback for counting GDB threads.
Do not count the main thread (whose tid is zero). This matches
the list of threads provided by the pthreaddebug library, which
does not include that main thread either, and thus allows us
to compare the two lists. */
static int
giter_count (struct thread_info *thread, void *countp)
{
if (PD_TID (thread->ptid))
(*(int *) countp)++;
return 0;
}
Maybe that comment is wrong about pthreaddebug not including
the main thread? Or maybe that changed between AIX versions?
In any case, something needs to be fixed here.
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB35442C5A578DC4042072FC83D60A9@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2022-11-23 14:15 ` Ulrich Weigand
[not found] ` <CH2PR15MB35441FE8F8E1FECD83202EF6D60C9@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2022-11-23 14:15 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>@@ -514,8 +514,16 @@ pdc_read_data (pthdb_user_t user_current_pid, void *buf,
> during first initialisation. In the rest of the callbacks the
> current thread needs to be correct. */
> if (user_current_pid != 0)
>- switch_to_thread (current_inferior ()->process_target (),
>- ptid_t (user_current_pid));
>+ {
>+ inferior *inf = find_inferior_ptid (current_inferior ()-> process_target (),
>+ ptid_t (user_current_pid));
This would be simpler using find_inferior_pid.
>+ for (thread_info *tp: inf->threads ())
>+ if (tp != NULL)
This would be simpler using any_thread_of_inferior.
>+ {
>+ switch_to_thread (tp);
>+ break;
>+ }
>+ }
> status = target_read_memory (addr, (gdb_byte *) buf, len);
However, switching to just any random thread of the process seems odd.
Looking at sol-thread.c, they don't switch to a thread at all
in the equivalent place, but rather do this:
scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
if (inferior_ptid.tid_p () || !target_thread_alive (inferior_ptid))
{
/* It's either a thread or an LWP that isn't alive. Any live
LWP will do so use the first available.
NOTE: We don't need to call switch_to_thread; we're just
reading memory. */
inferior_ptid = procfs_first_available ();
}
Since your xfer_partial routine only ever looks at the PID
component of the ptid, I'm wondering if we couldn't similarly
just switch inferior_ptid, without actually switching theads.
Something along the lines of
scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
if (user_current_pid != 0)
inferior_ptid = ptid_t (user_current_pid);
Does this work for you?
>- if (thrinf.ti_cursig == SIGTRAP)
>+ /* In a multi threaded application user can interrupt the main
>+ thread as well. This function should return the tid in this
>+ case apart from threads that can trap or be interrupted. */
Whitespace problem.
>+
>+ if (thrinf.ti_cursig == SIGTRAP || thrinf.ti_cursig == SIGINT)
> return thrinf.ti_tid;
This seems an unrelated change? If this is actually necessary,
then all the comments (e.g. at the top of this function, or at
the call site) likewise need to be updated - they only refer to
trap signals currently.
> process_stratum_target *proc_target
> = current_inferior ()->process_target ();
>- thread = add_thread_with_info (proc_target,
>- ptid_t (pid, 0, pbuf[pi].pthid),
>- priv);
>+
>+ thread_info *tp = find_thread_ptid (proc_target, ptid_t (pid));
>+
>+ /* If the pthread library is used then we replace the main
>+ with the thread having the main thread ID and process ID.
>+ Otherwise this is a new thread and we need to add it. */
>+ if (tp != NULL && tp->priv == NULL)
>+ {
>+ thread_change_ptid (proc_target, tp->ptid,
>+ ptid_t (pid, 0, pbuf[pi].pthid));
>+ tp->priv.reset (priv);
>+ }
>+ else
>+ thread = add_thread_with_info (proc_target,
>+ ptid_t (pid, 0, pbuf[pi].pthid),
>+ priv);
I'm confused why this is the correct place. From what I can see,
in this scenario, we have:
- libpthdebug reports some threads using a thread ID, i.e. pbuf has
ptid_t (pid, 0, pthid1)
..
ptid_t (pid, 0, pthidN)
with pcount >= 1.
- GDB only has one single thread in unthreaded mode, i.e. gbuf has
ptid_t (pid, 0, 0)
with gcount == 1.
So when running the loop, during the first iteration, we should compare
ptid_cmp (ptid_t (pid, 0, pthid1), ptid_t (pid, 0, 0))
which should be > 0 since pthid1 > 0. Right?
This means we'll get into the branch that just does:
delete_thread (gbuf[gi]);
thereby deleting the original thread. Does this not happen for you?
What is going on instead?
[ Note that this is a simplified case with only a single process; in the
multi-process scenario, this may be more complex. ]
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB3544E198B0622C3AD22597A2D6029@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2022-11-15 18:16 ` Ulrich Weigand
[not found] ` <CH2PR15MB35442C5A578DC4042072FC83D60A9@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2022-11-15 18:16 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>>So I think instead of adding a "priv" struct to that GDB thread
>>identifying the main process, the sync_threadlists routine should
>>actually just delete it (or replace it with the actual first thread,
>>whatever is easier).
>
>I have chosen not to add the first main thread as new thread. Instead,
>we carry on with main process thread itself adding private data to it.
>Kindly see the first if condition. I observed this with the linux folks
>where in their output as you mentioned do not add any new threads the
>first time on recognition of multi thread debugee for the main process.
OK, but this is still weird:
>* 1 process 26149278 0xd0595fb0 in _p_nsleep ()
> 2 Thread 258 (tid 24445361, running) thread_function (arg=0x0)
> 3 Thread 515 (tid 16187681, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)
Why does the first thread look so different? That's not the
case with Linux threads. I believe even if you re-use the
thread structure, you'll still need to switch the ptid to one
that indicates a thread instead of a non-threaded process.
>A couple of things I want to inform you is that the way the second
>for loop is executing is not correct from here on to sync both the
>buffer lists [pthread and GDB thread]. Since we are now not adding
>multiple threads for the same process main thread one representing
>the GDB thread and the other by the pthread those conditions and
>indices like pi and gi will fail. Now there has not pcount - 1
>threads in the GDB thread buffer always. Condition 2 and 3 in the
>patch take care of them for addition and deletion of threads.
The new logic doesn't look correct to me - note that it never
even looks at thread IDs any more, just the raw number of threads.
So for example if *any* thread exits, the code will always delete
the *last* thread from the GDB list - whether this is actually
the one that exited or not.
I do think it is necessary to compare thread IDs - you need to
map the thread IDs retrieved by libpthdebug against the thread
IDs already present in GDB's thread list. If a matching thread
ID is present in both lists, it should not be touched. If a
thread ID occurs only in the libpthdebug list, it needs to be
added to GDB's list. If a thread ID occurs only in GDB's list,
it needs to be removed from there.
That's what the old code attempted to do as far as I can see;
if it got it wrong in certain corner cases, they need to be fixed;
but completely removing that logic seems just wrong.
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] ` <CH2PR15MB3544280C0A0B3D7F2F35BD0FD63F9@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2022-11-08 12:17 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544E198B0622C3AD22597A2D6029@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2022-11-08 12:17 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
>>You should find out why the "priv" field isn't
>>set up correctly, and fix whatever was going
>>wrong there. (I believe this should have been
>>done in sync_threadlists.)
>
>You were right about this. What is happening is the main process
>and the thread representing it are treated as two separate threads
>by the libpthread library. Main process had no private data set
>whereas the thread representing it had. Usually, both of them
>should have it and their private data must be the same.
I see. I agree this is the root cause of the issue, but the fix
doesn't look quite right to me.
You should not even *have* the duplicate GDB thread in the first place.
>(gdb) info threads
> Id Target Id Frame
>* 1 process 12059046 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
> 2 Thread 1 (tid 39125487, running) 0xd0595fb0 in _p_nsleep () from /usr/lib/libpthread.a(shr_xpg5.o)
> 3 Thread 258 (tid 23396809, running) thread_function (arg=0x0) at continue-pending-status.c:36
> 4 Thread 515 (tid 36503883, running) thread_function (arg=warning: (Internal error: pc 0x0 in read in psymtab, but not in symtab.)
This is not how GDB threads are handled on any other platform. While
you do have a single dummy thread representing the process if it is
*non-threaded*, as soon as the process is recognized as multi-threaded,
you will only see a single GDB thread per target thread, and no
separate "thread" for the whole process.
So I think instead of adding a "priv" struct to that GDB thread
identifying the main process, the sync_threadlists routine should
actually just delete it (or replace it with the actual first thread,
whatever is easier).
Looking at the code, there already seems to be a place where
sync_threadlists deletes GDB threads that do not match onto
and of the threads reported by the AIX thread library - can
you verify why this doesn't trigger here?
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch
[not found] <CH2PR15MB35447ECA01D9D2F75AEACC1DD6319@CH2PR15MB3544.namprd15.prod.outlook.com>
@ 2022-10-28 9:49 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544280C0A0B3D7F2F35BD0FD63F9@CH2PR15MB3544.namprd15.prod.outlook.com>
0 siblings, 1 reply; 22+ messages in thread
From: Ulrich Weigand @ 2022-10-28 9:49 UTC (permalink / raw)
To: simark, Aditya Kamath1, gdb-patches; +Cc: Sangamesh Mallayya
Aditya Kamath1 <Aditya.Kamath1@ibm.com> wrote:
> static aix_thread_info *
> get_aix_thread_info (thread_info *thread)
> {
>+ if (thread->priv == NULL)
>+ return NULL;
This doesn't look right. Note that all users of
get_aix_thread_info assume the pointer returned
from there is never NULL.
You should find out why the "priv" field isn't
set up correctly, and fix whatever was going
wrong there. (I believe this should have been
done in sync_threadlists.)
Bye,
Ulrich
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2023-02-17 19:14 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <CH2PR15MB35449137854F561F1EA7DE04D6D69@CH2PR15MB3544.namprd15.prod.outlook.com>
2023-02-02 17:43 ` [PATCH] 0001-Fix-multi-thread-debug-bug-in-AIX.patch Ulrich Weigand
[not found] ` <DM6PR15MB3545C1FD15BEC2D160B3AB81D6D79@DM6PR15MB3545.namprd15.prod.outlook.com>
2023-02-06 19:07 ` Ulrich Weigand
[not found] ` <CH2PR15MB35449148A6541A1581B1749ED6DB9@CH2PR15MB3544.namprd15.prod.outlook.com>
2023-02-08 18:44 ` Ulrich Weigand
[not found] ` <DM6PR15MB3545F7F15C72DF6970E4D0AED6DE9@DM6PR15MB3545.namprd15.prod.outlook.com>
2023-02-13 19:01 ` Ulrich Weigand
[not found] ` <CH2PR15MB35444D2E4EC3340265427F47D6A29@CH2PR15MB3544.namprd15.prod.outlook.com>
2023-02-16 19:46 ` Ulrich Weigand
[not found] ` <CH2PR15MB354455A756AF729AE5F796E2D6A19@CH2PR15MB3544.namprd15.prod.outlook.com>
2023-02-17 12:04 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544B015A880690CA1F6284AD6A19@CH2PR15MB3544.namprd15.prod.outlook.com>
2023-02-17 14:18 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544E905156ECE35D5ADCD43D6A19@CH2PR15MB3544.namprd15.prod.outlook.com>
2023-02-17 19:14 ` Ulrich Weigand
[not found] <CH2PR15MB35447ECA01D9D2F75AEACC1DD6319@CH2PR15MB3544.namprd15.prod.outlook.com>
2022-10-28 9:49 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544280C0A0B3D7F2F35BD0FD63F9@CH2PR15MB3544.namprd15.prod.outlook.com>
2022-11-08 12:17 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544E198B0622C3AD22597A2D6029@CH2PR15MB3544.namprd15.prod.outlook.com>
2022-11-15 18:16 ` Ulrich Weigand
[not found] ` <CH2PR15MB35442C5A578DC4042072FC83D60A9@CH2PR15MB3544.namprd15.prod.outlook.com>
2022-11-23 14:15 ` Ulrich Weigand
[not found] ` <CH2PR15MB35441FE8F8E1FECD83202EF6D60C9@CH2PR15MB3544.namprd15.prod.outlook.com>
2022-11-23 17:09 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544AB7CA50F834387522DB0D60C9@CH2PR15MB3544.namprd15.prod.outlook.com>
[not found] ` <CH2PR15MB354465C64C4B32DA395C110AD6129@CH2PR15MB3544.namprd15.prod.outlook.com>
2022-11-30 14:57 ` Ulrich Weigand
[not found] ` <CH2PR15MB354486F155F5CC97501D056DD6179@CH2PR15MB3544.namprd15.prod.outlook.com>
2022-12-05 18:33 ` Ulrich Weigand
[not found] ` <CH2PR15MB35447EE88C64F7F36A0C61F5D61D9@CH2PR15MB3544.namprd15.prod.outlook.com>
2022-12-08 16:29 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544B1D8ED7EA9E56FEC79E5D6E19@CH2PR15MB3544.namprd15.prod.outlook.com>
2022-12-15 15:53 ` Ulrich Weigand
[not found] ` <CH2PR15MB35449A5A2BA7153A02CE913BD6E59@CH2PR15MB3544.namprd15.prod.outlook.com>
2022-12-22 12:50 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544BD03B93F39BA8E9E5452D6EC9@CH2PR15MB3544.namprd15.prod.outlook.com>
2023-01-09 14:04 ` Ulrich Weigand
[not found] ` <CH2PR15MB354417A03BCE77E3BA1C3A20D6FF9@CH2PR15MB3544.namprd15.prod.outlook.com>
2023-01-11 13:31 ` Ulrich Weigand
[not found] ` <BY5PR15MB3540261B391EA10F29AF5120D6C29@BY5PR15MB3540.namprd15.prod.outlook.com>
2023-01-20 14:44 ` Ulrich Weigand
[not found] ` <CH2PR15MB3544A44737D2C7BF3CDC9BAAD6CC9@CH2PR15MB3544.namprd15.prod.outlook.com>
2023-01-30 19:54 ` Tom Tromey
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).