public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] gdb: try to load libthread_db only after reading all shared libraries when attaching
@ 2021-03-30 17:21 Simon Marchi
  2021-04-02 17:14 ` Tom Tromey
  2021-05-04 18:02 ` [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child Simon Marchi
  0 siblings, 2 replies; 15+ messages in thread
From: Simon Marchi @ 2021-03-30 17:21 UTC (permalink / raw)
  To: gdb-patches; +Cc: Florian Weimer, Simon Marchi

Note: this is an ugly patch, I'd rather not push it like this, but I
send it to start the discussion and gather feedback.

When trying to attach to a pthread process on a Linux system with glibc 2.33,
we get:

    $ ./gdb -q -nx --data-directory=data-directory -p 1472010
    Attaching to process 1472010
    [New LWP 1472013]
    [New LWP 1472014]
    [New LWP 1472015]
    Error while reading shared library symbols for /usr/lib/libpthread.so.0:
    Cannot find user-level thread for LWP 1472015: generic error
    0x00007ffff6d3637f in poll () from /usr/lib/libc.so.6
    (gdb)

When attaching to a process, GDB reads the shared library list from the
process.  For each shared library (if "set auto-solib-add" is on), it
reads its symbols and calls the "new_objfile" observable.

The libthread-db code monitors this observable, and if it sees an
objfile named somewhat like "libpthread.so" go by, it tries to load
libthread_db.so in the GDB process itself.  libthread_db knows how to
navigate libpthread's data structures to get information about the
existing threads.

To locate data structures, libthread_db calls ps_pglobal_lookup
(implemented in proc-service.c), passing in a symbol name and expecting
an address in return.

Before glibc 2.33, libthread_db always asked for symbols found in
libpthread.  There was no ordering problem: since we were always trying
to load libthread_db in reaction to processing libpthread (and reading
in its symbols) and libthread_db only asked symbols from libpthread, the
requested symbols could always be found.  Starting with glibc 2.33,
libthread_db now asks for a symbol name that can be found in
/lib/ld-linux-x86-64.so.2 (_rtld_global).  And the ordering in which GDB
reads the shared library list from the inferior is unfortunate, in that
libpthread is processed before ld-linux.  So when loading libthread_db
in reaction to processing libpthread, and libthread_db requests the
symbol that is from ld-linux, GDB is not yet able to supply it.

That symbol lookup happens in the thread_from_lwp function, when we call
td_ta_map_lwp2thr_p, and an exception is thrown at this point:

    #0  0x00007ffff6681012 in __cxxabiv1::__cxa_throw (obj=0x60e000006100, tinfo=0x555560033b50 <typeinfo for gdb_exception_error>, dest=0x55555d9404bc <gdb_exception_error::~gdb_exception_error()>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:78
    #1  0x000055555e5d3734 in throw_it(return_reason, errors, const char *, typedef __va_list_tag __va_list_tag *) (reason=RETURN_ERROR, error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:200
    #2  0x000055555e5d37d4 in throw_verror (error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:208
    #3  0x000055555e0b0ed2 in verror (string=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", args=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdb/utils.c:171
    #4  0x000055555e5e898a in error (fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:43
    #5  0x000055555d06b4bc in thread_from_lwp (stopped=0x617000035d80, ptid=...) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:418
    #6  0x000055555d07040d in try_thread_db_load_1 (info=0x60c000011140) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:912
    #7  0x000055555d071103 in try_thread_db_load (library=0x55555f0c62a0 "libthread_db.so.1", check_auto_load_safe=false) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1014
    #8  0x000055555d072168 in try_thread_db_load_from_sdir () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1091
    #9  0x000055555d072d1c in thread_db_load_search () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1146
    #10 0x000055555d07365c in thread_db_load () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1203
    #11 0x000055555d07373e in check_for_thread_db () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1246
    #12 0x000055555d0738ab in thread_db_new_objfile (objfile=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1275
    #13 0x000055555bd10740 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:60
    #14 0x000055555bd02096 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:153
    #15 0x000055555bce0392 in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffb4a0: 0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:291
    #16 0x000055555d3595c0 in std::function<void (objfile*)>::operator()(objfile*) const (this=0x616000068d88, __args#0=0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:622
    #17 0x000055555d356b7f in gdb::observers::observable<objfile*>::notify (this=0x555566727020 <gdb::observers::new_objfile>, args#0=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/observable.h:106
    #18 0x000055555da3f228 in symbol_file_add_with_addrs (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=..., addrs=0x7fffffffbc10, flags=..., parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1131
    #19 0x000055555da3f763 in symbol_file_add_from_bfd (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=<error reading variable: Cannot access memory at address 0xffffffffffffffb0>, addrs=0x7fffffffbc10, flags=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>, parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1167
    #20 0x000055555d95f9fa in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:681
    #21 0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987
    #22 0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238
    #23 0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049
    #24 0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195
    #25 0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318
    #26 0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439
    #27 0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887
    #28 0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064
    #29 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006
    #30 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062
    #31 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727
    #32 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105
    #33 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
    #34 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
    #35 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #36 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #37 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #38 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528
    #39 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545
    #40 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452
    #41 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149
    #42 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232
    #43 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257
    #44 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

The exception is caught here:

    #0  __cxxabiv1::__cxa_begin_catch (exc_obj_in=0x60e0000060e0) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_catch.cc:84
    #1  0x000055555d95fded in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:689
    #2  0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987
    #3  0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238
    #4  0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049
    #5  0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195
    #6  0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318
    #7  0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439
    #8  0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887
    #9  0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064
    #10 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006
    #11 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062
    #12 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727
    #13 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105
    #14 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
    #15 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
    #16 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #17 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #18 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #19 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528
    #20 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545
    #21 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452
    #22 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149
    #23 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232
    #24 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257
    #25 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

Catching the exception at this point means that the thread_db_info
object for this inferior will be left in place, despite the failure to
load libthread_db.  This means that there won't be further attempts at
loading libthread_db, because thread_db_load will exit early.  This
patch adds a try/catch around calling try_thread_db_load_1 in
try_thread_db_load, such that if some exception is thrown while trying
to load libthread_db, we reset / delete the thread_db_info for that
inferior.  That alone makes attach work fine again, because
check_for_thread_db is called again in the thread_db_inferior_created
observer (that is after we learned about all shared libraries), and
libthread_db is successfully loaded then.

I think that's a good way to make it work: when attaching, the
inferior_created observable is called once everything has stabilized,
when we learned about all shared libraries, so that is a good time to
try to load libthread_db.

The only problem then is that when we first try (and fail) to load
libthread_db, in reaction to learning about libpthread, we show this
warning:

    warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.

This is misleading, because we do succeed in loading it later.  So when
attaching, I think we want to defer trying to load libthread_db until we
have learned about all shared libraries.  And that leads to the ugly
part of the patch.  What I've implemented in this patch is to delay
clearing the inferior::needs_setup flag, so that we can check for it in
thread_db_load.

I considered other solutions, like having a "new_objfiles" observable,
which would fire once after a series of objfiles are created.  When
attaching, we would get a single "new_objfiles" notification after all
shared library objfiles have been created, so that the thread-db code
could use that.  But it didn't seem nice anymore when actually
implementing it.

Change-Id: I7a279836cfbb2b362b4fde11b196b4aab82f5efb
---
 gdb/infcmd.c          |  7 ++-----
 gdb/linux-thread-db.c | 24 +++++++++++++++++++-----
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/gdb/infcmd.c b/gdb/infcmd.c
index 9b0186dd391c..aaac17cc73b7 100644
--- a/gdb/infcmd.c
+++ b/gdb/infcmd.c
@@ -347,6 +347,8 @@ post_create_inferior (int from_tty)
      if the now pushed target supports hardware watchpoints.  */
   breakpoint_re_set ();
 
+  current_inferior ()->needs_setup = 0;
+
   gdb::observers::inferior_created.notify (current_inferior ());
 }
 
@@ -2418,11 +2420,6 @@ proceed_after_attach (inferior *inf)
 void
 setup_inferior (int from_tty)
 {
-  struct inferior *inferior;
-
-  inferior = current_inferior ();
-  inferior->needs_setup = 0;
-
   /* If no exec file is yet known, try to determine it from the
      process itself.  */
   if (get_exec_file (0) == NULL)
diff --git a/gdb/linux-thread-db.c b/gdb/linux-thread-db.c
index de8687e99c73..b7a500079fb2 100644
--- a/gdb/linux-thread-db.c
+++ b/gdb/linux-thread-db.c
@@ -1011,8 +1011,19 @@ try_thread_db_load (const char *library, bool check_auto_load_safe)
   if (strchr (library, '/') != NULL)
     info->filename = gdb_realpath (library).release ();
 
-  if (try_thread_db_load_1 (info))
-    return true;
+  try
+    {
+      if (try_thread_db_load_1 (info))
+	return true;
+    }
+  catch (const gdb_exception &except)
+    {
+      if (libthread_db_debug)
+	{
+	  exception_fprintf (gdb_stdlog, except,
+			     "Warning: try_thread_db_load: ");
+	}
+    }
 
   /* This library "refused" to work on current inferior.  */
   delete_thread_db_info (current_inferior ()->process_target (),
@@ -1183,10 +1194,13 @@ has_libpthread (void)
 static bool
 thread_db_load (void)
 {
-  struct thread_db_info *info;
+  inferior *inf = current_inferior ();
 
-  info = get_thread_db_info (current_inferior ()->process_target (),
-			     inferior_ptid.pid ());
+  if (inf->needs_setup)
+    return false;
+
+  thread_db_info *info = get_thread_db_info (inf->process_target (),
+					     inferior_ptid.pid ());
 
   if (info != NULL)
     return true;
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] gdb: try to load libthread_db only after reading all shared libraries when attaching
  2021-03-30 17:21 [PATCH] gdb: try to load libthread_db only after reading all shared libraries when attaching Simon Marchi
@ 2021-04-02 17:14 ` Tom Tromey
  2021-04-06 21:08   ` Kevin Buettner
  2021-04-07  2:45   ` Simon Marchi
  2021-05-04 18:02 ` [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child Simon Marchi
  1 sibling, 2 replies; 15+ messages in thread
From: Tom Tromey @ 2021-04-02 17:14 UTC (permalink / raw)
  To: Simon Marchi via Gdb-patches; +Cc: Simon Marchi, Florian Weimer

Simon> Before glibc 2.33, libthread_db always asked for symbols found in
Simon> libpthread.  There was no ordering problem: since we were always trying
Simon> to load libthread_db in reaction to processing libpthread (and reading
Simon> in its symbols) and libthread_db only asked symbols from libpthread, the
Simon> requested symbols could always be found.  Starting with glibc 2.33,
Simon> libthread_db now asks for a symbol name that can be found in
Simon> /lib/ld-linux-x86-64.so.2 (_rtld_global).  And the ordering in which GDB
Simon> reads the shared library list from the inferior is unfortunate, in that
Simon> libpthread is processed before ld-linux.  So when loading libthread_db
Simon> in reaction to processing libpthread, and libthread_db requests the
Simon> symbol that is from ld-linux, GDB is not yet able to supply it.

Adding to my dislike of libthread_db.  Linux would be improved if it
were removed entirely.

Simon> I think that's a good way to make it work: when attaching, the
Simon> inferior_created observable is called once everything has stabilized,
Simon> when we learned about all shared libraries, so that is a good time to
Simon> try to load libthread_db.

Yeah.  We already have other code doing something similar --
SYMFILE_DEFER_BP_RESET was added to defer a breakpoint reset until all
the available shared libraries were loaded, the idea here being to avoid
multiple expensive resets one after another.

Simon> I considered other solutions, like having a "new_objfiles" observable,
Simon> which would fire once after a series of objfiles are created.  When
Simon> attaching, we would get a single "new_objfiles" notification after all
Simon> shared library objfiles have been created, so that the thread-db code
Simon> could use that.  But it didn't seem nice anymore when actually
Simon> implementing it.

Perhaps the SYMFILE_DEFER_BP_RESET approach could be generalized somehow.
Or is that too closely related to what you tried?

Also, does gdbserver need a similar treatment?  I know it has to load
libthread_db (one of the problems with this library) but is the decision
to do so drive by gdb or by itself?  I don't recall.

Tom

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] gdb: try to load libthread_db only after reading all shared libraries when attaching
  2021-04-02 17:14 ` Tom Tromey
@ 2021-04-06 21:08   ` Kevin Buettner
  2021-04-07  2:49     ` Simon Marchi
  2021-04-07  2:45   ` Simon Marchi
  1 sibling, 1 reply; 15+ messages in thread
From: Kevin Buettner @ 2021-04-06 21:08 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches, Florian Weimer

On Fri, 02 Apr 2021 11:14:23 -0600
Tom Tromey <tom@tromey.com> wrote:

> Also, does gdbserver need a similar treatment?  I know it has to load
> libthread_db (one of the problems with this library) but is the decision
> to do so drive by gdb or by itself?  I don't recall.

gdbserver definitely has problems too.  It needs gdb to look up
symbols (via the qSymbol packet), but is only able to do so at certain
times.

I'm trying to puzzle it out now...

Kevin


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] gdb: try to load libthread_db only after reading all shared libraries when attaching
  2021-04-02 17:14 ` Tom Tromey
  2021-04-06 21:08   ` Kevin Buettner
@ 2021-04-07  2:45   ` Simon Marchi
  2021-04-08 18:46     ` Tom Tromey
  1 sibling, 1 reply; 15+ messages in thread
From: Simon Marchi @ 2021-04-07  2:45 UTC (permalink / raw)
  To: Tom Tromey, Simon Marchi via Gdb-patches; +Cc: Florian Weimer



On 2021-04-02 1:14 p.m., Tom Tromey wrote:
> Simon> Before glibc 2.33, libthread_db always asked for symbols found in
> Simon> libpthread.  There was no ordering problem: since we were always trying
> Simon> to load libthread_db in reaction to processing libpthread (and reading
> Simon> in its symbols) and libthread_db only asked symbols from libpthread, the
> Simon> requested symbols could always be found.  Starting with glibc 2.33,
> Simon> libthread_db now asks for a symbol name that can be found in
> Simon> /lib/ld-linux-x86-64.so.2 (_rtld_global).  And the ordering in which GDB
> Simon> reads the shared library list from the inferior is unfortunate, in that
> Simon> libpthread is processed before ld-linux.  So when loading libthread_db
> Simon> in reaction to processing libpthread, and libthread_db requests the
> Simon> symbol that is from ld-linux, GDB is not yet able to supply it.
> 
> Adding to my dislike of libthread_db.  Linux would be improved if it
> were removed entirely.
> 
> Simon> I think that's a good way to make it work: when attaching, the
> Simon> inferior_created observable is called once everything has stabilized,
> Simon> when we learned about all shared libraries, so that is a good time to
> Simon> try to load libthread_db.
> 
> Yeah.  We already have other code doing something similar --
> SYMFILE_DEFER_BP_RESET was added to defer a breakpoint reset until all
> the available shared libraries were loaded, the idea here being to avoid
> multiple expensive resets one after another.
> 
> Simon> I considered other solutions, like having a "new_objfiles" observable,
> Simon> which would fire once after a series of objfiles are created.  When
> Simon> attaching, we would get a single "new_objfiles" notification after all
> Simon> shared library objfiles have been created, so that the thread-db code
> Simon> could use that.  But it didn't seem nice anymore when actually
> Simon> implementing it.
> 
> Perhaps the SYMFILE_DEFER_BP_RESET approach could be generalized somehow.
> Or is that too closely related to what you tried?

If I understand correctly, the "bp reset" mechanism is: pass
SYMFILE_DEFER_BP_RESET when loading one or more objfile, and manually
call breakpoint_re_set after that.

I don't think we can directly use the same mechanism here: we don't
really want to create a SYMFILE_DEFER_LOADING_LIBTHREAD_DB flag, that
would be quite an abstraction breach.  Or do you mean a
SYMFILE_DEFER_CALLING_THE_NEW_OBJFILE_OBSERVABLE flag?

> Also, does gdbserver need a similar treatment?  I know it has to load
> libthread_db (one of the problems with this library) but is the decision
> to do so drive by gdb or by itself?  I don't recall.

Probably, I didn't think of testing with GDBserver, it's probably broken
there too.  I'll try next time I fiddle with this.

Simon

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] gdb: try to load libthread_db only after reading all shared libraries when attaching
  2021-04-06 21:08   ` Kevin Buettner
@ 2021-04-07  2:49     ` Simon Marchi
  2021-04-07 23:51       ` Kevin Buettner
  0 siblings, 1 reply; 15+ messages in thread
From: Simon Marchi @ 2021-04-07  2:49 UTC (permalink / raw)
  To: Kevin Buettner, Tom Tromey; +Cc: Florian Weimer, gdb-patches

On 2021-04-06 5:08 p.m., Kevin Buettner via Gdb-patches wrote:
> On Fri, 02 Apr 2021 11:14:23 -0600
> Tom Tromey <tom@tromey.com> wrote:
> 
>> Also, does gdbserver need a similar treatment?  I know it has to load
>> libthread_db (one of the problems with this library) but is the decision
>> to do so drive by gdb or by itself?  I don't recall.
> 
> gdbserver definitely has problems too.  It needs gdb to look up
> symbols (via the qSymbol packet), but is only able to do so at certain
> times.
> 
> I'm trying to puzzle it out now...

Hi Kevin,

Just wondering, are you actively working on this problem?  I am not
currently assigned on this at $day_job, so all I can put on it is
leftovers of time here and there.  If you want to propose another patch
/ solution, feel free!

Simon

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] gdb: try to load libthread_db only after reading all shared libraries when attaching
  2021-04-07  2:49     ` Simon Marchi
@ 2021-04-07 23:51       ` Kevin Buettner
  0 siblings, 0 replies; 15+ messages in thread
From: Kevin Buettner @ 2021-04-07 23:51 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches, Tom Tromey, Florian Weimer

On Tue, 6 Apr 2021 22:49:50 -0400
Simon Marchi <simon.marchi@polymtl.ca> wrote:

> On 2021-04-06 5:08 p.m., Kevin Buettner via Gdb-patches wrote:
> > On Fri, 02 Apr 2021 11:14:23 -0600
> > Tom Tromey <tom@tromey.com> wrote:
> >   
> >> Also, does gdbserver need a similar treatment?  I know it has to load
> >> libthread_db (one of the problems with this library) but is the decision
> >> to do so drive by gdb or by itself?  I don't recall.  
> > 
> > gdbserver definitely has problems too.  It needs gdb to look up
> > symbols (via the qSymbol packet), but is only able to do so at certain
> > times.
> > 
> > I'm trying to puzzle it out now...  
> 
> Hi Kevin,
> 
> Just wondering, are you actively working on this problem?  I am not
> currently assigned on this at $day_job, so all I can put on it is
> leftovers of time here and there.  If you want to propose another patch
> / solution, feel free!

Hi Simon,

Yes, I'm actively working on it.  I first came across it while doing some
gdbserver testing, so that's where my focus is at the moment.  We
definitely need to come up with a solution (possibly different ones)
for both gdb and gdbserver.

Kevin


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] gdb: try to load libthread_db only after reading all shared libraries when attaching
  2021-04-07  2:45   ` Simon Marchi
@ 2021-04-08 18:46     ` Tom Tromey
  0 siblings, 0 replies; 15+ messages in thread
From: Tom Tromey @ 2021-04-08 18:46 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, Simon Marchi via Gdb-patches, Florian Weimer

Simon> I don't think we can directly use the same mechanism here: we don't
Simon> really want to create a SYMFILE_DEFER_LOADING_LIBTHREAD_DB flag, that
Simon> would be quite an abstraction breach.  Or do you mean a
Simon> SYMFILE_DEFER_CALLING_THE_NEW_OBJFILE_OBSERVABLE flag?

Not sure, it just seemed to me that the idea is the same: when a number
of libraries become visible all at once, GDB should defer some work
until after they've all been handled internally.

Tom

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child
  2021-03-30 17:21 [PATCH] gdb: try to load libthread_db only after reading all shared libraries when attaching Simon Marchi
  2021-04-02 17:14 ` Tom Tromey
@ 2021-05-04 18:02 ` Simon Marchi
  2021-05-30 14:12   ` Joel Brobecker
  2021-06-04 22:42   ` Kevin Buettner
  1 sibling, 2 replies; 15+ messages in thread
From: Simon Marchi @ 2021-05-04 18:02 UTC (permalink / raw)
  To: gdb-patches; +Cc: Florian Weimer, Kevin Buettner, Simon Marchi

<note>
Since I wrote this patch, Florian Weimer merged this in glibc that
mitigates the issue:

    https://sourceware.org/git/?p=glibc.git;a=commit;h=a64afc225240b2b27129ccfb0516d7c958b98040

But his commit message notes that it's "To make this work until GDB can
be fixed", so we might also want to fix this on the GDB side.  Also, the
patch can be desirable even if just the performance / waste less cycles
aspect.

Compared to v1, v2 uses a dedicated inferior flag instead of trying to
re-use needs_setup.  It also makes things work with the extended-remote
target.
</note>

When trying to attach to a pthread process on a Linux system with glibc 2.33,
we get:

    $ ./gdb -q -nx --data-directory=data-directory -p 1472010
    Attaching to process 1472010
    [New LWP 1472013]
    [New LWP 1472014]
    [New LWP 1472015]
    Error while reading shared library symbols for /usr/lib/libpthread.so.0:
    Cannot find user-level thread for LWP 1472015: generic error
    0x00007ffff6d3637f in poll () from /usr/lib/libc.so.6
    (gdb)

When attaching to a process (or handling a fork child, an operation very
similar to attaching), GDB reads the shared library list from the
process.  For each shared library (if "set auto-solib-add" is on), it
reads its symbols and calls the "new_objfile" observable.

The libthread-db code monitors this observable, and if it sees an
objfile named somewhat like "libpthread.so" go by, it tries to load
libthread_db.so in the GDB process itself.  libthread_db knows how to
navigate libpthread's data structures to get information about the
existing threads.

To locate these data structures, libthread_db calls ps_pglobal_lookup
(implemented in proc-service.c), passing in a symbol name and expecting
an address in return.

Before glibc 2.33, libthread_db always asked for symbols found in
libpthread.  There was no ordering problem: since we were always trying
to load libthread_db in reaction to processing libpthread (and reading
in its symbols) and libthread_db only asked symbols from libpthread, the
requested symbols could always be found.  Starting with glibc 2.33,
libthread_db now asks for a symbol name that can be found in
/lib/ld-linux-x86-64.so.2 (_rtld_global).  And the ordering in which GDB
reads the shared libraries from the inferior when attaching is
unfortunate, in that libpthread is processed before ld-linux.  So when
loading libthread_db in reaction to processing libpthread, and
libthread_db requests the symbol that is from ld-linux, GDB is not yet
able to supply it.

That problematic symbol lookup happens in the thread_from_lwp function,
when we call td_ta_map_lwp2thr_p, and an exception is thrown at this
point:

    #0  0x00007ffff6681012 in __cxxabiv1::__cxa_throw (obj=0x60e000006100, tinfo=0x555560033b50 <typeinfo for gdb_exception_error>, dest=0x55555d9404bc <gdb_exception_error::~gdb_exception_error()>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:78
    #1  0x000055555e5d3734 in throw_it(return_reason, errors, const char *, typedef __va_list_tag __va_list_tag *) (reason=RETURN_ERROR, error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:200
    #2  0x000055555e5d37d4 in throw_verror (error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:208
    #3  0x000055555e0b0ed2 in verror (string=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", args=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdb/utils.c:171
    #4  0x000055555e5e898a in error (fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:43
    #5  0x000055555d06b4bc in thread_from_lwp (stopped=0x617000035d80, ptid=...) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:418
    #6  0x000055555d07040d in try_thread_db_load_1 (info=0x60c000011140) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:912
    #7  0x000055555d071103 in try_thread_db_load (library=0x55555f0c62a0 "libthread_db.so.1", check_auto_load_safe=false) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1014
    #8  0x000055555d072168 in try_thread_db_load_from_sdir () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1091
    #9  0x000055555d072d1c in thread_db_load_search () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1146
    #10 0x000055555d07365c in thread_db_load () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1203
    #11 0x000055555d07373e in check_for_thread_db () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1246
    #12 0x000055555d0738ab in thread_db_new_objfile (objfile=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1275
    #13 0x000055555bd10740 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:60
    #14 0x000055555bd02096 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:153
    #15 0x000055555bce0392 in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffb4a0: 0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:291
    #16 0x000055555d3595c0 in std::function<void (objfile*)>::operator()(objfile*) const (this=0x616000068d88, __args#0=0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:622
    #17 0x000055555d356b7f in gdb::observers::observable<objfile*>::notify (this=0x555566727020 <gdb::observers::new_objfile>, args#0=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/observable.h:106
    #18 0x000055555da3f228 in symbol_file_add_with_addrs (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=..., addrs=0x7fffffffbc10, flags=..., parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1131
    #19 0x000055555da3f763 in symbol_file_add_from_bfd (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=<error reading variable: Cannot access memory at address 0xffffffffffffffb0>, addrs=0x7fffffffbc10, flags=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>, parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1167
    #20 0x000055555d95f9fa in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:681
    #21 0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987
    #22 0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238
    #23 0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049
    #24 0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195
    #25 0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318
    #26 0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439
    #27 0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887
    #28 0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064
    #29 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006
    #30 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062
    #31 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727
    #32 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105
    #33 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
    #34 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
    #35 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #36 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #37 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #38 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528
    #39 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545
    #40 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452
    #41 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149
    #42 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232
    #43 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257
    #44 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

The exception is caught here:

    #0  __cxxabiv1::__cxa_begin_catch (exc_obj_in=0x60e0000060e0) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_catch.cc:84
    #1  0x000055555d95fded in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:689
    #2  0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987
    #3  0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238
    #4  0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049
    #5  0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195
    #6  0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318
    #7  0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439
    #8  0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887
    #9  0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064
    #10 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006
    #11 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062
    #12 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727
    #13 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105
    #14 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
    #15 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
    #16 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
    #17 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
    #18 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
    #19 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528
    #20 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545
    #21 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452
    #22 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149
    #23 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232
    #24 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257
    #25 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32

Catching the exception at this point means that the thread_db_info
object for this inferior will be left in place, despite the failure to
load libthread_db.  This means that there won't be further attempts at
loading libthread_db, because thread_db_load will think that
libthread_db_debug is already loaded for this inferior and will always
exit early.  To fix this, add a try/catch around calling
try_thread_db_load_1 in try_thread_db_load, such that if some exception
is thrown while trying to load libthread_db, we reset / delete the
thread_db_info for that inferior.  That alone makes attach work fine
again, because check_for_thread_db is called again in the
thread_db_inferior_created observer (that happens after we learned about
all shared libraries and their symbols), and libthread_db is
successfully loaded then.

When attaching, I think that the inferior_created is a good place to try
to load libthread_db: it is called once everything has stabilized, when
we learned about all shared libraries.

The only problem then is that when we first try (and fail) to load
libthread_db, in reaction to learning about libpthread, we show this
warning:

    warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.

This is misleading, because we do succeed in loading it later.  So when
attaching, I think we shouldn't try to load libthread_db in reaction to
the new_objfile events, we should wait until we have learned about all
shared libraries (using the inferior_created observable).  To do so, add
an `in_initial_library_scan` flag to struct inferior.  This flag is used
to postpone loading libthread_db if we are attaching or handling a fork
child.

When debugging remotely with GDBserver, the same problem happens, except
that the qSymbol mechanism (allowing the remote side to ask GDB for
symbols values) is involved.  The fix there is the same idea, we make
GDB wait until all shared libraries and their symbols are known before
sending out a qSymbol packet.  This way, we never present the remote
side a state where libpthread.so's symbols are known but ld-linux's
symbols aren't.

gdb/ChangeLog:

        * inferior.h (class inferior) <in_initial_library_scan>: New.
        * infcmd.c (post_create_inferior): Set in_initial_library_scan.
        * infrun.c (follow_fork_inferior): Likewise.
        * linux-thread-db.c (try_thread_db_load): Catch exception thrown
	by try_thread_db_load_1
        (thread_db_load): Return early if in_initial_library_scan is
	set.
        * remote.c (remote_new_objfile): Return early if
	in_initial_library_scan is set.

Change-Id: I7a279836cfbb2b362b4fde11b196b4aab82f5efb
---
 gdb/infcmd.c          |  4 ++++
 gdb/inferior.h        |  4 ++++
 gdb/infrun.c          |  5 +++++
 gdb/linux-thread-db.c | 24 +++++++++++++++++++-----
 gdb/remote.c          | 19 +++++++++++++++++--
 5 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/gdb/infcmd.c b/gdb/infcmd.c
index 5aa6b00f20f3..6a4e1779f79b 100644
--- a/gdb/infcmd.c
+++ b/gdb/infcmd.c
@@ -312,6 +312,10 @@ post_create_inferior (int from_tty)
       const unsigned solib_add_generation
 	= current_program_space->solib_add_generation;
 
+      scoped_restore restore_in_initial_library_scan
+	= make_scoped_restore (&current_inferior ()->in_initial_library_scan,
+			       true);
+
       /* Create the hooks to handle shared library load and unload
 	 events.  */
       solib_create_inferior_hook (from_tty);
diff --git a/gdb/inferior.h b/gdb/inferior.h
index e0a7d622ccbb..977c8c8ba8aa 100644
--- a/gdb/inferior.h
+++ b/gdb/inferior.h
@@ -522,6 +522,10 @@ class inferior : public refcounted_object
      architecture/description.  */
   bool needs_setup = false;
 
+  /* True when we read reading the library list of the inferior during an
+     attach or handling a fork child.  */
+  bool in_initial_library_scan = false;
+
   /* Private data used by the target vector implementation.  */
   std::unique_ptr<private_inferior> priv;
 
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 90bab8d984f5..397f6ce2e29e 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -521,6 +521,9 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 		 breakpoint.  If a "cloned-VM" event was propagated
 		 better throughout the core, this wouldn't be
 		 required.  */
+	      scoped_restore restore_in_initial_library_scan
+		= make_scoped_restore (&child_inf->in_initial_library_scan,
+				       true);
 	      solib_create_inferior_hook (0);
 	    }
 	}
@@ -656,6 +659,8 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 	     shared libraries, and install the solib event breakpoint.
 	     If a "cloned-VM" event was propagated better throughout
 	     the core, this wouldn't be required.  */
+	  scoped_restore restore_in_initial_library_scan
+	    = make_scoped_restore (&child_inf->in_initial_library_scan, true);
 	  solib_create_inferior_hook (0);
 	}
 
diff --git a/gdb/linux-thread-db.c b/gdb/linux-thread-db.c
index 1c6ea4debd88..3657874d1e5c 100644
--- a/gdb/linux-thread-db.c
+++ b/gdb/linux-thread-db.c
@@ -1011,8 +1011,17 @@ try_thread_db_load (const char *library, bool check_auto_load_safe)
   if (strchr (library, '/') != NULL)
     info->filename = gdb_realpath (library).release ();
 
-  if (try_thread_db_load_1 (info))
-    return true;
+  try
+    {
+      if (try_thread_db_load_1 (info))
+	return true;
+    }
+  catch (const gdb_exception_error &except)
+    {
+      if (libthread_db_debug)
+	exception_fprintf (gdb_stdlog, except,
+			   "Warning: While trying to load libthread_db: ");
+    }
 
   /* This library "refused" to work on current inferior.  */
   delete_thread_db_info (current_inferior ()->process_target (),
@@ -1183,10 +1192,15 @@ has_libpthread (void)
 static bool
 thread_db_load (void)
 {
-  struct thread_db_info *info;
+  inferior *inf = current_inferior ();
 
-  info = get_thread_db_info (current_inferior ()->process_target (),
-			     inferior_ptid.pid ());
+  /* When attaching / handling fork child, don't try loading libthread_db
+     until we know about all shared libraries.  */
+  if (inf->in_initial_library_scan)
+    return false;
+
+  thread_db_info *info = get_thread_db_info (inf->process_target (),
+					     inferior_ptid.pid ());
 
   if (info != NULL)
     return true;
diff --git a/gdb/remote.c b/gdb/remote.c
index a7312a9fc2d1..1f7780f79bad 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -14515,8 +14515,23 @@ remote_new_objfile (struct objfile *objfile)
 {
   remote_target *remote = get_current_remote_target ();
 
-  if (remote != NULL)			/* Have a remote connection.  */
-    remote->remote_check_symbols ();
+  /* First, check whether the current inferior's process target is a remote
+     target.  */
+  if (remote == nullptr)
+    return;
+
+  /* If we are attaching or handling a fork child, we receive new objfile
+     events in between each found objfile.  The libraries are read in an
+     undefined order, so if we gave the remote side a chance to look up
+     symbols, we would give it an incomplete picture of the inferior (some
+     libraries are loaded in the inferior, but GDB doesn't know about them
+     and their symbols yet).  So, skip these events, we'll give the remote
+     a chance to look up symbols once all the loaded libraries and their
+     symbols are known.  */
+    if (current_inferior ()->in_initial_library_scan)
+      return;
+
+  remote->remote_check_symbols ();
 }
 
 /* Pull all the tracepoints defined on the target and create local
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child
  2021-05-04 18:02 ` [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child Simon Marchi
@ 2021-05-30 14:12   ` Joel Brobecker
  2021-05-31  0:48     ` Simon Marchi
  2021-06-04 22:42   ` Kevin Buettner
  1 sibling, 1 reply; 15+ messages in thread
From: Joel Brobecker @ 2021-05-30 14:12 UTC (permalink / raw)
  To: Simon Marchi via Gdb-patches; +Cc: Joel Brobecker

On Tue, May 04, 2021 at 02:02:45PM -0400, Simon Marchi via Gdb-patches wrote:
> <note>
> Since I wrote this patch, Florian Weimer merged this in glibc that
> mitigates the issue:
> 
>     https://sourceware.org/git/?p=glibc.git;a=commit;h=a64afc225240b2b27129ccfb0516d7c958b98040
> 
> But his commit message notes that it's "To make this work until GDB can
> be fixed", so we might also want to fix this on the GDB side.  Also, the
> patch can be desirable even if just the performance / waste less cycles
> aspect.
> 
> Compared to v1, v2 uses a dedicated inferior flag instead of trying to
> re-use needs_setup.  It also makes things work with the extended-remote
> target.
> </note>
> 
> When trying to attach to a pthread process on a Linux system with glibc 2.33,
> we get:
> 
>     $ ./gdb -q -nx --data-directory=data-directory -p 1472010
>     Attaching to process 1472010
>     [New LWP 1472013]
>     [New LWP 1472014]
>     [New LWP 1472015]
>     Error while reading shared library symbols for /usr/lib/libpthread.so.0:
>     Cannot find user-level thread for LWP 1472015: generic error
>     0x00007ffff6d3637f in poll () from /usr/lib/libc.so.6
>     (gdb)
> 
> When attaching to a process (or handling a fork child, an operation very
> similar to attaching), GDB reads the shared library list from the
> process.  For each shared library (if "set auto-solib-add" is on), it
> reads its symbols and calls the "new_objfile" observable.
> 
> The libthread-db code monitors this observable, and if it sees an
> objfile named somewhat like "libpthread.so" go by, it tries to load
> libthread_db.so in the GDB process itself.  libthread_db knows how to
> navigate libpthread's data structures to get information about the
> existing threads.
> 
> To locate these data structures, libthread_db calls ps_pglobal_lookup
> (implemented in proc-service.c), passing in a symbol name and expecting
> an address in return.
> 
> Before glibc 2.33, libthread_db always asked for symbols found in
> libpthread.  There was no ordering problem: since we were always trying
> to load libthread_db in reaction to processing libpthread (and reading
> in its symbols) and libthread_db only asked symbols from libpthread, the
> requested symbols could always be found.  Starting with glibc 2.33,
> libthread_db now asks for a symbol name that can be found in
> /lib/ld-linux-x86-64.so.2 (_rtld_global).  And the ordering in which GDB
> reads the shared libraries from the inferior when attaching is
> unfortunate, in that libpthread is processed before ld-linux.  So when
> loading libthread_db in reaction to processing libpthread, and
> libthread_db requests the symbol that is from ld-linux, GDB is not yet
> able to supply it.
> 
> That problematic symbol lookup happens in the thread_from_lwp function,
> when we call td_ta_map_lwp2thr_p, and an exception is thrown at this
> point:
> 
>     #0  0x00007ffff6681012 in __cxxabiv1::__cxa_throw (obj=0x60e000006100, tinfo=0x555560033b50 <typeinfo for gdb_exception_error>, dest=0x55555d9404bc <gdb_exception_error::~gdb_exception_error()>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:78
>     #1  0x000055555e5d3734 in throw_it(return_reason, errors, const char *, typedef __va_list_tag __va_list_tag *) (reason=RETURN_ERROR, error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:200
>     #2  0x000055555e5d37d4 in throw_verror (error=GENERIC_ERROR, fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", ap=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdbsupport/common-exceptions.cc:208
>     #3  0x000055555e0b0ed2 in verror (string=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s", args=0x7fffffffaae0) at /home/simark/src/binutils-gdb/gdb/utils.c:171
>     #4  0x000055555e5e898a in error (fmt=0x55555f0c5360 "Cannot find user-level thread for LWP %ld: %s") at /home/simark/src/binutils-gdb/gdbsupport/errors.cc:43
>     #5  0x000055555d06b4bc in thread_from_lwp (stopped=0x617000035d80, ptid=...) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:418
>     #6  0x000055555d07040d in try_thread_db_load_1 (info=0x60c000011140) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:912
>     #7  0x000055555d071103 in try_thread_db_load (library=0x55555f0c62a0 "libthread_db.so.1", check_auto_load_safe=false) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1014
>     #8  0x000055555d072168 in try_thread_db_load_from_sdir () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1091
>     #9  0x000055555d072d1c in thread_db_load_search () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1146
>     #10 0x000055555d07365c in thread_db_load () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1203
>     #11 0x000055555d07373e in check_for_thread_db () at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1246
>     #12 0x000055555d0738ab in thread_db_new_objfile (objfile=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/linux-thread-db.c:1275
>     #13 0x000055555bd10740 in std::__invoke_impl<void, void (*&)(objfile*), objfile*> (__f=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:60
>     #14 0x000055555bd02096 in std::__invoke_r<void, void (*&)(objfile*), objfile*> (__fn=@0x616000068d88: 0x55555d073745 <thread_db_new_objfile(objfile*)>) at /usr/include/c++/10.2.0/bits/invoke.h:153
>     #15 0x000055555bce0392 in std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke(std::_Any_data const&, objfile*&&) (__functor=..., __args#0=@0x7fffffffb4a0: 0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:291
>     #16 0x000055555d3595c0 in std::function<void (objfile*)>::operator()(objfile*) const (this=0x616000068d88, __args#0=0x61300000c0c0) at /usr/include/c++/10.2.0/bits/std_function.h:622
>     #17 0x000055555d356b7f in gdb::observers::observable<objfile*>::notify (this=0x555566727020 <gdb::observers::new_objfile>, args#0=0x61300000c0c0) at /home/simark/src/binutils-gdb/gdb/../gdbsupport/observable.h:106
>     #18 0x000055555da3f228 in symbol_file_add_with_addrs (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=..., addrs=0x7fffffffbc10, flags=..., parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1131
>     #19 0x000055555da3f763 in symbol_file_add_from_bfd (abfd=0x61200001ccc0, name=0x6190000d9090 "/usr/lib/libpthread.so.0", add_flags=<error reading variable: Cannot access memory at address 0xffffffffffffffb0>, addrs=0x7fffffffbc10, flags=<error reading variable: Cannot access memory at address 0xffffffffffffffc0>, parent=0x0) at /home/simark/src/binutils-gdb/gdb/symfile.c:1167
>     #20 0x000055555d95f9fa in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:681
>     #21 0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987
>     #22 0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238
>     #23 0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049
>     #24 0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195
>     #25 0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318
>     #26 0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439
>     #27 0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887
>     #28 0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064
>     #29 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006
>     #30 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062
>     #31 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727
>     #32 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105
>     #33 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
>     #34 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
>     #35 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
>     #36 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
>     #37 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
>     #38 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528
>     #39 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545
>     #40 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452
>     #41 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149
>     #42 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232
>     #43 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257
>     #44 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32
> 
> The exception is caught here:
> 
>     #0  __cxxabiv1::__cxa_begin_catch (exc_obj_in=0x60e0000060e0) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_catch.cc:84
>     #1  0x000055555d95fded in solib_read_symbols (so=0x6190000d8e80, flags=...) at /home/simark/src/binutils-gdb/gdb/solib.c:689
>     #2  0x000055555d96233d in solib_add (pattern=0x0, from_tty=0, readsyms=1) at /home/simark/src/binutils-gdb/gdb/solib.c:987
>     #3  0x000055555d93646e in enable_break (info=0x608000008f20, from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:2238
>     #4  0x000055555d93cfc0 in svr4_solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib-svr4.c:3049
>     #5  0x000055555d96610d in solib_create_inferior_hook (from_tty=0) at /home/simark/src/binutils-gdb/gdb/solib.c:1195
>     #6  0x000055555cdee318 in post_create_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:318
>     #7  0x000055555ce00e6e in setup_inferior (from_tty=0) at /home/simark/src/binutils-gdb/gdb/infcmd.c:2439
>     #8  0x000055555ce59c34 in handle_one (event=...) at /home/simark/src/binutils-gdb/gdb/infrun.c:4887
>     #9  0x000055555ce5cd00 in stop_all_threads () at /home/simark/src/binutils-gdb/gdb/infrun.c:5064
>     #10 0x000055555ce7f0da in stop_waiting (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:8006
>     #11 0x000055555ce67f5c in handle_signal_stop (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:6062
>     #12 0x000055555ce63653 in handle_inferior_event (ecs=0x7fffffffd170) at /home/simark/src/binutils-gdb/gdb/infrun.c:5727
>     #13 0x000055555ce4f297 in fetch_inferior_event () at /home/simark/src/binutils-gdb/gdb/infrun.c:4105
>     #14 0x000055555cdbe3bf in inferior_event_handler (event_type=INF_REG_EVENT) at /home/simark/src/binutils-gdb/gdb/inf-loop.c:42
>     #15 0x000055555d018047 in handle_target_event (error=0, client_data=0x0) at /home/simark/src/binutils-gdb/gdb/linux-nat.c:4060
>     #16 0x000055555e5ea77e in handle_file_event (file_ptr=0x60600008b1c0, ready_mask=1) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:575
>     #17 0x000055555e5eb09c in gdb_wait_for_event (block=0) at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:701
>     #18 0x000055555e5e8d19 in gdb_do_one_event () at /home/simark/src/binutils-gdb/gdbsupport/event-loop.cc:212
>     #19 0x000055555dd6e0d4 in wait_sync_command_done () at /home/simark/src/binutils-gdb/gdb/top.c:528
>     #20 0x000055555dd6e372 in maybe_wait_sync_command_done (was_sync=0) at /home/simark/src/binutils-gdb/gdb/top.c:545
>     #21 0x000055555d0ec7c8 in catch_command_errors (command=0x55555ce01bb8 <attach_command(char const*, int)>, arg=0x7fffffffe28d "1472010", from_tty=1, do_bp_actions=false) at /home/simark/src/binutils-gdb/gdb/main.c:452
>     #22 0x000055555d0f03ad in captured_main_1 (context=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1149
>     #23 0x000055555d0f1239 in captured_main (data=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1232
>     #24 0x000055555d0f1315 in gdb_main (args=0x7fffffffdd10) at /home/simark/src/binutils-gdb/gdb/main.c:1257
>     #25 0x000055555bb70cf9 in main (argc=7, argv=0x7fffffffde88) at /home/simark/src/binutils-gdb/gdb/gdb.c:32
> 
> Catching the exception at this point means that the thread_db_info
> object for this inferior will be left in place, despite the failure to
> load libthread_db.  This means that there won't be further attempts at
> loading libthread_db, because thread_db_load will think that
> libthread_db_debug is already loaded for this inferior and will always
> exit early.  To fix this, add a try/catch around calling
> try_thread_db_load_1 in try_thread_db_load, such that if some exception
> is thrown while trying to load libthread_db, we reset / delete the
> thread_db_info for that inferior.  That alone makes attach work fine
> again, because check_for_thread_db is called again in the
> thread_db_inferior_created observer (that happens after we learned about
> all shared libraries and their symbols), and libthread_db is
> successfully loaded then.
> 
> When attaching, I think that the inferior_created is a good place to try
> to load libthread_db: it is called once everything has stabilized, when
> we learned about all shared libraries.
> 
> The only problem then is that when we first try (and fail) to load
> libthread_db, in reaction to learning about libpthread, we show this
> warning:
> 
>     warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
> 
> This is misleading, because we do succeed in loading it later.  So when
> attaching, I think we shouldn't try to load libthread_db in reaction to
> the new_objfile events, we should wait until we have learned about all
> shared libraries (using the inferior_created observable).  To do so, add
> an `in_initial_library_scan` flag to struct inferior.  This flag is used
> to postpone loading libthread_db if we are attaching or handling a fork
> child.
> 
> When debugging remotely with GDBserver, the same problem happens, except
> that the qSymbol mechanism (allowing the remote side to ask GDB for
> symbols values) is involved.  The fix there is the same idea, we make
> GDB wait until all shared libraries and their symbols are known before
> sending out a qSymbol packet.  This way, we never present the remote
> side a state where libpthread.so's symbols are known but ld-linux's
> symbols aren't.
> 
> gdb/ChangeLog:
> 
>         * inferior.h (class inferior) <in_initial_library_scan>: New.
>         * infcmd.c (post_create_inferior): Set in_initial_library_scan.
>         * infrun.c (follow_fork_inferior): Likewise.
>         * linux-thread-db.c (try_thread_db_load): Catch exception thrown
> 	by try_thread_db_load_1
>         (thread_db_load): Return early if in_initial_library_scan is
> 	set.
>         * remote.c (remote_new_objfile): Return early if
> 	in_initial_library_scan is set.

Has anyone had a chance to look at this patch? It definitely deserves
a review. For me, the patch looks reasonable, but it's been a long time
since I last looked into this (complex and delicate) area of the code...

Thank you!

> 
> Change-Id: I7a279836cfbb2b362b4fde11b196b4aab82f5efb
> ---
>  gdb/infcmd.c          |  4 ++++
>  gdb/inferior.h        |  4 ++++
>  gdb/infrun.c          |  5 +++++
>  gdb/linux-thread-db.c | 24 +++++++++++++++++++-----
>  gdb/remote.c          | 19 +++++++++++++++++--
>  5 files changed, 49 insertions(+), 7 deletions(-)
> 
> diff --git a/gdb/infcmd.c b/gdb/infcmd.c
> index 5aa6b00f20f3..6a4e1779f79b 100644
> --- a/gdb/infcmd.c
> +++ b/gdb/infcmd.c
> @@ -312,6 +312,10 @@ post_create_inferior (int from_tty)
>        const unsigned solib_add_generation
>  	= current_program_space->solib_add_generation;
>  
> +      scoped_restore restore_in_initial_library_scan
> +	= make_scoped_restore (&current_inferior ()->in_initial_library_scan,
> +			       true);
> +
>        /* Create the hooks to handle shared library load and unload
>  	 events.  */
>        solib_create_inferior_hook (from_tty);
> diff --git a/gdb/inferior.h b/gdb/inferior.h
> index e0a7d622ccbb..977c8c8ba8aa 100644
> --- a/gdb/inferior.h
> +++ b/gdb/inferior.h
> @@ -522,6 +522,10 @@ class inferior : public refcounted_object
>       architecture/description.  */
>    bool needs_setup = false;
>  
> +  /* True when we read reading the library list of the inferior during an
> +     attach or handling a fork child.  */
> +  bool in_initial_library_scan = false;
> +
>    /* Private data used by the target vector implementation.  */
>    std::unique_ptr<private_inferior> priv;
>  
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 90bab8d984f5..397f6ce2e29e 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -521,6 +521,9 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
>  		 breakpoint.  If a "cloned-VM" event was propagated
>  		 better throughout the core, this wouldn't be
>  		 required.  */
> +	      scoped_restore restore_in_initial_library_scan
> +		= make_scoped_restore (&child_inf->in_initial_library_scan,
> +				       true);
>  	      solib_create_inferior_hook (0);
>  	    }
>  	}
> @@ -656,6 +659,8 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
>  	     shared libraries, and install the solib event breakpoint.
>  	     If a "cloned-VM" event was propagated better throughout
>  	     the core, this wouldn't be required.  */
> +	  scoped_restore restore_in_initial_library_scan
> +	    = make_scoped_restore (&child_inf->in_initial_library_scan, true);
>  	  solib_create_inferior_hook (0);
>  	}
>  
> diff --git a/gdb/linux-thread-db.c b/gdb/linux-thread-db.c
> index 1c6ea4debd88..3657874d1e5c 100644
> --- a/gdb/linux-thread-db.c
> +++ b/gdb/linux-thread-db.c
> @@ -1011,8 +1011,17 @@ try_thread_db_load (const char *library, bool check_auto_load_safe)
>    if (strchr (library, '/') != NULL)
>      info->filename = gdb_realpath (library).release ();
>  
> -  if (try_thread_db_load_1 (info))
> -    return true;
> +  try
> +    {
> +      if (try_thread_db_load_1 (info))
> +	return true;
> +    }
> +  catch (const gdb_exception_error &except)
> +    {
> +      if (libthread_db_debug)
> +	exception_fprintf (gdb_stdlog, except,
> +			   "Warning: While trying to load libthread_db: ");
> +    }
>  
>    /* This library "refused" to work on current inferior.  */
>    delete_thread_db_info (current_inferior ()->process_target (),
> @@ -1183,10 +1192,15 @@ has_libpthread (void)
>  static bool
>  thread_db_load (void)
>  {
> -  struct thread_db_info *info;
> +  inferior *inf = current_inferior ();
>  
> -  info = get_thread_db_info (current_inferior ()->process_target (),
> -			     inferior_ptid.pid ());
> +  /* When attaching / handling fork child, don't try loading libthread_db
> +     until we know about all shared libraries.  */
> +  if (inf->in_initial_library_scan)
> +    return false;
> +
> +  thread_db_info *info = get_thread_db_info (inf->process_target (),
> +					     inferior_ptid.pid ());
>  
>    if (info != NULL)
>      return true;
> diff --git a/gdb/remote.c b/gdb/remote.c
> index a7312a9fc2d1..1f7780f79bad 100644
> --- a/gdb/remote.c
> +++ b/gdb/remote.c
> @@ -14515,8 +14515,23 @@ remote_new_objfile (struct objfile *objfile)
>  {
>    remote_target *remote = get_current_remote_target ();
>  
> -  if (remote != NULL)			/* Have a remote connection.  */
> -    remote->remote_check_symbols ();
> +  /* First, check whether the current inferior's process target is a remote
> +     target.  */
> +  if (remote == nullptr)
> +    return;
> +
> +  /* If we are attaching or handling a fork child, we receive new objfile
> +     events in between each found objfile.  The libraries are read in an
> +     undefined order, so if we gave the remote side a chance to look up
> +     symbols, we would give it an incomplete picture of the inferior (some
> +     libraries are loaded in the inferior, but GDB doesn't know about them
> +     and their symbols yet).  So, skip these events, we'll give the remote
> +     a chance to look up symbols once all the loaded libraries and their
> +     symbols are known.  */
> +    if (current_inferior ()->in_initial_library_scan)
> +      return;
> +
> +  remote->remote_check_symbols ();
>  }
>  
>  /* Pull all the tracepoints defined on the target and create local
> -- 
> 2.30.1

-- 
Joel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child
  2021-05-30 14:12   ` Joel Brobecker
@ 2021-05-31  0:48     ` Simon Marchi
  0 siblings, 0 replies; 15+ messages in thread
From: Simon Marchi @ 2021-05-31  0:48 UTC (permalink / raw)
  To: Joel Brobecker, Simon Marchi via Gdb-patches

> Has anyone had a chance to look at this patch? It definitely deserves
> a review. For me, the patch looks reasonable, but it's been a long time
> since I last looked into this (complex and delicate) area of the code...

Indeed, I'd ideally like to have at least another person look at the
problem and solution and tell me if they think it makes sense, or if
they would have a better idea for the solution.

Simon

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child
  2021-05-04 18:02 ` [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child Simon Marchi
  2021-05-30 14:12   ` Joel Brobecker
@ 2021-06-04 22:42   ` Kevin Buettner
  2021-06-08 20:59     ` Simon Marchi
  1 sibling, 1 reply; 15+ messages in thread
From: Kevin Buettner @ 2021-06-04 22:42 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches, Florian Weimer, Carlos O'Donell

On Tue,  4 May 2021 14:02:45 -0400
Simon Marchi <simon.marchi@polymtl.ca> wrote:

> Since I wrote this patch, Florian Weimer merged this in glibc that
> mitigates the issue:
> 
>     https://sourceware.org/git/?p=glibc.git;a=commit;h=a64afc225240b2b27129ccfb0516d7c958b98040
> 
> But his commit message notes that it's "To make this work until GDB can
> be fixed", so we might also want to fix this on the GDB side.  Also, the
> patch can be desirable even if just the performance / waste less cycles
> aspect.
> 
> Compared to v1, v2 uses a dedicated inferior flag instead of trying to
> re-use needs_setup.  It also makes things work with the extended-remote
> target.

Florian has provided me with packages for glibc-2.33.9000, which is a
test release for glibc-2.34.  One of the really big changes in 2.34 that
affects GDB is that libpthread is largely empty now; when I examined
it, I found that it contained some versioning symbols and little else. 
E.g...

0000000000000000 g    DO *ABS*	0000000000000000  GLIBC_2.3.3 GLIBC_2.3.3
0000000000000000 g    DO *ABS*	0000000000000000  GLIBC_2.3.4 GLIBC_2.3.4
0000000000001110 g    DF .text	0000000000000005 (GLIBC_2.28) __libpthread_version_placeholder

(These aren't all of the symbols, just a sample.  But nearly all of them are
like the ones shown here.)

The important stuff that used to be contained in libpthread has been
moved to libc.  E.g. pthread_join is now found alongside execvp in
libc-2.33.9000.so...

000000000008c1c0 g    DF .text	0000000000000013  GLIBC_2.34  pthread_join
000000000008c1c0 g    DF .text	0000000000000013 (GLIBC_2.2.5) pthread_join
00000000000d9dd0 g    DF .text	0000000000000013  GLIBC_2.2.5 execvp

I've tested your patch on a Fedora 34 machine updated to use
glibc-2.33.9000 and have found that your patch is necessary to avoid
lots of failures in (e.g.) gdb.threads/attach-many-short-lived-threads.exp.
Without your patch, I see 79 passes and 30 failures for this test. 
With your patch, we get the expected 109 passes.

Your patch doesn't solve all of the problems when using glibc 2.34.  I
have a one line fix (maybe hack) which allows us to do thread debugging
when starting the program via "run" or "start".  I'll post that
patch separately, probably some time next week.  (Basically, we now
have to look for "libc" in addition to "libpthread" in order to trigger
libthread_db related loading / initialization in GDB.  It's possible that
there may be a better way to do it though.)

I've looked over your patch and am happy with the approach that
you've taken.  I think it should go in.

Kevin


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child
  2021-06-04 22:42   ` Kevin Buettner
@ 2021-06-08 20:59     ` Simon Marchi
  2021-06-08 21:28       ` Carlos O'Donell
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Simon Marchi @ 2021-06-08 20:59 UTC (permalink / raw)
  To: Kevin Buettner; +Cc: gdb-patches, Florian Weimer, Carlos O'Donell

On 2021-06-04 6:42 p.m., Kevin Buettner wrote:
> Florian has provided me with packages for glibc-2.33.9000, which is a
> test release for glibc-2.34.  One of the really big changes in 2.34 that
> affects GDB is that libpthread is largely empty now; when I examined
> it, I found that it contained some versioning symbols and little else. 
> E.g...
> 
> 0000000000000000 g    DO *ABS*	0000000000000000  GLIBC_2.3.3 GLIBC_2.3.3
> 0000000000000000 g    DO *ABS*	0000000000000000  GLIBC_2.3.4 GLIBC_2.3.4
> 0000000000001110 g    DF .text	0000000000000005 (GLIBC_2.28) __libpthread_version_placeholder
> 
> (These aren't all of the symbols, just a sample.  But nearly all of them are
> like the ones shown here.)
> 
> The important stuff that used to be contained in libpthread has been
> moved to libc.  E.g. pthread_join is now found alongside execvp in
> libc-2.33.9000.so...
> 
> 000000000008c1c0 g    DF .text	0000000000000013  GLIBC_2.34  pthread_join
> 000000000008c1c0 g    DF .text	0000000000000013 (GLIBC_2.2.5) pthread_join
> 00000000000d9dd0 g    DF .text	0000000000000013  GLIBC_2.2.5 execvp
> 
> I've tested your patch on a Fedora 34 machine updated to use
> glibc-2.33.9000 and have found that your patch is necessary to avoid
> lots of failures in (e.g.) gdb.threads/attach-many-short-lived-threads.exp.
> Without your patch, I see 79 passes and 30 failures for this test. 
> With your patch, we get the expected 109 passes.
>
> Your patch doesn't solve all of the problems when using glibc 2.34.  I
> have a one line fix (maybe hack) which allows us to do thread debugging
> when starting the program via "run" or "start".  I'll post that
> patch separately, probably some time next week.  (Basically, we now
> have to look for "libc" in addition to "libpthread" in order to trigger
> libthread_db related loading / initialization in GDB.  It's possible that
> there may be a better way to do it though.)

Hmm, although if libpthread.so is still there, even is mostly empty, it
can still be used by GDB as a trigger to load libthread_db:

- libc.so gets loaded, we don't try to load libthread_db
- The mostly empty libpthread.so gets loaded, so we try to load
  libthread_db.  libthread_db requests a bunch of symbols.  These
  symbols are found, because they are in libc, which is already loaded.

Is the long term plan to completely get rid of libpthread.so, or is it
going to stay as this mostly empty library?  If it is going away, then I
agree we have to change the way we trigger the load of libthread_db.

> I've looked over your patch and am happy with the approach that
> you've taken.  I think it should go in.

Thanks, I pushed it, given it doesn't seem to make things worse.

Simon

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child
  2021-06-08 20:59     ` Simon Marchi
@ 2021-06-08 21:28       ` Carlos O'Donell
  2021-06-08 21:53       ` Kevin Buettner
  2021-06-09  6:39       ` Florian Weimer
  2 siblings, 0 replies; 15+ messages in thread
From: Carlos O'Donell @ 2021-06-08 21:28 UTC (permalink / raw)
  To: Simon Marchi, Kevin Buettner; +Cc: gdb-patches, Florian Weimer

On 6/8/21 4:59 PM, Simon Marchi wrote:
> On 2021-06-04 6:42 p.m., Kevin Buettner wrote:
>> Florian has provided me with packages for glibc-2.33.9000, which is a
>> test release for glibc-2.34.  One of the really big changes in 2.34 that
>> affects GDB is that libpthread is largely empty now; when I examined
>> it, I found that it contained some versioning symbols and little else. 
>> E.g...
>>
>> 0000000000000000 g    DO *ABS*	0000000000000000  GLIBC_2.3.3 GLIBC_2.3.3
>> 0000000000000000 g    DO *ABS*	0000000000000000  GLIBC_2.3.4 GLIBC_2.3.4
>> 0000000000001110 g    DF .text	0000000000000005 (GLIBC_2.28) __libpthread_version_placeholder
>>
>> (These aren't all of the symbols, just a sample.  But nearly all of them are
>> like the ones shown here.)
>>
>> The important stuff that used to be contained in libpthread has been
>> moved to libc.  E.g. pthread_join is now found alongside execvp in
>> libc-2.33.9000.so...
>>
>> 000000000008c1c0 g    DF .text	0000000000000013  GLIBC_2.34  pthread_join
>> 000000000008c1c0 g    DF .text	0000000000000013 (GLIBC_2.2.5) pthread_join
>> 00000000000d9dd0 g    DF .text	0000000000000013  GLIBC_2.2.5 execvp
>>
>> I've tested your patch on a Fedora 34 machine updated to use
>> glibc-2.33.9000 and have found that your patch is necessary to avoid
>> lots of failures in (e.g.) gdb.threads/attach-many-short-lived-threads.exp.
>> Without your patch, I see 79 passes and 30 failures for this test. 
>> With your patch, we get the expected 109 passes.
>>
>> Your patch doesn't solve all of the problems when using glibc 2.34.  I
>> have a one line fix (maybe hack) which allows us to do thread debugging
>> when starting the program via "run" or "start".  I'll post that
>> patch separately, probably some time next week.  (Basically, we now
>> have to look for "libc" in addition to "libpthread" in order to trigger
>> libthread_db related loading / initialization in GDB.  It's possible that
>> there may be a better way to do it though.)

I'm responding here in my timezone ahead of Florian in the event that you
want to fix something else related to the questions you ask here.

Fixing this allows our glibc testing to progress into Fedora Rawhide so
we can test upstream glibc changes more widely.

Upstream glibc 2.34 release is planned for August 1st, with an ABI freeze
July 1st, and testing the whole month, but the sooner we test i.e. CI/CD
the better.

Integrating upstream glibc and breaking gdb is not a good developer
experience for the GNU Toolchain :-)

Thank you for helping!
 
> Hmm, although if libpthread.so is still there, even is mostly empty, it
> can still be used by GDB as a trigger to load libthread_db:

Only for now.
 
> - libc.so gets loaded, we don't try to load libthread_db
> - The mostly empty libpthread.so gets loaded, so we try to load
>   libthread_db.  libthread_db requests a bunch of symbols.  These
>   symbols are found, because they are in libc, which is already loaded.

The goal is to remove all such dependencies.
 
> Is the long term plan to completely get rid of libpthread.so, or is it
> going to stay as this mostly empty library?  If it is going away, then I
> agree we have to change the way we trigger the load of libthread_db.

We wanted to get rid of it entirely in glibc 2.34, but downstream distributions
requested it stay to improve compatibility with existing software stacks.

We should expect that eventually it will go away entirely in a subsequent
glibc release.

You should absolutely expect new applications to be able to use POSIX Thread
interfaces and have no NEEDED on libpthread.so.0.

>> I've looked over your patch and am happy with the approach that
>> you've taken.  I think it should go in.
> 
> Thanks, I pushed it, given it doesn't seem to make things worse.

Thank you!

Having this go forward means we can do more upstream glibc integration testing
with glibc in Fedora Rawhide.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child
  2021-06-08 20:59     ` Simon Marchi
  2021-06-08 21:28       ` Carlos O'Donell
@ 2021-06-08 21:53       ` Kevin Buettner
  2021-06-09  6:39       ` Florian Weimer
  2 siblings, 0 replies; 15+ messages in thread
From: Kevin Buettner @ 2021-06-08 21:53 UTC (permalink / raw)
  To: Simon Marchi; +Cc: gdb-patches, Florian Weimer, Carlos O'Donell

On Tue, 8 Jun 2021 16:59:47 -0400
Simon Marchi <simon.marchi@polymtl.ca> wrote:

> > Your patch doesn't solve all of the problems when using glibc 2.34.  I
> > have a one line fix (maybe hack) which allows us to do thread debugging
> > when starting the program via "run" or "start".  I'll post that
> > patch separately, probably some time next week.  (Basically, we now
> > have to look for "libc" in addition to "libpthread" in order to trigger
> > libthread_db related loading / initialization in GDB.  It's possible that
> > there may be a better way to do it though.)  
> 
> Hmm, although if libpthread.so is still there, even is mostly empty, it
> can still be used by GDB as a trigger to load libthread_db:
> 
> - libc.so gets loaded, we don't try to load libthread_db
> - The mostly empty libpthread.so gets loaded, so we try to load
>   libthread_db.  libthread_db requests a bunch of symbols.  These
>   symbols are found, because they are in libc, which is already loaded.
> 
> Is the long term plan to completely get rid of libpthread.so, or is it
> going to stay as this mostly empty library?  If it is going away, then I
> agree we have to change the way we trigger the load of libthread_db.

Fedora 34 / glibc-2.33.9000:

[kev@f34-2 gdb]$ ldd testsuite/outputs/gdb.threads/tls/tls 
	linux-vdso.so.1 (0x00007ffcf94fa000)
	libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007ff0ba9af000)
	libm.so.6 => /lib64/libm.so.6 (0x00007ff0ba8d4000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ff0ba8b9000)
	libc.so.6 => /lib64/libc.so.6 (0x00007ff0ba6c6000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ff0babf0000)

Fedora 34 / glibc-2.33:

[kev@f34-1 gdb]$ ldd testsuite/outputs/gdb.threads/tls/tls 
	linux-vdso.so.1 (0x00007fff32dc0000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f815f6de000)
	libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f815f4bf000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f815f37b000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f815f360000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f815f191000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f815f721000)

Note that libpthread is missing from the ldd output for the
glibc-2.33.9000 machine.

I'll post my patch soon...

Kevin


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child
  2021-06-08 20:59     ` Simon Marchi
  2021-06-08 21:28       ` Carlos O'Donell
  2021-06-08 21:53       ` Kevin Buettner
@ 2021-06-09  6:39       ` Florian Weimer
  2 siblings, 0 replies; 15+ messages in thread
From: Florian Weimer @ 2021-06-09  6:39 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Kevin Buettner, gdb-patches, Carlos O'Donell

* Simon Marchi:

> Hmm, although if libpthread.so is still there, even is mostly empty, it
> can still be used by GDB as a trigger to load libthread_db:
>
> - libc.so gets loaded, we don't try to load libthread_db
> - The mostly empty libpthread.so gets loaded, so we try to load
>   libthread_db.  libthread_db requests a bunch of symbols.  These
>   symbols are found, because they are in libc, which is already loaded.
>
> Is the long term plan to completely get rid of libpthread.so, or is it
> going to stay as this mostly empty library?  If it is going away, then I
> agree we have to change the way we trigger the load of libthread_db.

Relinking applications will remove the libpthread.so.0 reference.

We already have an init_complete probe in elf/rtld.c in glibc, perhaps
it's possible to load libthread_db at that point?

Thanks,
Florian


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-06-09  6:39 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-30 17:21 [PATCH] gdb: try to load libthread_db only after reading all shared libraries when attaching Simon Marchi
2021-04-02 17:14 ` Tom Tromey
2021-04-06 21:08   ` Kevin Buettner
2021-04-07  2:49     ` Simon Marchi
2021-04-07 23:51       ` Kevin Buettner
2021-04-07  2:45   ` Simon Marchi
2021-04-08 18:46     ` Tom Tromey
2021-05-04 18:02 ` [PATCH v2] gdb: try to load libthread_db only after reading all shared libraries when attaching / handling a fork child Simon Marchi
2021-05-30 14:12   ` Joel Brobecker
2021-05-31  0:48     ` Simon Marchi
2021-06-04 22:42   ` Kevin Buettner
2021-06-08 20:59     ` Simon Marchi
2021-06-08 21:28       ` Carlos O'Donell
2021-06-08 21:53       ` Kevin Buettner
2021-06-09  6:39       ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).