From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19489 invoked by alias); 25 May 2011 15:37:14 -0000 Mailing-List: contact archer-help@sourceware.org; run by ezmlm Sender: Precedence: bulk List-Post: List-Help: List-Subscribe: List-Id: Received: (qmail 19267 invoked by uid 22791); 25 May 2011 15:37:12 -0000 X-SWARE-Spam-Status: No, hits=-1.3 required=5.0 tests=AWL,BAYES_00,SPF_FAIL X-Spam-Check-By: sourceware.org Date: Wed, 25 May 2011 15:37:00 -0000 From: Gary Benson To: archer@sourceware.org Subject: Improved linker-debugger interface Message-ID: <20110525153649.GB3149@redhat.com> Mail-Followup-To: archer@sourceware.org MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="JYK4vJDZwFMowpUq" Content-Disposition: inline X-SW-Source: 2011-q2/txt/msg00000.txt.bz2 --JYK4vJDZwFMowpUq Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-length: 3801 Hi all, These past few weeks I've been working on the following bug: https://bugzilla.redhat.com/show_bug.cgi?id=658851 aka http://sources.redhat.com/bugzilla/show_bug.cgi?id=2328 "_dl_debug_state() RT_CONSISTENT called too early" and trying to figure out a fix that also addresses this: https://bugzilla.redhat.com/show_bug.cgi?id=698001 "improve GDB performance on an application performing a lot of object loading." and this: http://sourceware.org/bugzilla/show_bug.cgi?id=11839 "gdb does not detect calls to dlmopen" It's taken me a few tries, but I think I finally have an interface that will work. This mail is so I can run it past a wider audience before I start implementing the gdb side in earnest. The current linker-debugger interface has a structure (r_debug) containing a list of loaded libraries, and an empty function (_dl_debug_state) which the linker calls both before and after modifying this list. It exists solely for the debugger to set a breakpoint on. - There is one place where glibc calls _dl_debug_state earlier than Solaris libc. This is #658851. It is unlikely that glibc will ever be changed to make it compatible with Solaris libc as this could break compatibility with previous glibcs. - This interface was presumably invented before dlmopen() was, so there's only provision in it for one namespace. In glibc each namespace has it's own r_debug structure, but there is no way for the linker to communicate the addresses of the others to the debugger. This is PR 11839. - In normal use gdb only needs to stop _after_ the list is modified. Because _dl_debug_state is called both before and after, gdb stops twice as often as it needs to. This is #698001, the gist of it at any rate. - When stop-on-solib-events is set, however, it is either necessary or at least useful to stop both before and after library loads. I'm not 100% sure on this point, but if nothing else it preserves the existing behaviour, and without it it probably becomes a lot more difficult to debug linker issues and possibly other stuff too. The solution I'm proposing is this: - Add a new function for gdb to break on, _dl_debug_state_extended, to supplement _dl_debug_state. Everywhere that _dl_debug_state is called, _dl_debug_state_extended is also called. In the case where glibc calls _dl_debug_state earlier than Solaris libc we move the call to _dl_debug_state_extended to the Solaris location. This fixes #658851, with the caveat that some provision must be made for halting before STT_GNU_IFUNC relocations. - _dl_debug_state_extended contains SystemTap probes, currently two, but extendable to as many as we want. gdb will look for these probes, and if it finds them it will set breakpoints on the ones it is interested in, falling back to breaking on _dl_debug_state if the required probes are not found. The two current probes are r_debug_mod_starting (called before the list is modified) and r_debug_mod_complete (called after). gdb will always stop on r_debug_mod_complete, to update its internal lists, but it will only stop on r_debug_mod_starting when stop-on-solib-events is set. This halves the number of stops, addressing #698001. - All probes have as namespace id and r_debug address arguments, allowing gdb to discover namespaces other than the default. This opens the door to fixing PR 11839. I've attached a patch for glibc that implements its side of the interface, designed to apply after the SystemTap support that was added for Fedora 15, so if what I've written doesn't make sense then maybe at least the code will! Does this all seem ok? Cheers, Gary -- http://gbenson.net/ --JYK4vJDZwFMowpUq Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="glibc.patch" Content-length: 5325 diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h index d040590..226904e 100644 --- a/sysdeps/generic/ldsodefs.h +++ b/sysdeps/generic/ldsodefs.h @@ -958,6 +958,11 @@ extern void _dl_sort_fini (struct link_map *l, struct link_map **maps, extern void _dl_debug_state (void); rtld_hidden_proto (_dl_debug_state) +/* The dynamic linker calls this function after having changed any + shared object mappings. */ +extern void _dl_debug_state_extended (Lmid_t ns, struct r_debug *r) + internal_function; + /* Initialize `struct r_debug' if it has not already been done. The argument is the run-time load address of the dynamic linker, to be put in the `r_ldbase' member. Returns the address of the structure. */ diff --git a/elf/dl-debug.c b/elf/dl-debug.c index 2538364..84cc163 100644 --- a/elf/dl-debug.c +++ b/elf/dl-debug.c @@ -76,3 +76,19 @@ _dl_debug_state (void) { } rtld_hidden_def (_dl_debug_state) + + +/* This function exists solely to contain SystemTap probes which the + debugger can set breakpoints on. */ +void +_dl_debug_state_extended (Lmid_t ns, struct r_debug *r) +{ + if (r->r_state == RT_CONSISTENT) + { + LIBC_PROBE (r_debug_mod_complete, 2, ns, r); + } + else + { + LIBC_PROBE (r_debug_mod_starting, 2, ns, r); + } +} diff --git a/elf/rtld.c b/elf/rtld.c index 174954b..004060b 100644 --- a/elf/rtld.c +++ b/elf/rtld.c @@ -1658,6 +1658,7 @@ ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n", /* We start adding objects. */ r->r_state = RT_ADD; _dl_debug_state (); + _dl_debug_state_extended (LM_ID_BASE, r); /* Auditing checkpoint: we are ready to signal that the initial map is being constructed. */ @@ -2361,6 +2362,7 @@ ERROR: ld.so: object '%s' cannot be loaded as audit interface: %s; ignored.\n", r = _dl_debug_initialize (0, LM_ID_BASE); r->r_state = RT_CONSISTENT; _dl_debug_state (); + _dl_debug_state_extended (LM_ID_BASE, r); #ifndef MAP_COPY /* We must munmap() the cache file. */ diff --git a/elf/dl-open.c b/elf/dl-open.c index cf8e8cc..b487203 100644 --- a/elf/dl-open.c +++ b/elf/dl-open.c @@ -293,7 +293,9 @@ dl_open_worker (void *a) } #endif - /* Notify the debugger all new objects are now ready to go. */ + /* Notify the debugger all new objects are now ready to go. + Note that the corresponding _dl_debug_state_extended call + to this happens further on, after relocation is complete. */ struct r_debug *r = _dl_debug_initialize (0, args->nsid); r->r_state = RT_CONSISTENT; _dl_debug_state (); @@ -460,6 +462,9 @@ cannot load any more object with static TLS")); _dl_fatal_printf (N_("\ TLS generation counter wrapped! Please report this.")); + /* Notify the debugger all new objects have been relocated. */ + _dl_debug_state_extended (args->nsid, r); + /* Run the initializer functions of new objects. */ _dl_init (new, args->argc, args->argv, args->env); diff --git a/elf/dl-close.c b/elf/dl-close.c index efb2b58..49a3d07 100644 --- a/elf/dl-close.c +++ b/elf/dl-close.c @@ -471,6 +471,7 @@ _dl_close_worker (struct link_map *map) struct r_debug *r = _dl_debug_initialize (0, nsid); r->r_state = RT_DELETE; _dl_debug_state (); + _dl_debug_state_extended (nsid, r); if (unload_global) { @@ -724,6 +725,7 @@ _dl_close_worker (struct link_map *map) /* Notify the debugger those objects are finalized and gone. */ r->r_state = RT_CONSISTENT; _dl_debug_state (); + _dl_debug_state_extended (nsid, r); /* Recheck if we need to retry, release the lock. */ out: diff --git a/elf/dl-load.c b/elf/dl-load.c index f866066..7588ddf 100644 --- a/elf/dl-load.c +++ b/elf/dl-load.c @@ -793,7 +793,7 @@ _dl_init_paths (const char *llp) static void __attribute__ ((noreturn, noinline)) lose (int code, int fd, const char *name, char *realname, struct link_map *l, - const char *msg, struct r_debug *r) + const char *msg, struct r_debug *r, Lmid_t nsid) { /* The file might already be closed. */ if (fd != -1) @@ -805,6 +805,7 @@ lose (int code, int fd, const char *name, char *realname, struct link_map *l, { r->r_state = RT_CONSISTENT; _dl_debug_state (); + _dl_debug_state_extended (nsid, r); } _dl_signal_error (code, name, NULL, msg); @@ -843,7 +844,7 @@ _dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp, errval = errno; call_lose: lose (errval, fd, name, realname, l, errstring, - make_consistent ? r : NULL); + make_consistent ? r : NULL, nsid); } /* Look again to see if the real name matched another already loaded. */ @@ -950,6 +951,7 @@ _dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp, linking has not been used before. */ r->r_state = RT_ADD; _dl_debug_state (); + _dl_debug_state_extended (nsid, r); make_consistent = true; } else @@ -1645,7 +1647,7 @@ open_verify (const char *name, struct filebuf *fbp, struct link_map *loader, name = strdupa (realname); free (realname); } - lose (errval, fd, name, NULL, NULL, errstring, NULL); + lose (errval, fd, name, NULL, NULL, errstring, NULL, 0); } /* See whether the ELF header is what we expect. */ --JYK4vJDZwFMowpUq--