public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Ben Woodard <woodard@redhat.com>
To: gdb-patches@sourceware.org
Cc: Kevin Buettner <kevinb@redhat.com>, Ben Woodard <woodard@redhat.com>
Subject: [PATCH v4] gdb, gdbserver: support dlmopen()
Date: Tue, 24 May 2022 11:40:03 -0700	[thread overview]
Message-ID: <BD66A0A2-84F4-4E3C-BEC7-C3BFC2B896CB@redhat.com> (raw)

Sorry for the late feedback. I was not aware of Markus’s patch until yesterday when Kevin pointed it out to me. I’ve just started playing with a test build that Kevin gave me. 

> On 2022-03-09 12:24, Metzger, Markus T wrote: > Hello Pedro, 
> > 
> >> Should the commit have a Co-Authored-By: tag for H.J.? 
> > 
> > Definitely. I didn't know we were using such tags. 
> > 
> > 
> >> I'm wondering whether ideally symbol lookup would be updated to handle 
> >> different namespaces. I'm thinking of for example 
> >> svr4_iterate_over_objfiles_in_search_order. 
> > 
> > Do you mean that we should restrict the search to the namespace of the 
> > current_objfile argument? And use, say, the default namespace if that is nullptr? 
> Something like that. I mean, doesn't the dynamic loader restrict resolving symbols to the same namespace (and then the global namespace, I guess)? Not sure about completely restricting to the namespace is what users would want, but at least I'd think we should search symbols in the current namespace first before searching the global namespace and then other namespaces (*). The idea being that evaluating an expression in GDB should yield the same result the same expression would yield if coded in the program. (*) Similar to how we search symbols in scope, and can still print static globals of other files even if they're not in scope. It's just a question at this point, I haven't thought about actual use cases, but I think it's worth it to ponder about it. 
I actually have thought about this a lot and have customers which have asked me for this feature. 

The real use case is not dlmopen(); I’m not really sure I’ve seen anyone actually use dlmopen(). The real use case is LD_AUDIT and DT_AUDIT / DT_DEPAUDIT. The dynamic linker is antiquated and doesn’t meet our needs but it evidently cannot be changed. LD_AUDIT allows us to get in there and change the dynamic linker’s behavior so that it can be made to meet our needs. 

Tool developer need to debug their code the same way that app developers do. Without support like Markus’s patch this is EXTREMELY difficult. When we wrote Spindle https://computing.llnl.gov/projects/spindle getting it working was PAINFUL. We had to write tools to write tools. The thing is tool developers and app developers are different. Thinking of it as one big namespace is OK but it is unwieldy for tool developers. One big thing that we need is the ability to disambiguate symbols based upon which namespace they are found in.

App developers have quite a few different compilers that they use. The tool developers have had extra work maintaining their tools because they had to make them work with every compiler and libstdc++ combination in use because they were using the same libstdc++ and boost libs that the apps used. This introduced a lot of extra support cost for the tool developers. What they are trying to get to is a place where they can make one version of the tool using their preferred version lof libstdc++ and boost (and whatever other libraries) and have it insulated from the application. Then they give the application authors some magic invocation for their link line which sets up the DT_AUDIT and DT_DEPAUDIT voila’ their tool is “always on”. The upshot of this is that there will likely be two versions of libstdc++ and likely other libraries loaded into the process’s address space. With Markus’s patch we can now see all the libraries:

 (gdb) info shared
   From                To                  Syms Read   Shared Object Library
   0x00007ffff7fc8090  0x00007ffff7feea45  Yes         /lib64/ld-linux-x86-64.so.2
   0x00007ffff7fb12a0  0x00007ffff7fb9022  Yes         ./auditlib.so
   0x00007ffff7df73f0  0x00007ffff7eff532  Yes         /lib64/libstdc++.so.6
   0x00007ffff7c873b0  0x00007ffff7cf8b58  Yes         /lib64/libm.so.6
   0x00007ffff7c5a670  0x00007ffff7c70c05  Yes         /lib64/libgcc_s.so.1
   0x00007ffff7a82740  0x00007ffff7bf371d  Yes         /lib64/libc.so.6
   0x00007ffff7fc8090  0x00007ffff7feea45  Yes         /lib64/ld-linux-x86-64.so.2
   0x00007ffff777d740  0x00007ffff78ee71d  Yes         /lib64/libc.so.6

Great! The problem is we don’t know which ones are in which namespace. It would be great if we had something like:

 (gdb) info shared
   From                To                  Syms Read   NS	Shared Object Library
   0x00007ffff7fc8090  0x00007ffff7feea45  Yes         1	/lib64/ld-linux-x86-64.so.2
   0x00007ffff7fb12a0  0x00007ffff7fb9022  Yes         1	./auditlib.so
   0x00007ffff7df73f0  0x00007ffff7eff532  Yes         1	/lib64/libstdc++.so.6
   0x00007ffff7c873b0  0x00007ffff7cf8b58  Yes         0	/lib64/libm.so.6
   0x00007ffff7c5a670  0x00007ffff7c70c05  Yes         0	/lib64/libgcc_s.so.1
   0x00007ffff7a82740  0x00007ffff7bf371d  Yes         1	/lib64/libc.so.6
   0x00007ffff7fc8090  0x00007ffff7feea45  Yes         0	/lib64/ld-linux-x86-64.so.2
   0x00007ffff777d740  0x00007ffff78ee71d  Yes         0	/lib64/libc.so.6

Then when we want to set a breakpoint on a symbol or something we can qualify it with the namespace something like:

(gdb) b 1::printf

The problem with lazily searching all namespaces and setting breakpoints at all of the functions is most evident with C++ inline functions expanded from header files. There can be thousands of functions in the app which are not interesting to a tool developer. Tool developers tend to want to focus on developing the tool while app developers focus on their app. The tool developer may need to use the app as a reproducer while debugging their tool.

Similar problems seem to abound in other places where the lack of a way to disambiguate namespaces can be a problem:

When I say:

(gdb) list <function>

How do I get it to show me the version of the function which is in the first private namespace vs. the version which is in the application?

or when I do something like:

(gdb) break filename.c:231

How do I specify the filename.c TU which is part of the libstdc++.so which is linked into one of the audit libraries rather than the one that is linked into the app. Path to the filename isn’t sufficient because they could have both been built in the same directory structure from different sources. That is why I believe that we need some way to know which shared objects are in which namespace and some way to specify which linkage namespace to search.

Since the linkage namespaces are on a linked list in glibc, I thought referring to the default namespace as 0 and counting the depth and then reusing the C++ namespace qualifier might be a good idea but it is not something that I am hung up on. If someone has a better idea, great.

Anyway, that is my first pass at feedback. Once I get feedback from other developers that I work with and have more time to play with the build that Kevin gave me, I’ll have more for you guys.

-ben


> Pedro Alves


             reply	other threads:[~2022-05-24 18:40 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-24 18:40 Ben Woodard [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-11-17 14:28 Markus Metzger
2022-01-21 11:42 ` Metzger, Markus T
2022-03-01 19:10 ` Tom Tromey
2022-03-09 12:23   ` Metzger, Markus T
2022-03-03 18:32 ` Pedro Alves
2022-03-09 12:24   ` Metzger, Markus T
2022-03-09 14:15     ` Pedro Alves
2022-03-29 16:13       ` Metzger, Markus T
2022-04-08 12:40         ` Metzger, Markus T
2022-05-25 17:12 ` Kevin Buettner
2022-05-31  9:29   ` Metzger, Markus T
2022-05-31 18:44     ` Ben Woodard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BD66A0A2-84F4-4E3C-BEC7-C3BFC2B896CB@redhat.com \
    --to=woodard@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=kevinb@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).