* RTLD_DEEPBIND interaction with LD_PRELOAD
@ 2023-03-23 11:06 Matthew Parkinson
2023-03-31 14:52 ` Jonathon Anderson
0 siblings, 1 reply; 2+ messages in thread
From: Matthew Parkinson @ 2023-03-23 11:06 UTC (permalink / raw)
To: libc-alpha
[-- Attachment #1: Type: text/plain, Size: 4376 bytes --]
When a shared library is loaded using RTLD_DEEPBIND, it does not use the LD_PRELOADed libraries in preference. This means that allocator overriding with LD_PRELOAD in applications that load libraries with RTLD_DEEPBIND does not work. A minimal example can be found here:
deepbindexample/problem at main · mjp41/deepbindexample · GitHub<https://github.com/mjp41/deepbindexample/tree/main/problem>
This causes issues for a collection of allocators and address sanitizer. More examples can be found on the Bugzilla issue I raised:
30186 – RTLD_DEEPBIND interacts badly with LD_PRELOAD (sourceware.org)<https://sourceware.org/bugzilla/show_bug.cgi?id=30186>
And the twitter thread:
Twitter thread on RTLD_DEEPBIND<https://twitter.com/ParkyMatthew/status/1630500641708683268>
I am raising this on libc-alpha to discuss possible solutions, and how acceptable each would be to the community. This is the list I have so far from discussions with colleagues and feedback from Adhemerval Zanella and Siddhesh Poyarekar:
1. Malloc only solutions
* Introduce new malloc specific symbols for LD_PRELOAD
* Use malloc tunables to specify the allocator
2. General solutions
* Change RTLD_DEEPBIND to look at LD_PRELOADed libraries first
* Introduce new environment variable LD_PRELOAD_OVERRIDE_DEEPBIND(*) that must be respected by RTLD_DEEPBIND
* Introduce new RTLD_DEEPBIND_RESPECT_PRELOAD(*) that looks at LD_PRELOAD first.
(*) Naming is not my strong point, just trying to be illustrative.
As an allocator person I am fine with something from “Malloc only solution”, but I also appreciate anything that is added is something that needs to be maintained. So a quick specific solution may be a long-term bad choice. The “General solutions” has far more ramifications that I personally don’t understand.
Here are some more details of the specific ideas
1a. This is probably the quickest solution. Introduce a collection of internal symbols that are used to override the allocator. I have put a very minimal PoC for a single call at:
deepbindexample/solutionopt at main · mjp41/deepbindexample · GitHub<https://github.com/mjp41/deepbindexample/tree/main/solutionopt>
The core idea for something exposed would be
__attribute__((visibility("hidden")))
void message_impl()
{
puts("lib.c: message_impl");
}
__attribute__((weak))
extern void override_message();
extern void message()
{
if (override_message != NULL)
{
puts("lib.c: message -> override_message");
override_message();
return;
}
puts("lib.c: message -> message_impl");
message_impl();
}
Here `message` would be the libc function we want to be able to override. A library that wants to override this would provide both `message` and `override_message`. This would then work even in the presence of RTLD_DEEPBIND libraries. The call from a library that was loaded with RTLD_DEEPBIND would call the libc `message`, which would then call the `override_message` from the preload.
This incurs a single load, compare and branch on the fast path when LD_PRELOAD does not occur. It does not suffer the previous malloc hooks issues as this is a relocation, rather than a code pointer in the data segment.
1b. This is proposed by Siddhesh Poyarekar. I think the idea is to expose a “Tunable” parameter to specify, which malloc library to use. This is very appealing and has a clear meaning to me. I worry a bit about when Tunables are processed and if any allocation occurs before then.
2a. This seems like the nicest solution if RTLD_DEEPBIND didn’t already exist. It will alter existing semantics of programs, and hence is probably a compatibility nightmare.
2b and 2c. Are both adding a new feature to enable the desired behaviour. Personally, I prefer 2b as that doesn’t require everything that currently uses RTLD_DEEPBIND to be modified. However, I do not have enough experience to understand the consequences of either choice properly.
I am sure there are other possible approaches not outlined here, and I am sure there are consequences of each choice that I am not aware of. However, I do believe making LD_PRELOADing an allocator more reliable is an important feature for glibc.
--
Matthew Parkinson,
Principal Researcher
Microsoft
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: RTLD_DEEPBIND interaction with LD_PRELOAD
2023-03-23 11:06 RTLD_DEEPBIND interaction with LD_PRELOAD Matthew Parkinson
@ 2023-03-31 14:52 ` Jonathon Anderson
0 siblings, 0 replies; 2+ messages in thread
From: Jonathon Anderson @ 2023-03-31 14:52 UTC (permalink / raw)
To: Matthew Parkinson, libc-alpha
Hello all,
Wrapping symbols is of interest to us (the HPCToolkit folks), so I thought I would jump in here and bring up a 3rd option that we are excited about: LD_AUDIT-powered symbol injection.
We currently use LD_PRELOAD + dlsym(RTLD_NEXT) to wrap critical symbols, however we have also encountered unavoidable limitations with the approach for some applications in the wild. We haven't run into RTLD_DEEPBIND before, but we have found many other issues:
- dlsym(): If the symbol is fetched directly with dlsym() LD_PRELOAD does not apply. (And yes, there is code out there that does `dlsym(dlopen(libc.so.6))`. :/)
- dlopen(RTLD_LOCAL): dlsym(RTLD_NEXT) fails if the "victim" symbol is only loaded as the dependency of a library loaded with dlopen(RTLD_LOCAL).
- dlmopen() namespaces: LD_PRELOAD only applies to the main namespace, symbols in private dlmopen() namespaces are unaffected.
The alternative we are considering uses LD_AUDIT's la_symbind hook to inject our wrappers. This hook fires *every* time a symbol gets bound or on dlsym(), avoiding the narrow application issues with LD_PRELOAD. The hook also receives the to-be-bound target as an argument, avoiding the issues with dlsym(RTLD_NEXT). The high power of this approach makes it a very appealing alternative to LD_PRELOAD.
We have not yet tried LD_AUDIT-powered symbol injection in the wild, but I did write [a small test matrix with some possible wrapper implementations][1] for preliminary research. So far, the basic LD_AUDIT-powered implementations are very promising and avoid the issues we see with LD_PRELOAD.
[1]: https://gitlab.com/blue42u/ldaudit-power-tests/-/tree/main/symbol-wrapping
-Jonathon
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-03-31 14:52 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-23 11:06 RTLD_DEEPBIND interaction with LD_PRELOAD Matthew Parkinson
2023-03-31 14:52 ` Jonathon Anderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).