From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Fengkai Sun <qcloud1014@gmail.com>
Cc: libc-help@sourceware.org, Vivek Das Mohapatra <vivek@collabora.com>
Subject: Re: Enable preloading in dlopen-ed shared libraries?
Date: Mon, 26 Jul 2021 16:51:31 -0300 [thread overview]
Message-ID: <e4d1f4df-fa71-4ca5-ede6-919581c709ae@linaro.org> (raw)
In-Reply-To: <CAF6YOcMAY8X0r7cYzpinzG0YoWtFex_ZztUvEescMM0_9Hn+eQ@mail.gmail.com>
On 23/07/2021 23:51, Fengkai Sun wrote:
> Hi Adhemerval,
>
> Thanks so much for your reply! I will explain my idea in detail and sorry
> for the unclearness.
>
>
> Do you mean by preloading the shared library list using dlmopen in a new
>> namespace? Or do you mean to use RTLD_DEEPBIND with the preload libraries?
>>
>
> I mean making users able to specify a list of preload libraries in a shared
> library's local scope, so that when the library is loaded with
> RTLD_DEEPBIND on, the preload libraries will take precedence over any other
> library except the dlopen-ed one.
This seems very specific to add such complexity and extra internal state
on loader and I want to understand better the issue you are trying to solve
and why you can't use the current tools instead.
>
>
>> By doing so, the user can easily provide a different definition of a
>> symbol
>>> from the one of the main executable, by enabling RTLD_DEEPBIND.
>>> This is useful under some circumstances. For example, a dlopen-ed library
>>> may want to use a separate heap from the main heap, and the user can
>>> provide another malloc implementation for that library.
>>
>> But how is this different than the malloc() interposition already supported
>> with LD_PRELOAD?
>>
>
> I found that LD_PRELOAD cannot provide a different definition for a
> dlopen-ed library from the main executable. Let's say we are preloading
> mymalloc.so in an executable:
Yes, LD_PRELOAD act only for the default global scope.
>
> scope0(global scope): ./main_exe mymalloc.so libc.so libdl.so
> So the reference in the main executable is binded with the definition in
> mymalloc.so.
>
> A dlopen-ed shared library will have such kind of scope, let's call it
> lib1.so:
> scope0(global scope): ./main_exe mymalloc.so libc.so libdl.so
> scope1(local scope): lib1.so libc.so
>
> If lib1.so is loaded without RTLD_DEEPBIND, its reference to malloc will be
> binded to mymalloc.so too. That means shared libraries and the main
> executable are using the same heap, sometimes the user may want to prevent
> it.
> My goal is to preload libraries inside local scope, and it will look like:
> scope1(local scope): lib1.so othermalloc.so libc.so
>
> In this way, the main executable will never see the definition inside
> othermalloc.so, and lib1 can bind to it when RTLD_DEEPBIND is on.
Right, now I have a better grasp of what you are trying to do. I am not
sure if I really like this: besides adding more complexity on the loader for
a very specify user case through environment variables, I think you could
implement it with in a different way with a new interface we are aiming
to support on next 2.35 (RTLD_SHARED):
void *h = dlmopen (LM_ID_NEWLM, "othermalloc.so", RTLD_NOW)
Lmid_t id;
dlinfo (h, RTLD_DI_LMID, &id);
dlmopen (id, "lib1.so", RTLD_SHARED | RTLD_NOW);
The lib1.so will then bind its dependencies only on the newly created
namespace scope (LD_PRELOAD won't be added in this case). So a subsequent
dlopen ("lib1.so", ...) will return the handler to the lib1.so already
created in the extra namespace.
(Vivek can correct me if I am wrong here)
>
>> The auditing interface can do the similar thing, but after doing some
>>> experiments, I found that `la_symbind64' cannot catch the bindings of
>>> global variables, and it cannot hook all of the function bindings.
>>
>> The rtld-audit currently only works for symbols which requires a PLT call,
>> the global variables either done with GOT access directly or through copy
>> relocations. I am working on extending la_symbind() to work with bind-now
>> binaries, so it would be called at loading time in symbol resolution
>> instead
>> on the lazy binding resolution.
>>
>>>
>>> Would it be a good idea to add an interface to enable preloading in the
>>> local scope of dlopen-ed shared libraries?
>>
>> I am trying to understand better what you are trying to do here, because
>> you are mixing two different usercases here. The RTLD_DEEPBIND is usually
>> used for shared libraries to use its direct dependencies over the global
>> list, the rtld-audit interfaces are loaded in a different namespace.
>>
>
> I mentioned the rtld-audit interface here because it could provide
> different definitions for the same symbol. In la_symbind(), the user can
> return a different symbol address other than sym->st_value to achieve this:
>
> void *mymalloc(...) {...}
> void *la_symbind(...){
> if (refcook == lib1_cook && strcmp(symname, "malloc") == 0) return
> mymalloc;
> else return sym->st_value;
> }
>
>
> However, la_symbind() currently happens in lazy binding resolution(thank
> you for clarifying it!), so it cannot work on every symbol for now.
Yeah, my plan is to fix it for next release. The rtld-audit might indeed
be another way to implement it.
>
> Again, I'm sorry for mixing RTLD_DEEPBIND, dlmopen and rtld-audit all
> together in the first email. I hope I've made things clear now.
No problem, this is quite complex due the multiple ways to change the loader
dynamic symbol resolution.
>
> --
> Best,
> Fengkai
>
next prev parent reply other threads:[~2021-07-26 19:51 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-14 8:44 Fengkai Sun
2021-07-23 21:00 ` Adhemerval Zanella
2021-07-24 2:51 ` Fengkai Sun
2021-07-26 19:51 ` Adhemerval Zanella [this message]
2021-07-27 13:23 ` Vivek Das Mohapatra
2021-07-28 6:17 ` Fengkai Sun
2021-07-28 18:09 ` Adhemerval Zanella
2021-07-28 19:02 ` Geoff T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e4d1f4df-fa71-4ca5-ede6-919581c709ae@linaro.org \
--to=adhemerval.zanella@linaro.org \
--cc=libc-help@sourceware.org \
--cc=qcloud1014@gmail.com \
--cc=vivek@collabora.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).