public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Fengkai Sun <qcloud1014@gmail.com>
Cc: libc-help@sourceware.org, Vivek Das Mohapatra <vivek@collabora.com>
Subject: Re: Enable preloading in dlopen-ed shared libraries?
Date: Mon, 26 Jul 2021 16:51:31 -0300	[thread overview]
Message-ID: <e4d1f4df-fa71-4ca5-ede6-919581c709ae@linaro.org> (raw)
In-Reply-To: <CAF6YOcMAY8X0r7cYzpinzG0YoWtFex_ZztUvEescMM0_9Hn+eQ@mail.gmail.com>



On 23/07/2021 23:51, Fengkai Sun wrote:
> Hi Adhemerval,
> 
> Thanks so much for your reply! I will explain my idea in detail and sorry
> for the unclearness.
> 
> 
> Do you mean by preloading the shared library list using dlmopen in a new
>> namespace? Or do you mean to use RTLD_DEEPBIND with the preload libraries?
>>
> 
> I mean making users able to specify a list of preload libraries in a shared
> library's local scope, so that when the library is loaded with
> RTLD_DEEPBIND on, the preload libraries will take precedence over any other
> library except the dlopen-ed one.

This seems very specific to add such complexity and extra internal state
on loader and I want to understand better the issue you are trying to solve
and why you can't use the current tools instead.

> 
> 
>> By doing so, the user can easily provide a different definition of a
>> symbol
>>> from the one of the main executable, by enabling RTLD_DEEPBIND.
>>> This is useful under some circumstances. For example, a dlopen-ed library
>>> may want to use a separate heap from the main heap, and the user can
>>> provide another malloc implementation for that library.
>>
>> But how is this different than the malloc() interposition already supported
>> with LD_PRELOAD?
>>
> 
> I found that LD_PRELOAD cannot provide a different definition for a
> dlopen-ed library from the main executable. Let's say we are preloading
> mymalloc.so in an executable:

Yes, LD_PRELOAD act only for the default global scope.

> 
> scope0(global scope): ./main_exe  mymalloc.so  libc.so  libdl.so
> So the reference in the main executable is binded with the definition in
> mymalloc.so.
> 
> A dlopen-ed shared library will have such kind of scope, let's call it
> lib1.so:
> scope0(global scope): ./main_exe  mymalloc.so  libc.so  libdl.so
> scope1(local scope): lib1.so  libc.so
> 
> If lib1.so is loaded without RTLD_DEEPBIND, its reference to malloc will be
> binded to mymalloc.so too. That means shared libraries and the main
> executable are using the same heap, sometimes the user may want to prevent
> it.
> My goal is to preload libraries inside local scope, and it will look like:
> scope1(local scope): lib1.so  othermalloc.so  libc.so
> 
> In this way, the main executable will never see the definition inside
> othermalloc.so, and lib1 can bind to it when RTLD_DEEPBIND is on.

Right, now I have a better grasp of what you are trying to do.  I am not
sure if I really like this: besides adding more complexity on the loader for
a very specify user case through environment variables, I think you could 
implement it with in a different way with a new interface we are aiming 
to support on next 2.35 (RTLD_SHARED):

  void *h = dlmopen (LM_ID_NEWLM, "othermalloc.so", RTLD_NOW)
  Lmid_t id;
  dlinfo (h, RTLD_DI_LMID, &id);
  dlmopen (id, "lib1.so", RTLD_SHARED | RTLD_NOW);

The lib1.so will then bind its dependencies only on the newly created
namespace scope (LD_PRELOAD won't be added in this case).  So a subsequent
dlopen ("lib1.so", ...) will return the handler to the lib1.so already 
created in the extra namespace.

(Vivek can correct me if I am wrong here)

> 
>> The auditing interface can do the similar thing, but after doing some
>>> experiments, I found that `la_symbind64' cannot catch the bindings of
>>> global variables, and it cannot hook all of the function bindings.
>>
>> The rtld-audit currently only works for symbols which requires a PLT call,
>> the global variables either done with GOT access directly or through copy
>> relocations.  I am working on extending la_symbind() to work with bind-now
>> binaries, so it would be called at loading time in symbol resolution
>> instead
>> on the lazy binding resolution.
>>
>>>
>>> Would it be a good idea to add an interface to enable preloading in the
>>> local scope of dlopen-ed shared libraries?
>>
>> I am trying to understand better what you are trying to do here, because
>> you are mixing two different usercases here.  The RTLD_DEEPBIND is usually
>> used for shared libraries to use its direct dependencies over the global
>> list, the rtld-audit interfaces are loaded in a different namespace.
>>
> 
> I mentioned the rtld-audit interface here because it could provide
> different definitions for the same symbol. In la_symbind(), the user can
> return a different symbol address other than sym->st_value to achieve this:
> 
> void *mymalloc(...) {...}
> void *la_symbind(...){
>     if (refcook == lib1_cook && strcmp(symname, "malloc") == 0) return
> mymalloc;
>     else return sym->st_value;
> }
> 
> 
> However, la_symbind() currently happens in lazy binding resolution(thank
> you for clarifying it!), so it cannot work on every symbol for now.

Yeah, my plan is to fix it for next release.  The rtld-audit might indeed
be another way to implement it.

> 
> Again, I'm sorry for mixing RTLD_DEEPBIND, dlmopen and rtld-audit all
> together in the first email. I hope I've made things clear now.

No problem, this is quite complex due the multiple ways to change the loader
dynamic symbol resolution.

> 
> --
> Best,
> Fengkai
> 

  reply	other threads:[~2021-07-26 19:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-14  8:44 Fengkai Sun
2021-07-23 21:00 ` Adhemerval Zanella
2021-07-24  2:51   ` Fengkai Sun
2021-07-26 19:51     ` Adhemerval Zanella [this message]
2021-07-27 13:23       ` Vivek Das Mohapatra
2021-07-28  6:17       ` Fengkai Sun
2021-07-28 18:09         ` Adhemerval Zanella
2021-07-28 19:02           ` Geoff T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4d1f4df-fa71-4ca5-ede6-919581c709ae@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=libc-help@sourceware.org \
    --cc=qcloud1014@gmail.com \
    --cc=vivek@collabora.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).