The future of static dlopen

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* The future of static dlopen
@ 2017-12-16 13:28 Florian Weimer
  2017-12-20  7:03 ` Carlos O'Donell
  0 siblings, 1 reply; 5+ messages in thread
From: Florian Weimer @ 2017-12-16 13:28 UTC (permalink / raw)
  To: GNU C Library

Folklore has it that static dlopen and dlmopen are closely related. 
Both have an outer and inner libc, and thus share a similar problem of 
making sure that they have the same view of the process and share data 
as needed.

However, there is this code in _dl_map_object_from_fd:

   /* When loading into a namespace other than the base one we must
      avoid loading ld.so since there can only be one copy.  Ever.  */
   if (__glibc_unlikely (nsid != LM_ID_BASE)
       && (_dl_file_id_match_p (&id, &GL(dl_rtld_map).l_file_id)
	  || _dl_name_match_p (name, &GL(dl_rtld_map))))
     {
       /* This is indeed ld.so.  Create a new link_map which refers to
	 the real one for almost everything.  */
       l = _dl_new_object (realname, name, l_type, loader, mode, nsid);

So the dynamic linker is indeed shared across dlmopen namespaces.  If we 
want to share anything between libcs, we can simply do this by 
implementing it in ld.so instead.

However, this works only for dlmopen.  For static dlopen, there is no 
outer lds.so that can be shared.  Instead, a new inner ld.so is loaded 
but not initialized, leading to bugs such as bug 20802 (getauxval not 
working after static dlopen).

In fact, when the inner ld.so appears to work, it only does so because 
it is bypassed.  For dlopen from the loaded DSOs, we have two different 
mechanisms, one for libc, one for libdl, which install the non-ld.so 
implementation of dlopen into the inner libc, called 
__libc_register_dl_open_hook and __libc_register_dlfcn_hook.  These 
hooks, when active, completely replace the implementation.  Here's the 
example for dlopen:

void *
__dlopen (const char *file, int mode DL_CALLER_DECL)
{
# ifdef SHARED
   if (__glibc_unlikely (_dlfcn_hook != NULL))
     return _dlfcn_hook->dlopen (file, mode, DL_CALLER);
# endif

This is not exactly harmless because there are still crash handlers 
which call dlopen as part of the crash reporting procedure (to load the 
libgcc unwinder).  It is possible, however, to mangle those function 
pointers (although this will of course break static dlopen from existing 
binaries, but we require recompilation already as there is no stable 
ABI; see bug 20204).

Let me stress again that these hooks are *not* needed for the dlmopen 
case.  There, _rtld_global_ro is fully initialized, and a call to 
GLRO(dl_open) just works (and so would a call to the ld.so function 
through an ELF relocation).

As the getauxval bug 20802 shows, the set of hooks is currently 
incomplete.  Another example is dlvsym support from libc.so itself for 
internal use, which is missing from elf/dl-libc.c (and which I need to 
implement libidn2 support for AI_IDN).  There are probably many other 
things missing as well, e.g. bug 10652 which still lacks root cause 
analysis.

This led me to wonder if there is a more natural way of implementing 
static dlopen.  The current scheme certainly has the advantage that it 
is possible to dlopen a DSO which is not linked against libc.so and 
ld.so (basically, without DT_NEEDED) with minimal extra overhead and 
dependency on additional files.  However, I'm not sure how common that 
use case is.  Our own use of static dlopen for NSS modules does not fit 
that.

If the static-dlopen-of-statically-linked-DSO is not a useful use case 
to support, maybe we should change the static dlopen implementation to 
load ld.so first and let it handle all further dynamic linking.  We 
would have to tweak the regular entry point so that the TLS 
initialization and some other steps are skipped because the main 
executable has already done that work.  At that point, we would load 
ld.so pretty much like the kernel would load it.  After the 
initialization, the dynamic loader would work just in the way it does 
for dynamically linked binaries.

But this leads to the question: Why do this at all?  Shouldn't we 
perhaps simply tell the kernel to load the dynamic loader for us?  That 
is, create a dynamically linked executable?

Since a statically linked executable is already tied to the libc.so and 
ld.so version it was created with, what exactly is the use case for 
static dlopen?

Should we remove support for static dlopen?  And use some other 
mechanism to implement NSS for statically linked binaries?

Thanks,
Florian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The future of static dlopen
  2017-12-16 13:28 The future of static dlopen Florian Weimer
@ 2017-12-20  7:03 ` Carlos O'Donell
  2017-12-20  7:24   ` Florian Weimer
  0 siblings, 1 reply; 5+ messages in thread
From: Carlos O'Donell @ 2017-12-20  7:03 UTC (permalink / raw)
  To: Florian Weimer, GNU C Library

On 12/16/2017 05:28 AM, Florian Weimer wrote:
> Folklore has it that static dlopen and dlmopen are closely related. 
> Both have an outer and inner libc, and thus share a similar problem 
> of making sure that they have the same view of the process and share 
> data as needed.

They are closely related only in that the static dlopen can be considered
as a form of a namespace. The objects opened by the static dlopen are
isolated from the static application linkages. Some of the problems are
the same, some are not.
 
> However, there is this code in _dl_map_object_from_fd:
> 
> /* When loading into a namespace other than the base one we must 
> avoid loading ld.so since there can only be one copy.  Ever.  */ if
> (__glibc_unlikely (nsid != LM_ID_BASE) && (_dl_file_id_match_p (&id,
> &GL(dl_rtld_map).l_file_id) || _dl_name_match_p (name,
> &GL(dl_rtld_map)))) { /* This is indeed ld.so.  Create a new link_map
> which refers to the real one for almost everything.  */ l =
> _dl_new_object (realname, name, l_type, loader, mode, nsid);
> 
> So the dynamic linker is indeed shared across dlmopen namespaces.
> If we want to share anything between libcs, we can simply do this by 
> implementing it in ld.so instead.

We don't know if that is the best solution for what our users want.

* Allowing different dynamic loaders provides better isolation.
  - Would require a loader<->loader API.
  - Even better LD_AUDIT isolation.

* Allowing different dynamic loaders lets you load newer libraries
  than you can possibly support.
  - Load libraries in a chroot/container that may require a newer
    ld.so (so long as the new ld.so supports the loader<->loader API).

Your suggestion is the simplest solution though, which is to move any
needed features into the parent ld.so, and always assure your outer
process uses the latest ld.so.
 
> However, this works only for dlmopen.  For static dlopen, there is
> no outer lds.so that can be shared.  Instead, a new inner ld.so is 
> loaded but not initialized, leading to bugs such as bug 20802 
> (getauxval not working after static dlopen).

There is an outer ld.so, but it's linked *into* the application.
 
> In fact, when the inner ld.so appears to work, it only does so 
> because it is bypassed.  For dlopen from the loaded DSOs, we have
> two different mechanisms, one for libc, one for libdl, which install
> the non-ld.so implementation of dlopen into the inner libc, called 
> __libc_register_dl_open_hook and __libc_register_dlfcn_hook.  These 
> hooks, when active, completely replace the implementation.  Here's 
> the example for dlopen:

The design of these hooks is to bridge the static ld.so into the
inner dynamic namespace, and effect what happens with dlmopen, having
just one dynamic loader.
 
> void * __dlopen (const char *file, int mode DL_CALLER_DECL) { # ifdef
> SHARED if (__glibc_unlikely (_dlfcn_hook != NULL)) return
> _dlfcn_hook->dlopen (file, mode, DL_CALLER); # endif
> 
> This is not exactly harmless because there are still crash handlers 
> which call dlopen as part of the crash reporting procedure (to load 
> the libgcc unwinder).

What harm is caused by this? Could you expand on this a bit?

> It is possible, however, to mangle those function pointers (although
> this will of course break static dlopen from existing binaries, but
> we require recompilation already as there is no stable ABI; see bug
> 20204).

Correct, there is *no* stable ABI, you must always run your static
binary (that uses dlopen) with the *exact* matching glibc you built with.
You cannot upgrade glibc and expect static binaries using dlopen to continue
to work. This is a known limitation and we express it very clearly.
 
> Let me stress again that these hooks are *not* needed for the
> dlmopen case.  There, _rtld_global_ro is fully initialized, and a
> call to GLRO(dl_open) just works (and so would a call to the ld.so
> function through an ELF relocation).

Correct.
 
> As the getauxval bug 20802 shows, the set of hooks is currently 
> incomplete.  Another example is dlvsym support from libc.so itself 
> for internal use, which is missing from elf/dl-libc.c (and which I 
> need to implement libidn2 support for AI_IDN).  There are probably 
> many other things missing as well, e.g. bug 10652 which still lacks 
> root cause analysis.

Yes, the implementation of static dlopen has some rough edges.

> This led me to wonder if there is a more natural way of implementing 
> static dlopen.  The current scheme certainly has the advantage that 
> it is possible to dlopen a DSO which is not linked against libc.so 
> and ld.so (basically, without DT_NEEDED) with minimal extra overhead 
> and dependency on additional files.  However, I'm not sure how
> common that use case is.  Our own use of static dlopen for NSS
> modules does not fit that.

Right.

> If the static-dlopen-of-statically-linked-DSO is not a useful use 
> case to support, maybe we should change the static dlopen 
> implementation to load ld.so first and let it handle all further 
> dynamic linking.  We would have to tweak the regular entry point so 
> that the TLS initialization and some other steps are skipped because 
> the main executable has already done that work.  At that point, we 
> would load ld.so pretty much like the kernel would load it.  After 
> the initialization, the dynamic loader would work just in the way it 
> does for dynamically linked binaries.

The same system would let dlmopen use a distinct ld.so, and this would
allow you to chain-load a newer loader in userspace, and "step into"
a newer runtime, do something, and then "step out" and destroy the
namespace you created. This could let a parent process load newer-than-you
plugins that require completely new runtimes.

Notes:
"A Multi-User Virtual Machine"
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.61.5598&rep=rep1&type=pdf
- Uses dlmopen/LD_AUDIT.

> But this leads to the question: Why do this at all?  Shouldn't we 
> perhaps simply tell the kernel to load the dynamic loader for us? 
> That is, create a dynamically linked executable?

Sure, but I think it might be simpler to do what you suggest below :-)

> Since a statically linked executable is already tied to the libc.so 
> and ld.so version it was created with, what exactly is the use case 
> for static dlopen?

None except to support NSS. The point of a static executable is not
to have *any* dependencies.

> Should we remove support for static dlopen?  And use some other
> mechanism to implement NSS for statically linked binaries?

Yes, I think we *could* remove support for static dlopen if you could
solve the NSS issues.

It would be easiest to have a proxy process to handle these requests
for you... such a proxy process could be a proxy thread instead?
As you suggest earlier have the kernel start a new tid, and map into
your VMA a new dynamic executable that you can access and call into
for services?

At this point the kernel will just reject such patches saying it could
be implemented completely in the static executable e.g. map in a new
ld.so and bootstrap a new runtime properly.

The difficulty is that NSS plugins need to do a lot to interface with
their respective service providers.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The future of static dlopen
  2017-12-20  7:03 ` Carlos O'Donell
@ 2017-12-20  7:24   ` Florian Weimer
  2017-12-20 16:36     ` Zack Weinberg
  0 siblings, 1 reply; 5+ messages in thread
From: Florian Weimer @ 2017-12-20  7:24 UTC (permalink / raw)
  To: Carlos O'Donell, GNU C Library

On 12/20/2017 08:03 AM, Carlos O'Donell wrote:

> We don't know if that is the best solution for what our users want.
> 
> * Allowing different dynamic loaders provides better isolation.
>    - Would require a loader<->loader API.
>    - Even better LD_AUDIT isolation.
> 
> * Allowing different dynamic loaders lets you load newer libraries
>    than you can possibly support.
>    - Load libraries in a chroot/container that may require a newer
>      ld.so (so long as the new ld.so supports the loader<->loader API).
> 
> Your suggestion is the simplest solution though, which is to move any
> needed features into the parent ld.so, and always assure your outer
> process uses the latest ld.so.

It's not a suggestion, it's was the loader currently does (and I have 
written a test case to verify that it actually works, i.e. that symbols 
implemented by the loader have the same address on both sides of dlmopen).

>> However, this works only for dlmopen.  For static dlopen, there is
>> no outer lds.so that can be shared.  Instead, a new inner ld.so is
>> loaded but not initialized, leading to bugs such as bug 20802
>> (getauxval not working after static dlopen).
> 
> There is an outer ld.so, but it's linked *into* the application.

It's code compiled from mostly the same sources in elf/, but I can 
assure you that it is *not* anything close to resembling ld.so at run 
time: It does not have a dynamic symbol table (so no interposition into 
libc).  It does not have its own link map entry.

>> In fact, when the inner ld.so appears to work, it only does so
>> because it is bypassed.  For dlopen from the loaded DSOs, we have
>> two different mechanisms, one for libc, one for libdl, which install
>> the non-ld.so implementation of dlopen into the inner libc, called
>> __libc_register_dl_open_hook and __libc_register_dlfcn_hook.  These
>> hooks, when active, completely replace the implementation.  Here's
>> the example for dlopen:
> 
> The design of these hooks is to bridge the static ld.so into the
> inner dynamic namespace, and effect what happens with dlmopen, having
> just one dynamic loader.

The mechanisms are completely different.  dlmopen works essentially the 
same as regular dynamic linking.  For static dlopen, providing dynamic 
linker functionality requires that we write custom hooks or other 
mechanisms, and use them to override ld.so behavior.  If we don't do 
that, loaded DSOs will use the uninitialized ld.so, which is unlikely to 
work.

>> void * __dlopen (const char *file, int mode DL_CALLER_DECL) { # ifdef
>> SHARED if (__glibc_unlikely (_dlfcn_hook != NULL)) return
>> _dlfcn_hook->dlopen (file, mode, DL_CALLER); # endif
>>
>> This is not exactly harmless because there are still crash handlers
>> which call dlopen as part of the crash reporting procedure (to load
>> the libgcc unwinder).
> 
> What harm is caused by this? Could you expand on this a bit?

There are exploits which overwrite the hook pointers to achieve code 
execution.  This was particularly attractive when we still called dlopen 
on heap corruption.

>> Should we remove support for static dlopen?  And use some other
>> mechanism to implement NSS for statically linked binaries?
> 
> Yes, I think we *could* remove support for static dlopen if you could
> solve the NSS issues.

Okay, I'll post a patch to add a deprecation notice to NEWS.

> It would be easiest to have a proxy process to handle these requests
> for you... such a proxy process could be a proxy thread instead?
> As you suggest earlier have the kernel start a new tid, and map into
> your VMA a new dynamic executable that you can access and call into
> for services?

I would just add an option to /usr/bin/getent which causes it to enter 
co-process mode.  It's not going to be extremely efficient (especially 
if we don't use a persistent subprocess, but it would be quite reliable, 
unlike what we have to day).

Thanks,
Florian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The future of static dlopen
  2017-12-20  7:24   ` Florian Weimer
@ 2017-12-20 16:36     ` Zack Weinberg
  2017-12-20 17:03       ` Florian Weimer
  0 siblings, 1 reply; 5+ messages in thread
From: Zack Weinberg @ 2017-12-20 16:36 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Carlos O'Donell, GNU C Library

On Tue, Dec 19, 2017 at 11:24 PM, Florian Weimer <fweimer@redhat.com> wrote:
> On 12/20/2017 08:03 AM, Carlos O'Donell wrote:
>> It would be easiest to have a proxy process to handle these requests
>> for you... such a proxy process could be a proxy thread instead?
>> As you suggest earlier have the kernel start a new tid, and map into
>> your VMA a new dynamic executable that you can access and call into
>> for services?
>
> I would just add an option to /usr/bin/getent which causes it to enter
> co-process mode.  It's not going to be extremely efficient (especially if we
> don't use a persistent subprocess, but it would be quite reliable, unlike
> what we have to day).

This seems like another case where making nscd less of an afterthought
would be a win.  It already does this job, after all.  In principle,
static binaries could just omit the no-nscd fallback path.  (In
practice, falling back to "files [dns]" might be the right thing,
since static binaries tend to get used for recovery.)

I also wonder what other C libraries do for static NSS.

zw

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The future of static dlopen
  2017-12-20 16:36     ` Zack Weinberg
@ 2017-12-20 17:03       ` Florian Weimer
  0 siblings, 0 replies; 5+ messages in thread
From: Florian Weimer @ 2017-12-20 17:03 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Carlos O'Donell, GNU C Library

On 12/20/2017 05:36 PM, Zack Weinberg wrote:
> On Tue, Dec 19, 2017 at 11:24 PM, Florian Weimer <fweimer@redhat.com> wrote:
>> On 12/20/2017 08:03 AM, Carlos O'Donell wrote:
>>> It would be easiest to have a proxy process to handle these requests
>>> for you... such a proxy process could be a proxy thread instead?
>>> As you suggest earlier have the kernel start a new tid, and map into
>>> your VMA a new dynamic executable that you can access and call into
>>> for services?
>>
>> I would just add an option to /usr/bin/getent which causes it to enter
>> co-process mode.  It's not going to be extremely efficient (especially if we
>> don't use a persistent subprocess, but it would be quite reliable, unlike
>> what we have to day).
> 
> This seems like another case where making nscd less of an afterthought
> would be a win.  It already does this job, after all.

I don't think nscd takes care of everything.  Enumeration is missing, I 
think, and so are some databases (aliasent, etherent).

> In principle,
> static binaries could just omit the no-nscd fallback path.  (In
> practice, falling back to "files [dns]" might be the right thing,
> since static binaries tend to get used for recovery.)

I thought about that as well, but I'm really not sure if using nscd 
brings a benefit.  From a distribution perspective, if you have to 
install nscd to run certain satic libraries, that might also enable it 
for dynamically-linked binaries, and many users may not want that.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-12-20 17:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-16 13:28 The future of static dlopen Florian Weimer
2017-12-20  7:03 ` Carlos O'Donell
2017-12-20  7:24   ` Florian Weimer
2017-12-20 16:36     ` Zack Weinberg
2017-12-20 17:03       ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).