public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: stsp <stsp2@yandex.ru>
To: Zack Weinberg <zack@owlfolio.org>
Cc: Carlos O'Donell <carlos@redhat.com>,
	libc-alpha@sourceware.org, Jonathon Anderson <janderson@rice.edu>,
	Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>,
	Szabolcs Nagy <szabolcs.nagy@arm.com>
Subject: Re: [PATCH v9 0/13] implement dlmem() function
Date: Fri, 14 Apr 2023 03:50:10 +0500	[thread overview]
Message-ID: <aaabd8e5-1bab-4e06-3b38-86bb35d896c5@yandex.ru> (raw)
In-Reply-To: <23436bcf-a715-971b-db54-5d80634ca0ef@redhat.com>


[-- Attachment #1.1: Type: text/plain, Size: 6122 bytes --]

Hi Zack, thanks for a reply.
Have you forgot to CC the e-mail to me,
or was it some spam filtering? Carlos have
just told me about it.
I'll use copy/paste from web interface to reply,
and will monitor the web archive for a few days
in case your mails to me are blocked somewhere.


> Let me try to highlight specific issues from the feedback you've been
> getting.  It appears to me that your *API* proposal -- not any details
> of its implementation, the API itself -- is unacceptable *BECAUSE*:
>
> * It introduces a callback from the dynamic loader into user code, that
>    is to be run with an internal lock held. glibc does already have
>    callbacks like that, but we do not want to introduce any more of
>    them, period.
> * It assumes that the dynamic loader will always, in the future, operate
>    by creating a large initial mapping and then modifying protection
>    flags on it.
Yes, those are the totally valid concerns
raised by Szabolcs. They were very difficult
to address actually, and eventually I came to
this proposal:
https://sourceware.org/pipermail/libc-alpha/2023-April/147243.html

It is needed to address exactly those, but
it may not be obvious yet how exactly they
are related. I think this is not a problem, you'll
see that in a next dlmem() draft. These problems
will be solved. Most of dlmem() can be off-loaded
to either proposed RTLD_NORELOC or something
similar.

> * It assumes that it the dynamic loader is, and will always be, capable
>    of taking a pointer to a block of memory -- no matter how it was
>    originally allocated -- and then chopping its virtual memory map up to
>    satisfy the requirements of the embedded ELF program header table,
>    without breaking anything else.
I think you are talking about a premap callback
here? It will be dropped per comment 1 of yours,
so I think this is mostly a duplicate.

> [On this specific note, I foresee
>    someone trying to write a dynamic module into memory allocated with
>    memalign() or even plain old malloc(); calling dlmem() on this is very
>    likely to apply PROT_NONE to address ranges containing malloc
>    metadata!]
No-no, it never changes the protection
of a source buffer! Nothing to worry about
here. Also there are some recommendations
in an API draft what mappings are supported
as a source and what not - using memalign()
space is very much unrecommended unless
you know what you do (see api draft for details).
But the protections are of course not altered.

> For instance, maybe we could have an fdlopen variant that
> called callback functions instead of mmap() and mprotect().  Unlike
> your dlmem_premap_t, I think it ought to be possible to avoid doing
> those specific operations with any locks held.
But what such a callback would do?
Plain read()?


So.
You are welcome to this lengthy thread. :)
I am attaching an API draft in case you are
curious about a details, but it was not yet
updated to the removal of a callback.
Sorry about that, just take my word on that
for now. :)

There are a few things that you miss about
the proposed design (as every newcomer
to this thread does), so let me show you an
examples of fdlopen() and dlopen_with_offset()
implemented on top of dlmem(). Most of things,
like eg the source buffer usage, will be more
clear after that:

static void *
fdlopen (int fd, int flags)
{
   off_t len;
   void *addr;
   void *handle;

   len = lseek (fd, 0, SEEK_END);
   lseek (fd, 0, SEEK_SET);
   addr = mmap (NULL, len, PROT_READ, MAP_PRIVATE, fd, 0);
   if (addr == MAP_FAILED)
     return NULL;
   handle = dlmem (addr, len, flags, NULL);
   munmap (addr, len);
   return handle;
}

If we run that example, then in /proc/self/maps
we will see the correct references to an elf file:

$ LD_LIBRARY_PATH=..:. ./tst-dlmem-fdlopen
unaligned buf gives buffer not aligned: Invalid argument
7fb413101000-7fb413102000 r--p 00000000 00:28 17195405
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413102000-7fb413103000 r-xp 00001000 00:28 17195405
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413103000-7fb413104000 r--p 00002000 00:28 17195405
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413104000-7fb413105000 r--p 00002000 00:28 17195405
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so
7fb413105000-7fb413106000 rw-p 00003000 00:28 17195405
/home/stas/src/glibc-dev/build/dlfcn/glreflib1.so

As you an see, there is no reference to the
full file, just to the elf segments. That's
because the above code just munmap()s the source
buffer after dlmem() returns. Its protection
was never changed.

Now, dlopen_with_offset() is slightly more difficult,
but is suggested for a read too:

/* Load the shared library from a container file.
    file   - file name.
    offset - solib offset within a container file.
             Highly recommended to be page-aligned.
    length - solib file length (not a container file length).
    flags  - dlopen() flags. */
void *dlopen_with_offset4 (const char *file, off_t offset, size_t length,
                            int flags)
{
     void *addr;
     void *handle;
     int fd;
     off_t pad_size = (offset & (getpagesize () - 1));
     off_t aligned_offset = offset - pad_size;
     size_t map_length = length + pad_size;

     fd = open (file, O_RDONLY);
     if (fd == -1)
         return NULL;
     addr = mmap (NULL, map_length, PROT_READ, MAP_PRIVATE, fd, aligned_offset);
     close(fd);
     if (addr == MAP_FAILED)
         return NULL;
     if (pad_size)
       {
         /* We need to fix alignment by hands. :-(
            And for that we need a shared mapping. */
         void *addr2 = mmap (NULL, length, PROT_READ | PROT_WRITE,
                             MAP_SHARED | MAP_ANONYMOUS, -1, 0);
         if (addr2 == MAP_FAILED)
           {
             munmap (addr, map_length);
             return NULL;
           }
         memcpy (addr2, addr + pad_size, length);
         munmap (addr, map_length);
         addr = addr2;
         map_length = length;
       }
     handle = dlmem (addr, length, flags, NULL);
     munmap (addr, map_length);
     return handle;
}

[-- Attachment #2: 0000-cover-letter.patch --]
[-- Type: text/x-patch, Size: 8231 bytes --]

From 46e5095ebfe63be4dcd813c4237d6a491a3f9768 Mon Sep 17 00:00:00 2001
From: Stas Sergeev <stsp2@yandex.ru>
Date: Mon, 13 Feb 2023 18:15:34 +0500
Subject: [PATCH v10 0/12] implement dlmem() function

This patch-set implements the dlmem() function that allows to load
the solib from the file-mapped or anonymously-shared memory buffer.
Generic memory buffer is also supported with DLMEM_GENBUF_SRC flag,
but it should not be preferred. Private anonymous mapping used as
a source buffer, also needs this flag to be set.

dlmem() suits as a building block for implementing the functions like
fdlopen() and dlopen_with_offset(), which are demo-implemented in a
test-case called tst-dlmem-extfns.
The reasons why it suits well for such file-based loaders, are below:
1. It correctly handles the file association of the original solib
   buffer if it was mmap'ed from a file (unless DLMEM_GENBUF_SRC
   of DLMEM_DONTREPLACE flag is set).
2. It allows to provide a solib name, which can be the file name.

With the above properties, the "direct" implementation of these functions
gives no advantages over implementing them with dlmem().

In addition, dlmem() has lots of optional functionality for the fine-grained
control over the loading process. It allows you to set nsid (like dlmopen()),
specify the solib relocation address and even relocate the solib into
the user's buffer. That "advanced" functionality is only needed for the
very specific use-cases, like virtualized environments where the relocation
address may have a special constraints, eg MAP_32BIT. In all other cases
it is advised to set the "dlm_args" pointer of dlmem() call to NULL, but
see "Limitations" below to find out when its not the case.

The API looks as below:

/* Callback for dlmem. */
typedef void *
(dlmem_premap_t) (void *mappref, size_t maplength, size_t mapalign,
	          void *cookie);

/* Do not replace mapping created by premap callback.
   dlmem() will then use memcpy(). */
#define DLMEM_DONTREPLACE 1
/* Treat source memory buffer as a generic unaligned buffer, rather
   than a file-backed or anonymously-shared mapping. Anonymous private
   mapping also needs this flag to be set. */
#define DLMEM_GENBUF_SRC 2

struct dlmem_args {
  /* Optional name to associate with the loaded object. */
  const char *soname;
  /* Namespace where to load the object. */
  Lmid_t nsid;
  /* dlmem-specific flags. */
  unsigned int flags;
  /* Optional premap callback. */
  dlmem_premap_t *premap;
  /* Optional argument for premap callback. */
  void *cookie;
};

/* Like `dlmopen', but loads shared object from memory buffer.  */
extern void *dlmem (const unsigned char *buffer, size_t size, int mode,
		    struct dlmem_args *dlm_args);

Advanced functionality:

In most cases dlm_args should just be set to NULL. It provides the
advanced functionality, most of which is obvious (soname, nsid).
The optional premap callback allows to set the relocation address for
the solib by mapping the destination space and returning its address.
More so, if DLMEM_DONTREPLACE flag is used, then the mapping
established by the premap callback, will not be replaced with the
file-backed mapping. In that case dlmem() have to use memcpy(), which
is likely even faster than mmaps() but doesn't end up with the proper
/proc/self/map_files or /proc/self/maps entries. So for example if the
premap callback uses MAP_SHARED, then with the use of the DLMEM_DONTREPLACE
flag you can get your solib relocated into a shared memory buffer.
Note that the premap callback may be called under glibc locks, so it
should restrict itself to the syscall functionality. Certainly no libdl
functions can be used. mmap(), shm_open(), open(), ftruncate(),
memfd_create() should be the sufficient set of functions that may
ever be in use in a premap callback, but in most cases the callback
should just be disabled.


Limitations:

- If you need to load the solib from anonymously-mapped buffer, you need
  to use MAP_SHARED|MAP_ANONYMOUS mmap flags when creating that buffer.
  If it is not possible in your use-case and the buffer was created
  with MAP_PRIVATE|MAP_ANONYMOUS flags, then DLMEM_GENBUF_SRC flag
  needs to be set when calling dlmem().
  Failure to follow that guide-line results in an UB (loader will not
  be able to properly lay out an elf segments).

- If you use a private file-backed mapping, then it shouldn't be
  modified by hands before passing to dlmem(). I.e. you can't apply
  mprotect() to it to change protection bits, and you can't apply
  memmove() to it to move the solib to the beginning of the buffer,
  and so on. dlmem() can only work with "virgin" private file-backed
  mappings. You can set DLMEM_GENBUF_SRC flag as a work-around if
  the mapping is already corrupted.
  Failure to follow that guide-line results in an UB (loader will not
  be able to properly lay out an elf segments).

- The need of mapping the entire solib (with debug info etc) may
  represent a problem on a 32bit architectures if the solib has an
  absurdly large size, like 3Gb or more.

- For the very same reason the efficient implementation of Android's
  dlopen_with_offset() is difficult, as in that case you'd need to
  map the entire file container, starting from the needed offset.
  The demo implementation in this patch implements dlopen_with_offset4()
  that has an additional "length" argument where the solib length
  should be passed.

- As linux doesn't implement MAP_UNALIGNED as some unixes did, the
  efficient implementation of dlopen_with_offset4() is difficult
  if the offset is not page-aligned. Demo in this example fixes the
  alignment by hands, using the MAP_SHARED|MAP_ANONYMOUS intermediate
  buffer. The alignment cannot be fixed in an existing buffer with
  memmove(), as that will make the file-backed mapping unacceptable
  for the use with dlmem(). I suspect that google's dlopen_with_offset()
  has similar limitation because mmap() with unaligned offset is
  not possible in any implementation, be it a "direct" implementation
  or "over-dlmem" implementation.


Changes in v10:
- addressed review comments of Adhemerval Zanella
- moved refactor patches to the beginning of the serie to simplify review
- fixed a few bugs in an elf relocation machinery after various hot discussions
- added a new test tst-dlmem-extfns that demo-implements dlopen_with_offset4()
  and fdlopen()
- studied and documented all limitations, most importantly those leading to UB
- better documented premap callback as suggested by Szabolcs Nagy
- added DLMEM_GENBUF_SRC flag for unaligned generic memory buffers

Changes in v9:
- use "zero-copy" machinery instead of memcpy(). It works on linux 5.13
  and newer, falling back to memcpy() otherwise. Suggested by Florian Weimer.
- implement fdlopen() using the above functionality. It is in a new test
  tst-dlmem-fdlopen. Suggested by Carlos O'Donell.
- add DLMEM_DONTREPLACE flag that doesn't replace the backing-store mapping.
  It switches back to memcpy(). Test-case is called tst-dlmem-shm.

Changes in v8:
- drop audit machinery and instead add an extra arg (optional pointer
  to a struct) to dlmem() itself that allows to install a custom premap
  callback or to specify nsid. Audit machinery was meant to allow
  controling over the pre-existing APIs like dlopen(), but if someone
  ever needs such extensions to dlopen(), he can trivially implement
  dlopen() on top of dlmem().

Changes in v7:
- add _dl_audit_premap audit extension and its usage example

Changes in v6:
- use __strdup("") for l_name as suggested by Andreas Schwab

Changes in v5:
- added _dl_audit_premap_dlmem audit extension for dlmem
- added tst-auditmod-dlmem.c test-case that feeds shm fd to dlmem()

Changes in v4:
- re-target to GLIBC_2.38
- add tst-auditdlmem.c test-case to test auditing
- drop length page-aligning in tst-dlmem: mmap() aligns length on its own
- bugfix: in do_mmapcpy() allow mmaps past end of buffer

Changes in v3:
- Changed prototype of dlmem() (and all the internal machinery) to
  use "const unsigned char *buffer" instead of "const char *buffer".

Changes in v2:
- use <support/test-driver.c> instead of "../test-skeleton.c"
- re-target to GLIBC_2.37
- update all libc.abilist files

-- 
2.37.2


  parent reply	other threads:[~2023-04-13 22:50 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-18 16:50 Stas Sergeev
2023-03-18 16:50 ` [PATCH 01/13] elf: strdup() l_name if no realname [BZ #30100] Stas Sergeev
2023-03-29 13:54   ` Adhemerval Zanella Netto
2023-03-29 14:12     ` stsp
2023-03-29 14:19       ` Adhemerval Zanella Netto
2023-03-29 14:28         ` stsp
2023-03-29 14:30           ` Adhemerval Zanella Netto
2023-03-29 14:33             ` stsp
2023-03-18 16:50 ` [PATCH 02/13] elf: switch _dl_map_segment() to anonymous mapping Stas Sergeev
2023-03-29 17:01   ` Adhemerval Zanella Netto
2023-03-29 18:00     ` stsp
2023-03-29 18:29       ` Adhemerval Zanella Netto
2023-03-29 18:46         ` stsp
2023-03-29 19:17           ` Adhemerval Zanella Netto
2023-03-29 19:43             ` stsp
2023-03-18 16:51 ` [PATCH 03/13] elf: dont pass fd to _dl_process_pt_xx Stas Sergeev
2023-03-29 17:10   ` Adhemerval Zanella Netto
2023-03-30 16:08     ` stsp
2023-03-30 20:46       ` Adhemerval Zanella Netto
2023-03-31 12:02         ` Szabolcs Nagy
2023-03-31 12:54           ` Adhemerval Zanella Netto
2023-03-31 14:04             ` stsp
2023-03-18 16:51 ` [PATCH 04/13] elf: split _dl_map_object_from_fd() into reusable parts Stas Sergeev
2023-03-18 16:51 ` [PATCH 05/13] elf: split open_verify() " Stas Sergeev
2023-03-18 16:51 ` [PATCH 06/13] elf: load elf hdr fully in open_verify() Stas Sergeev
2023-03-18 16:51 ` [PATCH 07/13] elf: convert pread64 to callback in do_open_verify() Stas Sergeev
2023-03-18 16:51 ` [PATCH 08/13] elf: convert _dl_map_segments's mmap() to a callback Stas Sergeev
2023-03-18 16:51 ` [PATCH 09/13] elf: call _dl_map_segment() via premap callback Stas Sergeev
2023-03-18 16:51 ` [PATCH 10/13] elf: convert _dl_map_object to a callback Stas Sergeev
2023-03-18 16:51 ` [PATCH 11/13] elf: split _dl_check_loaded() from _dl_map_object Stas Sergeev
2023-03-18 16:51 ` [PATCH 12/13] dlfcn,elf: implement dlmem() [BZ #11767] Stas Sergeev
2023-03-29 13:45   ` Carlos O'Donell
2023-03-29 13:51     ` stsp
2023-03-29 14:10       ` Jonathon Anderson
2023-03-29 14:20         ` stsp
2023-03-29 14:31           ` Adhemerval Zanella Netto
2023-03-29 15:01             ` stsp
2023-03-29 14:35           ` Carlos O'Donell
2023-03-29 14:50             ` stsp
2023-03-29 15:20               ` Carlos O'Donell
2023-03-29 15:34                 ` stsp
2023-03-30  8:09         ` stsp
2023-03-18 16:51 ` [PATCH 13/13] dlfcn,elf: impl DLMEM_DONTREPLACE dlmem() flag Stas Sergeev
2023-03-29 12:32 ` [PATCH v9 0/13] implement dlmem() function Adhemerval Zanella Netto
2023-03-29 13:10   ` stsp
2023-03-29 13:18   ` stsp
2023-03-31 12:20     ` Szabolcs Nagy
2023-03-31 13:51       ` stsp
2023-03-31 14:49         ` Rich Felker
2023-03-31 14:56           ` stsp
2023-03-31 14:58             ` Rich Felker
2023-03-31 15:03               ` stsp
2023-03-31 14:44       ` stsp
2023-03-31 15:12       ` stsp
2023-03-31 17:12         ` Szabolcs Nagy
2023-03-31 17:36           ` stsp
2023-04-01  9:28             ` stsp
2023-04-03 10:04             ` Szabolcs Nagy
2023-04-03 10:43               ` stsp
2023-04-03 12:01                 ` Szabolcs Nagy
2023-04-03 13:07                   ` stsp
2023-04-05  7:29                   ` stsp
2023-04-05  8:51                     ` Szabolcs Nagy
2023-04-05  9:26                       ` stsp
2023-04-05  9:31                       ` Florian Weimer
2023-04-12 17:23                       ` stsp
2023-04-12 18:00                         ` stsp
2023-04-12 18:20                           ` Rich Felker
2023-04-12 18:46                             ` stsp
2023-04-12 19:52                               ` Zack Weinberg
2023-04-12 19:07                             ` stsp
2023-04-13 10:01                             ` stsp
2023-04-13 12:38                               ` Szabolcs Nagy
2023-04-13 15:59                                 ` stsp
2023-04-13 18:09                                   ` Adhemerval Zanella Netto
2023-04-13 18:59                                     ` stsp
2023-04-13 19:12                                       ` Adhemerval Zanella Netto
2023-04-13 19:29                                         ` stsp
2023-04-13 20:02                                           ` Adhemerval Zanella Netto
2023-04-13 20:21                                             ` stsp
2023-04-13 20:57                                             ` stsp
2023-04-14  7:07                                             ` stsp
2023-04-14  7:36                                             ` stsp
2023-04-14 11:30                                             ` stsp
2023-04-14 19:04                                             ` proof for dlmem() (Re: [PATCH v9 0/13] implement dlmem() function) stsp
2023-05-01 23:11                                               ` Zack Weinberg
2023-05-02  5:48                                                 ` stsp
2023-05-08 16:00                                                   ` stsp
2023-05-02  6:24                                                 ` stsp
2023-05-08 15:10                                 ` [PATCH v9 0/13] implement dlmem() function stsp
2023-03-31 18:47           ` stsp
2023-03-31 19:00             ` stsp
2023-03-29 13:17 ` Carlos O'Donell
2023-03-29 13:26   ` stsp
2023-03-29 17:03   ` stsp
2023-03-29 18:13     ` Carlos O'Donell
2023-03-29 18:29       ` stsp
2023-03-31 11:04       ` stsp
2023-04-13 21:17         ` Carlos O'Donell
2023-04-13 21:58           ` stsp
2023-04-13 22:08           ` stsp
2023-04-13 22:50           ` stsp [this message]
2023-04-14 16:15           ` Autoconf maintenance (extremely tangential to Re: [PATCH v9 0/13] implement dlmem() function) Zack Weinberg
2023-04-14 20:24             ` Carlos O'Donell
2023-04-14 20:40               ` Zack Weinberg
2023-05-08 15:05           ` [PATCH v9 0/13] implement dlmem() function stsp
2023-05-19  7:26           ` stsp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaabd8e5-1bab-4e06-3b38-86bb35d896c5@yandex.ru \
    --to=stsp2@yandex.ru \
    --cc=adhemerval.zanella@linaro.org \
    --cc=carlos@redhat.com \
    --cc=janderson@rice.edu \
    --cc=libc-alpha@sourceware.org \
    --cc=szabolcs.nagy@arm.com \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).