public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: stsp <stsp2@yandex.ru>
To: Carlos O'Donell <carlos@redhat.com>,
	libc-alpha@sourceware.org, Jonathon Anderson <janderson@rice.edu>
Subject: Re: [PATCH v9 0/13] implement dlmem() function
Date: Fri, 31 Mar 2023 16:04:48 +0500	[thread overview]
Message-ID: <09b644bc-3d6e-39cf-02c2-af5c5a72e248@yandex.ru> (raw)
In-Reply-To: <81d75bd3-147e-f85a-9955-0c7f0f2dfbeb@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 843 bytes --]

Hi Carlos, Jonathon.

29.03.2023 23:13, Carlos O'Donell пишет:
> The most important thing is the reasons for the change and that should come first.

Done, please see the attachment.

> This needs to explain the workload that requires the API.
>
> Why that workload is generic.
>
> Any core system library, like glibc, is not the place for *all* APIs, but the
> place for the most generic building blocks for ISO C, POSIX, BSD, GNU, and
> Linux APIs.

I hope I've got that suggestion rightly.
Please see if my current description is
adequate.

And on the other front...
I studied and documented (in the attachment)
all the cases where my impl fails to arrange
an elf segments properly... I have to admit
such cases were possible. :(
I documented them and their mitigations in
the "Limitations" section.
Let me know if this is now adequate.

[-- Attachment #2: 0000-cover-letter.patch --]
[-- Type: text/x-patch, Size: 7080 bytes --]

From 46e5095ebfe63be4dcd813c4237d6a491a3f9768 Mon Sep 17 00:00:00 2001
From: Stas Sergeev <stsp2@yandex.ru>
Date: Mon, 13 Feb 2023 18:15:34 +0500
Subject: [PATCH v10 0/12] implement dlmem() function

This patch-set implements the dlmem() function that allows to load
the solib from page-aligned memory buffer. It suits as a building
block for implementing the functions like fdlopen() and dlopen_with_offset(),
which are demo-implemented in a test-case called tst-dlmem-extfns.
The reasons why it suits well for such file-based loaders, are below:
1. It correctly handles the file association of the original solib
   buffer if it was mmap'ed from a file.
2. It allows to provide a solib name, which can be the file name.

With the above properties, the "direct" implementation of these functions
gives no advantages over implementing them with dlmem().

In addition, dlmem() has lots of optional functionality for the fine-grained
control over the loading process. It allows you to set nsid (like dlmopen()),
specify the solib relocation address and even relocate the solib into
the user's buffer. That "advanced" functionality is only needed for the
very specific use-cases, like virtualized environments where the relocation
address may have a special constraints, eg MAP_32BIT. In all other cases
it is advised to set the "dlm_args" pointer of dlmem() call to NULL, but
see "Limitations" below to find out when its not the case.

The API looks as below:

/* Callback for dlmem. */
typedef void *
(dlmem_premap_t) (void *mappref, size_t maplength, size_t mapalign,
	          void *cookie);

/* Do not replace mapping created by premap callback.
   dlmem() will then use memcpy(). */
#define DLMEM_DONTREPLACE 1

struct dlmem_args {
  /* Optional name to associate with the loaded object. */
  const char *soname;
  /* Namespace where to load the object. */
  Lmid_t nsid;
  /* dlmem-specific flags. */
  unsigned int flags;
  /* Optional premap callback. */
  dlmem_premap_t *premap;
  /* Optional argument for premap callback. */
  void *cookie;
};

/* Like `dlmopen', but loads shared object from memory buffer.  */
extern void *dlmem (const unsigned char *buffer, size_t size, int mode,
		    struct dlmem_args *dlm_args);


In most cases dlm_args should just be set to NULL. It provides the
advanced functionality, most of which is obvious (soname, nsid).
The premap callback allows to set the relocation address for the solib.
More so, if DLMEM_DONTREPLACE flag is used, then the mapping established
by the premap callback, will not be replaced with the file-backed mapping.
In that case dlmem() have to use memcpy(), which is likely even faster
than mmaps() but doesn't end up with the proper /proc/self/map_files
or /proc/self/maps entries. So for example if the premap callback uses
MAP_SHARED, then with the use of the DLMEM_DONTREPLACE flag you can get
your solib relocated into a shared memory buffer.

Limitations:

- If you need to load the solib from anonymously-mapped buffer, you need
  to use MAP_SHARED|MAP_ANONYMOUS mmap flags when creating that buffer.
  If it is not possible in your use-case and the buffer was created
  with MAP_PRIVATE|MAP_ANONYMOUS flags, then DLMEM_DONTREPLACE flag
  needs to be set when calling dlmem().
  Failure to follow that guide-line results in an UB (loader will not
  be able to properly lay out an elf segments).

- If you use a private file-backed mapping, then it shouldn't be
  modified by hands before passing to dlmem(). I.e. you can't apply
  mprotect() to it to change protection bits, and you can't apply
  memmove() to it to move the solib to the beginning of the buffer,
  and so on. dlmem() can only work with "virgin" private file-backed
  mappings. You can set DLMEM_DONTREPLACE flag as a work-around if
  the mapping is already corrupted.
  Failure to follow that guide-line results in an UB (loader will not
  be able to properly lay out an elf segments).

- The need of mapping the entire solib (with debug info etc) may
  represent a problem on a 32bit architectures if the solib has an
  absurdly large size, like 3Gb or more.

- For the very same reason the efficient implementation of Android's
  dlopen_with_offset() is difficult, as in that case you'd need to
  map the entire file container, starting from the needed offset.
  The demo implementation in this patch implements dlopen_with_offset4()
  that has an additional "length" argument where the solib length
  should be passed.

- As linux doesn't implement MAP_UNALIGNED as some unixes did, the
  efficient implementation of dlopen_with_offset4() is difficult
  if the offset is not page-aligned. Demo in this example fixes the
  alignment by hands, using the MAP_SHARED|MAP_ANONYMOUS intermediate
  buffer. The alignment cannot be fixed in an existing buffer with
  memmove(), as that will make the file-backed mapping unacceptable
  for the use with dlmem(). I suspect that google's dlopen_with_offset()
  has similar limitation because mmap() with unaligned offset is
  not possible in any implementation, be it a "direct" implementation
  or "over-dlmem" implementation.

Changes in v10:
- addressed review comments of Adhemerval Zanella
- fixed a few bugs in an elf relocation machinery after various hot discussions
- added a new test tst-dlmem-extfns that demo-implements dlopen_with_offset4()
  and fdlopen()
- studied and documented all limitations, most importantly those leading to UB

Changes in v9:
- use "zero-copy" machinery instead of memcpy(). It works on linux 5.13
  and newer, falling back to memcpy() otherwise. Suggested by Florian Weimer.
- implement fdlopen() using the above functionality. It is in a new test
  tst-dlmem-fdlopen. Suggested by Carlos O'Donell.
- add DLMEM_DONTREPLACE flag that doesn't replace the backing-store mapping.
  It switches back to memcpy(). Test-case is called tst-dlmem-shm.

Changes in v8:
- drop audit machinery and instead add an extra arg (optional pointer
  to a struct) to dlmem() itself that allows to install a custom premap
  callback or to specify nsid. Audit machinery was meant to allow
  controling over the pre-existing APIs like dlopen(), but if someone
  ever needs such extensions to dlopen(), he can trivially implement
  dlopen() on top of dlmem().

Changes in v7:
- add _dl_audit_premap audit extension and its usage example

Changes in v6:
- use __strdup("") for l_name as suggested by Andreas Schwab

Changes in v5:
- added _dl_audit_premap_dlmem audit extension for dlmem
- added tst-auditmod-dlmem.c test-case that feeds shm fd to dlmem()

Changes in v4:
- re-target to GLIBC_2.38
- add tst-auditdlmem.c test-case to test auditing
- drop length page-aligning in tst-dlmem: mmap() aligns length on its own
- bugfix: in do_mmapcpy() allow mmaps past end of buffer

Changes in v3:
- Changed prototype of dlmem() (and all the internal machinery) to
  use "const unsigned char *buffer" instead of "const char *buffer".

Changes in v2:
- use <support/test-driver.c> instead of "../test-skeleton.c"
- re-target to GLIBC_2.37
- update all libc.abilist files

-- 
2.37.2


  parent reply	other threads:[~2023-03-31 11:04 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-18 16:50 Stas Sergeev
2023-03-18 16:50 ` [PATCH 01/13] elf: strdup() l_name if no realname [BZ #30100] Stas Sergeev
2023-03-29 13:54   ` Adhemerval Zanella Netto
2023-03-29 14:12     ` stsp
2023-03-29 14:19       ` Adhemerval Zanella Netto
2023-03-29 14:28         ` stsp
2023-03-29 14:30           ` Adhemerval Zanella Netto
2023-03-29 14:33             ` stsp
2023-03-18 16:50 ` [PATCH 02/13] elf: switch _dl_map_segment() to anonymous mapping Stas Sergeev
2023-03-29 17:01   ` Adhemerval Zanella Netto
2023-03-29 18:00     ` stsp
2023-03-29 18:29       ` Adhemerval Zanella Netto
2023-03-29 18:46         ` stsp
2023-03-29 19:17           ` Adhemerval Zanella Netto
2023-03-29 19:43             ` stsp
2023-03-18 16:51 ` [PATCH 03/13] elf: dont pass fd to _dl_process_pt_xx Stas Sergeev
2023-03-29 17:10   ` Adhemerval Zanella Netto
2023-03-30 16:08     ` stsp
2023-03-30 20:46       ` Adhemerval Zanella Netto
2023-03-31 12:02         ` Szabolcs Nagy
2023-03-31 12:54           ` Adhemerval Zanella Netto
2023-03-31 14:04             ` stsp
2023-03-18 16:51 ` [PATCH 04/13] elf: split _dl_map_object_from_fd() into reusable parts Stas Sergeev
2023-03-18 16:51 ` [PATCH 05/13] elf: split open_verify() " Stas Sergeev
2023-03-18 16:51 ` [PATCH 06/13] elf: load elf hdr fully in open_verify() Stas Sergeev
2023-03-18 16:51 ` [PATCH 07/13] elf: convert pread64 to callback in do_open_verify() Stas Sergeev
2023-03-18 16:51 ` [PATCH 08/13] elf: convert _dl_map_segments's mmap() to a callback Stas Sergeev
2023-03-18 16:51 ` [PATCH 09/13] elf: call _dl_map_segment() via premap callback Stas Sergeev
2023-03-18 16:51 ` [PATCH 10/13] elf: convert _dl_map_object to a callback Stas Sergeev
2023-03-18 16:51 ` [PATCH 11/13] elf: split _dl_check_loaded() from _dl_map_object Stas Sergeev
2023-03-18 16:51 ` [PATCH 12/13] dlfcn,elf: implement dlmem() [BZ #11767] Stas Sergeev
2023-03-29 13:45   ` Carlos O'Donell
2023-03-29 13:51     ` stsp
2023-03-29 14:10       ` Jonathon Anderson
2023-03-29 14:20         ` stsp
2023-03-29 14:31           ` Adhemerval Zanella Netto
2023-03-29 15:01             ` stsp
2023-03-29 14:35           ` Carlos O'Donell
2023-03-29 14:50             ` stsp
2023-03-29 15:20               ` Carlos O'Donell
2023-03-29 15:34                 ` stsp
2023-03-30  8:09         ` stsp
2023-03-18 16:51 ` [PATCH 13/13] dlfcn,elf: impl DLMEM_DONTREPLACE dlmem() flag Stas Sergeev
2023-03-29 12:32 ` [PATCH v9 0/13] implement dlmem() function Adhemerval Zanella Netto
2023-03-29 13:10   ` stsp
2023-03-29 13:18   ` stsp
2023-03-31 12:20     ` Szabolcs Nagy
2023-03-31 13:51       ` stsp
2023-03-31 14:49         ` Rich Felker
2023-03-31 14:56           ` stsp
2023-03-31 14:58             ` Rich Felker
2023-03-31 15:03               ` stsp
2023-03-31 14:44       ` stsp
2023-03-31 15:12       ` stsp
2023-03-31 17:12         ` Szabolcs Nagy
2023-03-31 17:36           ` stsp
2023-04-01  9:28             ` stsp
2023-04-03 10:04             ` Szabolcs Nagy
2023-04-03 10:43               ` stsp
2023-04-03 12:01                 ` Szabolcs Nagy
2023-04-03 13:07                   ` stsp
2023-04-05  7:29                   ` stsp
2023-04-05  8:51                     ` Szabolcs Nagy
2023-04-05  9:26                       ` stsp
2023-04-05  9:31                       ` Florian Weimer
2023-04-12 17:23                       ` stsp
2023-04-12 18:00                         ` stsp
2023-04-12 18:20                           ` Rich Felker
2023-04-12 18:46                             ` stsp
2023-04-12 19:52                               ` Zack Weinberg
2023-04-12 19:07                             ` stsp
2023-04-13 10:01                             ` stsp
2023-04-13 12:38                               ` Szabolcs Nagy
2023-04-13 15:59                                 ` stsp
2023-04-13 18:09                                   ` Adhemerval Zanella Netto
2023-04-13 18:59                                     ` stsp
2023-04-13 19:12                                       ` Adhemerval Zanella Netto
2023-04-13 19:29                                         ` stsp
2023-04-13 20:02                                           ` Adhemerval Zanella Netto
2023-04-13 20:21                                             ` stsp
2023-04-13 20:57                                             ` stsp
2023-04-14  7:07                                             ` stsp
2023-04-14  7:36                                             ` stsp
2023-04-14 11:30                                             ` stsp
2023-04-14 19:04                                             ` proof for dlmem() (Re: [PATCH v9 0/13] implement dlmem() function) stsp
2023-05-01 23:11                                               ` Zack Weinberg
2023-05-02  5:48                                                 ` stsp
2023-05-08 16:00                                                   ` stsp
2023-05-02  6:24                                                 ` stsp
2023-05-08 15:10                                 ` [PATCH v9 0/13] implement dlmem() function stsp
2023-03-31 18:47           ` stsp
2023-03-31 19:00             ` stsp
2023-03-29 13:17 ` Carlos O'Donell
2023-03-29 13:26   ` stsp
2023-03-29 17:03   ` stsp
2023-03-29 18:13     ` Carlos O'Donell
2023-03-29 18:29       ` stsp
2023-03-31 11:04       ` stsp [this message]
2023-04-13 21:17         ` Carlos O'Donell
2023-04-13 21:58           ` stsp
2023-04-13 22:08           ` stsp
2023-04-13 22:50           ` stsp
2023-04-14 16:15           ` Autoconf maintenance (extremely tangential to Re: [PATCH v9 0/13] implement dlmem() function) Zack Weinberg
2023-04-14 20:24             ` Carlos O'Donell
2023-04-14 20:40               ` Zack Weinberg
2023-05-08 15:05           ` [PATCH v9 0/13] implement dlmem() function stsp
2023-05-19  7:26           ` stsp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09b644bc-3d6e-39cf-02c2-af5c5a72e248@yandex.ru \
    --to=stsp2@yandex.ru \
    --cc=carlos@redhat.com \
    --cc=janderson@rice.edu \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).