public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: stsp <stsp2@yandex.ru>
To: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>,
	Szabolcs Nagy <szabolcs.nagy@arm.com>,
	Rich Felker <dalias@libc.org>
Cc: libc-alpha@sourceware.org, janderson@rice.edu,
	Carlos O'Donell <carlos@redhat.com>
Subject: Re: [PATCH v9 0/13] implement dlmem() function
Date: Thu, 13 Apr 2023 23:59:03 +0500	[thread overview]
Message-ID: <59862084-0fe3-7642-d3b3-01bb87eef7db@yandex.ru> (raw)
In-Reply-To: <83ee7b42-7a50-e8d1-e9ca-58ec2a12a995@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 4419 bytes --]

Hi Adhemerval,

13.04.2023 23:09, Adhemerval Zanella Netto пишет:
> It is being tiring to work with your proposal because Rich already
> brought the inherent issue of exposing the loader internals for ELF
> objects [1] about 10 years ago,

My proposal did not exist 10 years ago.
Maybe we all do not properly document
our proposals or an objections. So let me
ask you to please refer to a particular
comment rather than to some 10 years
ago thread, and then carefully document
how can that be the problem in my patches.

I think its a fair ask, e.g. Szabolcs asked
me for a more precise spec - I write. We
progress. Why can't you write the objections
in a detalization level that is enough to
make a progress?

I admit you probably couldn't do that initially,
because I poorly documented my API. But
when Carlos asked for a more detailed spec,
I did. Now you can express your objections
in a detailed manner, let me even attach
the prev API draft to help you doing that.
Of course this draft will be simplified a lot,
but for such a "generic" objections please
use its current form.

>   Szabolcs and myself are constantly
> trying to say that is not good libc interface (regardless it solved
> your specific problem),

And regardless that it was never even discussed.
Part of the failure is on me: I wrote an API draft
only when Carlos asked, I should probably do that
before anything else. Sorry, I am a novice. But its
written now for quite long, so why nothing changes?


>   and even Jonathan tried to explore different
> alternatives.

I am thankful to Jonathon for doing that.
Unfortunately I am unsatisfied with the proposal
of intercepting glibc's internal mmap()s on a
syscall level, and binding a non-trivial (I mean,
very non-trivial!) logic on them. But I am thankful
he proved this is even possible, as I was confident
in an otherwise.

> The main problem is no one from glibc is comfortable, including myself,
> to maintain and work with this interface.

Which no one, besides Szabolcs, have actually
even looked into? It is attached to this e-mail.
Why not to make the complains direct and precise?


>   I don't want to maintain
> another interface with a lot of corner cases.

I have expressed the plans at removing all the
corner cases that Szabolcs pointed to. If you
point more, I'll definitely take an actions.
Off-loading the biggest part to RTLD_NORELOC
will reduce the proposal considerably, avoid
the callback and most of other complexity, so
why such a prejudice? Why can't we just discuss,
amend, see what happens?

> Also the way approaching the glibc community, by not listening to
> criticism and ignoring multiple remarks;

I wonder if Szabolcs also thinks I ignore his
remarks. If you think I ignore yours, then either
that was a misunderstanding on my part, or
they were not referring to my API proposal
(which is attached), or something else from that
line, but definitely not my reluctance to change
the proposal. I want to address every possible
problem raised, but unfortunately so far I only
see the problems raised by Szabolcs.
He details them well and targets over my API
proposals. Its immediately obvious what does
he mean, and after his last review I spent a
full month (!) just thinking how to get rid of
a call-back, and finally got to this RTLD_NORELOC
proposal as a result.


>   saying that you have addresses
> the comments without any ack;

That's a bit strange. Yes, I do some actions
based on a comments I get, then post the
new RFC and write in a log that I addressed
this and that. If there is a further complain,
i can re-do what I thought to be addressed,
and replace it with something I believe now
addresses the comment. Its an iterative
process, misunderstanding happens, etc.
Of course if the policy is to not write "addressed
this and that comment" before getting an
ACK, then I will no longer do that, and write
"tried to address this and that, hope its now better".


>   or flooding the maillist with multiple
> version without addressing previous feedback is really, and I want to
> stress the really here, tiring.

Ok, should I post the next drafts to the bugzilla
then, if flooding ML is bad?
Its not a problem, really!
You tell me what to do, I do.
Why such a policy things are even becoming
a problems? I didn't mean to offend anyone
by flooding the ML, I'll post to bugzilla, and I
don't want you to be offended on me just because
of that. :)

[-- Attachment #2: 0000-cover-letter.patch --]
[-- Type: text/x-patch, Size: 8231 bytes --]

From 46e5095ebfe63be4dcd813c4237d6a491a3f9768 Mon Sep 17 00:00:00 2001
From: Stas Sergeev <stsp2@yandex.ru>
Date: Mon, 13 Feb 2023 18:15:34 +0500
Subject: [PATCH v10 0/12] implement dlmem() function

This patch-set implements the dlmem() function that allows to load
the solib from the file-mapped or anonymously-shared memory buffer.
Generic memory buffer is also supported with DLMEM_GENBUF_SRC flag,
but it should not be preferred. Private anonymous mapping used as
a source buffer, also needs this flag to be set.

dlmem() suits as a building block for implementing the functions like
fdlopen() and dlopen_with_offset(), which are demo-implemented in a
test-case called tst-dlmem-extfns.
The reasons why it suits well for such file-based loaders, are below:
1. It correctly handles the file association of the original solib
   buffer if it was mmap'ed from a file (unless DLMEM_GENBUF_SRC
   of DLMEM_DONTREPLACE flag is set).
2. It allows to provide a solib name, which can be the file name.

With the above properties, the "direct" implementation of these functions
gives no advantages over implementing them with dlmem().

In addition, dlmem() has lots of optional functionality for the fine-grained
control over the loading process. It allows you to set nsid (like dlmopen()),
specify the solib relocation address and even relocate the solib into
the user's buffer. That "advanced" functionality is only needed for the
very specific use-cases, like virtualized environments where the relocation
address may have a special constraints, eg MAP_32BIT. In all other cases
it is advised to set the "dlm_args" pointer of dlmem() call to NULL, but
see "Limitations" below to find out when its not the case.

The API looks as below:

/* Callback for dlmem. */
typedef void *
(dlmem_premap_t) (void *mappref, size_t maplength, size_t mapalign,
	          void *cookie);

/* Do not replace mapping created by premap callback.
   dlmem() will then use memcpy(). */
#define DLMEM_DONTREPLACE 1
/* Treat source memory buffer as a generic unaligned buffer, rather
   than a file-backed or anonymously-shared mapping. Anonymous private
   mapping also needs this flag to be set. */
#define DLMEM_GENBUF_SRC 2

struct dlmem_args {
  /* Optional name to associate with the loaded object. */
  const char *soname;
  /* Namespace where to load the object. */
  Lmid_t nsid;
  /* dlmem-specific flags. */
  unsigned int flags;
  /* Optional premap callback. */
  dlmem_premap_t *premap;
  /* Optional argument for premap callback. */
  void *cookie;
};

/* Like `dlmopen', but loads shared object from memory buffer.  */
extern void *dlmem (const unsigned char *buffer, size_t size, int mode,
		    struct dlmem_args *dlm_args);

Advanced functionality:

In most cases dlm_args should just be set to NULL. It provides the
advanced functionality, most of which is obvious (soname, nsid).
The optional premap callback allows to set the relocation address for
the solib by mapping the destination space and returning its address.
More so, if DLMEM_DONTREPLACE flag is used, then the mapping
established by the premap callback, will not be replaced with the
file-backed mapping. In that case dlmem() have to use memcpy(), which
is likely even faster than mmaps() but doesn't end up with the proper
/proc/self/map_files or /proc/self/maps entries. So for example if the
premap callback uses MAP_SHARED, then with the use of the DLMEM_DONTREPLACE
flag you can get your solib relocated into a shared memory buffer.
Note that the premap callback may be called under glibc locks, so it
should restrict itself to the syscall functionality. Certainly no libdl
functions can be used. mmap(), shm_open(), open(), ftruncate(),
memfd_create() should be the sufficient set of functions that may
ever be in use in a premap callback, but in most cases the callback
should just be disabled.


Limitations:

- If you need to load the solib from anonymously-mapped buffer, you need
  to use MAP_SHARED|MAP_ANONYMOUS mmap flags when creating that buffer.
  If it is not possible in your use-case and the buffer was created
  with MAP_PRIVATE|MAP_ANONYMOUS flags, then DLMEM_GENBUF_SRC flag
  needs to be set when calling dlmem().
  Failure to follow that guide-line results in an UB (loader will not
  be able to properly lay out an elf segments).

- If you use a private file-backed mapping, then it shouldn't be
  modified by hands before passing to dlmem(). I.e. you can't apply
  mprotect() to it to change protection bits, and you can't apply
  memmove() to it to move the solib to the beginning of the buffer,
  and so on. dlmem() can only work with "virgin" private file-backed
  mappings. You can set DLMEM_GENBUF_SRC flag as a work-around if
  the mapping is already corrupted.
  Failure to follow that guide-line results in an UB (loader will not
  be able to properly lay out an elf segments).

- The need of mapping the entire solib (with debug info etc) may
  represent a problem on a 32bit architectures if the solib has an
  absurdly large size, like 3Gb or more.

- For the very same reason the efficient implementation of Android's
  dlopen_with_offset() is difficult, as in that case you'd need to
  map the entire file container, starting from the needed offset.
  The demo implementation in this patch implements dlopen_with_offset4()
  that has an additional "length" argument where the solib length
  should be passed.

- As linux doesn't implement MAP_UNALIGNED as some unixes did, the
  efficient implementation of dlopen_with_offset4() is difficult
  if the offset is not page-aligned. Demo in this example fixes the
  alignment by hands, using the MAP_SHARED|MAP_ANONYMOUS intermediate
  buffer. The alignment cannot be fixed in an existing buffer with
  memmove(), as that will make the file-backed mapping unacceptable
  for the use with dlmem(). I suspect that google's dlopen_with_offset()
  has similar limitation because mmap() with unaligned offset is
  not possible in any implementation, be it a "direct" implementation
  or "over-dlmem" implementation.


Changes in v10:
- addressed review comments of Adhemerval Zanella
- moved refactor patches to the beginning of the serie to simplify review
- fixed a few bugs in an elf relocation machinery after various hot discussions
- added a new test tst-dlmem-extfns that demo-implements dlopen_with_offset4()
  and fdlopen()
- studied and documented all limitations, most importantly those leading to UB
- better documented premap callback as suggested by Szabolcs Nagy
- added DLMEM_GENBUF_SRC flag for unaligned generic memory buffers

Changes in v9:
- use "zero-copy" machinery instead of memcpy(). It works on linux 5.13
  and newer, falling back to memcpy() otherwise. Suggested by Florian Weimer.
- implement fdlopen() using the above functionality. It is in a new test
  tst-dlmem-fdlopen. Suggested by Carlos O'Donell.
- add DLMEM_DONTREPLACE flag that doesn't replace the backing-store mapping.
  It switches back to memcpy(). Test-case is called tst-dlmem-shm.

Changes in v8:
- drop audit machinery and instead add an extra arg (optional pointer
  to a struct) to dlmem() itself that allows to install a custom premap
  callback or to specify nsid. Audit machinery was meant to allow
  controling over the pre-existing APIs like dlopen(), but if someone
  ever needs such extensions to dlopen(), he can trivially implement
  dlopen() on top of dlmem().

Changes in v7:
- add _dl_audit_premap audit extension and its usage example

Changes in v6:
- use __strdup("") for l_name as suggested by Andreas Schwab

Changes in v5:
- added _dl_audit_premap_dlmem audit extension for dlmem
- added tst-auditmod-dlmem.c test-case that feeds shm fd to dlmem()

Changes in v4:
- re-target to GLIBC_2.38
- add tst-auditdlmem.c test-case to test auditing
- drop length page-aligning in tst-dlmem: mmap() aligns length on its own
- bugfix: in do_mmapcpy() allow mmaps past end of buffer

Changes in v3:
- Changed prototype of dlmem() (and all the internal machinery) to
  use "const unsigned char *buffer" instead of "const char *buffer".

Changes in v2:
- use <support/test-driver.c> instead of "../test-skeleton.c"
- re-target to GLIBC_2.37
- update all libc.abilist files

-- 
2.37.2


  reply	other threads:[~2023-04-13 18:59 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-18 16:50 Stas Sergeev
2023-03-18 16:50 ` [PATCH 01/13] elf: strdup() l_name if no realname [BZ #30100] Stas Sergeev
2023-03-29 13:54   ` Adhemerval Zanella Netto
2023-03-29 14:12     ` stsp
2023-03-29 14:19       ` Adhemerval Zanella Netto
2023-03-29 14:28         ` stsp
2023-03-29 14:30           ` Adhemerval Zanella Netto
2023-03-29 14:33             ` stsp
2023-03-18 16:50 ` [PATCH 02/13] elf: switch _dl_map_segment() to anonymous mapping Stas Sergeev
2023-03-29 17:01   ` Adhemerval Zanella Netto
2023-03-29 18:00     ` stsp
2023-03-29 18:29       ` Adhemerval Zanella Netto
2023-03-29 18:46         ` stsp
2023-03-29 19:17           ` Adhemerval Zanella Netto
2023-03-29 19:43             ` stsp
2023-03-18 16:51 ` [PATCH 03/13] elf: dont pass fd to _dl_process_pt_xx Stas Sergeev
2023-03-29 17:10   ` Adhemerval Zanella Netto
2023-03-30 16:08     ` stsp
2023-03-30 20:46       ` Adhemerval Zanella Netto
2023-03-31 12:02         ` Szabolcs Nagy
2023-03-31 12:54           ` Adhemerval Zanella Netto
2023-03-31 14:04             ` stsp
2023-03-18 16:51 ` [PATCH 04/13] elf: split _dl_map_object_from_fd() into reusable parts Stas Sergeev
2023-03-18 16:51 ` [PATCH 05/13] elf: split open_verify() " Stas Sergeev
2023-03-18 16:51 ` [PATCH 06/13] elf: load elf hdr fully in open_verify() Stas Sergeev
2023-03-18 16:51 ` [PATCH 07/13] elf: convert pread64 to callback in do_open_verify() Stas Sergeev
2023-03-18 16:51 ` [PATCH 08/13] elf: convert _dl_map_segments's mmap() to a callback Stas Sergeev
2023-03-18 16:51 ` [PATCH 09/13] elf: call _dl_map_segment() via premap callback Stas Sergeev
2023-03-18 16:51 ` [PATCH 10/13] elf: convert _dl_map_object to a callback Stas Sergeev
2023-03-18 16:51 ` [PATCH 11/13] elf: split _dl_check_loaded() from _dl_map_object Stas Sergeev
2023-03-18 16:51 ` [PATCH 12/13] dlfcn,elf: implement dlmem() [BZ #11767] Stas Sergeev
2023-03-29 13:45   ` Carlos O'Donell
2023-03-29 13:51     ` stsp
2023-03-29 14:10       ` Jonathon Anderson
2023-03-29 14:20         ` stsp
2023-03-29 14:31           ` Adhemerval Zanella Netto
2023-03-29 15:01             ` stsp
2023-03-29 14:35           ` Carlos O'Donell
2023-03-29 14:50             ` stsp
2023-03-29 15:20               ` Carlos O'Donell
2023-03-29 15:34                 ` stsp
2023-03-30  8:09         ` stsp
2023-03-18 16:51 ` [PATCH 13/13] dlfcn,elf: impl DLMEM_DONTREPLACE dlmem() flag Stas Sergeev
2023-03-29 12:32 ` [PATCH v9 0/13] implement dlmem() function Adhemerval Zanella Netto
2023-03-29 13:10   ` stsp
2023-03-29 13:18   ` stsp
2023-03-31 12:20     ` Szabolcs Nagy
2023-03-31 13:51       ` stsp
2023-03-31 14:49         ` Rich Felker
2023-03-31 14:56           ` stsp
2023-03-31 14:58             ` Rich Felker
2023-03-31 15:03               ` stsp
2023-03-31 14:44       ` stsp
2023-03-31 15:12       ` stsp
2023-03-31 17:12         ` Szabolcs Nagy
2023-03-31 17:36           ` stsp
2023-04-01  9:28             ` stsp
2023-04-03 10:04             ` Szabolcs Nagy
2023-04-03 10:43               ` stsp
2023-04-03 12:01                 ` Szabolcs Nagy
2023-04-03 13:07                   ` stsp
2023-04-05  7:29                   ` stsp
2023-04-05  8:51                     ` Szabolcs Nagy
2023-04-05  9:26                       ` stsp
2023-04-05  9:31                       ` Florian Weimer
2023-04-12 17:23                       ` stsp
2023-04-12 18:00                         ` stsp
2023-04-12 18:20                           ` Rich Felker
2023-04-12 18:46                             ` stsp
2023-04-12 19:52                               ` Zack Weinberg
2023-04-12 19:07                             ` stsp
2023-04-13 10:01                             ` stsp
2023-04-13 12:38                               ` Szabolcs Nagy
2023-04-13 15:59                                 ` stsp
2023-04-13 18:09                                   ` Adhemerval Zanella Netto
2023-04-13 18:59                                     ` stsp [this message]
2023-04-13 19:12                                       ` Adhemerval Zanella Netto
2023-04-13 19:29                                         ` stsp
2023-04-13 20:02                                           ` Adhemerval Zanella Netto
2023-04-13 20:21                                             ` stsp
2023-04-13 20:57                                             ` stsp
2023-04-14  7:07                                             ` stsp
2023-04-14  7:36                                             ` stsp
2023-04-14 11:30                                             ` stsp
2023-04-14 19:04                                             ` proof for dlmem() (Re: [PATCH v9 0/13] implement dlmem() function) stsp
2023-05-01 23:11                                               ` Zack Weinberg
2023-05-02  5:48                                                 ` stsp
2023-05-08 16:00                                                   ` stsp
2023-05-02  6:24                                                 ` stsp
2023-05-08 15:10                                 ` [PATCH v9 0/13] implement dlmem() function stsp
2023-03-31 18:47           ` stsp
2023-03-31 19:00             ` stsp
2023-03-29 13:17 ` Carlos O'Donell
2023-03-29 13:26   ` stsp
2023-03-29 17:03   ` stsp
2023-03-29 18:13     ` Carlos O'Donell
2023-03-29 18:29       ` stsp
2023-03-31 11:04       ` stsp
2023-04-13 21:17         ` Carlos O'Donell
2023-04-13 21:58           ` stsp
2023-04-13 22:08           ` stsp
2023-04-13 22:50           ` stsp
2023-04-14 16:15           ` Autoconf maintenance (extremely tangential to Re: [PATCH v9 0/13] implement dlmem() function) Zack Weinberg
2023-04-14 20:24             ` Carlos O'Donell
2023-04-14 20:40               ` Zack Weinberg
2023-05-08 15:05           ` [PATCH v9 0/13] implement dlmem() function stsp
2023-05-19  7:26           ` stsp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59862084-0fe3-7642-d3b3-01bb87eef7db@yandex.ru \
    --to=stsp2@yandex.ru \
    --cc=adhemerval.zanella@linaro.org \
    --cc=carlos@redhat.com \
    --cc=dalias@libc.org \
    --cc=janderson@rice.edu \
    --cc=libc-alpha@sourceware.org \
    --cc=szabolcs.nagy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).