public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
To: Florian Weimer <fweimer@redhat.com>,
	"Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: libc-alpha@sourceware.org, zatrazz@gmail.com, carlos@redhat.com
Subject: Re: [PATCH v2] linux: Add support for getrandom vDSO
Date: Mon, 19 Aug 2024 16:35:51 -0300	[thread overview]
Message-ID: <087e3370-7ef3-444d-bd02-1edadf9c6ab7@linaro.org> (raw)
In-Reply-To: <87ttg44u41.fsf@oldenburg.str.redhat.com>



On 01/08/24 03:47, Florian Weimer wrote:
> * Jason A. Donenfeld:
> 
>> diff --git a/malloc/malloc.c b/malloc/malloc.c
>> index bcb6e5b83c..9e577ab900 100644
>> --- a/malloc/malloc.c
>> +++ b/malloc/malloc.c
>> @@ -3140,8 +3140,8 @@ static void
>>  tcache_key_initialize (void)
>>  {
>>    /* We need to use the _nostatus version here, see BZ 29624.  */
>> -  if (__getrandom_nocancel_nostatus (&tcache_key, sizeof(tcache_key),
>> -				     GRND_NONBLOCK)
>> +  if (__getrandom_nocancel_nostatus_direct (&tcache_key, sizeof(tcache_key),
>> +					    GRND_NONBLOCK)
>>        != sizeof (tcache_key))
>>      {
>>        tcache_key = random_bits ();
> 
> Sorry, the comment needs updating.

Ack.

> 
>> diff --git a/sysdeps/unix/sysv/linux/getrandom.c b/sysdeps/unix/sysv/linux/getrandom.c
>> index 777d1decf0..ba3092d7ca 100644
>> --- a/sysdeps/unix/sysv/linux/getrandom.c
>> +++ b/sysdeps/unix/sysv/linux/getrandom.c
> 
>> +static void *
>> +vgetrandom_get_state (void)
>> +{
>> +  void *state = NULL;
>> +
>> +  __libc_lock_lock (grnd_alloc.lock);
>> +
>> +  if (grnd_alloc.len == 0)
>> +    {
>> +      struct vgetrandom_opaque_params params;
>> +      size_t block_size, num = __get_nprocs (); /* Just a decent heuristic. */
>> +      void *block;
>> +
>> +      if (GLRO (dl_vdso_getrandom) (NULL, 0, 0, &params, ~0UL))
>> +        goto out;
>> +      grnd_alloc.state_size = params.size_of_opaque_state;
>> +
>> +      block_size = ALIGN_PAGE (num * grnd_alloc.state_size);
>> +      num = (GLRO (dl_pagesize) / grnd_alloc.state_size) *
>> +            (block_size / GLRO (dl_pagesize));
>> +      block = __mmap (0, block_size, params.mmap_prot,
>> +                      params.mmap_flags, -1, 0);
>> +      if (block == NULL)
>> +        goto out;
>> +      __set_vma_name (block, block_size, " glibc: getrandom");
>> +
>> +      if (grnd_alloc.total + num > grnd_alloc.cap)
>> +        {
>> +          void **states;
>> +          size_t states_size = ALIGN_PAGE (sizeof(*grnd_alloc.states) *
>> +                                           (grnd_alloc.total + num));
>> +          if (grnd_alloc.states)
>> +            states = __mremap (grnd_alloc.states,
>> +                               ALIGN_PAGE (sizeof (*grnd_alloc.states) *
>> +                                           grnd_alloc.cap),
>> +                               states_size, MREMAP_MAYMOVE);
> 
> I think if we fork after this point (after the mremap), the new process
> will no longer have the pointer array at grnd_alloc.states.  Currently,
> it is not safe to simply reset grnd_alloc.lock in the subprocess.
> 
> It may be easier to just use a list of pages, so that a fork at the
> wrong time merely loses track of the newly allocated page.

The list of pages would have similar issues where it would need to put
the address somewhere, either in the free-state itself (which adds
extra complexity in the free-state tracking) or by adding a static array
to track it.

Instead I think that since the idea it to keep the getrandom call 
*consistent* after fork and not really avoid possible resource leakage
(simiar to the thread stack cache), reallocate the free list with
mmap, atomically replace the pointer, and munmap will be suffice.

> 
>> +/* Re-add states of stale threads different from our own that have a
>> +   getrandom_buf in use at the time of a fork().  */
>> +static void
>> +reset_other_threads (list_t *list)
>> +{
>> +  list_t *runp;
>> +  list_for_each (runp, list)
>> +    {
>> +      struct pthread *t = list_entry (runp, struct pthread, list);
>> +      if (t == THREAD_SELF || !t->tls_state.getrandom_buf)
>> +        continue;
>> +      grnd_alloc.states[grnd_alloc.len++]
>> +          = RELEASE_PTR (t->tls_state.getrandom_buf);
>> +      t->tls_state.getrandom_buf = NULL;
>> +    }
>> +}
> 
> I thought that the kernel would drop those mappings on fork?  Or do they
> stick around and just get replaced with the zero page?

The kernel does not drop, afaik MAP_DROPPABLE (used by getrandom vDSO
pages) follows usual mmap semantic regarding fork().

> 
> As I said before, the whole thing is rather suspicious.  We should not
> need to walk the thread list after fork (or _Fork).  We know that only
> one thread exists, and where all the getrandom states are.  We can put
> all states except that of the current thread (if any) on the freelist.
> This only needs the freelist itself to be handled in a fork-safe manner.

We do not track the *pages* allocated for the opaque states because they
will be either in the free states list (grnd_alloc.states) or allocated
to a running thread.  So on fork(), it is way simpler to just iterate
over the thread cache than tracking the pages.

I have move the logic to reclaim the getrandom opaque states to
reclaim_stacks, so it is the only place where fork() iterates over the
list.

> 
>> +/* Called when a thread terminates, and adds its random buffer back into the
>> +   allocator pool for use in a future thread.  */
>> +void
>> +__getrandom_vdso_release (void)
>> +{
>> +#ifdef HAVE_GETRANDOM_VSYSCALL
>> +  void *state = __glibc_tls_internal ()->getrandom_buf;
>> +  if (state == NULL)
>> +    return;
>> +
>> +  __libc_lock_lock (grnd_alloc.lock);
>> +  grnd_alloc.states[grnd_alloc.len++] = state;
>> +  __libc_lock_unlock (grnd_alloc.lock);
>> +#endif
>> +}
> 
> At present, this has a race if a signal handler installs a state pointer
> after the null check (see below for start_thread).
> 
>> diff --git a/sysdeps/unix/sysv/linux/getrandom_vdso.h b/sysdeps/unix/sysv/linux/getrandom_vdso.h
>> new file mode 100644
>> index 0000000000..d3130c1038
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/getrandom_vdso.h
> 
>> +struct vgetrandom_opaque_params
>> +{
>> +  uint32_t size_of_opaque_state;
>> +  uint32_t mmap_prot;
>> +  uint32_t mmap_flags;
>> +  uint32_t reserved[13];
>> +};
> 
> This struct is shared with the kernel, right?  Needs a comment (perhaps
> pointing to kernel documentation).

Ack.

> 
>> diff --git a/sysdeps/unix/sysv/linux/tls-internal-struct.h b/sysdeps/unix/sysv/linux/tls-internal-struct.h
>> new file mode 100644
>> index 0000000000..f8d561c987
>> --- /dev/null
>> +++ b/sysdeps/unix/sysv/linux/tls-internal-struct.h
> 
>> +struct tls_internal_t
>> +{
>> +  char *strsignal_buf;
>> +  char *strerror_l_buf;
>> +  void *getrandom_buf;
>> +};
> 
> The getrandom_buf member should go directly into struct pthread in
> nptl/descr.h because it's Linux-specific.  We want to keep
> tls_internal_t generic.

Ack.

> 
>> diff --git a/sysdeps/unix/sysv/linux/tls-internal.c b/sysdeps/unix/sysv/linux/tls-internal.c
>> index 81ebcdfbf8..886b55daf9 100644
>> --- a/sysdeps/unix/sysv/linux/tls-internal.c
>> +++ b/sysdeps/unix/sysv/linux/tls-internal.c
> 
>> @@ -25,4 +26,5 @@ __glibc_tls_internal_free (void)
>>    struct pthread *self = THREAD_SELF;
>>    free (self->tls_state.strsignal_buf);
>>    free (self->tls_state.strerror_l_buf);
>> +  call_function_static_weak (__getrandom_vdso_release);
>>  }
> 
> The __glibc_tls_internal_free function is not called late enough, and
> user code can still run at this point.  You need to call
> __getrandom_vdso_release after signals are disabled in start_thread in
> nptl/pthread_create.c.

Ack.

> 
> We need to discuss what we should do about getrandom as a cancellation
> point.  If no longer want it to be a cancellation point, that should be
> changed first, in a separate patch, I think.

Indeed, the vDSO currently always issues the syscall if it can not use
the opaque state for any reason (invalid size, RNG is not ready, reentracy
protection, etc) and this breaks the SYSCALL_CANCEL macro.

I don't have a strong opinion here, maybe a better option would to just
avoid calling the syscall on the vDSO (but this would make it differ from
other vDSO symbols).  Jason, what do you think?  Would current getrandom
implementation block?

  reply	other threads:[~2024-08-19 19:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-30 13:24 [PATCH v1] " Jason A. Donenfeld
2024-07-30 20:16 ` Florian Weimer
2024-07-31 12:38   ` Adhemerval Zanella Netto
2024-08-01  1:47   ` Jason A. Donenfeld
2024-08-01  2:11     ` [PATCH v2] " Jason A. Donenfeld
2024-08-01  6:47       ` Florian Weimer
2024-08-19 19:35         ` Adhemerval Zanella Netto [this message]
2024-08-20  4:13           ` Jason A. Donenfeld
2024-08-02 17:32 ` [PATCH v1] " Cristian Rodríguez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=087e3370-7ef3-444d-bd02-1edadf9c6ab7@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=Jason@zx2c4.com \
    --cc=carlos@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=zatrazz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).