Re: [PATCH v2] linux: Add support for getrandom vDSO

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Florian Weimer <fweimer@redhat.com>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: libc-alpha@sourceware.org,  zatrazz@gmail.com,  carlos@redhat.com
Subject: Re: [PATCH v2] linux: Add support for getrandom vDSO
Date: Thu, 01 Aug 2024 08:47:42 +0200	[thread overview]
Message-ID: <87ttg44u41.fsf@oldenburg.str.redhat.com> (raw)
In-Reply-To: <20240801021136.1322275-1-Jason@zx2c4.com> (Jason A. Donenfeld's message of "Thu, 1 Aug 2024 04:11:36 +0200")

* Jason A. Donenfeld:

> diff --git a/malloc/malloc.c b/malloc/malloc.c
> index bcb6e5b83c..9e577ab900 100644
> --- a/malloc/malloc.c
> +++ b/malloc/malloc.c
> @@ -3140,8 +3140,8 @@ static void
>  tcache_key_initialize (void)
>  {
>    /* We need to use the _nostatus version here, see BZ 29624.  */
> -  if (__getrandom_nocancel_nostatus (&tcache_key, sizeof(tcache_key),
> -				     GRND_NONBLOCK)
> +  if (__getrandom_nocancel_nostatus_direct (&tcache_key, sizeof(tcache_key),
> +					    GRND_NONBLOCK)
>        != sizeof (tcache_key))
>      {
>        tcache_key = random_bits ();

Sorry, the comment needs updating.

> diff --git a/sysdeps/unix/sysv/linux/getrandom.c b/sysdeps/unix/sysv/linux/getrandom.c
> index 777d1decf0..ba3092d7ca 100644
> --- a/sysdeps/unix/sysv/linux/getrandom.c
> +++ b/sysdeps/unix/sysv/linux/getrandom.c

> +static void *
> +vgetrandom_get_state (void)
> +{
> +  void *state = NULL;
> +
> +  __libc_lock_lock (grnd_alloc.lock);
> +
> +  if (grnd_alloc.len == 0)
> +    {
> +      struct vgetrandom_opaque_params params;
> +      size_t block_size, num = __get_nprocs (); /* Just a decent heuristic. */
> +      void *block;
> +
> +      if (GLRO (dl_vdso_getrandom) (NULL, 0, 0, &params, ~0UL))
> +        goto out;
> +      grnd_alloc.state_size = params.size_of_opaque_state;
> +
> +      block_size = ALIGN_PAGE (num * grnd_alloc.state_size);
> +      num = (GLRO (dl_pagesize) / grnd_alloc.state_size) *
> +            (block_size / GLRO (dl_pagesize));
> +      block = __mmap (0, block_size, params.mmap_prot,
> +                      params.mmap_flags, -1, 0);
> +      if (block == NULL)
> +        goto out;
> +      __set_vma_name (block, block_size, " glibc: getrandom");
> +
> +      if (grnd_alloc.total + num > grnd_alloc.cap)
> +        {
> +          void **states;
> +          size_t states_size = ALIGN_PAGE (sizeof(*grnd_alloc.states) *
> +                                           (grnd_alloc.total + num));
> +          if (grnd_alloc.states)
> +            states = __mremap (grnd_alloc.states,
> +                               ALIGN_PAGE (sizeof (*grnd_alloc.states) *
> +                                           grnd_alloc.cap),
> +                               states_size, MREMAP_MAYMOVE);

I think if we fork after this point (after the mremap), the new process
will no longer have the pointer array at grnd_alloc.states.  Currently,
it is not safe to simply reset grnd_alloc.lock in the subprocess.

It may be easier to just use a list of pages, so that a fork at the
wrong time merely loses track of the newly allocated page.

> +/* Re-add states of stale threads different from our own that have a
> +   getrandom_buf in use at the time of a fork().  */
> +static void
> +reset_other_threads (list_t *list)
> +{
> +  list_t *runp;
> +  list_for_each (runp, list)
> +    {
> +      struct pthread *t = list_entry (runp, struct pthread, list);
> +      if (t == THREAD_SELF || !t->tls_state.getrandom_buf)
> +        continue;
> +      grnd_alloc.states[grnd_alloc.len++]
> +          = RELEASE_PTR (t->tls_state.getrandom_buf);
> +      t->tls_state.getrandom_buf = NULL;
> +    }
> +}

I thought that the kernel would drop those mappings on fork?  Or do they
stick around and just get replaced with the zero page?

As I said before, the whole thing is rather suspicious.  We should not
need to walk the thread list after fork (or _Fork).  We know that only
one thread exists, and where all the getrandom states are.  We can put
all states except that of the current thread (if any) on the freelist.
This only needs the freelist itself to be handled in a fork-safe manner.

> +/* Called when a thread terminates, and adds its random buffer back into the
> +   allocator pool for use in a future thread.  */
> +void
> +__getrandom_vdso_release (void)
> +{
> +#ifdef HAVE_GETRANDOM_VSYSCALL
> +  void *state = __glibc_tls_internal ()->getrandom_buf;
> +  if (state == NULL)
> +    return;
> +
> +  __libc_lock_lock (grnd_alloc.lock);
> +  grnd_alloc.states[grnd_alloc.len++] = state;
> +  __libc_lock_unlock (grnd_alloc.lock);
> +#endif
> +}

At present, this has a race if a signal handler installs a state pointer
after the null check (see below for start_thread).

> diff --git a/sysdeps/unix/sysv/linux/getrandom_vdso.h b/sysdeps/unix/sysv/linux/getrandom_vdso.h
> new file mode 100644
> index 0000000000..d3130c1038
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/getrandom_vdso.h

> +struct vgetrandom_opaque_params
> +{
> +  uint32_t size_of_opaque_state;
> +  uint32_t mmap_prot;
> +  uint32_t mmap_flags;
> +  uint32_t reserved[13];
> +};

This struct is shared with the kernel, right?  Needs a comment (perhaps
pointing to kernel documentation).

> diff --git a/sysdeps/unix/sysv/linux/tls-internal-struct.h b/sysdeps/unix/sysv/linux/tls-internal-struct.h
> new file mode 100644
> index 0000000000..f8d561c987
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/tls-internal-struct.h

> +struct tls_internal_t
> +{
> +  char *strsignal_buf;
> +  char *strerror_l_buf;
> +  void *getrandom_buf;
> +};

The getrandom_buf member should go directly into struct pthread in
nptl/descr.h because it's Linux-specific.  We want to keep
tls_internal_t generic.

> diff --git a/sysdeps/unix/sysv/linux/tls-internal.c b/sysdeps/unix/sysv/linux/tls-internal.c
> index 81ebcdfbf8..886b55daf9 100644
> --- a/sysdeps/unix/sysv/linux/tls-internal.c
> +++ b/sysdeps/unix/sysv/linux/tls-internal.c

> @@ -25,4 +26,5 @@ __glibc_tls_internal_free (void)
>    struct pthread *self = THREAD_SELF;
>    free (self->tls_state.strsignal_buf);
>    free (self->tls_state.strerror_l_buf);
> +  call_function_static_weak (__getrandom_vdso_release);
>  }

The __glibc_tls_internal_free function is not called late enough, and
user code can still run at this point.  You need to call
__getrandom_vdso_release after signals are disabled in start_thread in
nptl/pthread_create.c.

We need to discuss what we should do about getrandom as a cancellation
point.  If no longer want it to be a cancellation point, that should be
changed first, in a separate patch, I think.

Thanks,
Florian

next prev parent reply	other threads:[~2024-08-01  6:50 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-30 13:24 [PATCH v1] " Jason A. Donenfeld
2024-07-30 20:16 ` Florian Weimer
2024-07-31 12:38   ` Adhemerval Zanella Netto
2024-08-01  1:47   ` Jason A. Donenfeld
2024-08-01  2:11     ` [PATCH v2] " Jason A. Donenfeld
2024-08-01  6:47       ` Florian Weimer [this message]
2024-08-19 19:35         ` Adhemerval Zanella Netto
2024-08-20  4:13           ` Jason A. Donenfeld
2024-08-02 17:32 ` [PATCH v1] " Cristian Rodríguez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ttg44u41.fsf@oldenburg.str.redhat.com \
    --to=fweimer@redhat.com \
    --cc=Jason@zx2c4.com \
    --cc=carlos@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=zatrazz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).