public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Noah Goldstein <goldstein.w.n@gmail.com>
Cc: libc-alpha@sourceware.org, fweimer@redhat.com,
	 adhemerval.zanella@linaro.org
Subject: Re: [PATCH v8 2/2] x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registers
Date: Sat, 24 Feb 2024 10:00:06 -0800	[thread overview]
Message-ID: <CAMe9rOrHOet5_24qGoZp77EMSoDU0FunugtUCi_A6WP7LO8Vgw@mail.gmail.com> (raw)
In-Reply-To: <CAFUsyfJMp3-RcXU6PZMDvR+8bYiR4Qr+epQp1+OkV3Rt7zbYVw@mail.gmail.com>

On Sat, Feb 24, 2024 at 9:39 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Sat, Feb 24, 2024 at 11:31 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Sat, Feb 24, 2024 at 9:09 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > >
> > > On Fri, Feb 16, 2024 at 9:17 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > Compiler generates the following instruction sequence for GNU2 dynamic
> > > > TLS access:
> > > >
> > > >         leaq    tls_var@TLSDESC(%rip), %rax
> > > >         call    *tls_var@TLSCALL(%rax)
> > > >
> > > > or
> > > >
> > > >         leal    tls_var@TLSDESC(%ebx), %eax
> > > >         call    *tls_var@TLSCALL(%eax)
> > > >
> > > > CALL instruction is transparent to compiler which assumes all registers,
> > > > except for EFLAGS and RAX/EAX, are unchanged after CALL.  When
> > > > _dl_tlsdesc_dynamic is called, it calls __tls_get_addr on the slow
> > > > path.  __tls_get_addr is a normal function which doesn't preserve any
> > > > caller-saved registers.  _dl_tlsdesc_dynamic saved and restored integer
> > > > caller-saved registers, but didn't preserve any other caller-saved
> > > > registers.  Add _dl_tlsdesc_dynamic IFUNC functions for FNSAVE, FXSAVE,
> > > > XSAVE and XSAVEC to save and restore all caller-saved registers.  This
> > > > fixes BZ #31372.
> > > >
> > > > Add GLRO(dl_x86_64_runtime_resolve) with GLRO(dl_x86_tlsdesc_dynamic)
> > > > to optimize elf_machine_runtime_setup.
> > > > ---
> > > >  elf/Makefile                                 |  14 ++
> > > >  elf/tst-gnu2-tls2.c                          | 120 ++++++++++++
> > > >  elf/tst-gnu2-tls2.h                          |  36 ++++
> > > >  elf/tst-gnu2-tls2mod0.c                      |  31 +++
> > > >  elf/tst-gnu2-tls2mod1.c                      |  31 +++
> > > >  elf/tst-gnu2-tls2mod2.c                      |  31 +++
> > > >  sysdeps/i386/dl-machine.h                    |   2 +-
> > > >  sysdeps/i386/dl-tlsdesc-dynamic.h            | 190 +++++++++++++++++++
> > > >  sysdeps/i386/dl-tlsdesc.S                    | 115 +++++------
> > > >  sysdeps/x86/Makefile                         |   7 +-
> > > >  sysdeps/x86/cpu-features.c                   |  56 +++++-
> > > >  sysdeps/x86/dl-procinfo.c                    |  16 ++
> > > >  sysdeps/{x86_64 => x86}/features-offsets.sym |   2 +
> > > >  sysdeps/x86/sysdep.h                         |   6 +
> > > >  sysdeps/x86/tst-gnu2-tls2.c                  |  20 ++
> > > >  sysdeps/x86_64/Makefile                      |   2 +-
> > > >  sysdeps/x86_64/dl-machine.h                  |  19 +-
> > > >  sysdeps/x86_64/dl-procinfo.c                 |  16 ++
> > > >  sysdeps/x86_64/dl-tlsdesc-dynamic.h          | 166 ++++++++++++++++
> > > >  sysdeps/x86_64/dl-tlsdesc.S                  | 108 ++++-------
> > > >  sysdeps/x86_64/dl-trampoline-save.h          |  34 ++++
> > > >  sysdeps/x86_64/dl-trampoline-state.h         |  51 +++++
> > > >  sysdeps/x86_64/dl-trampoline.S               |  20 +-
> > > >  sysdeps/x86_64/dl-trampoline.h               |  34 +---
> > > >  24 files changed, 914 insertions(+), 213 deletions(-)
> > > >  create mode 100644 elf/tst-gnu2-tls2.c
> > > >  create mode 100644 elf/tst-gnu2-tls2.h
> > > >  create mode 100644 elf/tst-gnu2-tls2mod0.c
> > > >  create mode 100644 elf/tst-gnu2-tls2mod1.c
> > > >  create mode 100644 elf/tst-gnu2-tls2mod2.c
> > > >  create mode 100644 sysdeps/i386/dl-tlsdesc-dynamic.h
> > > >  rename sysdeps/{x86_64 => x86}/features-offsets.sym (89%)
> > > >  create mode 100644 sysdeps/x86/tst-gnu2-tls2.c
> > > >  create mode 100644 sysdeps/x86_64/dl-tlsdesc-dynamic.h
> > > >  create mode 100644 sysdeps/x86_64/dl-trampoline-save.h
> > > >  create mode 100644 sysdeps/x86_64/dl-trampoline-state.h
> > > >
> > > > diff --git a/elf/Makefile b/elf/Makefile
> > > > index 5d78b659ce..030db4d207 100644
> > > > --- a/elf/Makefile
> > > > +++ b/elf/Makefile
> > > > @@ -424,6 +424,7 @@ tests += \
> > > >    tst-glibc-hwcaps-prepend \
> > > >    tst-global1 \
> > > >    tst-global2 \
> > > > +  tst-gnu2-tls2 \
> > > >    tst-initfinilazyfail \
> > > >    tst-initorder \
> > > >    tst-initorder2 \
> > > > @@ -846,6 +847,9 @@ modules-names += \
> > > >    tst-filterobj-flt \
> > > >    tst-finilazyfailmod \
> > > >    tst-globalmod2 \
> > > > +  tst-gnu2-tls2mod0 \
> > > > +  tst-gnu2-tls2mod1 \
> > > > +  tst-gnu2-tls2mod2 \
> > > >    tst-initlazyfailmod \
> > > >    tst-initorder2a \
> > > >    tst-initorder2b \
> > > > @@ -3044,8 +3048,18 @@ $(objpfx)tst-tlsgap.out: \
> > > >    $(objpfx)tst-tlsgap-mod0.so \
> > > >    $(objpfx)tst-tlsgap-mod1.so \
> > > >    $(objpfx)tst-tlsgap-mod2.so
> > > > +
> > > > +$(objpfx)tst-gnu2-tls2: $(shared-thread-library)
> > > > +$(objpfx)tst-gnu2-tls2.out: \
> > > > +  $(objpfx)tst-gnu2-tls2mod0.so \
> > > > +  $(objpfx)tst-gnu2-tls2mod1.so \
> > > > +  $(objpfx)tst-gnu2-tls2mod2.so
> > > > +
> > > >  ifeq (yes,$(have-mtls-dialect-gnu2))
> > > >  CFLAGS-tst-tlsgap-mod0.c += -mtls-dialect=gnu2
> > > >  CFLAGS-tst-tlsgap-mod1.c += -mtls-dialect=gnu2
> > > >  CFLAGS-tst-tlsgap-mod2.c += -mtls-dialect=gnu2
> > > > +CFLAGS-tst-gnu2-tls2mod0.c += -mtls-dialect=gnu2
> > > > +CFLAGS-tst-gnu2-tls2mod1.c += -mtls-dialect=gnu2
> > > > +CFLAGS-tst-gnu2-tls2mod2.c += -mtls-dialect=gnu2
> > > >  endif
> > > > diff --git a/elf/tst-gnu2-tls2.c b/elf/tst-gnu2-tls2.c
> > > > new file mode 100644
> > > > index 0000000000..8039ba614d
> > > > --- /dev/null
> > > > +++ b/elf/tst-gnu2-tls2.c
> > > > @@ -0,0 +1,120 @@
> > > > +/* Test TLSDESC relocation.
> > > > +   Copyright (C) 2024 Free Software Foundation, Inc.
> > > > +   This file is part of the GNU C Library.
> > > > +
> > > > +   The GNU C Library is free software; you can redistribute it and/or
> > > > +   modify it under the terms of the GNU Lesser General Public
> > > > +   License as published by the Free Software Foundation; either
> > > > +   version 2.1 of the License, or (at your option) any later version.
> > > > +
> > > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > > +   Lesser General Public License for more details.
> > > > +
> > > > +   You should have received a copy of the GNU Lesser General Public
> > > > +   License along with the GNU C Library; if not, see
> > > > +   <http://www.gnu.org/licenses/>.  */
> > > > +
> > > > +#include <stdio.h>
> > > > +#include <stdlib.h>
> > > > +#include <string.h>
> > > > +#include <dlfcn.h>
> > > > +#include <pthread.h>
> > > > +#include <support/xdlfcn.h>
> > > > +#include <support/xthread.h>
> > > > +#include <support/check.h>
> > > > +#include <support/test-driver.h>
> > > > +#include "tst-gnu2-tls2.h"
> > > > +
> > > > +#ifndef IS_SUPPORTED
> > > > +# define IS_SUPPORTED() true
> > > > +#endif
> > > > +
> > > > +/* An architecture can define it to clobber caller-saved registers in
> > > > +   malloc below to verify that the implicit TLSDESC call won't change
> > > > +   caller-saved registers.  */
> > > > +#ifndef PREPARE_MALLOC
> > > > +# define PREPARE_MALLOC()
> > > > +#endif
> > > > +
> > > > +extern void * __libc_malloc (size_t);
> > > > +
> > > > +size_t malloc_counter = 0;
> > > > +
> > > > +void *
> > > > +malloc (size_t n)
> > > > +{
> > > > +  PREPARE_MALLOC ();
> > > > +  malloc_counter++;
> > > > +  return __libc_malloc (n);
> > > > +}
> > > > +
> > > > +static void *mod[3];
> > > > +#define MOD(i) "tst-gnu2-tls2mod" #i ".so"
> > > > +static const char *modname[3] = { MOD(0), MOD(1), MOD(2) };
> > > > +#undef MOD
> > > > +
> > > > +static void
> > > > +open_mod (int i)
> > > > +{
> > > > +  mod[i] = xdlopen (modname[i], RTLD_LAZY);
> > > > +  printf ("open %s\n", modname[i]);
> > > > +}
> > > > +
> > > > +static void
> > > > +close_mod (int i)
> > > > +{
> > > > +  xdlclose (mod[i]);
> > > > +  mod[i] = NULL;
> > > > +  printf ("close %s\n", modname[i]);
> > > > +}
> > > > +
> > > > +static void
> > > > +access_mod (int i, const char *sym)
> > > > +{
> > > > +  struct tls var = { -1, -1, -1, -1 };
> > > > +  struct tls *(*f) (struct tls *) = xdlsym (mod[i], sym);
> > > > +  /* Check that our malloc is called.  */
> > > > +  malloc_counter = 0;
> > > > +  struct tls *p = f (&var);
> > > > +  TEST_VERIFY (malloc_counter != 0);
> > > > +  printf ("access %s: %s() = %p\n", modname[i], sym, p);
> > > > +  TEST_VERIFY_EXIT (memcmp (p, &var, sizeof (var)) == 0);
> > > > +  ++(p->a);
> > > > +}
> > > > +
> > > > +static void *
> > > > +start (void *arg)
> > > > +{
> > > > +  /* The DTV generation is at the last dlopen of mod0 and the
> > > > +     entry for mod1 is NULL.  */
> > > > +
> > > > +  open_mod (1); /* Reuse modid of mod1. Uses dynamic TLS.  */
> > > > +
> > > > +  /* Force the slow path in GNU2 TLS descriptor call.  */
> > > > +  access_mod (1, "apply_tls");
> > > > +
> > > > +  return arg;
> > > > +}
> > > > +
> > > > +static int
> > > > +do_test (void)
> > > > +{
> > > > +  if (!IS_SUPPORTED ())
> > > > +    return EXIT_UNSUPPORTED;
> > > > +
> > > > +  open_mod (0);
> > > > +  open_mod (1);
> > > > +  open_mod (2);
> > > > +  close_mod (0);
> > > > +  close_mod (1); /* Create modid gap at mod1.  */
> > > > +  open_mod (0); /* Reuse modid of mod0, bump generation count.  */
> > > > +
> > > > +  /* Create a thread where DTV of mod1 is NULL.  */
> > > > +  pthread_t t = xpthread_create (NULL, start, NULL);
> > > > +  xpthread_join (t);
> > > > +  return 0;
> > > > +}
> > > > +
> > > > +#include <support/test-driver.c>
> > >
> > > The change looks good but this is still failing on arm.
> > >
> > > ```
> > > FAIL: elf/tst-gnu2-tls2
> > > original exit status 1
> > > open tst-gnu2-tls2mod0.so
> > > open tst-gnu2-tls2mod1.so
> > > open tst-gnu2-tls2mod2.so
> > > close tst-gnu2-tls2mod0.so
> > > close tst-gnu2-tls2mod1.so
> > > open tst-gnu2-tls2mod0.so
> > > open tst-gnu2-tls2mod1.so
> > > Didn't expect signal from child: got `Segmentation fault'
> > > ```
> > >
> > > HJ can you add some more logging to its clear exactly where
> > > the fault is? (looks to be malloc or xdlsym).
> >
> > The new test may fail on targets which don't preserve all
> > caller-saved registers in _dl_tlsdesc_dynamic.
> >
> > > Are there any other arch this is failing on? I.e is this an arm
> > > bug or buggy test?
> > >
> >
> > It is a bug on arm:
> >
> > https://github.com/zatrazz/glibc/commits/azanella/tls-descriptor-fixes-arm/
>
> Is anyone working on a fix for arm?

There is a patch for arm in the URL above:

https://github.com/zatrazz/glibc/commit/acdac4ff394e82ddf348c0eb03a8235ac0f67f7c

-- 
H.J.

  reply	other threads:[~2024-02-24 18:00 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-16 15:17 [PATCH v8 0/2] " H.J. Lu
2024-02-16 15:17 ` [PATCH v8 1/2] x86-64: Save APX registers in ld.so trampoline H.J. Lu
2024-02-24 19:01   ` Noah Goldstein
2024-03-08 20:09     ` H.J. Lu
2024-03-09 17:39       ` Noah Goldstein
2024-02-16 15:17 ` [PATCH v8 2/2] x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registers H.J. Lu
2024-02-24 17:09   ` Noah Goldstein
2024-02-24 17:30     ` H.J. Lu
2024-02-24 17:39       ` Noah Goldstein
2024-02-24 18:00         ` H.J. Lu [this message]
2024-02-24 18:45           ` Noah Goldstein
2024-02-24 18:52             ` H.J. Lu
2024-02-24 18:59               ` Noah Goldstein
2024-02-24 19:00   ` Noah Goldstein
2024-02-24 19:10     ` H.J. Lu
2024-02-22 20:24 ` PING: [PATCH v8 0/2] " H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMe9rOrHOet5_24qGoZp77EMSoDU0FunugtUCi_A6WP7LO8Vgw@mail.gmail.com \
    --to=hjl.tools@gmail.com \
    --cc=adhemerval.zanella@linaro.org \
    --cc=fweimer@redhat.com \
    --cc=goldstein.w.n@gmail.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).