From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x32.google.com (mail-oa1-x32.google.com [IPv6:2001:4860:4864:20::32]) by sourceware.org (Postfix) with ESMTPS id 7C273385841B for ; Sat, 24 Feb 2024 18:59:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7C273385841B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7C273385841B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:4860:4864:20::32 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708801160; cv=none; b=N8XBz9XUJ1UKnmw76pQu91lbGjlTYXNeOTSkctjuplHY7n3R7ZfmmryTDifDIMyd3L1dSB84dFis6m2ti8yCjJvJ07HJB4J6E+R7zEA2ut2ShKNJR5R4voCZZbsuap4lAF1Ms4fsg9m9hx7DjAcby2UkiAVZHboLl7ywLZc66TE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708801160; c=relaxed/simple; bh=vtZZKPvfplKQpztJIwp4maAks8ZPx0vSlxU4sy4+1Ng=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=JBbMuNd3Z0MN1CZ7+8CvUjlEfvJOAjv+OjPXcJlc/kOK4cDHkWXIS4ykuZ7NVu0opcz1LQ1Q8yoC/kG3+SJ9PQEPDxehIx5vfCaHq0N6JV7hOcs3wtJMjB8zs29QmR0R+H4VBsJf4JFMUM+rriQEMKEMgxifY+d8xRS+g4EUg60= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oa1-x32.google.com with SMTP id 586e51a60fabf-21fe1bc5fc4so149751fac.1 for ; Sat, 24 Feb 2024 10:59:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708801157; x=1709405957; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tWpuI+U3KYoESbQVwbM6DswsBuprxX9m9N+4UBP3LAU=; b=D4KVq9uH+wXcuG4+pmp7hyJSv59ScN+Od1yeBy+36LcVEwksvTqqzryHtrK6NLsrB5 yF4OejrM5TuwwwSQ76sg2QQLHLa4EJE2AAPESBgIhcpEsNLKrKRhy5gordYCQED5RczH CryEvnBPd0NjlRApyjlVbnEAS+q/QhGCK5uGcQFErB5UFKc4RTTO8iGKIqI9C3WurCcD 2UgP7ZGheIzuUwVAjW5iv3vIEyaa+5qf59FdMampHQBuStv6dZ6xkdzK8z5E1ei3FtWs oHTNAoYmUILy+TwYV7hTIYuuPJMLsq6aCsfslEqfMcs8G2fOzAhpZnxtiWK4GK2qsqka n26A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708801157; x=1709405957; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tWpuI+U3KYoESbQVwbM6DswsBuprxX9m9N+4UBP3LAU=; b=qQtsmJusmINTVll10+jqR+52NaoQeyVKzVU5RQpa1dJnpGspfsEGgYXSWhRtlCSLa8 qnvYTZvPgYXE/bljC7iVYRp4HobfNtGgx7gKa20N4PADzk3xLiQf6Va0GsL5FnbPJgAE WmGbyVZn0W/BUt2EtgyrdxGaVX1o+rNXzpko8gjwHJ3ivy1gumDGDq6sbvVjLqqD/BRo VPEd03RZp7dN3dxsTujAhN4Ky0GHV+mwMWCj4gF68MHzeyLM9oohey49mD/wiZIvc8Sd wSRLt1GASubsXjZ9R8NT2LReNtyQs7A265eBCyk0fcc8wHjAmAE6+DCoQU7zeNt2lcbG 0npg== X-Gm-Message-State: AOJu0YxJIpPVtkyZkPqUHcoC3e6rFygwmulhimJUYG0Xz8tH0wUjB6Bq YxUineeHBIZ7BDJv0ocKOk8kRwtzRTdGljnfrNjfvXHhLQUb8P2FHY83Uumzc1h6+2kfo48zDbz DXOYwyzm/9x0KyLeAX3El3VZxdNc= X-Google-Smtp-Source: AGHT+IFUkkVPnyK/RGn8pqUDRSaROt7CjokHIB/XWRXY5DR+lUbW23zbGyduvRstfK9SP5w0UNf6qSozELKEmUM2dhc= X-Received: by 2002:a05:6870:4724:b0:21e:5bb9:5d7a with SMTP id b36-20020a056870472400b0021e5bb95d7amr1363382oaq.27.1708801156772; Sat, 24 Feb 2024 10:59:16 -0800 (PST) MIME-Version: 1.0 References: <20240216151711.2742988-1-hjl.tools@gmail.com> <20240216151711.2742988-3-hjl.tools@gmail.com> In-Reply-To: From: Noah Goldstein Date: Sat, 24 Feb 2024 12:59:05 -0600 Message-ID: Subject: Re: [PATCH v8 2/2] x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registers To: "H.J. Lu" Cc: libc-alpha@sourceware.org, fweimer@redhat.com, adhemerval.zanella@linaro.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sat, Feb 24, 2024 at 12:52=E2=80=AFPM H.J. Lu wrot= e: > > On Sat, Feb 24, 2024 at 10:45=E2=80=AFAM Noah Goldstein wrote: > > > > On Sat, Feb 24, 2024 at 12:00=E2=80=AFPM H.J. Lu = wrote: > > > > > > On Sat, Feb 24, 2024 at 9:39=E2=80=AFAM Noah Goldstein wrote: > > > > > > > > On Sat, Feb 24, 2024 at 11:31=E2=80=AFAM H.J. Lu wrote: > > > > > > > > > > On Sat, Feb 24, 2024 at 9:09=E2=80=AFAM Noah Goldstein wrote: > > > > > > > > > > > > On Fri, Feb 16, 2024 at 9:17=E2=80=AFAM H.J. Lu wrote: > > > > > > > > > > > > > > Compiler generates the following instruction sequence for GNU= 2 dynamic > > > > > > > TLS access: > > > > > > > > > > > > > > leaq tls_var@TLSDESC(%rip), %rax > > > > > > > call *tls_var@TLSCALL(%rax) > > > > > > > > > > > > > > or > > > > > > > > > > > > > > leal tls_var@TLSDESC(%ebx), %eax > > > > > > > call *tls_var@TLSCALL(%eax) > > > > > > > > > > > > > > CALL instruction is transparent to compiler which assumes all= registers, > > > > > > > except for EFLAGS and RAX/EAX, are unchanged after CALL. Whe= n > > > > > > > _dl_tlsdesc_dynamic is called, it calls __tls_get_addr on the= slow > > > > > > > path. __tls_get_addr is a normal function which doesn't pres= erve any > > > > > > > caller-saved registers. _dl_tlsdesc_dynamic saved and restor= ed integer > > > > > > > caller-saved registers, but didn't preserve any other caller-= saved > > > > > > > registers. Add _dl_tlsdesc_dynamic IFUNC functions for FNSAV= E, FXSAVE, > > > > > > > XSAVE and XSAVEC to save and restore all caller-saved registe= rs. This > > > > > > > fixes BZ #31372. > > > > > > > > > > > > > > Add GLRO(dl_x86_64_runtime_resolve) with GLRO(dl_x86_tlsdesc_= dynamic) > > > > > > > to optimize elf_machine_runtime_setup. > > > > > > > --- > > > > > > > elf/Makefile | 14 ++ > > > > > > > elf/tst-gnu2-tls2.c | 120 +++++++++= +++ > > > > > > > elf/tst-gnu2-tls2.h | 36 ++++ > > > > > > > elf/tst-gnu2-tls2mod0.c | 31 +++ > > > > > > > elf/tst-gnu2-tls2mod1.c | 31 +++ > > > > > > > elf/tst-gnu2-tls2mod2.c | 31 +++ > > > > > > > sysdeps/i386/dl-machine.h | 2 +- > > > > > > > sysdeps/i386/dl-tlsdesc-dynamic.h | 190 +++++++++= ++++++++++ > > > > > > > sysdeps/i386/dl-tlsdesc.S | 115 +++++----= -- > > > > > > > sysdeps/x86/Makefile | 7 +- > > > > > > > sysdeps/x86/cpu-features.c | 56 +++++- > > > > > > > sysdeps/x86/dl-procinfo.c | 16 ++ > > > > > > > sysdeps/{x86_64 =3D> x86}/features-offsets.sym | 2 + > > > > > > > sysdeps/x86/sysdep.h | 6 + > > > > > > > sysdeps/x86/tst-gnu2-tls2.c | 20 ++ > > > > > > > sysdeps/x86_64/Makefile | 2 +- > > > > > > > sysdeps/x86_64/dl-machine.h | 19 +- > > > > > > > sysdeps/x86_64/dl-procinfo.c | 16 ++ > > > > > > > sysdeps/x86_64/dl-tlsdesc-dynamic.h | 166 +++++++++= +++++++ > > > > > > > sysdeps/x86_64/dl-tlsdesc.S | 108 ++++-----= -- > > > > > > > sysdeps/x86_64/dl-trampoline-save.h | 34 ++++ > > > > > > > sysdeps/x86_64/dl-trampoline-state.h | 51 +++++ > > > > > > > sysdeps/x86_64/dl-trampoline.S | 20 +- > > > > > > > sysdeps/x86_64/dl-trampoline.h | 34 +--- > > > > > > > 24 files changed, 914 insertions(+), 213 deletions(-) > > > > > > > create mode 100644 elf/tst-gnu2-tls2.c > > > > > > > create mode 100644 elf/tst-gnu2-tls2.h > > > > > > > create mode 100644 elf/tst-gnu2-tls2mod0.c > > > > > > > create mode 100644 elf/tst-gnu2-tls2mod1.c > > > > > > > create mode 100644 elf/tst-gnu2-tls2mod2.c > > > > > > > create mode 100644 sysdeps/i386/dl-tlsdesc-dynamic.h > > > > > > > rename sysdeps/{x86_64 =3D> x86}/features-offsets.sym (89%) > > > > > > > create mode 100644 sysdeps/x86/tst-gnu2-tls2.c > > > > > > > create mode 100644 sysdeps/x86_64/dl-tlsdesc-dynamic.h > > > > > > > create mode 100644 sysdeps/x86_64/dl-trampoline-save.h > > > > > > > create mode 100644 sysdeps/x86_64/dl-trampoline-state.h > > > > > > > > > > > > > > diff --git a/elf/Makefile b/elf/Makefile > > > > > > > index 5d78b659ce..030db4d207 100644 > > > > > > > --- a/elf/Makefile > > > > > > > +++ b/elf/Makefile > > > > > > > @@ -424,6 +424,7 @@ tests +=3D \ > > > > > > > tst-glibc-hwcaps-prepend \ > > > > > > > tst-global1 \ > > > > > > > tst-global2 \ > > > > > > > + tst-gnu2-tls2 \ > > > > > > > tst-initfinilazyfail \ > > > > > > > tst-initorder \ > > > > > > > tst-initorder2 \ > > > > > > > @@ -846,6 +847,9 @@ modules-names +=3D \ > > > > > > > tst-filterobj-flt \ > > > > > > > tst-finilazyfailmod \ > > > > > > > tst-globalmod2 \ > > > > > > > + tst-gnu2-tls2mod0 \ > > > > > > > + tst-gnu2-tls2mod1 \ > > > > > > > + tst-gnu2-tls2mod2 \ > > > > > > > tst-initlazyfailmod \ > > > > > > > tst-initorder2a \ > > > > > > > tst-initorder2b \ > > > > > > > @@ -3044,8 +3048,18 @@ $(objpfx)tst-tlsgap.out: \ > > > > > > > $(objpfx)tst-tlsgap-mod0.so \ > > > > > > > $(objpfx)tst-tlsgap-mod1.so \ > > > > > > > $(objpfx)tst-tlsgap-mod2.so > > > > > > > + > > > > > > > +$(objpfx)tst-gnu2-tls2: $(shared-thread-library) > > > > > > > +$(objpfx)tst-gnu2-tls2.out: \ > > > > > > > + $(objpfx)tst-gnu2-tls2mod0.so \ > > > > > > > + $(objpfx)tst-gnu2-tls2mod1.so \ > > > > > > > + $(objpfx)tst-gnu2-tls2mod2.so > > > > > > > + > > > > > > > ifeq (yes,$(have-mtls-dialect-gnu2)) > > > > > > > CFLAGS-tst-tlsgap-mod0.c +=3D -mtls-dialect=3Dgnu2 > > > > > > > CFLAGS-tst-tlsgap-mod1.c +=3D -mtls-dialect=3Dgnu2 > > > > > > > CFLAGS-tst-tlsgap-mod2.c +=3D -mtls-dialect=3Dgnu2 > > > > > > > +CFLAGS-tst-gnu2-tls2mod0.c +=3D -mtls-dialect=3Dgnu2 > > > > > > > +CFLAGS-tst-gnu2-tls2mod1.c +=3D -mtls-dialect=3Dgnu2 > > > > > > > +CFLAGS-tst-gnu2-tls2mod2.c +=3D -mtls-dialect=3Dgnu2 > > > > > > > endif > > > > > > > diff --git a/elf/tst-gnu2-tls2.c b/elf/tst-gnu2-tls2.c > > > > > > > new file mode 100644 > > > > > > > index 0000000000..8039ba614d > > > > > > > --- /dev/null > > > > > > > +++ b/elf/tst-gnu2-tls2.c > > > > > > > @@ -0,0 +1,120 @@ > > > > > > > +/* Test TLSDESC relocation. > > > > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > > > > + This file is part of the GNU C Library. > > > > > > > + > > > > > > > + The GNU C Library is free software; you can redistribute = it and/or > > > > > > > + modify it under the terms of the GNU Lesser General Publi= c > > > > > > > + License as published by the Free Software Foundation; eit= her > > > > > > > + version 2.1 of the License, or (at your option) any later= version. > > > > > > > + > > > > > > > + The GNU C Library is distributed in the hope that it will= be useful, > > > > > > > + but WITHOUT ANY WARRANTY; without even the implied warran= ty of > > > > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See= the GNU > > > > > > > + Lesser General Public License for more details. > > > > > > > + > > > > > > > + You should have received a copy of the GNU Lesser General= Public > > > > > > > + License along with the GNU C Library; if not, see > > > > > > > + . */ > > > > > > > + > > > > > > > +#include > > > > > > > +#include > > > > > > > +#include > > > > > > > +#include > > > > > > > +#include > > > > > > > +#include > > > > > > > +#include > > > > > > > +#include > > > > > > > +#include > > > > > > > +#include "tst-gnu2-tls2.h" > > > > > > > + > > > > > > > +#ifndef IS_SUPPORTED > > > > > > > +# define IS_SUPPORTED() true > > > > > > > +#endif > > > > > > > + > > > > > > > +/* An architecture can define it to clobber caller-saved reg= isters in > > > > > > > + malloc below to verify that the implicit TLSDESC call won= 't change > > > > > > > + caller-saved registers. */ > > > > > > > +#ifndef PREPARE_MALLOC > > > > > > > +# define PREPARE_MALLOC() > > > > > > > +#endif > > > > > > > + > > > > > > > +extern void * __libc_malloc (size_t); > > > > > > > + > > > > > > > +size_t malloc_counter =3D 0; > > > > > > > + > > > > > > > +void * > > > > > > > +malloc (size_t n) > > > > > > > +{ > > > > > > > + PREPARE_MALLOC (); > > > > > > > + malloc_counter++; > > > > > > > + return __libc_malloc (n); > > > > > > > +} > > > > > > > + > > > > > > > +static void *mod[3]; > > > > > > > +#define MOD(i) "tst-gnu2-tls2mod" #i ".so" > > > > > > > +static const char *modname[3] =3D { MOD(0), MOD(1), MOD(2) }= ; > > > > > > > +#undef MOD > > > > > > > + > > > > > > > +static void > > > > > > > +open_mod (int i) > > > > > > > +{ > > > > > > > + mod[i] =3D xdlopen (modname[i], RTLD_LAZY); > > > > > > > + printf ("open %s\n", modname[i]); > > > > > > > +} > > > > > > > + > > > > > > > +static void > > > > > > > +close_mod (int i) > > > > > > > +{ > > > > > > > + xdlclose (mod[i]); > > > > > > > + mod[i] =3D NULL; > > > > > > > + printf ("close %s\n", modname[i]); > > > > > > > +} > > > > > > > + > > > > > > > +static void > > > > > > > +access_mod (int i, const char *sym) > > > > > > > +{ > > > > > > > + struct tls var =3D { -1, -1, -1, -1 }; > > > > > > > + struct tls *(*f) (struct tls *) =3D xdlsym (mod[i], sym); > > > > > > > + /* Check that our malloc is called. */ > > > > > > > + malloc_counter =3D 0; > > > > > > > + struct tls *p =3D f (&var); > > > > > > > + TEST_VERIFY (malloc_counter !=3D 0); > > > > > > > + printf ("access %s: %s() =3D %p\n", modname[i], sym, p); > > > > > > > + TEST_VERIFY_EXIT (memcmp (p, &var, sizeof (var)) =3D=3D 0)= ; > > > > > > > + ++(p->a); > > > > > > > +} > > > > > > > + > > > > > > > +static void * > > > > > > > +start (void *arg) > > > > > > > +{ > > > > > > > + /* The DTV generation is at the last dlopen of mod0 and th= e > > > > > > > + entry for mod1 is NULL. */ > > > > > > > + > > > > > > > + open_mod (1); /* Reuse modid of mod1. Uses dynamic TLS. *= / > > > > > > > + > > > > > > > + /* Force the slow path in GNU2 TLS descriptor call. */ > > > > > > > + access_mod (1, "apply_tls"); > > > > > > > + > > > > > > > + return arg; > > > > > > > +} > > > > > > > + > > > > > > > +static int > > > > > > > +do_test (void) > > > > > > > +{ > > > > > > > + if (!IS_SUPPORTED ()) > > > > > > > + return EXIT_UNSUPPORTED; > > > > > > > + > > > > > > > + open_mod (0); > > > > > > > + open_mod (1); > > > > > > > + open_mod (2); > > > > > > > + close_mod (0); > > > > > > > + close_mod (1); /* Create modid gap at mod1. */ > > > > > > > + open_mod (0); /* Reuse modid of mod0, bump generation coun= t. */ > > > > > > > + > > > > > > > + /* Create a thread where DTV of mod1 is NULL. */ > > > > > > > + pthread_t t =3D xpthread_create (NULL, start, NULL); > > > > > > > + xpthread_join (t); > > > > > > > + return 0; > > > > > > > +} > > > > > > > + > > > > > > > +#include > > > > > > > > > > > > The change looks good but this is still failing on arm. > > > > > > > > > > > > ``` > > > > > > FAIL: elf/tst-gnu2-tls2 > > > > > > original exit status 1 > > > > > > open tst-gnu2-tls2mod0.so > > > > > > open tst-gnu2-tls2mod1.so > > > > > > open tst-gnu2-tls2mod2.so > > > > > > close tst-gnu2-tls2mod0.so > > > > > > close tst-gnu2-tls2mod1.so > > > > > > open tst-gnu2-tls2mod0.so > > > > > > open tst-gnu2-tls2mod1.so > > > > > > Didn't expect signal from child: got `Segmentation fault' > > > > > > ``` > > > > > > > > > > > > HJ can you add some more logging to its clear exactly where > > > > > > the fault is? (looks to be malloc or xdlsym). > > > > > > > > > > The new test may fail on targets which don't preserve all > > > > > caller-saved registers in _dl_tlsdesc_dynamic. > > > > > > > > > > > Are there any other arch this is failing on? I.e is this an arm > > > > > > bug or buggy test? > > > > > > > > > > > > > > > > It is a bug on arm: > > > > > > > > > > https://github.com/zatrazz/glibc/commits/azanella/tls-descriptor-= fixes-arm/ > > > > > > > > Is anyone working on a fix for arm? > > > > > > There is a patch for arm in the URL above: > > > > > > https://github.com/zatrazz/glibc/commit/acdac4ff394e82ddf348c0eb03a82= 35ac0f67f7c > > > > > > > Can you re-submit the patch to make the CI happy? > > That is Adhemerval's patch. I prefer letting him decide what > to do. My patch doesn't change any target files. I can make > the new testcase x86 specific. But I don't believe that we should > make CI happy in this particular case. Okay > > -- > H.J.