From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by sourceware.org (Postfix) with ESMTPS id C717C386103D for ; Thu, 15 Feb 2024 00:23:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C717C386103D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C717C386103D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::62a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707956622; cv=none; b=svdIsZux2S+EnHUUVteR+jyHUZRNYKAG7QUHo1qVUrzGSD5HbcgaqltkCqeTUXVMhFlpM3M3gwm8ekYY270CndfkJSvBJcd0jLNf7vnEA+b4AbM2SRG0bWuslCE2iA2crH50MxZyuNXFQyF3VoU+Il5YaNaBT8cI5qYZfK5F++Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707956622; c=relaxed/simple; bh=cbrcwSyIWvpI6ke2QL7PvGC9n/Fxrr3Fk7K/9fnEk6k=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=ljQAoVawldGIpjM8/qqQgAnzlM6R4lPl0rGxwHgABUmgY1yCaVb98nEkPYgL5cZZeVh5tdV8cLJ28MYP+QpjgHLszEqFEEex4w85XT43oava7I0JmTePqoMykDJGfaJfSjy7aDBrQwqqLkcK820toOnbV9kOXMRFAbHktJbUrXs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1d93edfa76dso2622995ad.1 for ; Wed, 14 Feb 2024 16:23:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707956617; x=1708561417; darn=sourceware.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=8s5CEVVRcPWkwSGATTl8IULy+3c0pBU2W8ca5RmUF2U=; b=BpZq/k+1rdDl/JD+wn/x1ymGrIm2QxU22zWbof9eAAqu/FYRhKlA474jen86NPNofQ LdQgHn8N+ZHSHWlBMb3XuvERhqTmkHhFVU9mbejw8H456kXfga6mh/wo3xzEk9W2hMmI G/G2kiia3SXhcA/QLQH0Ek0QYksR/mRoe2PKgyteBPN9t7x7Dh09tcBHsyjPvJAGG3uR Xvcb8e3br1/DvR0qWvqNJm0shWjtelldsBocxPgF59H7Q5yBuc0j7mIPDQfrEYcF+8uN /txEunu/vipW0nbvikKwDxw++dfjOxrjm89MUhuTcb4zqq4Y8g0wq8PaHE2s4XocZQBJ LAhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707956617; x=1708561417; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8s5CEVVRcPWkwSGATTl8IULy+3c0pBU2W8ca5RmUF2U=; b=mZo4t2xKs1WcJOtugxz55bwh4xJ8OXFeUzT3W3X7Le6ZSUAjx6lgwf6xrp3M1KixDg mwB8MiZ7Tj3rjzSc6zR+FCqjZ8y25xrXwnmBNCeJXfo9xSrKv0Wgv7VpbeRmaJsqvS0P bI6isNwwhzVUQQuDqk/n82HPSCRdADBZnxZ8wNSbrSfEGsw9SBaaKGzsN5nZMVdQBmxQ a1tizIpFocnoOFb0uWChG8XoaoL+Zyp/c6/VeQmSh0C1Gaje9/aF1wB7QvQ3CqluCmGk 8Df/bX2bTaIRm0OrrCtWrUDjRVbKyZAlwnLPnab6QuhNuLR0f48mUN2gbDYYyPfQjoYD Zzzw== X-Gm-Message-State: AOJu0Yy4lS1FvqNL9f/SKGHdrbGNSN71Wf4hW9seFxfQBe/q6nMKysCw RdpCrMeQUfqBZ7KVnW0c4L12ArqbRvBzseO5Kjs5Kdxso0rXYRhb5lg+FDSO X-Google-Smtp-Source: AGHT+IE286OHCinSHAoIqo3uBhEPUNLMSgMhqbO0fdl/x2FnHvWAexLzk82HcF0KcwVQPJBTkj8+uw== X-Received: by 2002:a17:902:c946:b0:1db:28b4:4475 with SMTP id i6-20020a170902c94600b001db28b44475mr306046pla.9.1707956616492; Wed, 14 Feb 2024 16:23:36 -0800 (PST) Received: from gnu-cfl-3.localdomain ([172.58.89.72]) by smtp.gmail.com with ESMTPSA id jg11-20020a17090326cb00b001d91d515dffsm31237plb.156.2024.02.14.16.23.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Feb 2024 16:23:35 -0800 (PST) Received: by gnu-cfl-3.localdomain (Postfix, from userid 1000) id A7B20740257; Wed, 14 Feb 2024 16:23:34 -0800 (PST) Date: Wed, 14 Feb 2024 16:23:34 -0800 From: "H.J. Lu" To: Noah Goldstein Cc: libc-alpha@sourceware.org Subject: Re: [PATCH v4 2/2] x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registers Message-ID: References: <20240213041501.2494232-1-hjl.tools@gmail.com> <20240213041501.2494232-3-hjl.tools@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-3019.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_ABUSEAT,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Feb 14, 2024 at 11:57:20PM +0000, Noah Goldstein wrote: > On Wed, Feb 14, 2024 at 11:21 PM H.J. Lu wrote: > > > > On Wed, Feb 14, 2024 at 10:44:20PM +0000, Noah Goldstein wrote: > > > On Tue, Feb 13, 2024 at 4:15 AM H.J. Lu wrote: > > > > > > > > Compiler generates the following instruction sequence for GNU2 dynamic > > > > TLS access: > > > > > > > > leaq tls_var@TLSDESC(%rip), %rax > > > > call *tls_var@TLSCALL(%rax) > > > > > > > > or > > > > > > > > leal tls_var@TLSDESC(%ebx), %eax > > > > call *tls_var@TLSCALL(%eax) > > > > > > > > CALL instruction is transparent to compiler which assumes all registers, > > > > except for EFLAGS and RAX/EAX, are unchanged after CALL. When > > > > _dl_tlsdesc_dynamic is called, it calls __tls_get_addr on the slow > > > > path. __tls_get_addr is a normal function which doesn't preserve any > > > > caller-saved registers. _dl_tlsdesc_dynamic saved and restored integer > > > > caller-saved registers, but didn't preserve any other caller-saved > > > > registers. Add _dl_tlsdesc_dynamic IFUNC functions for FNSAVE, FXSAVE, > > > > XSAVE and XSAVEC to save and restore all caller-saved registers. This > > > > fixes BZ #31372. > > > > > > > > Add GLRO(dl_x86_64_runtime_resolve) with GLRO(dl_x86_tlsdesc_dynamic) > > > > to optimize elf_machine_runtime_setup. > > > > --- > > > > elf/Makefile | 19 ++ > > > > elf/malloc-for-test.c | 32 ++++ > > > > elf/malloc-for-test.map | 6 + > > > > elf/tst-gnu2-tls2.c | 97 ++++++++++ > > > > elf/tst-gnu2-tls2.h | 26 +++ > > > > elf/tst-gnu2-tls2mod0.c | 28 +++ > > > > elf/tst-gnu2-tls2mod1.c | 28 +++ > > > > elf/tst-gnu2-tls2mod2.c | 28 +++ > > > > sysdeps/i386/dl-machine.h | 2 +- > > > > sysdeps/i386/dl-tlsdesc-dynamic.h | 187 +++++++++++++++++++ > > > > sysdeps/i386/dl-tlsdesc.S | 115 +++++------- > > > > sysdeps/i386/tst-gnu2-tls2.c | 5 + > > > > sysdeps/x86/Makefile | 7 +- > > > > sysdeps/x86/cpu-features.c | 56 +++++- > > > > sysdeps/x86/dl-procinfo.c | 16 ++ > > > > sysdeps/{x86_64 => x86}/features-offsets.sym | 2 + > > > > sysdeps/x86/malloc-for-test.c | 33 ++++ > > > > sysdeps/x86/sysdep.h | 6 + > > > > sysdeps/x86_64/Makefile | 2 +- > > > > sysdeps/x86_64/dl-machine.h | 19 +- > > > > sysdeps/x86_64/dl-procinfo.c | 16 ++ > > > > sysdeps/x86_64/dl-tlsdesc-dynamic.h | 166 ++++++++++++++++ > > > > sysdeps/x86_64/dl-tlsdesc.S | 108 ++++------- > > > > sysdeps/x86_64/dl-trampoline-save.h | 34 ++++ > > > > sysdeps/x86_64/dl-trampoline-state.h | 51 +++++ > > > > sysdeps/x86_64/dl-trampoline.S | 20 +- > > > > sysdeps/x86_64/dl-trampoline.h | 34 +--- > > > > 27 files changed, 930 insertions(+), 213 deletions(-) > > > > create mode 100644 elf/malloc-for-test.c > > > > create mode 100644 elf/malloc-for-test.map > > > > create mode 100644 elf/tst-gnu2-tls2.c > > > > create mode 100644 elf/tst-gnu2-tls2.h > > > > create mode 100644 elf/tst-gnu2-tls2mod0.c > > > > create mode 100644 elf/tst-gnu2-tls2mod1.c > > > > create mode 100644 elf/tst-gnu2-tls2mod2.c > > > > create mode 100644 sysdeps/i386/dl-tlsdesc-dynamic.h > > > > create mode 100644 sysdeps/i386/tst-gnu2-tls2.c > > > > rename sysdeps/{x86_64 => x86}/features-offsets.sym (89%) > > > > create mode 100644 sysdeps/x86/malloc-for-test.c > > > > create mode 100644 sysdeps/x86_64/dl-tlsdesc-dynamic.h > > > > create mode 100644 sysdeps/x86_64/dl-trampoline-save.h > > > > create mode 100644 sysdeps/x86_64/dl-trampoline-state.h > > > > > > > > diff --git a/elf/Makefile b/elf/Makefile > > > > index 5d78b659ce..e0665d2007 100644 > > > > --- a/elf/Makefile > > > > +++ b/elf/Makefile > > > > @@ -424,6 +424,7 @@ tests += \ > > > > tst-glibc-hwcaps-prepend \ > > > > tst-global1 \ > > > > tst-global2 \ > > > > + tst-gnu2-tls2 \ > > > > tst-initfinilazyfail \ > > > > tst-initorder \ > > > > tst-initorder2 \ > > > > @@ -699,6 +700,7 @@ modules-names += \ > > > > libtracemod5-1 \ > > > > ltglobmod1 \ > > > > ltglobmod2 \ > > > > + malloc-for-test \ > > > > neededobj1 \ > > > > neededobj2 \ > > > > neededobj3 \ > > > > @@ -846,6 +848,9 @@ modules-names += \ > > > > tst-filterobj-flt \ > > > > tst-finilazyfailmod \ > > > > tst-globalmod2 \ > > > > + tst-gnu2-tls2mod0 \ > > > > + tst-gnu2-tls2mod1 \ > > > > + tst-gnu2-tls2mod2 \ > > > > tst-initlazyfailmod \ > > > > tst-initorder2a \ > > > > tst-initorder2b \ > > > > @@ -3044,8 +3049,22 @@ $(objpfx)tst-tlsgap.out: \ > > > > $(objpfx)tst-tlsgap-mod0.so \ > > > > $(objpfx)tst-tlsgap-mod1.so \ > > > > $(objpfx)tst-tlsgap-mod2.so > > > > + > > > > +$(objpfx)tst-gnu2-tls2: \ > > > > + $(shared-thread-library) \ > > > > + $(objpfx)malloc-for-test.so > > > > +$(objpfx)tst-gnu2-tls2.out: \ > > > > + $(objpfx)tst-gnu2-tls2mod0.so \ > > > > + $(objpfx)tst-gnu2-tls2mod1.so \ > > > > + $(objpfx)tst-gnu2-tls2mod2.so > > > > + > > > > +LDFLAGS-malloc-for-test.so += -Wl,--version-script=malloc-for-test.map > > > > + > > > > ifeq (yes,$(have-mtls-dialect-gnu2)) > > > > CFLAGS-tst-tlsgap-mod0.c += -mtls-dialect=gnu2 > > > > CFLAGS-tst-tlsgap-mod1.c += -mtls-dialect=gnu2 > > > > CFLAGS-tst-tlsgap-mod2.c += -mtls-dialect=gnu2 > > > > +CFLAGS-tst-gnu2-tls2mod0.c += -mtls-dialect=gnu2 > > > > +CFLAGS-tst-gnu2-tls2mod1.c += -mtls-dialect=gnu2 > > > > +CFLAGS-tst-gnu2-tls2mod2.c += -mtls-dialect=gnu2 > > > > endif > > > > diff --git a/elf/malloc-for-test.c b/elf/malloc-for-test.c > > > > new file mode 100644 > > > > index 0000000000..1bec69eda7 > > > > --- /dev/null > > > > +++ b/elf/malloc-for-test.c > > > > @@ -0,0 +1,32 @@ > > > > +/* A malloc for intercept test. > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + . */ > > > > + > > > > +#include > > > > + > > > > +extern void * __libc_malloc (size_t); > > > > + > > > > +#ifndef PREPARE_MALLOC > > > > +# define PREPARE_MALLOC() > > > > +#endif > > > > + > > > > +void * > > > > +malloc (size_t n) > > > > +{ > > > > + PREPARE_MALLOC (); > > > > + return __libc_malloc (n); > > > > +} > > > > diff --git a/elf/malloc-for-test.map b/elf/malloc-for-test.map > > > > new file mode 100644 > > > > index 0000000000..8437cf4346 > > > > --- /dev/null > > > > +++ b/elf/malloc-for-test.map > > > > @@ -0,0 +1,6 @@ > > > > +GLIBC_2.0 { > > > > + global: > > > > + malloc; > > > > + local: > > > > + *; > > > > +}; > > > > diff --git a/elf/tst-gnu2-tls2.c b/elf/tst-gnu2-tls2.c > > > > new file mode 100644 > > > > index 0000000000..34427f9a0f > > > > --- /dev/null > > > > +++ b/elf/tst-gnu2-tls2.c > > > > @@ -0,0 +1,97 @@ > > > > +/* Test TLSDESC relocation. > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + . */ > > > > + > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > +#include > > > > +#include "tst-gnu2-tls2.h" > > > > + > > > > +#ifndef IS_SUPPORTED > > > > +# define IS_SUPPORTED() true > > > > +#endif > > > > + > > > > +static void *mod[3]; > > > > +#define MOD(i) "tst-gnu2-tls2mod" #i ".so" > > > > +static const char *modname[3] = { MOD(0), MOD(1), MOD(2) }; > > > > +#undef MOD > > > > + > > > > +static void > > > > +open_mod (int i) > > > > +{ > > > > + mod[i] = xdlopen (modname[i], RTLD_LAZY); > > > > + printf ("open %s\n", modname[i]); > > > > +} > > > > + > > > > +static void > > > > +close_mod (int i) > > > > +{ > > > > + xdlclose (mod[i]); > > > > + mod[i] = NULL; > > > > + printf ("close %s\n", modname[i]); > > > > +} > > > > + > > > > +static void > > > > +access_mod (int i, const char *sym) > > > > +{ > > > > + struct tls var = { -1, -1, -1, -1 }; > > > > + struct tls *(*f) (struct tls *) = xdlsym (mod[i], sym); > > > > + struct tls *p = f (&var); > > > > + printf ("access %s: %s() = %p\n", modname[i], sym, p); > > > > + TEST_VERIFY_EXIT (memcmp (p, &var, sizeof (var)) == 0); > > > > + ++(p->a); > > > > +} > > > > + > > > > +static void * > > > > +start (void *arg) > > > > +{ > > > > + /* The DTV generation is at the last dlopen of mod0 and the > > > > + entry for mod1 is NULL. */ > > > > + > > > > + open_mod (1); /* Reuse modid of mod1. Uses dynamic TLS. */ > > > > + > > > > + /* Force the slow path in GNU2 TLS descriptor call. */ > > > > + access_mod (1, "apply_tls"); > > > > + > > > > + return arg; > > > > +} > > > > + > > > > +static int > > > > +do_test (void) > > > > +{ > > > > + if (!IS_SUPPORTED ()) > > > > + return EXIT_UNSUPPORTED; > > > > + > > > > + open_mod (0); > > > > + open_mod (1); > > > > + open_mod (2); > > > > + close_mod (0); > > > > + close_mod (1); /* Create modid gap at mod1. */ > > > > + open_mod (0); /* Reuse modid of mod0, bump generation count. */ > > > > + > > > > + /* Create a thread where DTV of mod1 is NULL. */ > > > > + pthread_t t = xpthread_create (NULL, start, NULL); > > > > + xpthread_join (t); > > > > + return 0; > > > > +} > > > > + > > > > +#include > > > > diff --git a/elf/tst-gnu2-tls2.h b/elf/tst-gnu2-tls2.h > > > > new file mode 100644 > > > > index 0000000000..e33f4dbe27 > > > > --- /dev/null > > > > +++ b/elf/tst-gnu2-tls2.h > > > > @@ -0,0 +1,26 @@ > > > > +/* Test TLSDESC relocation. > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + . */ > > > > + > > > > +#include > > > > + > > > > +struct tls > > > > +{ > > > > + int64_t a, b, c, d; > > > > +}; > > > > + > > > > +extern struct tls *apply_tls (struct tls *); > > > > diff --git a/elf/tst-gnu2-tls2mod0.c b/elf/tst-gnu2-tls2mod0.c > > > > new file mode 100644 > > > > index 0000000000..67dc0d464d > > > > --- /dev/null > > > > +++ b/elf/tst-gnu2-tls2mod0.c > > > > @@ -0,0 +1,28 @@ > > > > +/* DSO used by tst-gnu2-tls2. > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + . */ > > > > + > > > > +#include "tst-gnu2-tls2.h" > > > > + > > > > +__thread struct tls tls_var0 __attribute__ ((visibility ("hidden"))); > > > > + > > > > +struct tls * > > > > +apply_tls (struct tls *p) > > > > +{ > > > > + tls_var0 = *p; > > > > + return &tls_var0; > > > > +} > > > > diff --git a/elf/tst-gnu2-tls2mod1.c b/elf/tst-gnu2-tls2mod1.c > > > > new file mode 100644 > > > > index 0000000000..a4ae6db24f > > > > --- /dev/null > > > > +++ b/elf/tst-gnu2-tls2mod1.c > > > > @@ -0,0 +1,28 @@ > > > > +/* DSO used by tst-gnu2-tls2. > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + . */ > > > > + > > > > +#include "tst-gnu2-tls2.h" > > > > + > > > > +__thread struct tls tls_var1[100] __attribute__ ((visibility ("hidden"))); > > > > + > > > > +struct tls * > > > > +apply_tls (struct tls *p) > > > > +{ > > > > + tls_var1[1] = *p; > > > > + return &tls_var1[1]; > > > > +} > > > > diff --git a/elf/tst-gnu2-tls2mod2.c b/elf/tst-gnu2-tls2mod2.c > > > > new file mode 100644 > > > > index 0000000000..2d13921717 > > > > --- /dev/null > > > > +++ b/elf/tst-gnu2-tls2mod2.c > > > > @@ -0,0 +1,28 @@ > > > > +/* DSO used by tst-gnu2-tls2. > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + . */ > > > > + > > > > +#include "tst-gnu2-tls2.h" > > > > + > > > > +__thread struct tls tls_var2 __attribute__ ((visibility ("hidden"))); > > > > + > > > > +struct tls * > > > > +apply_tls (struct tls *p) > > > > +{ > > > > + tls_var2 = *p; > > > > + return &tls_var2; > > > > +} > > > > diff --git a/sysdeps/i386/dl-machine.h b/sysdeps/i386/dl-machine.h > > > > index fc1ef96587..50d74fe6e9 100644 > > > > --- a/sysdeps/i386/dl-machine.h > > > > +++ b/sysdeps/i386/dl-machine.h > > > > @@ -347,7 +347,7 @@ and creates an unsatisfiable circular dependency.\n", > > > > { > > > > td->arg = _dl_make_tlsdesc_dynamic > > > > (sym_map, sym->st_value + (ElfW(Word))td->arg); > > > > - td->entry = _dl_tlsdesc_dynamic; > > > > + td->entry = GLRO(dl_x86_tlsdesc_dynamic); > > > > } > > > > else > > > > # endif > > > > diff --git a/sysdeps/i386/dl-tlsdesc-dynamic.h b/sysdeps/i386/dl-tlsdesc-dynamic.h > > > > new file mode 100644 > > > > index 0000000000..675e56d32d > > > > --- /dev/null > > > > +++ b/sysdeps/i386/dl-tlsdesc-dynamic.h > > > > @@ -0,0 +1,187 @@ > > > > +/* Thread-local storage handling in the ELF dynamic linker. i386 version. > > > > + Copyright (C) 2004-2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + . */ > > > > + > > > > +#undef REGISTER_SAVE_AREA > > > > + > > > > +#if !defined USE_FNSAVE && (STATE_SAVE_ALIGNMENT % 16) != 0 > > > > +# error STATE_SAVE_ALIGNMENT must be multiple of 16 > > > > +#endif > > > > + > > > > +#if DL_RUNTIME_RESOLVE_REALIGN_STACK > > > > +# ifdef USE_FNSAVE > > > > +# error USE_FNSAVE shouldn't be defined > > > > +# endif > > > > +# ifdef USE_FXSAVE > > > > +/* Use fxsave to save all registers. */ > > > > +# define REGISTER_SAVE_AREA 512 > > > > +# endif > > > > +#else > > > > +# ifdef USE_FNSAVE > > > > +/* Use fnsave to save x87 FPU stack registers. */ > > > > +# define REGISTER_SAVE_AREA 108 > > > > +# else > > > > +# ifndef USE_FXSAVE > > > > +# error USE_FXSAVE must be defined > > > > +# endif > > > > +/* Use fxsave to save all registers. Add 12 bytes to align the stack > > > > + to 16 bytes. */ > > > > +# define REGISTER_SAVE_AREA (512 + 12) > > > > +# endif > > > > +#endif > > > > + > > > > + .hidden _dl_tlsdesc_dynamic > > > > + .global _dl_tlsdesc_dynamic > > > > + .type _dl_tlsdesc_dynamic,@function > > > > + > > > > + /* This function is used for symbols that need dynamic TLS. > > > > + > > > > + %eax points to the TLS descriptor, such that 0(%eax) points to > > > > + _dl_tlsdesc_dynamic itself, and 4(%eax) points to a struct > > > > + tlsdesc_dynamic_arg object. It must return in %eax the offset > > > > + between the thread pointer and the object denoted by the > > > > + argument, without clobbering any registers. > > > > + > > > > + The assembly code that follows is a rendition of the following > > > > + C code, hand-optimized a little bit. > > > > + > > > > +ptrdiff_t > > > > +__attribute__ ((__regparm__ (1))) > > > > +_dl_tlsdesc_dynamic (struct tlsdesc *tdp) > > > > +{ > > > > + struct tlsdesc_dynamic_arg *td = tdp->arg; > > > > + dtv_t *dtv = *(dtv_t **)((char *)__thread_pointer + DTV_OFFSET); > > > > + if (__builtin_expect (td->gen_count <= dtv[0].counter > > > > + && (dtv[td->tlsinfo.ti_module].pointer.val > > > > + != TLS_DTV_UNALLOCATED), > > > > + 1)) > > > > + return dtv[td->tlsinfo.ti_module].pointer.val + td->tlsinfo.ti_offset > > > > + - __thread_pointer; > > > > + > > > > + return ___tls_get_addr (&td->tlsinfo) - __thread_pointer; > > > > +} > > > > +*/ > > > > + cfi_startproc > > > > + .align 16 > > > > +_dl_tlsdesc_dynamic: > > > > + /* Like all TLS resolvers, preserve call-clobbered registers. > > > > + We need two scratch regs anyway. */ > > > > + subl $32, %esp > > > > + cfi_adjust_cfa_offset (32) > > > > + movl %ecx, 20(%esp) > > > > + movl %edx, 24(%esp) > > > > + movl TLSDESC_ARG(%eax), %eax > > > > + movl %gs:DTV_OFFSET, %edx > > > > + movl TLSDESC_GEN_COUNT(%eax), %ecx > > > > + cmpl (%edx), %ecx > > > > + ja 2f > > > > + movl TLSDESC_MODID(%eax), %ecx > > > > + movl (%edx,%ecx,8), %edx > > > > + cmpl $-1, %edx > > > > + je 2f > > > > + movl TLSDESC_MODOFF(%eax), %eax > > > > + addl %edx, %eax > > > > +1: > > > > + movl 20(%esp), %ecx > > > > + subl %gs:0, %eax > > > > + movl 24(%esp), %edx > > > > + addl $32, %esp > > > > + cfi_adjust_cfa_offset (-32) > > > > + ret > > > > + .p2align 4,,7 > > > > +2: > > > > + cfi_adjust_cfa_offset (32) > > > Extraneous AFAICT. > > > > This was in the existing code. The label 2 can only be reached by > > a jump. When the label 2 is reached, this CFA adjustment is to tell > > debugger that CFA isn't changed the CFA directive above. > > > > > > > > > +#if DL_RUNTIME_RESOLVE_REALIGN_STACK > > > > + movl %ebx, -28(%esp) > > > > + movl %esp, %ebx > > > > + cfi_def_cfa_register(%ebx) > > > > + and $-STATE_SAVE_ALIGNMENT, %esp > > > > +#endif > > > > +#ifdef REGISTER_SAVE_AREA > > > > + subl $REGISTER_SAVE_AREA, %esp > > > > +# if !DL_RUNTIME_RESOLVE_REALIGN_STACK > > > > + cfi_adjust_cfa_offset(REGISTER_SAVE_AREA) > > > > +# endif > > > > +#else > > > > + # Allocate stack space of the required size to save the state. > > > > + LOAD_PIC_REG (cx) > > > > + subl RTLD_GLOBAL_RO_DL_X86_CPU_FEATURES_OFFSET+XSAVE_STATE_SIZE_OFFSET+_rtld_local_ro@GOTOFF(%ecx), %esp > > > > +#endif > > > > +#ifdef USE_FNSAVE > > > > + fnsave (%esp) > > > > +#elif defined USE_FXSAVE > > > > + fxsave (%esp) > > > > +#else > > > > + # Save the argument for ___tls_get_addr in EAX. > > > > + movl %eax, %ecx > > > > + movl $TLSDESC_CALL_STATE_SAVE_MASK, %eax > > > > + xorl %edx, %edx > > > > + # Clear the XSAVE Header. > > > > +# ifdef USE_XSAVE > > > > + movl %edx, (512)(%esp) > > > > + movl %edx, (512 + 4 * 1)(%esp) > > > > + movl %edx, (512 + 4 * 2)(%esp) > > > > + movl %edx, (512 + 4 * 3)(%esp) > > > > +# endif > > > > + movl %edx, (512 + 4 * 4)(%esp) > > > > + movl %edx, (512 + 4 * 5)(%esp) > > > > + movl %edx, (512 + 4 * 6)(%esp) > > > > + movl %edx, (512 + 4 * 7)(%esp) > > > > + movl %edx, (512 + 4 * 8)(%esp) > > > > + movl %edx, (512 + 4 * 9)(%esp) > > > > + movl %edx, (512 + 4 * 10)(%esp) > > > > + movl %edx, (512 + 4 * 11)(%esp) > > > > + movl %edx, (512 + 4 * 12)(%esp) > > > > + movl %edx, (512 + 4 * 13)(%esp) > > > > + movl %edx, (512 + 4 * 14)(%esp) > > > > + movl %edx, (512 + 4 * 15)(%esp) > > > > +# ifdef USE_XSAVE > > > > + xsave (%esp) > > > > +# else > > > > + xsavec (%esp) > > > > +# endif > > > > + # Restore the argument for ___tls_get_addr in EAX. > > > > + movl %ecx, %eax > > > > +#endif > > > > + call HIDDEN_JUMPTARGET (___tls_get_addr) > > > > + # Get register content back. > > > > +#ifdef USE_FNSAVE > > > > + frstor (%esp) > > > > +#elif defined USE_FXSAVE > > > > + fxrstor (%esp) > > > > +#else > > > > + /* Save and retore ___tls_get_addr return value stored in EAX. */ > > > > + movl %eax, %ecx > > > > + movl $TLSDESC_CALL_STATE_SAVE_MASK, %eax > > > > + xorl %edx, %edx > > > > + xrstor (%esp) > > > > + movl %ecx, %eax > > > > +#endif > > > > +#if DL_RUNTIME_RESOLVE_REALIGN_STACK > > > > + mov %ebx, %esp > > > > + cfi_def_cfa_register(%esp) > > > > + movl -28(%esp), %ebx > > > > + cfi_restore(%ebx) > > > > +#else > > > > + addl $REGISTER_SAVE_AREA, %esp > > > > + cfi_adjust_cfa_offset(-REGISTER_SAVE_AREA) > > > The use of `REGISTER_SAVE_AREA` above is guarded by an > > > `#ifdef REGISTER_SAVE_AREA` > > > and uses > > > `_rtld_local_ro+RTLD_GLOBAL_RO_DL_X86_CPU_FEATURES_OFFSET+XSAVE_STATE_SIZE_OFFSET(%rip)` > > > otherwise. > > > Would expect same here? > > > > REGISTER_SAVE_AREA is only used by fnsave and fxsave which > > expect the fixed area. > > > > _rtld_local_ro+RTLD_GLOBAL_RO_DL_X86_CPU_FEATURES_OFFSET+XSAVE_STATE_SIZE_OFFSET(%rip) > > is used by xsave and xsavec whose saved area size depends on > > the enabled features. > > > > 2 things are different. > > My point is that we setup the stack above with ifdef i.e > ``` > #ifdef REGISTER_SAVE_AREA > subl $REGISTER_SAVE_AREA, %esp > #else > subl RTLD_GLOBAL_RO_DL_X86_CPU_FEATURES_OFFSET+XSAVE_STATE_SIZE_OFFSET+_rtld_local_ro@GOTOFF(%ecx), > %esp > #endif > ``` > Shouldnt you have the same ifdef for restoring? The actual code is #ifdef REGISTER_SAVE_AREA subl $REGISTER_SAVE_AREA, %esp # if !DL_RUNTIME_RESOLVE_REALIGN_STACK cfi_adjust_cfa_offset(REGISTER_SAVE_AREA) # endif #else # Allocate stack space of the required size to save the state. LOAD_PIC_REG (cx) subl RTLD_GLOBAL_RO_DL_X86_CPU_FEATURES_OFFSET+XSAVE_STATE_SIZE_OFFSET+_rtld_local_ro@GOTOFF(%ecx), %esp #endif I am not sure how your suggestion should work. H.J.