From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by sourceware.org (Postfix) with ESMTPS id 1E053385842A for ; Wed, 29 Nov 2023 13:46:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1E053385842A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1E053385842A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::635 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701265611; cv=none; b=IEL3PEFDD0BdKsGN4iKKeG1S6uZrr+6iw02pBOA1w4Vw0ge4HEdpCG5YsJMo2wc5fbVIElDL7fH7SpewavA4puhP9DJ6RLjAAP1O8J3UmMNTHiwH/zYNy2O2ZPISWxZYcYC/FsjavIwkfuU/G4Oy2BmUUawd4Mp3/zfDFATX3Lg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701265611; c=relaxed/simple; bh=mM/5o8MIIf9/UFfPzI+F3MrydUd3v2y0HvYKoID8OU0=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=D2TOM5HiTGA/vtBG3X9HB7N62QiEKmre2BJdV791A3IcylyT26Fu59CSHyYfkNiQW406liBH71l8Yict8QOHFsDw/ex1y0wQHCJIhatVpj2ChgrmBi81V/9E7h5qx1KGx50Eyq09JUkYNDH0YhwgL59G3O8PsOEAfWmxuvDb9IY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1cf8b35a6dbso50611185ad.0 for ; Wed, 29 Nov 2023 05:46:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1701265603; x=1701870403; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=B8g6H1VPuLsiFrbwwkRUHnkGRpZ9ts5b55gsxLhb8yA=; b=bvuOYYRrj7CYmwvs4L5uK279LqbAmtSGiqYJY8RWEkhKEM/kQNUSGvji6tlv3QFeZ5 SODiRnSl9o6F/3/Kt9rPHsdmRlhrbDstHevNJ/Pw0hRi4D+5d67ToNMlBUyzpSph5ZNn OZyHooG6rcLILbeK39o7a5mt3Fqor4s1iKyiB/qgrwpU45zSPPcfOl+TManVIZV9Pp3+ gL4iKHMNT6LYs0r9khiPa1/1OOJCIGuuKyUPixlXBPPJzJZUtExcvqKGnlbZbz7y1hfw I4dTpj+6KhOg8VjSnyx1uOFgTSQfQ+iMj7Q20uVgeyMqZuILJaq1NIAXTVzq82aboK+k dKmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701265603; x=1701870403; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=B8g6H1VPuLsiFrbwwkRUHnkGRpZ9ts5b55gsxLhb8yA=; b=mdjmIXwVKUf0goW4FQnbQRAqdZGY8euC6dXszNalh7vrhAK6/OmjVeAKkwzf4mY7qi dNxZmZXlUx9uqT5MO0XAGZ7IG5YRG3pUOsgXr0tSmAoNj5Muuo6rGtXxais2ifVH6/8I znRoKIeNbIm9WGdj0fmy0FrkH5vCJ5D8MdKj++nd1QTAuqmoTMw+aD/LVxqceEMLDx0t 4ANfAc1MraVN3BXM+eNXDB4ZAl2dSzXi/9M5jP5DOUF3Ny7KTEuDzLuvP3U1+hoUUtry 9cm4JNuV6t6w4MMHex4Y1RXVGMpVDi+RXrPLAks2Dg9bXwQTcrFStXMqhpLTYH3A2/wS /DRg== X-Gm-Message-State: AOJu0Yz2Nb4OQFCqNmTL9rt4dn82w9GiaVDgj3ZnO/y+Gs0N4ICB6d9c FnvXnDdoRqjqsU/Lple6g6HFcg== X-Google-Smtp-Source: AGHT+IFjdcaF5IZJDoGMuQLXi/Z+VI8BwU9zsxq2ISS6kiAXyLaHAM0YpOg42HMiN3IA5OHDE6iTYQ== X-Received: by 2002:a17:902:cf43:b0:1cf:c2dc:1c2c with SMTP id e3-20020a170902cf4300b001cfc2dc1c2cmr14820590plg.52.1701265602870; Wed, 29 Nov 2023 05:46:42 -0800 (PST) Received: from ?IPV6:2804:1b3:a7c2:bcdb:cd60:ab60:c05:7aa0? ([2804:1b3:a7c2:bcdb:cd60:ab60:c05:7aa0]) by smtp.gmail.com with ESMTPSA id jc2-20020a17090325c200b001b7cbc5871csm12240487plb.53.2023.11.29.05.46.40 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 29 Nov 2023 05:46:42 -0800 (PST) Message-ID: <3491689c-873f-4815-9ec7-1e39d5424f88@linaro.org> Date: Wed, 29 Nov 2023 10:46:39 -0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v3] Add rseq extensible ABI support Content-Language: en-US To: Michael Jeanson , libc-alpha@sourceware.org Cc: Mathieu Desnoyers , Florian Weimer , Carlos O'Donell References: <20231122214020.247888-1-mjeanson@efficios.com> <20231127220922.443855-1-mjeanson@efficios.com> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: <20231127220922.443855-1-mjeanson@efficios.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 27/11/23 19:09, Michael Jeanson wrote: > Introduced in Linux v6.3 the rseq extensible ABI [1] will allow adding > rseq features past the initial 32 bytes of the original ABI. > > While the rseq features in the latest kernel still fit within the > original ABI size, there are currently only 4 bytes left. It would thus > be a good time to add support for the extensible ABI so that when new > features are added, they are immediately available to GNU libc users. > > This is a first draft of a possible implementation to gather some > feedback. > > We use the ELF auxiliary vectors to query the kernel for the size and > alignment of the rseq area, if this fails we default to the original > fixed size and alignment of '32' which the kernel will accept as a > compatibility mode with the original ABI. > > This makes the size of the rseq area variable and thus requires to > relocate it out of 'struct pthread'. We chose to move it after (in block > allocation order) the last TLS block since it required a fairly small > modification to the TLS block allocator and did not interfere with the > main executable TLS block which must always be first. > > [1] https://lore.kernel.org/all/20221122203932.231377-4-mathieu.desnoyers@efficios.com/ Both aarch64 and arm32 shows a lot of regressions [1] [2], for instance: $ file io/tst-statx io/tst-statx: ELF 32-bit LSB executable, ARM, EABI5 version 1 (GNU/Linux), statically linked, for GNU/Linux 3.2.0, with debug_info, not stripped $ gdb --args io/tst-statx --direct [...] r(gdb) r [...] Program received signal SIGSEGV, Segmentation fault. 0x0000aaaaaf08a114 in ?? () (gdb) bt #0 0x0000aaaaaf08a114 in ?? () #1 0x0000ffffe5e54450 in ?? () It seems something is really wrong that make gdb backtrace non usable. [1] https://ci.linaro.org/job/tcwg_glibc_check--master-arm-precommit/1088/artifact/artifacts/artifacts.precommit/notify/mail-body.txt [2] https://ci.linaro.org/job/tcwg_glibc_check--master-aarch64-precommit/1063/artifact/artifacts/artifacts.precommit/notify/mail-body.txt > > Signed-off-by: Michael Jeanson > Co-Authored-By: Mathieu Desnoyers > Signed-off-By: Mathieu Desnoyers > Cc: Florian Weimer > Cc: Carlos O'Donell > --- > Changes since RFC v1: > - Insert the rseq area after the last TLS block > - Add proper support for TLS_TCB_AT_TP variant > Changes since RFC v2: > - Set __rseq_size even when the registration fails > - Adjust rseq tests to the new ABI > - Add support for statically linked executables > --- > csu/libc-tls.c | 40 +++++++++++++++- > elf/dl-tls.c | 56 ++++++++++++++++++++++ > elf/rtld_static_init.c | 6 +++ > nptl/descr.h | 20 ++------ > nptl/pthread_create.c | 2 +- > sysdeps/generic/ldsodefs.h | 6 +++ > sysdeps/i386/nptl/tcb-access.h | 54 +++++++++++++++++++++ > sysdeps/nptl/dl-tls_init_tp.c | 19 ++++---- > sysdeps/nptl/tcb-access.h | 5 ++ > sysdeps/unix/sysv/linux/rseq-internal.h | 26 ++++++++-- > sysdeps/unix/sysv/linux/sched_getcpu.c | 3 +- > sysdeps/unix/sysv/linux/tst-rseq-disable.c | 20 ++++---- > sysdeps/unix/sysv/linux/tst-rseq.c | 10 ++-- > sysdeps/unix/sysv/linux/tst-rseq.h | 3 +- > sysdeps/x86_64/nptl/tcb-access.h | 55 ++++++++++++++++++++- > 15 files changed, 277 insertions(+), 48 deletions(-) > > diff --git a/csu/libc-tls.c b/csu/libc-tls.c > index cdf6442c02..58e564db8b 100644 > --- a/csu/libc-tls.c > +++ b/csu/libc-tls.c > @@ -23,6 +23,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -62,6 +63,12 @@ size_t _dl_tls_static_surplus; > dynamic TLS access (e.g. with TLSDESC). */ > size_t _dl_tls_static_optional; > > +/* Offset of the rseq area from the thread pointer. */ > +ptrdiff_t _dl_tls_rseq_offset; > + > +/* Size of the rseq area in the static TLS block. */ > +size_t _dl_tls_rseq_size; > + > /* Generation counter for the dtv. */ > size_t _dl_tls_generation; > > @@ -135,6 +142,24 @@ __libc_setup_tls (void) > /* Calculate the size of the static TLS surplus, with 0 auditors. */ > _dl_tls_static_surplus_init (0); > > + /* Get the rseq auxiliary vectors, 0 is returned when not implemented > + and we then default to the rseq ABI minimums. */ > + size_t rseq_size = MAX (getauxval (AT_RSEQ_FEATURE_SIZE), 32); > + size_t rseq_align = MAX (getauxval (AT_RSEQ_ALIGN), 32); This triggers linknamespace issues, you need to use the internal aliases __getauxval. > + > + /* Make sure the rseq area size is a multiple of the requested > + aligment. */ > + rseq_size = roundup (rseq_size, rseq_align); > + > + /* Add the rseq area block to the global offset. */ > + //offset = roundup (offset, rseq_align) + rseq_size; > + > + /* Increase the max_align if necessary. */ > + max_align = MAX (max_align, rseq_align); > + > + /* Record the rseq_area block size. */ > + GLRO (dl_tls_rseq_size) = rseq_size; > + > /* We have to set up the TCB block which also (possibly) contains > 'errno'. Therefore we avoid 'malloc' which might touch 'errno'. > Instead we use 'sbrk' which would only uses 'errno' if it fails. > @@ -144,13 +169,13 @@ __libc_setup_tls (void) > /* Align the TCB offset to the maximum alignment, as > _dl_allocate_tls_storage (in elf/dl-tls.c) does using __libc_memalign > and dl_tls_static_align. */ > - tcb_offset = roundup (memsz + GLRO(dl_tls_static_surplus), max_align); > + tcb_offset = roundup (memsz + rseq_size + GLRO(dl_tls_static_surplus), max_align); > tlsblock = _dl_early_allocate (tcb_offset + TLS_INIT_TCB_SIZE + max_align); > if (tlsblock == NULL) > _startup_fatal_tls_error (); > #elif TLS_DTV_AT_TP > tcb_offset = roundup (TLS_INIT_TCB_SIZE, align ?: 1); > - tlsblock = _dl_early_allocate (tcb_offset + memsz + max_align > + tlsblock = _dl_early_allocate (tcb_offset + memsz + rseq_size + max_align > + TLS_PRE_TCB_SIZE > + GLRO(dl_tls_static_surplus)); > if (tlsblock == NULL) > @@ -175,9 +200,17 @@ __libc_setup_tls (void) > _dl_static_dtv[2].pointer.val = ((char *) tlsblock + tcb_offset > - roundup (memsz, align ?: 1)); > main_map->l_tls_offset = roundup (memsz, align ?: 1); > + > + /* Record the rseq_area offset. The offset is negative with TLS_TCB_AT_TP > + because the TLS blocks are located before the thread pointer. */ > + GLRO (dl_tls_rseq_offset) = roundup (main_map->l_tls_offset + rseq_size, rseq_align); > #elif TLS_DTV_AT_TP > _dl_static_dtv[2].pointer.val = (char *) tlsblock + tcb_offset; > main_map->l_tls_offset = tcb_offset; > + > + /* Record the rseq_area offset. The offset is positive with TLS_DTV_AT_TP > + because the TLS blocks are located after the thread pointer. */ > + GLRO (dl_tls_rseq_offset) = roundup (tcb_offset + memsz, rseq_align); > #else > # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined" > #endif > @@ -215,5 +248,8 @@ __libc_setup_tls (void) > memsz += tcb_offset; > #endif > > + /* Add rseq area to the used size. */ > + memsz = roundup (memsz + rseq_size, rseq_align); > + > init_static_tls (memsz, MAX (TCB_ALIGNMENT, max_align)); > } > diff --git a/elf/dl-tls.c b/elf/dl-tls.c > index c192b5a13a..e9390f87fd 100644 > --- a/elf/dl-tls.c > +++ b/elf/dl-tls.c > @@ -24,6 +24,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -75,6 +76,12 @@ > /* Default for dl_tls_static_optional. */ > #define OPTIONAL_TLS 512 > > +/* Minimum size of the rseq area. */ > +#define TLS_DL_RSEQ_MIN_SIZE 32 > + > +/* Minimum size of the rseq area alignment. */ > +#define TLS_DL_RSEQ_MIN_ALIGN 32 > + > /* Compute the static TLS surplus based on the namespace count and the > TLS space that can be used for optimizations. */ > static inline int > @@ -297,6 +304,29 @@ _dl_determine_tlsoffset (void) > slotinfo[cnt].map->l_tls_offset = off; > } > > + /* Insert the rseq area block after the last TLS block. */ > + > + /* Get the rseq auxiliary vectors, 0 is returned when not implemented > + and we then default to the rseq ABI minimums. */ > + size_t rseq_size = MAX (getauxval (AT_RSEQ_FEATURE_SIZE), TLS_DL_RSEQ_MIN_SIZE); > + size_t rseq_align = MAX (getauxval (AT_RSEQ_ALIGN), TLS_DL_RSEQ_MIN_ALIGN); > + > + /* Make sure the rseq area size is a multiple of the requested > + aligment. */ > + rseq_size = roundup (rseq_size, rseq_align); > + > + /* Add the rseq area block to the global offset. */ > + offset = roundup (offset, rseq_align) + rseq_size; > + > + /* Increase the max_align if necessary. */ > + max_align = MAX (max_align, rseq_align); > + > + /* Record the rseq_area block size and offset. The offset is negative > + with TLS_TCB_AT_TP because the TLS blocks are located before the > + thread pointer. */ > + GLRO (dl_tls_rseq_offset) = -offset; > + GLRO (dl_tls_rseq_size) = rseq_size; > + > GL(dl_tls_static_used) = offset; > GLRO (dl_tls_static_size) = (roundup (offset + GLRO(dl_tls_static_surplus), > max_align) > @@ -342,6 +372,32 @@ _dl_determine_tlsoffset (void) > offset = off + slotinfo[cnt].map->l_tls_blocksize - firstbyte; > } > > + /* Insert the rseq area block after the last TLS block. */ > + > + /* Get the rseq auxiliary vectors, 0 is returned when not implemented > + and we then default to the rseq ABI minimums. */ > + size_t rseq_size = MAX (getauxval (AT_RSEQ_FEATURE_SIZE), TLS_DL_RSEQ_MIN_SIZE); > + size_t rseq_align = MAX (getauxval (AT_RSEQ_ALIGN), TLS_DL_RSEQ_MIN_ALIGN); > + > + /* Make sure the rseq area size is a multiple of the requested > + aligment. */ > + rseq_size = roundup (rseq_size, rseq_align); > + > + /* Align the global offset to the beginning of the rseq area. */ > + offset = roundup (offset, rseq_align); > + > + /* Record the rseq_area block size and offset. The offset is positive > + with TLS_DTV_AT_TP because the TLS blocks are located after the > + thread pointer. */ > + GLRO (dl_tls_rseq_size) = rseq_size; > + GLRO (dl_tls_rseq_offset) = offset; > + > + /* Add the rseq area block to the global offset. */ > + offset += rseq_size; > + > + /* Increase the max_align if necessary. */ > + max_align = MAX (max_align, rseq_align); > + > GL(dl_tls_static_used) = offset; > GLRO (dl_tls_static_size) = roundup (offset + GLRO(dl_tls_static_surplus), > TCB_ALIGNMENT); > diff --git a/elf/rtld_static_init.c b/elf/rtld_static_init.c > index aec8cc056b..84548023ab 100644 > --- a/elf/rtld_static_init.c > +++ b/elf/rtld_static_init.c > @@ -78,6 +78,12 @@ __rtld_static_init (struct link_map *map) > extern __typeof (dl->_dl_tls_static_size) _dl_tls_static_size > attribute_hidden; > dl->_dl_tls_static_size = _dl_tls_static_size; > + extern __typeof (dl->_dl_tls_rseq_size) _dl_tls_rseq_size > + attribute_hidden; > + dl->_dl_tls_rseq_size = _dl_tls_rseq_size; > + extern __typeof (dl->_dl_tls_rseq_offset) _dl_tls_rseq_offset > + attribute_hidden; > + dl->_dl_tls_rseq_offset = _dl_tls_rseq_offset; > dl->_dl_find_object = _dl_find_object; > > __rtld_static_init_arch (map, dl); > diff --git a/nptl/descr.h b/nptl/descr.h > index 0171576c23..fadba1f619 100644 > --- a/nptl/descr.h > +++ b/nptl/descr.h > @@ -404,25 +404,11 @@ struct pthread > /* Used on strsignal. */ > struct tls_internal_t tls_state; > > - /* rseq area registered with the kernel. Use a custom definition > - here to isolate from kernel struct rseq changes. The > - implementation of sched_getcpu needs acccess to the cpu_id field; > - the other fields are unused and not included here. */ > - union > - { > - struct > - { > - uint32_t cpu_id_start; > - uint32_t cpu_id; > - }; > - char pad[32]; /* Original rseq area size. */ > - } rseq_area __attribute__ ((aligned (32))); > - > /* Amount of end padding, if any, in this structure. > - This definition relies on rseq_area being last. */ > + This definition relies on tls_state being last. */ > #define PTHREAD_STRUCT_END_PADDING \ > - (sizeof (struct pthread) - offsetof (struct pthread, rseq_area) \ > - + sizeof ((struct pthread) {}.rseq_area)) > + (sizeof (struct pthread) - offsetof (struct pthread, tls_state) \ > + + sizeof ((struct pthread) {}.tls_state)) > } __attribute ((aligned (TCB_ALIGNMENT))); > > static inline bool > diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c > index 63cb684f04..4ee0f24741 100644 > --- a/nptl/pthread_create.c > +++ b/nptl/pthread_create.c > @@ -691,7 +691,7 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr, > > /* Inherit rseq registration state. Without seccomp filters, rseq > registration will either always fail or always succeed. */ > - if ((int) THREAD_GETMEM_VOLATILE (self, rseq_area.cpu_id) >= 0) > + if ((int) RSEQ_GETMEM_VOLATILE (rseq_get_area(), cpu_id) >= 0) > pd->flags |= ATTR_FLAG_DO_RSEQ; > > /* Initialize the field for the ID of the thread which is waiting > diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h > index 9b50ddd09f..f8a3628bc0 100644 > --- a/sysdeps/generic/ldsodefs.h > +++ b/sysdeps/generic/ldsodefs.h > @@ -610,6 +610,12 @@ struct rtld_global_ro > See comments in elf/dl-tls.c where it is initialized. */ > EXTERN size_t _dl_tls_static_surplus; > > + /* Offset of the rseq area from the thread pointer. */ > + EXTERN ptrdiff_t _dl_tls_rseq_offset; > + > + /* Size of the rseq area in the static TLS block. */ > + EXTERN size_t _dl_tls_rseq_size; > + > /* Name of the shared object to be profiled (if any). */ > EXTERN const char *_dl_profile; > /* Filename of the output file. */ > diff --git a/sysdeps/i386/nptl/tcb-access.h b/sysdeps/i386/nptl/tcb-access.h > index 28f0a5625f..f1bbccab6e 100644 > --- a/sysdeps/i386/nptl/tcb-access.h > +++ b/sysdeps/i386/nptl/tcb-access.h > @@ -123,3 +123,57 @@ > "i" (offsetof (struct pthread, member)), \ > "r" (idx)); \ > }}) > + > + > +/* Read member of the RSEQ area directly. */ > +#define RSEQ_GETMEM_VOLATILE(descr, member) \ > + ({ __typeof (descr->member) __value; \ > + _Static_assert (sizeof (__value) == 1 \ > + || sizeof (__value) == 4 \ > + || sizeof (__value) == 8, \ > + "size of per-thread data"); \ > + if (sizeof (__value) == 1) \ > + asm volatile ("movb %%gs:%P2(%3),%b0" \ > + : "=q" (__value) \ > + : "0" (0), "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + else if (sizeof (__value) == 4) \ > + asm volatile ("movl %%gs:%P1(%2),%0" \ > + : "=r" (__value) \ > + : "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + else /* 8 */ \ > + { \ > + asm volatile ("movl %%gs:%P1(%2),%%eax\n\t" \ > + "movl %%gs:4+%P1(%2),%%edx" \ > + : "=&A" (__value) \ > + : "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + } \ > + __value; }) > + > +/* Set member of the RSEQ area directly. */ > +#define RSEQ_SETMEM(descr, member, value) \ > + ({ \ > + _Static_assert (sizeof (descr->member) == 1 \ > + || sizeof (descr->member) == 4 \ > + || sizeof (descr->member) == 8, \ > + "size of per-thread data"); \ > + if (sizeof (descr->member) == 1) \ > + asm volatile ("movb %b0,%%gs:%P1(%2)" : \ > + : "iq" (value), \ > + "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + else if (sizeof (descr->member) == 4) \ > + asm volatile ("movl %0,%%gs:%P1(%2)" : \ > + : "ir" (value), \ > + "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + else /* 8 */ \ > + { \ > + asm volatile ("movl %%eax,%%gs:%P1(%2)\n\t" \ > + "movl %%edx,%%gs:4+%P1(%2)" : \ > + : "A" ((uint64_t) cast_to_integer (value)), \ > + "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + }}) > diff --git a/sysdeps/nptl/dl-tls_init_tp.c b/sysdeps/nptl/dl-tls_init_tp.c > index 2ed98c5a31..bc4549d240 100644 > --- a/sysdeps/nptl/dl-tls_init_tp.c > +++ b/sysdeps/nptl/dl-tls_init_tp.c > @@ -103,13 +103,16 @@ __tls_init_tp (void) > { > bool do_rseq = true; > do_rseq = TUNABLE_GET (rseq, int, NULL); > - if (rseq_register_current_thread (pd, do_rseq)) > - { > - /* We need a writable view of the variables. They are in > - .data.relro and are not yet write-protected. */ > - extern unsigned int size __asm__ ("__rseq_size"); > - size = sizeof (pd->rseq_area); > - } > + rseq_register_current_thread (pd, do_rseq); > + > + // FIXME: Even if the registration fails, we need to communicate the size > + // of the allocated rseq area to an application that could attempt the > + // registration itself. > + > + /* We need a writable view of the variables. They are in > + .data.relro and are not yet write-protected. */ > + extern unsigned int size __asm__ ("__rseq_size"); > + size = GLRO (dl_tls_rseq_size); > > #ifdef RSEQ_SIG > /* This should be a compile-time constant, but the current > @@ -118,7 +121,7 @@ __tls_init_tp (void) > if the rseq registration may have happened because RSEQ_SIG is > defined. */ > extern ptrdiff_t offset __asm__ ("__rseq_offset"); > - offset = (char *) &pd->rseq_area - (char *) __thread_pointer (); > + offset = GLRO (dl_tls_rseq_offset); > #endif > } > > diff --git a/sysdeps/nptl/tcb-access.h b/sysdeps/nptl/tcb-access.h > index d6dfd41391..7092428d38 100644 > --- a/sysdeps/nptl/tcb-access.h > +++ b/sysdeps/nptl/tcb-access.h > @@ -30,3 +30,8 @@ > descr->member = (value) > #define THREAD_SETMEM_NC(descr, member, idx, value) \ > descr->member[idx] = (value) > + > +#define RSEQ_GETMEM_VOLATILE(descr, member) \ > + THREAD_GETMEM_VOLATILE(descr, member) > +#define RSEQ_SETMEM(descr, member, value) \ > + THREAD_SETMEM(descr, member, value) > diff --git a/sysdeps/unix/sysv/linux/rseq-internal.h b/sysdeps/unix/sysv/linux/rseq-internal.h > index 294880c04e..a944c71421 100644 > --- a/sysdeps/unix/sysv/linux/rseq-internal.h > +++ b/sysdeps/unix/sysv/linux/rseq-internal.h > @@ -24,6 +24,24 @@ > #include > #include > #include > +#include > +#include > + > +/* rseq area registered with the kernel. Use a custom definition > + here to isolate from kernel struct rseq changes. The > + implementation of sched_getcpu needs acccess to the cpu_id field; > + the other fields are unused and not included here. */ > +struct rseq_area > +{ > + uint32_t cpu_id_start; > + uint32_t cpu_id; > +}; > + > +static inline struct rseq_area * > +rseq_get_area(void) > +{ > + return (struct rseq_area *) ((char *) __thread_pointer() + GLRO (dl_tls_rseq_offset)); > +} > > #ifdef RSEQ_SIG > static inline bool > @@ -31,20 +49,20 @@ rseq_register_current_thread (struct pthread *self, bool do_rseq) > { > if (do_rseq) > { > - int ret = INTERNAL_SYSCALL_CALL (rseq, &self->rseq_area, > - sizeof (self->rseq_area), > + int ret = INTERNAL_SYSCALL_CALL (rseq, rseq_get_area(), > + GLRO (dl_tls_rseq_size), > 0, RSEQ_SIG); > if (!INTERNAL_SYSCALL_ERROR_P (ret)) > return true; > } > - THREAD_SETMEM (self, rseq_area.cpu_id, RSEQ_CPU_ID_REGISTRATION_FAILED); > + RSEQ_SETMEM (rseq_get_area(), cpu_id, RSEQ_CPU_ID_REGISTRATION_FAILED); > return false; > } > #else /* RSEQ_SIG */ > static inline bool > rseq_register_current_thread (struct pthread *self, bool do_rseq) > { > - THREAD_SETMEM (self, rseq_area.cpu_id, RSEQ_CPU_ID_REGISTRATION_FAILED); > + RSEQ_SETMEM (rseq_get_area(), cpu_id, RSEQ_CPU_ID_REGISTRATION_FAILED); > return false; > } > #endif /* RSEQ_SIG */ > diff --git a/sysdeps/unix/sysv/linux/sched_getcpu.c b/sysdeps/unix/sysv/linux/sched_getcpu.c > index 4457d714bc..6109c68625 100644 > --- a/sysdeps/unix/sysv/linux/sched_getcpu.c > +++ b/sysdeps/unix/sysv/linux/sched_getcpu.c > @@ -19,6 +19,7 @@ > #include > #include > #include > +#include > > static int > vsyscall_sched_getcpu (void) > @@ -37,7 +38,7 @@ vsyscall_sched_getcpu (void) > int > sched_getcpu (void) > { > - int cpu_id = THREAD_GETMEM_VOLATILE (THREAD_SELF, rseq_area.cpu_id); > + int cpu_id = RSEQ_GETMEM_VOLATILE (rseq_get_area(), cpu_id); > return __glibc_likely (cpu_id >= 0) ? cpu_id : vsyscall_sched_getcpu (); > } > #else /* RSEQ_SIG */ > diff --git a/sysdeps/unix/sysv/linux/tst-rseq-disable.c b/sysdeps/unix/sysv/linux/tst-rseq-disable.c > index cc7e94b3fe..a58d35ba02 100644 > --- a/sysdeps/unix/sysv/linux/tst-rseq-disable.c > +++ b/sysdeps/unix/sysv/linux/tst-rseq-disable.c > @@ -23,6 +23,7 @@ > #include > #include > #include > +#include > #include > > #ifdef RSEQ_SIG > @@ -31,22 +32,23 @@ > static void > check_rseq_disabled (void) > { > - struct pthread *pd = THREAD_SELF; > + size_t rseq_size = MAX (getauxval (AT_RSEQ_FEATURE_SIZE), 32); > + struct rseq *rseq_area = (struct rseq *) ((char *) __thread_pointer () + __rseq_offset); > > TEST_COMPARE (__rseq_flags, 0); > - TEST_VERIFY ((char *) __thread_pointer () + __rseq_offset > - == (char *) &pd->rseq_area); > - TEST_COMPARE (__rseq_size, 0); > - TEST_COMPARE ((int) pd->rseq_area.cpu_id, RSEQ_CPU_ID_REGISTRATION_FAILED); > + //FIXME: unsure how to test this > + //TEST_VERIFY ((char *) __thread_pointer () + __rseq_offset > + // == (char *) &pd->rseq_area); > + TEST_COMPARE (__rseq_size, rseq_size); > + TEST_COMPARE ((int) rseq_area->cpu_id, RSEQ_CPU_ID_REGISTRATION_FAILED); > > - int ret = syscall (__NR_rseq, &pd->rseq_area, sizeof (pd->rseq_area), > - 0, RSEQ_SIG); > + int ret = syscall (__NR_rseq, rseq_area, __rseq_size, 0, RSEQ_SIG); > if (ret == 0) > { > - ret = syscall (__NR_rseq, &pd->rseq_area, sizeof (pd->rseq_area), > + ret = syscall (__NR_rseq, rseq_area, __rseq_size, > RSEQ_FLAG_UNREGISTER, RSEQ_SIG); > TEST_COMPARE (ret, 0); > - pd->rseq_area.cpu_id = RSEQ_CPU_ID_REGISTRATION_FAILED; > + rseq_area->cpu_id = RSEQ_CPU_ID_REGISTRATION_FAILED; > } > else > { > diff --git a/sysdeps/unix/sysv/linux/tst-rseq.c b/sysdeps/unix/sysv/linux/tst-rseq.c > index 16983503b1..a6af0e783f 100644 > --- a/sysdeps/unix/sysv/linux/tst-rseq.c > +++ b/sysdeps/unix/sysv/linux/tst-rseq.c > @@ -31,18 +31,20 @@ > # include > # include > # include > +# include > # include "tst-rseq.h" > > static void > do_rseq_main_test (void) > { > - struct pthread *pd = THREAD_SELF; > + size_t rseq_size = MAX (getauxval (AT_RSEQ_FEATURE_SIZE), 32); > > TEST_VERIFY_EXIT (rseq_thread_registered ()); > TEST_COMPARE (__rseq_flags, 0); > - TEST_VERIFY ((char *) __thread_pointer () + __rseq_offset > - == (char *) &pd->rseq_area); > - TEST_COMPARE (__rseq_size, sizeof (pd->rseq_area)); > + //FIXME: unsure how to test this > + //TEST_VERIFY ((char *) __thread_pointer () + __rseq_offset > + // == (char *) &pd->rseq_area); > + TEST_COMPARE (__rseq_size, rseq_size); > } > > static void > diff --git a/sysdeps/unix/sysv/linux/tst-rseq.h b/sysdeps/unix/sysv/linux/tst-rseq.h > index 95d12048df..38a14e27d5 100644 > --- a/sysdeps/unix/sysv/linux/tst-rseq.h > +++ b/sysdeps/unix/sysv/linux/tst-rseq.h > @@ -23,11 +23,12 @@ > #include > #include > #include > +#include > > static inline bool > rseq_thread_registered (void) > { > - return THREAD_GETMEM_VOLATILE (THREAD_SELF, rseq_area.cpu_id) >= 0; > + return RSEQ_GETMEM_VOLATILE (((struct rseq_area *) ((char *) __thread_pointer () + __rseq_offset)), cpu_id) >= 0; > } > > static inline int > diff --git a/sysdeps/x86_64/nptl/tcb-access.h b/sysdeps/x86_64/nptl/tcb-access.h > index 110b1be44d..d45413b1e3 100644 > --- a/sysdeps/x86_64/nptl/tcb-access.h > +++ b/sysdeps/x86_64/nptl/tcb-access.h > @@ -67,7 +67,6 @@ > } \ > __value; }) > > - > /* Loading addresses of objects on x86-64 needs to be treated special > when generating PIC code. */ > #ifdef __pic__ > @@ -130,3 +129,57 @@ > "i" (offsetof (struct pthread, member[0])), \ > "r" (idx)); \ > }}) > + > +/* Read member of the RSEQ area directly. */ > +# define RSEQ_GETMEM_VOLATILE(descr, member) \ > + ({ __typeof (descr->member) __value; \ > + _Static_assert (sizeof (__value) == 1 \ > + || sizeof (__value) == 4 \ > + || sizeof (__value) == 8, \ > + "size of per-thread data"); \ > + if (sizeof (__value) == 1) \ > + asm volatile ("movb %%fs:%P2(%q3),%b0" \ > + : "=q" (__value) \ > + : "0" (0), "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + else if (sizeof (__value) == 4) \ > + asm volatile ("movl %%fs:%P1(%q2),%0" \ > + : "=r" (__value) \ > + : "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + else /* 8 */ \ > + { \ > + asm volatile ("movq %%fs:%P1(%q2),%q0" \ > + : "=r" (__value) \ > + : "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + } \ > + __value; }) > + > +/* Set member of the RSEQ area directly. */ > +# define RSEQ_SETMEM(descr, member, value) \ > + ({ \ > + _Static_assert (sizeof (descr->member) == 1 \ > + || sizeof (descr->member) == 4 \ > + || sizeof (descr->member) == 8, \ > + "size of per-thread data"); \ > + if (sizeof (descr->member) == 1) \ > + asm volatile ("movb %b0,%%fs:%P1(%q2)" : \ > + : "iq" (value), \ > + "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + else if (sizeof (descr->member) == 4) \ > + asm volatile ("movl %0,%%fs:%P1(%q2)" : \ > + : IMM_MODE (value), \ > + "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + else /* 8 */ \ > + { \ > + /* Since movq takes a signed 32-bit immediate or a register source \ > + operand, use "er" constraint for 32-bit signed integer constant \ > + or register. */ \ > + asm volatile ("movq %q0,%%fs:%P1(%q2)" : \ > + : "er" ((uint64_t) cast_to_integer (value)), \ > + "i" (offsetof (struct rseq_area, member)), \ > + "r" (__rseq_offset)); \ > + }})