public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v5 0/6] x86/cet: Update CET kernel interface
@ 2023-12-22 16:58 H.J. Lu
  2023-12-22 16:58 ` [PATCH v5 1/6] x86/cet: Sync with Linux kernel 6.6 shadow stack interface H.J. Lu
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: H.J. Lu @ 2023-12-22 16:58 UTC (permalink / raw)
  To: libc-alpha; +Cc: rick.p.edgecombe, goldstein.w.n

Changes in v5.

1. Rebase.
2. Move allocate-shadow-stack.[ch] to sysdeps/unix/sysv/linux/x86_64.

Changes in v4.

1. Rebase.
2. Remove 3 patches which have been checked into master branch.

Changes in v3:

1. Remove 7 test patches which have been checked into master branch.

Changes in v2:

1. Add add extra 20 stack frames in shadow stack for signal handlers
when allocating shadow stack for ucontexts.
2. Remove the "x86: Check PT_GNU_PROPERTY early" patch which has been
checked into master branch.


Linux kernel 6.6 added SHSTK support for x86-64.  This patch set updates
CET kernel interface to Linux kernel 6.6.  The main difference from the
current glibc assumption is that SHSTK is enabled by glibc, instead of
kernel.  Glibc enables SHSTK after verifying that the application and
all dependency libraries are CET enabled.  SHSTK can only be enabled in a
function which will never return.  Otherwise, shadow stack will underflow
at the function return.

Not all CET enabled applications and libraries have been properly tested
in CET enabled environments.  Some CET enabled applications or libraries
will crash or misbehave when CET is enabled.  Don't set CET active by
default so that all applications and libraries will run normally regardless
of whether CET is active or not.  Shadow stack can be enabled by

$ export GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK

at run-time if shadow stack can be enabled by kernel.

Since only x86-64 is supported, i386 shadow stack codes are unchanged
and CET shouldn't be enabled for i386.

NB: This change can be reverted if it is OK to enable CET by default for
all applications and libraries.

Tested on Intel Tiger Lake under Linux kernel 6.6.7.

H.J. Lu (6):
  x86/cet: Sync with Linux kernel 6.6 shadow stack interface
  elf: Always provide _dl_get_dl_main_map in libc.a
  x86/cet: Enable shadow stack during startup
  x86/cet: Check feature_1 in TCB for active IBT and SHSTK
  x86/cet: Don't set CET active by default
  x86/cet: Run some CET tests with shadow stack

 elf/dl-support.c                              |  2 -
 sysdeps/generic/ldsodefs.h                    |  8 +-
 sysdeps/unix/sysv/linux/x86/bits/mman.h       |  5 ++
 sysdeps/unix/sysv/linux/x86/dl-cet.h          | 39 +++++++++-
 .../unix/sysv/linux/x86/include/asm/prctl.h   | 37 ++++-----
 .../sysv/linux/x86/tst-cet-setcontext-1.c     | 17 ++--
 sysdeps/unix/sysv/linux/x86_64/Makefile       |  2 +-
 .../unix/sysv/linux/x86_64/__start_context.S  | 38 ++-------
 .../sysv/linux/x86_64/allocate-shadow-stack.c | 55 +++++++++++++
 .../allocate-shadow-stack.h}                  | 32 ++------
 sysdeps/unix/sysv/linux/x86_64/dl-cet.h       | 47 +++++++++++
 sysdeps/unix/sysv/linux/x86_64/getcontext.S   | 30 ++------
 sysdeps/unix/sysv/linux/x86_64/makecontext.c  | 28 +++----
 sysdeps/unix/sysv/linux/x86_64/swapcontext.S  | 22 +-----
 sysdeps/x86/Makefile                          | 14 ++++
 sysdeps/x86/bits/platform/x86.h               |  8 ++
 sysdeps/x86/cpu-features-offsets.sym          |  1 +
 sysdeps/x86/cpu-features.c                    | 48 +-----------
 sysdeps/x86/cpu-tunables.c                    | 15 +++-
 sysdeps/x86/dl-cet.c                          | 77 +++++++++----------
 sysdeps/x86/get-cpuid-feature-leaf.c          | 13 +++-
 sysdeps/x86/include/cpu-features.h            |  3 +
 sysdeps/x86/libc-start.h                      | 54 ++++++++++++-
 sysdeps/x86/sys/platform/x86.h                | 17 ++++
 sysdeps/x86/tst-shstk-legacy-1e-static.sh     |  1 +
 sysdeps/x86/tst-shstk-legacy-1e.sh            |  1 +
 sysdeps/x86/tst-shstk-legacy-1g.sh            |  1 +
 sysdeps/x86_64/dl-machine.h                   | 12 ++-
 sysdeps/x86_64/nptl/tls.h                     |  2 +-
 29 files changed, 381 insertions(+), 248 deletions(-)
 create mode 100644 sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
 rename sysdeps/unix/sysv/linux/{x86/cpu-features.c => x86_64/allocate-shadow-stack.h} (53%)
 create mode 100644 sysdeps/unix/sysv/linux/x86_64/dl-cet.h

-- 
2.43.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 1/6] x86/cet: Sync with Linux kernel 6.6 shadow stack interface
  2023-12-22 16:58 [PATCH v5 0/6] x86/cet: Update CET kernel interface H.J. Lu
@ 2023-12-22 16:58 ` H.J. Lu
  2023-12-26 17:37   ` Noah Goldstein
  2023-12-22 16:58 ` [PATCH v5 2/6] elf: Always provide _dl_get_dl_main_map in libc.a H.J. Lu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: H.J. Lu @ 2023-12-22 16:58 UTC (permalink / raw)
  To: libc-alpha; +Cc: rick.p.edgecombe, goldstein.w.n

Sync with Linux kernel 6.6 shadow stack interface.  Since only x86-64 is
supported, i386 shadow stack codes are unchanged and CET shouldn't be
enabled for i386.

1. When the shadow stack base in TCB is unset, the default shadow stack
is in use.  Use the current shadow stack pointer as the marker for the
default shadow stack. It is used to identify if the current shadow stack
is the same as the target shadow stack when switching ucontexts.  If yes,
INCSSP will be used to unwind shadow stack.  Otherwise, shadow stack
restore token will be used.
2. Allocate shadow stack with the map_shadow_stack syscall.  Since there
is no function to explicitly release ucontext, there is no place to
release shadow stack allocated by map_shadow_stack in ucontext functions.
Such shadow stacks will be leaked.
3. Rename arch_prctl CET commands to ARCH_SHSTK_XXX.
4. Rewrite the CET control functions with the current kernel shadow stack
interface.

Since CET is no longer enabled by kernel, a separate patch will enable
shadow stack during startup.
---
 sysdeps/unix/sysv/linux/x86/bits/mman.h       |  5 ++
 sysdeps/unix/sysv/linux/x86/cpu-features.c    | 13 +++--
 sysdeps/unix/sysv/linux/x86/dl-cet.h          | 16 ++++--
 .../unix/sysv/linux/x86/include/asm/prctl.h   | 37 ++++++-------
 .../sysv/linux/x86/tst-cet-setcontext-1.c     | 17 +++---
 sysdeps/unix/sysv/linux/x86_64/Makefile       |  2 +-
 .../unix/sysv/linux/x86_64/__start_context.S  | 38 +++----------
 .../sysv/linux/x86_64/allocate-shadow-stack.c | 55 +++++++++++++++++++
 .../sysv/linux/x86_64/allocate-shadow-stack.h | 24 ++++++++
 sysdeps/unix/sysv/linux/x86_64/getcontext.S   | 30 ++--------
 sysdeps/unix/sysv/linux/x86_64/makecontext.c  | 28 +++++-----
 sysdeps/unix/sysv/linux/x86_64/swapcontext.S  | 22 ++------
 sysdeps/x86/cpu-features.c                    | 15 +++--
 sysdeps/x86/dl-cet.c                          |  2 +-
 sysdeps/x86_64/nptl/tls.h                     |  2 +-
 15 files changed, 173 insertions(+), 133 deletions(-)
 create mode 100644 sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
 create mode 100644 sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h

diff --git a/sysdeps/unix/sysv/linux/x86/bits/mman.h b/sysdeps/unix/sysv/linux/x86/bits/mman.h
index 3d356e86a0..232b55a13d 100644
--- a/sysdeps/unix/sysv/linux/x86/bits/mman.h
+++ b/sysdeps/unix/sysv/linux/x86/bits/mman.h
@@ -27,6 +27,11 @@
 #define MAP_32BIT	0x40		/* Only give out 32-bit addresses.  */
 #define MAP_ABOVE4G	0x80		/* Only map above 4GB.  */
 
+#ifdef __USE_MISC
+/* Set up a restore token in the newly allocated shadow stack */
+# define SHADOW_STACK_SET_TOKEN 0x1
+#endif
+
 #include <bits/mman-map-flags-generic.h>
 
 /* Include generic Linux declarations.  */
diff --git a/sysdeps/unix/sysv/linux/x86/cpu-features.c b/sysdeps/unix/sysv/linux/x86/cpu-features.c
index 41e7600668..0e6e2bf855 100644
--- a/sysdeps/unix/sysv/linux/x86/cpu-features.c
+++ b/sysdeps/unix/sysv/linux/x86/cpu-features.c
@@ -23,10 +23,15 @@
 static inline int __attribute__ ((always_inline))
 get_cet_status (void)
 {
-  unsigned long long cet_status[3];
-  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_STATUS, cet_status) == 0)
-    return cet_status[0];
-  return 0;
+  unsigned long long kernel_feature;
+  unsigned int status = 0;
+  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_STATUS,
+			     &kernel_feature) == 0)
+    {
+      if ((kernel_feature & ARCH_SHSTK_SHSTK) != 0)
+	status = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
+    }
+  return status;
 }
 
 # ifndef SHARED
diff --git a/sysdeps/unix/sysv/linux/x86/dl-cet.h b/sysdeps/unix/sysv/linux/x86/dl-cet.h
index c885bf1323..da220ac627 100644
--- a/sysdeps/unix/sysv/linux/x86/dl-cet.h
+++ b/sysdeps/unix/sysv/linux/x86/dl-cet.h
@@ -21,12 +21,20 @@
 static inline int __attribute__ ((always_inline))
 dl_cet_disable_cet (unsigned int cet_feature)
 {
-  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_DISABLE,
-				      cet_feature);
+  if (cet_feature != GNU_PROPERTY_X86_FEATURE_1_SHSTK)
+    return -1;
+  long long int kernel_feature = ARCH_SHSTK_SHSTK;
+  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_DISABLE,
+				      kernel_feature);
 }
 
 static inline int __attribute__ ((always_inline))
-dl_cet_lock_cet (void)
+dl_cet_lock_cet (unsigned int cet_feature)
 {
-  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_LOCK, 0);
+  if (cet_feature != GNU_PROPERTY_X86_FEATURE_1_SHSTK)
+    return -1;
+  /* Lock all SHSTK features.  */
+  long long int kernel_feature = -1;
+  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_LOCK,
+				      kernel_feature);
 }
diff --git a/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h b/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
index 45ad0b052f..2f511321ad 100644
--- a/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
+++ b/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
@@ -4,24 +4,19 @@
 
 #include_next <asm/prctl.h>
 
-#ifndef ARCH_CET_STATUS
-/* CET features:
-   IBT:   GNU_PROPERTY_X86_FEATURE_1_IBT
-   SHSTK: GNU_PROPERTY_X86_FEATURE_1_SHSTK
- */
-/* Return CET features in unsigned long long *addr:
-     features: addr[0].
-     shadow stack base address: addr[1].
-     shadow stack size: addr[2].
- */
-# define ARCH_CET_STATUS	0x3001
-/* Disable CET features in unsigned int features.  */
-# define ARCH_CET_DISABLE	0x3002
-/* Lock all CET features.  */
-# define ARCH_CET_LOCK		0x3003
-/* Allocate a new shadow stack with unsigned long long *addr:
-     IN: requested shadow stack size: *addr.
-     OUT: allocated shadow stack address: *addr.
- */
-# define ARCH_CET_ALLOC_SHSTK	0x3004
-#endif /* ARCH_CET_STATUS */
+#ifndef ARCH_SHSTK_ENABLE
+/* Enable SHSTK features in unsigned long int features.  */
+# define ARCH_SHSTK_ENABLE		0x5001
+/* Disable SHSTK features in unsigned long int features.  */
+# define ARCH_SHSTK_DISABLE		0x5002
+/* Lock SHSTK features in unsigned long int features.  */
+# define ARCH_SHSTK_LOCK		0x5003
+/* Unlock SHSTK features in unsigned long int features.  */
+# define ARCH_SHSTK_UNLOCK		0x5004
+/* Return SHSTK features in unsigned long int features.  */
+# define ARCH_SHSTK_STATUS		0x5005
+
+/* ARCH_SHSTK_ features bits */
+# define ARCH_SHSTK_SHSTK		0x1
+# define ARCH_SHSTK_WRSS		0x2
+#endif
diff --git a/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c b/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
index 837a9fd0eb..2ea66c803b 100644
--- a/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
+++ b/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
@@ -87,15 +87,14 @@ do_test (void)
   ctx[4].uc_link = &ctx[0];
   makecontext (&ctx[4], (void (*) (void)) f1, 0);
 
-  /* NB: When shadow stack is enabled, makecontext calls arch_prctl
-     with ARCH_CET_ALLOC_SHSTK to allocate a new shadow stack which
-     can be unmapped.  The base address and size of the new shadow
-     stack are returned in __ssp[1] and __ssp[2].  makecontext is
-     called for CTX1, CTX3 and CTX4.  But only CTX1 is used.  New
-     shadow stacks are allocated in the order of CTX3, CTX1, CTX4.
-     It is very likely that CTX1's shadow stack is placed between
-     CTX3 and CTX4.  We munmap CTX3's and CTX4's shadow stacks to
-     create gaps above and below CTX1's shadow stack.  We check
+  /* NB: When shadow stack is enabled, makecontext calls map_shadow_stack
+     to allocate a new shadow stack which can be unmapped.  The base
+     address and size of the new shadow stack are returned in __ssp[1]
+     and __ssp[2].  makecontext is called for CTX1, CTX3 and CTX4.  But
+     only CTX1 is used.  New shadow stacks are allocated in the order
+     of CTX3, CTX1, CTX4.  It is very likely that CTX1's shadow stack is
+     placed between CTX3 and CTX4.  We munmap CTX3's and CTX4's shadow
+     stacks to create gaps above and below CTX1's shadow stack.  We check
      that setcontext CTX1 works correctly in this case.  */
   if (_get_ssp () != 0)
     {
diff --git a/sysdeps/unix/sysv/linux/x86_64/Makefile b/sysdeps/unix/sysv/linux/x86_64/Makefile
index 5e19202ebf..06b873949e 100644
--- a/sysdeps/unix/sysv/linux/x86_64/Makefile
+++ b/sysdeps/unix/sysv/linux/x86_64/Makefile
@@ -3,7 +3,7 @@ sysdep_routines += ioperm iopl
 endif
 
 ifeq ($(subdir),stdlib)
-sysdep_routines += __start_context
+sysdep_routines += __start_context allocate-shadow-stack
 endif
 
 ifeq ($(subdir),csu)
diff --git a/sysdeps/unix/sysv/linux/x86_64/__start_context.S b/sysdeps/unix/sysv/linux/x86_64/__start_context.S
index f6436dd6bb..ae04203c90 100644
--- a/sysdeps/unix/sysv/linux/x86_64/__start_context.S
+++ b/sysdeps/unix/sysv/linux/x86_64/__start_context.S
@@ -24,20 +24,14 @@
 /* Use CALL to push __start_context onto the new stack as well as the new
    shadow stack.  RDI points to ucontext:
    Incoming:
-     __ssp[0]: The original caller's shadow stack pointer.
-     __ssp[1]: The size of the new shadow stack.
-     __ssp[2]: The size of the new shadow stack.
-   Outgoing:
      __ssp[0]: The new shadow stack pointer.
      __ssp[1]: The base address of the new shadow stack.
      __ssp[2]: The size of the new shadow stack.
  */
 
 ENTRY(__push___start_context)
-	/* Save the pointer to ucontext.  */
-	movq	%rdi, %r9
 	/* Get the original shadow stack pointer.  */
-	rdsspq	%r8
+	rdsspq	%rcx
 	/* Save the original stack pointer.  */
 	movq	%rsp, %rdx
 	/* Load the top of the new stack into RSI.  */
@@ -45,24 +39,12 @@ ENTRY(__push___start_context)
 	/* Add 8 bytes to RSI since CALL will push the 8-byte return
 	   address onto stack.  */
 	leaq	8(%rsi), %rsp
-	/* Allocate the new shadow stack.  The size of the new shadow
-	   stack is passed in __ssp[1].  */
-	lea	(oSSP + 8)(%rdi), %RSI_LP
-	movl	$ARCH_CET_ALLOC_SHSTK, %edi
-	movl	$__NR_arch_prctl, %eax
-	/* The new shadow stack base is returned in __ssp[1].  */
-	syscall
-	testq	%rax, %rax
-	jne	L(hlt)		/* This should never happen.  */
-
-	/* Get the size of the new shadow stack.  */
-	movq	8(%rsi), %rdi
-
-	/* Get the base address of the new shadow stack.  */
-	movq	(%rsi), %rsi
-
+	/* The size of the new shadow stack is stored in __ssp[2].  */
+	mov	(oSSP + 16)(%rdi), %RSI_LP
+	/* The new shadow stack base is stored in __ssp[1].  */
+	mov	(oSSP + 8)(%rdi), %RAX_LP
 	/* Use the restore stoken to restore the new shadow stack.  */
-	rstorssp -8(%rsi, %rdi)
+	rstorssp -8(%rax, %rsi)
 
 	/* Save the restore token on the original shadow stack.  */
 	saveprevssp
@@ -73,18 +55,12 @@ ENTRY(__push___start_context)
 	jmp	__start_context
 1:
 
-	/* Get the new shadow stack pointer.  */
-	rdsspq	%rdi
-
 	/* Use the restore stoken to restore the original shadow stack.  */
-	rstorssp -8(%r8)
+	rstorssp -8(%rcx)
 
 	/* Save the restore token on the new shadow stack.  */
 	saveprevssp
 
-	/* Store the new shadow stack pointer in __ssp[0].  */
-	movq	%rdi, oSSP(%r9)
-
 	/* Restore the original stack.  */
 	mov	%rdx, %rsp
 	ret
diff --git a/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
new file mode 100644
index 0000000000..f2e1d03b96
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
@@ -0,0 +1,55 @@
+/* Helper function to allocate shadow stack.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <stdint.h>
+#include <errno.h>
+#include <sys/mman.h>
+#include <libc-pointer-arith.h>
+#include <allocate-shadow-stack.h>
+
+/* NB: This can be treated as a syscall by caller.  */
+
+long int
+__allocate_shadow_stack (size_t stack_size,
+			 shadow_stack_size_t *child_stack)
+{
+#ifdef __NR_map_shadow_stack
+  size_t shadow_stack_size
+    = stack_size >> STACK_SIZE_TO_SHADOW_STACK_SIZE_SHIFT;
+  /* Align shadow stack to 8 bytes.  */
+  shadow_stack_size = ALIGN_UP (shadow_stack_size, 8);
+  /* Since sigaltstack shares shadow stack with the current context in
+     the thread, add extra 20 stack frames in shadow stack for signal
+     handlers.  */
+  shadow_stack_size += 20 * 8;
+  void *shadow_stack = (void *)INLINE_SYSCALL_CALL
+    (map_shadow_stack, NULL, shadow_stack_size, SHADOW_STACK_SET_TOKEN);
+  /* Report the map_shadow_stack error.  */
+  if (shadow_stack == MAP_FAILED)
+    return -errno;
+
+  /* Save the shadow stack base and size on child stack.  */
+  child_stack[0] = (uintptr_t) shadow_stack;
+  child_stack[1] = shadow_stack_size;
+
+  return 0;
+#else
+  return -ENOSYS;
+#endif
+}
diff --git a/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
new file mode 100644
index 0000000000..d05aaf16e5
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
@@ -0,0 +1,24 @@
+/* Helper function to allocate shadow stack.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <ucontext.h>
+
+typedef __typeof (((ucontext_t *) 0)->__ssp[0]) shadow_stack_size_t;
+
+extern long int __allocate_shadow_stack (size_t, shadow_stack_size_t *)
+  attribute_hidden;
diff --git a/sysdeps/unix/sysv/linux/x86_64/getcontext.S b/sysdeps/unix/sysv/linux/x86_64/getcontext.S
index a00e2f6290..71f3802dca 100644
--- a/sysdeps/unix/sysv/linux/x86_64/getcontext.S
+++ b/sysdeps/unix/sysv/linux/x86_64/getcontext.S
@@ -58,35 +58,15 @@ ENTRY(__getcontext)
 	testl	$X86_FEATURE_1_SHSTK, %fs:FEATURE_1_OFFSET
 	jz	L(no_shstk)
 
-	/* Save RDI in RDX which won't be clobbered by syscall.  */
-	movq	%rdi, %rdx
-
 	xorl	%eax, %eax
 	cmpq	%fs:SSP_BASE_OFFSET, %rax
 	jnz	L(shadow_stack_bound_recorded)
 
-	/* Get the base address and size of the default shadow stack
-	   which must be the current shadow stack since nothing has
-	   been recorded yet.  */
-	sub	$24, %RSP_LP
-	mov	%RSP_LP, %RSI_LP
-	movl	$ARCH_CET_STATUS, %edi
-	movl	$__NR_arch_prctl, %eax
-	syscall
-	testq	%rax, %rax
-	jz	L(continue_no_err)
-
-	/* This should never happen.  */
-	hlt
-
-L(continue_no_err):
-	/* Record the base of the current shadow stack.  */
-	movq	8(%rsp), %rax
+	/* When the shadow stack base is unset, the default shadow
+	   stack is in use.  Use the current shadow stack pointer
+	   as the marker for the default shadow stack.  */
+	rdsspq	%rax
 	movq	%rax, %fs:SSP_BASE_OFFSET
-	add	$24, %RSP_LP
-
-	/* Restore RDI.  */
-	movq	%rdx, %rdi
 
 L(shadow_stack_bound_recorded):
 	/* Get the current shadow stack pointer.  */
@@ -94,7 +74,7 @@ L(shadow_stack_bound_recorded):
 	/* NB: Save the caller's shadow stack so that we can jump back
 	   to the caller directly.  */
 	addq	$8, %rax
-	movq	%rax, oSSP(%rdx)
+	movq	%rax, oSSP(%rdi)
 
 	/* Save the current shadow stack base in ucontext.  */
 	movq	%fs:SSP_BASE_OFFSET, %rax
diff --git a/sysdeps/unix/sysv/linux/x86_64/makecontext.c b/sysdeps/unix/sysv/linux/x86_64/makecontext.c
index de9e03eb81..e4f025bd50 100644
--- a/sysdeps/unix/sysv/linux/x86_64/makecontext.c
+++ b/sysdeps/unix/sysv/linux/x86_64/makecontext.c
@@ -24,6 +24,7 @@
 # include <pthread.h>
 # include <libc-pointer-arith.h>
 # include <sys/prctl.h>
+# include <allocate-shadow-stack.h>
 #endif
 
 #include "ucontext_i.h"
@@ -88,23 +89,24 @@ __makecontext (ucontext_t *ucp, void (*func) (void), int argc, ...)
   if ((feature_1 & X86_FEATURE_1_SHSTK) != 0)
     {
       /* Shadow stack is enabled.  We need to allocate a new shadow
-         stack.  */
-      unsigned long ssp_size = (((uintptr_t) sp
-				 - (uintptr_t) ucp->uc_stack.ss_sp)
-				>> STACK_SIZE_TO_SHADOW_STACK_SIZE_SHIFT);
-      /* Align shadow stack to 8 bytes.  */
-      ssp_size = ALIGN_UP (ssp_size, 8);
-
-      ucp->__ssp[1] = ssp_size;
-      ucp->__ssp[2] = ssp_size;
-
-      /* Call __push___start_context to allocate a new shadow stack,
-	 push __start_context onto the new stack as well as the new
-	 shadow stack.  NB: After __push___start_context returns,
+         stack.  NB:
 	   ucp->__ssp[0]: The new shadow stack pointer.
 	   ucp->__ssp[1]: The base address of the new shadow stack.
 	   ucp->__ssp[2]: The size of the new shadow stack.
        */
+      long int ret
+	= __allocate_shadow_stack (((uintptr_t) sp
+				    - (uintptr_t) ucp->uc_stack.ss_sp),
+				   &ucp->__ssp[1]);
+      if (ret != 0)
+	{
+	  /* FIXME: What should we do?  */
+	  abort ();
+	}
+
+      ucp->__ssp[0] = ucp->__ssp[1] + ucp->__ssp[2] - 8;
+      /* Call __push___start_context to push __start_context onto the new
+	 stack as well as the new shadow stack.  */
       __push___start_context (ucp);
     }
   else
diff --git a/sysdeps/unix/sysv/linux/x86_64/swapcontext.S b/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
index 5925752164..2f2fe9875b 100644
--- a/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
+++ b/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
@@ -109,25 +109,11 @@ ENTRY(__swapcontext)
 	cmpq	%fs:SSP_BASE_OFFSET, %rax
 	jnz	L(shadow_stack_bound_recorded)
 
-	/* Get the base address and size of the default shadow stack
-	   which must be the current shadow stack since nothing has
-	   been recorded yet.  */
-	sub	$24, %RSP_LP
-	mov	%RSP_LP, %RSI_LP
-	movl	$ARCH_CET_STATUS, %edi
-	movl	$__NR_arch_prctl, %eax
-	syscall
-	testq	%rax, %rax
-	jz	L(continue_no_err)
-
-	/* This should never happen.  */
-	hlt
-
-L(continue_no_err):
-	/* Record the base of the current shadow stack.  */
-	movq	8(%rsp), %rax
+	/* When the shadow stack base is unset, the default shadow
+	   stack is in use.  Use the current shadow stack pointer
+	   as the marker for the default shadow stack.  */
+	rdsspq	%rax
 	movq	%rax, %fs:SSP_BASE_OFFSET
-	add	$24, %RSP_LP
 
 L(shadow_stack_bound_recorded):
         /* If we unwind the stack, we can't undo stack unwinding.  Just
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index 0bf923d48b..f180f0d9a4 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -1121,8 +1121,9 @@ no_cpuid:
 
 # ifndef SHARED
       /* Check if IBT and SHSTK are enabled by kernel.  */
-      if ((cet_status & GNU_PROPERTY_X86_FEATURE_1_IBT)
-	  || (cet_status & GNU_PROPERTY_X86_FEATURE_1_SHSTK))
+      if ((cet_status
+	   & (GNU_PROPERTY_X86_FEATURE_1_IBT
+	      | GNU_PROPERTY_X86_FEATURE_1_SHSTK)))
 	{
 	  /* Disable IBT and/or SHSTK if they are enabled by kernel, but
 	     disabled by environment variable:
@@ -1131,9 +1132,11 @@ no_cpuid:
 	   */
 	  unsigned int cet_feature = 0;
 	  if (!CPU_FEATURE_USABLE (IBT))
-	    cet_feature |= GNU_PROPERTY_X86_FEATURE_1_IBT;
+	    cet_feature |= (cet_status
+			    & GNU_PROPERTY_X86_FEATURE_1_IBT);
 	  if (!CPU_FEATURE_USABLE (SHSTK))
-	    cet_feature |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
+	    cet_feature |= (cet_status
+			    & GNU_PROPERTY_X86_FEATURE_1_SHSTK);
 
 	  if (cet_feature)
 	    {
@@ -1148,7 +1151,9 @@ no_cpuid:
 	     lock CET if IBT or SHSTK is enabled permissively.  */
 	  if (GL(dl_x86_feature_control).ibt != cet_permissive
 	      && GL(dl_x86_feature_control).shstk != cet_permissive)
-	    dl_cet_lock_cet ();
+	    dl_cet_lock_cet (GL(dl_x86_feature_1)
+			     & (GNU_PROPERTY_X86_FEATURE_1_IBT
+				| GNU_PROPERTY_X86_FEATURE_1_SHSTK));
 	}
 # endif
     }
diff --git a/sysdeps/x86/dl-cet.c b/sysdeps/x86/dl-cet.c
index e486e549be..66a78244d4 100644
--- a/sysdeps/x86/dl-cet.c
+++ b/sysdeps/x86/dl-cet.c
@@ -202,7 +202,7 @@ dl_cet_check_startup (struct link_map *m, struct dl_cet_info *info)
 	feature_1_lock |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
 
       if (feature_1_lock != 0
-	  && dl_cet_lock_cet () != 0)
+	  && dl_cet_lock_cet (feature_1_lock) != 0)
 	_dl_fatal_printf ("%s: can't lock CET\n", info->program);
     }
 
diff --git a/sysdeps/x86_64/nptl/tls.h b/sysdeps/x86_64/nptl/tls.h
index 1403f939f7..4bcc2552a1 100644
--- a/sysdeps/x86_64/nptl/tls.h
+++ b/sysdeps/x86_64/nptl/tls.h
@@ -60,7 +60,7 @@ typedef struct
   void *__private_tm[4];
   /* GCC split stack support.  */
   void *__private_ss;
-  /* The lowest address of shadow stack,  */
+  /* The marker for the current shadow stack.  */
   unsigned long long int ssp_base;
   /* Must be kept even if it is no longer used by glibc since programs,
      like AddressSanitizer, depend on the size of tcbhead_t.  */
-- 
2.43.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 2/6] elf: Always provide _dl_get_dl_main_map in libc.a
  2023-12-22 16:58 [PATCH v5 0/6] x86/cet: Update CET kernel interface H.J. Lu
  2023-12-22 16:58 ` [PATCH v5 1/6] x86/cet: Sync with Linux kernel 6.6 shadow stack interface H.J. Lu
@ 2023-12-22 16:58 ` H.J. Lu
  2023-12-29 14:45   ` Adhemerval Zanella Netto
  2023-12-22 16:58 ` [PATCH v5 3/6] x86/cet: Enable shadow stack during startup H.J. Lu
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: H.J. Lu @ 2023-12-22 16:58 UTC (permalink / raw)
  To: libc-alpha; +Cc: rick.p.edgecombe, goldstein.w.n

Always provide _dl_get_dl_main_map in libc.a.  It will be used by x86
to process PT_GNU_PROPERTY segment.
---
 elf/dl-support.c           | 2 --
 sysdeps/generic/ldsodefs.h | 8 ++++----
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/elf/dl-support.c b/elf/dl-support.c
index 837fa1c836..70c5b3599a 100644
--- a/elf/dl-support.c
+++ b/elf/dl-support.c
@@ -344,7 +344,6 @@ _dl_non_dynamic_init (void)
 DL_SYSINFO_IMPLEMENTATION
 #endif
 
-#if ENABLE_STATIC_PIE
 /* Since relocation to hidden _dl_main_map causes relocation overflow on
    aarch64, a function is used to get the address of _dl_main_map.  */
 
@@ -353,7 +352,6 @@ _dl_get_dl_main_map (void)
 {
   return &_dl_main_map;
 }
-#endif
 
 /* This is used by _dl_runtime_profile, not used on static code.  */
 void
diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index 9b50ddd09f..0e8a008a49 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -1172,10 +1172,6 @@ void __libc_setup_tls (void);
 # if ENABLE_STATIC_PIE
 /* Relocate static executable with PIE.  */
 extern void _dl_relocate_static_pie (void) attribute_hidden;
-
-/* Get a pointer to _dl_main_map.  */
-extern struct link_map * _dl_get_dl_main_map (void)
-  __attribute__ ((visibility ("hidden")));
 # else
 #  define _dl_relocate_static_pie()
 # endif
@@ -1217,6 +1213,10 @@ rtld_hidden_proto (_dl_deallocate_tls)
 
 extern void _dl_nothread_init_static_tls (struct link_map *) attribute_hidden;
 
+/* Get a pointer to _dl_main_map.  */
+extern struct link_map * _dl_get_dl_main_map (void)
+  __attribute__ ((visibility ("hidden")));
+
 /* Find origin of the executable.  */
 extern const char *_dl_get_origin (void) attribute_hidden;
 
-- 
2.43.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 3/6] x86/cet: Enable shadow stack during startup
  2023-12-22 16:58 [PATCH v5 0/6] x86/cet: Update CET kernel interface H.J. Lu
  2023-12-22 16:58 ` [PATCH v5 1/6] x86/cet: Sync with Linux kernel 6.6 shadow stack interface H.J. Lu
  2023-12-22 16:58 ` [PATCH v5 2/6] elf: Always provide _dl_get_dl_main_map in libc.a H.J. Lu
@ 2023-12-22 16:58 ` H.J. Lu
  2023-12-29 14:55   ` Adhemerval Zanella Netto
  2023-12-22 16:58 ` [PATCH v5 4/6] x86/cet: Check feature_1 in TCB for active IBT and SHSTK H.J. Lu
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: H.J. Lu @ 2023-12-22 16:58 UTC (permalink / raw)
  To: libc-alpha; +Cc: rick.p.edgecombe, goldstein.w.n

Previously, CET was enabled by kernel before passing control to user
space and the startup code must disable CET if applications or shared
libraries aren't CET enabled.  Since the current kernel only supports
shadow stack and won't enable shadow stack before passing control to
user space, we need to enable shadow stack during startup if the
application and all shared library are shadow stack enabled.  There
is no need to disable shadow stack at startup.  Shadow stack can only
be enabled in a function which will never return.  Otherwise, shadow
stack will underflow at the function return.

1. GL(dl_x86_feature_1) is set to the CET features which are supported
by the processor and are not disabled by the tunable.  Only non-zero
features in GL(dl_x86_feature_1) should be enabled.  After enabling
shadow stack with ARCH_SHSTK_ENABLE, ARCH_SHSTK_STATUS is used to check
if shadow stack is really enabled.
2. Use ARCH_SHSTK_ENABLE in RTLD_START in dynamic executable.  It is
safe since RTLD_START never returns.
3. Call arch_prctl (ARCH_SHSTK_ENABLE) from ARCH_SETUP_TLS in static
executable.  Since the start function using ARCH_SETUP_TLS never returns,
it is safe to enable shadow stack in ARCH_SETUP_TLS.
---
 sysdeps/unix/sysv/linux/x86/cpu-features.c | 49 --------------
 sysdeps/unix/sysv/linux/x86/dl-cet.h       | 23 +++++++
 sysdeps/unix/sysv/linux/x86_64/dl-cet.h    | 47 +++++++++++++
 sysdeps/x86/cpu-features-offsets.sym       |  1 +
 sysdeps/x86/cpu-features.c                 | 51 --------------
 sysdeps/x86/dl-cet.c                       | 77 +++++++++++-----------
 sysdeps/x86/get-cpuid-feature-leaf.c       |  2 +-
 sysdeps/x86/include/cpu-features.h         |  3 +
 sysdeps/x86/libc-start.h                   | 54 ++++++++++++++-
 sysdeps/x86_64/dl-machine.h                | 12 +++-
 10 files changed, 175 insertions(+), 144 deletions(-)
 delete mode 100644 sysdeps/unix/sysv/linux/x86/cpu-features.c
 create mode 100644 sysdeps/unix/sysv/linux/x86_64/dl-cet.h

diff --git a/sysdeps/unix/sysv/linux/x86/cpu-features.c b/sysdeps/unix/sysv/linux/x86/cpu-features.c
deleted file mode 100644
index 0e6e2bf855..0000000000
--- a/sysdeps/unix/sysv/linux/x86/cpu-features.c
+++ /dev/null
@@ -1,49 +0,0 @@
-/* Initialize CPU feature data for Linux/x86.
-   This file is part of the GNU C Library.
-   Copyright (C) 2018-2023 Free Software Foundation, Inc.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#if CET_ENABLED
-# include <sys/prctl.h>
-# include <asm/prctl.h>
-
-static inline int __attribute__ ((always_inline))
-get_cet_status (void)
-{
-  unsigned long long kernel_feature;
-  unsigned int status = 0;
-  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_STATUS,
-			     &kernel_feature) == 0)
-    {
-      if ((kernel_feature & ARCH_SHSTK_SHSTK) != 0)
-	status = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
-    }
-  return status;
-}
-
-# ifndef SHARED
-static inline void
-x86_setup_tls (void)
-{
-  __libc_setup_tls ();
-  THREAD_SETMEM (THREAD_SELF, header.feature_1, GL(dl_x86_feature_1));
-}
-
-#  define ARCH_SETUP_TLS() x86_setup_tls ()
-# endif
-#endif
-
-#include <sysdeps/x86/cpu-features.c>
diff --git a/sysdeps/unix/sysv/linux/x86/dl-cet.h b/sysdeps/unix/sysv/linux/x86/dl-cet.h
index da220ac627..634c885d33 100644
--- a/sysdeps/unix/sysv/linux/x86/dl-cet.h
+++ b/sysdeps/unix/sysv/linux/x86/dl-cet.h
@@ -38,3 +38,26 @@ dl_cet_lock_cet (unsigned int cet_feature)
   return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_LOCK,
 				      kernel_feature);
 }
+
+static inline unsigned int __attribute__ ((always_inline))
+dl_cet_get_cet_status (void)
+{
+  unsigned long long kernel_feature;
+  unsigned int status = 0;
+  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_STATUS,
+			     &kernel_feature) == 0)
+    {
+      if ((kernel_feature & ARCH_SHSTK_SHSTK) != 0)
+	status = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
+    }
+  return status;
+}
+
+/* Enable shadow stack with a macro to avoid shadow stack underflow.  */
+#define ENABLE_X86_CET(cet_feature)				\
+  if ((cet_feature & GNU_PROPERTY_X86_FEATURE_1_SHSTK))		\
+    {								\
+      long long int kernel_feature = ARCH_SHSTK_SHSTK;		\
+      INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_ENABLE,	\
+			     kernel_feature);			\
+    }
diff --git a/sysdeps/unix/sysv/linux/x86_64/dl-cet.h b/sysdeps/unix/sysv/linux/x86_64/dl-cet.h
new file mode 100644
index 0000000000..e23e05c6b8
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/x86_64/dl-cet.h
@@ -0,0 +1,47 @@
+/* Linux/x86-64 CET initializers function.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <cpu-features-offsets.h>
+#include_next <dl-cet.h>
+
+#define X86_STRINGIFY_1(x)	#x
+#define X86_STRINGIFY(x)	X86_STRINGIFY_1 (x)
+
+/* Enable shadow stack before calling _dl_init if it is enabled in
+   GL(dl_x86_feature_1).  Call _dl_setup_x86_features to setup shadow
+   stack.  */
+#define RTLD_START_ENABLE_X86_FEATURES \
+"\
+	# Check if shadow stack is enabled in GL(dl_x86_feature_1).\n\
+	movl _rtld_local+" X86_STRINGIFY (RTLD_GLOBAL_DL_X86_FEATURE_1_OFFSET) "(%rip), %edx\n\
+	testl $" X86_STRINGIFY (X86_FEATURE_1_SHSTK) ", %edx\n\
+	jz 1f\n\
+	# Enable shadow stack if enabled in GL(dl_x86_feature_1).\n\
+	movl $" X86_STRINGIFY (ARCH_SHSTK_SHSTK) ", %esi\n\
+	movl $" X86_STRINGIFY (ARCH_SHSTK_ENABLE) ", %edi\n\
+	movl $" X86_STRINGIFY (__NR_arch_prctl) ", %eax\n\
+	syscall\n\
+1:\n\
+	# Pass GL(dl_x86_feature_1) to _dl_cet_setup_features.\n\
+	movl %edx, %edi\n\
+	# Align stack for the _dl_cet_setup_features call.\n\
+	andq $-16, %rsp\n\
+	call _dl_cet_setup_features\n\
+	# Restore %rax and %rsp from %r12 and %r13.\n\
+	movq %r12, %rax\n\
+	movq %r13, %rsp\n\
+"
diff --git a/sysdeps/x86/cpu-features-offsets.sym b/sysdeps/x86/cpu-features-offsets.sym
index 6d03cea8e8..5429f60632 100644
--- a/sysdeps/x86/cpu-features-offsets.sym
+++ b/sysdeps/x86/cpu-features-offsets.sym
@@ -4,3 +4,4 @@
 
 RTLD_GLOBAL_RO_DL_X86_CPU_FEATURES_OFFSET offsetof (struct rtld_global_ro, _dl_x86_cpu_features)
 XSAVE_STATE_SIZE_OFFSET	offsetof (struct cpu_features, xsave_state_size)
+RTLD_GLOBAL_DL_X86_FEATURE_1_OFFSET offsetof (struct rtld_global, _dl_x86_feature_1)
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index f180f0d9a4..097868c1d9 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -1106,57 +1106,6 @@ no_cpuid:
 	       TUNABLE_CALLBACK (set_x86_ibt));
   TUNABLE_GET (x86_shstk, tunable_val_t *,
 	       TUNABLE_CALLBACK (set_x86_shstk));
-
-  /* Check CET status.  */
-  unsigned int cet_status = get_cet_status ();
-
-  if ((cet_status & GNU_PROPERTY_X86_FEATURE_1_IBT) == 0)
-    CPU_FEATURE_UNSET (cpu_features, IBT)
-  if ((cet_status & GNU_PROPERTY_X86_FEATURE_1_SHSTK) == 0)
-    CPU_FEATURE_UNSET (cpu_features, SHSTK)
-
-  if (cet_status)
-    {
-      GL(dl_x86_feature_1) = cet_status;
-
-# ifndef SHARED
-      /* Check if IBT and SHSTK are enabled by kernel.  */
-      if ((cet_status
-	   & (GNU_PROPERTY_X86_FEATURE_1_IBT
-	      | GNU_PROPERTY_X86_FEATURE_1_SHSTK)))
-	{
-	  /* Disable IBT and/or SHSTK if they are enabled by kernel, but
-	     disabled by environment variable:
-
-	     GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK
-	   */
-	  unsigned int cet_feature = 0;
-	  if (!CPU_FEATURE_USABLE (IBT))
-	    cet_feature |= (cet_status
-			    & GNU_PROPERTY_X86_FEATURE_1_IBT);
-	  if (!CPU_FEATURE_USABLE (SHSTK))
-	    cet_feature |= (cet_status
-			    & GNU_PROPERTY_X86_FEATURE_1_SHSTK);
-
-	  if (cet_feature)
-	    {
-	      int res = dl_cet_disable_cet (cet_feature);
-
-	      /* Clear the disabled bits in dl_x86_feature_1.  */
-	      if (res == 0)
-		GL(dl_x86_feature_1) &= ~cet_feature;
-	    }
-
-	  /* Lock CET if IBT or SHSTK is enabled in executable.  Don't
-	     lock CET if IBT or SHSTK is enabled permissively.  */
-	  if (GL(dl_x86_feature_control).ibt != cet_permissive
-	      && GL(dl_x86_feature_control).shstk != cet_permissive)
-	    dl_cet_lock_cet (GL(dl_x86_feature_1)
-			     & (GNU_PROPERTY_X86_FEATURE_1_IBT
-				| GNU_PROPERTY_X86_FEATURE_1_SHSTK));
-	}
-# endif
-    }
 #endif
 
 #ifndef SHARED
diff --git a/sysdeps/x86/dl-cet.c b/sysdeps/x86/dl-cet.c
index 66a78244d4..25add215f2 100644
--- a/sysdeps/x86/dl-cet.c
+++ b/sysdeps/x86/dl-cet.c
@@ -173,40 +173,11 @@ dl_cet_check_startup (struct link_map *m, struct dl_cet_info *info)
     = info->enable_feature_1 ^ info->feature_1_enabled;
   if (disable_feature_1 != 0)
     {
-      /* Disable features in the kernel because of legacy objects or
-	 cet_always_off.  */
-      if (dl_cet_disable_cet (disable_feature_1) != 0)
-	_dl_fatal_printf ("%s: can't disable x86 Features\n",
-			  info->program);
-
       /* Clear the disabled bits.  Sync dl_x86_feature_1 and
          info->feature_1_enabled with info->enable_feature_1.  */
       info->feature_1_enabled = info->enable_feature_1;
       GL(dl_x86_feature_1) = info->enable_feature_1;
     }
-
-  if (HAS_CPU_FEATURE (IBT) || HAS_CPU_FEATURE (SHSTK))
-    {
-      /* Lock CET features only if IBT or SHSTK are enabled and are not
-         enabled permissively.  */
-      unsigned int feature_1_lock = 0;
-
-      if (((info->feature_1_enabled & GNU_PROPERTY_X86_FEATURE_1_IBT)
-	   != 0)
-	  && info->enable_ibt_type != cet_permissive)
-	feature_1_lock |= GNU_PROPERTY_X86_FEATURE_1_IBT;
-
-      if (((info->feature_1_enabled & GNU_PROPERTY_X86_FEATURE_1_SHSTK)
-	   != 0)
-	  && info->enable_shstk_type != cet_permissive)
-	feature_1_lock |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
-
-      if (feature_1_lock != 0
-	  && dl_cet_lock_cet (feature_1_lock) != 0)
-	_dl_fatal_printf ("%s: can't lock CET\n", info->program);
-    }
-
-  THREAD_SETMEM (THREAD_SELF, header.feature_1, GL(dl_x86_feature_1));
 }
 #endif
 
@@ -298,6 +269,15 @@ dl_cet_check (struct link_map *m, const char *program)
 {
   struct dl_cet_info info;
 
+  /* CET is enabled only if RTLD_START_ENABLE_X86_FEATURES is defined.  */
+#if defined SHARED && defined RTLD_START_ENABLE_X86_FEATURES
+  /* Set dl_x86_feature_1 to features enabled in the executable.  */
+  if (program != NULL)
+    GL(dl_x86_feature_1) = (m->l_x86_feature_1_and
+			    & (X86_FEATURE_1_IBT
+			       | X86_FEATURE_1_SHSTK));
+#endif
+
   /* Check how IBT and SHSTK should be enabled. */
   info.enable_ibt_type = GL(dl_x86_feature_control).ibt;
   info.enable_shstk_type = GL(dl_x86_feature_control).shstk;
@@ -307,17 +287,9 @@ dl_cet_check (struct link_map *m, const char *program)
   /* No legacy object check if IBT and SHSTK are always on.  */
   if (info.enable_ibt_type == cet_always_on
       && info.enable_shstk_type == cet_always_on)
-    {
-#ifdef SHARED
-      /* Set it only during startup.  */
-      if (program != NULL)
-	THREAD_SETMEM (THREAD_SELF, header.feature_1,
-		       info.feature_1_enabled);
-#endif
-      return;
-    }
+    return;
 
-  /* Check if IBT and SHSTK were enabled by kernel.  */
+  /* Check if IBT and SHSTK were enabled.  */
   if (info.feature_1_enabled == 0)
     return;
 
@@ -351,6 +323,33 @@ _dl_cet_open_check (struct link_map *l)
   dl_cet_check (l, NULL);
 }
 
+/* Set GL(dl_x86_feature_1) to the enabled features and clear the
+   active bits of the disabled features.  */
+
+attribute_hidden
+void
+_dl_cet_setup_features (unsigned int cet_feature)
+{
+  /* NB: cet_feature == GL(dl_x86_feature_1) which is set to features
+     enabled from executable, not necessarily supported by kernel.  */
+  if (cet_feature)
+    {
+      cet_feature = dl_cet_get_cet_status ();
+      if (cet_feature)
+	{
+	  THREAD_SETMEM (THREAD_SELF, header.feature_1, cet_feature);
+
+	  /* Lock CET if IBT or SHSTK is enabled in executable.  Don't
+	     lock CET if IBT or SHSTK is enabled permissively.  */
+	  if (GL(dl_x86_feature_control).ibt != cet_permissive
+	      && (GL(dl_x86_feature_control).shstk != cet_permissive))
+	    dl_cet_lock_cet (cet_feature);
+	}
+      /* Sync GL(dl_x86_feature_1) with kernel.  */
+      GL(dl_x86_feature_1) = cet_feature;
+    }
+}
+
 #ifdef SHARED
 
 # ifndef LINKAGE
diff --git a/sysdeps/x86/get-cpuid-feature-leaf.c b/sysdeps/x86/get-cpuid-feature-leaf.c
index 40a46cc79c..9317a6b494 100644
--- a/sysdeps/x86/get-cpuid-feature-leaf.c
+++ b/sysdeps/x86/get-cpuid-feature-leaf.c
@@ -24,7 +24,7 @@ __x86_get_cpuid_feature_leaf (unsigned int leaf)
   static const struct cpuid_feature feature = {};
   if (leaf < CPUID_INDEX_MAX)
     return ((const struct cpuid_feature *)
-	      &GLRO(dl_x86_cpu_features).features[leaf]);
+	    &GLRO(dl_x86_cpu_features).features[leaf]);
   else
     return &feature;
 }
diff --git a/sysdeps/x86/include/cpu-features.h b/sysdeps/x86/include/cpu-features.h
index 2d7427a6c0..23bd8146a2 100644
--- a/sysdeps/x86/include/cpu-features.h
+++ b/sysdeps/x86/include/cpu-features.h
@@ -990,6 +990,9 @@ extern const struct cpu_features *_dl_x86_get_cpu_features (void)
 # define INIT_ARCH()
 # define _dl_x86_get_cpu_features() (&GLRO(dl_x86_cpu_features))
 extern void _dl_x86_init_cpu_features (void) attribute_hidden;
+
+extern void _dl_cet_setup_features (unsigned int)
+    attribute_hidden;
 #endif
 
 #ifdef __x86_64__
diff --git a/sysdeps/x86/libc-start.h b/sysdeps/x86/libc-start.h
index e93da6ef3d..856230daeb 100644
--- a/sysdeps/x86/libc-start.h
+++ b/sysdeps/x86/libc-start.h
@@ -19,7 +19,57 @@
 #ifndef SHARED
 # define ARCH_SETUP_IREL() apply_irel ()
 # define ARCH_APPLY_IREL()
-# ifndef ARCH_SETUP_TLS
-#  define ARCH_SETUP_TLS() __libc_setup_tls ()
+# ifdef __CET__
+/* Get CET features enabled in the static executable.  */
+
+static inline unsigned int
+get_cet_feature (void)
+{
+  /* Check if CET is supported and not disabled by tunables.  */
+  struct cpu_features *cpu_features
+    = (struct cpu_features *) __get_cpu_features ();
+  unsigned int cet_feature = 0;
+  if (CPU_FEATURE_USABLE_P (cpu_features, IBT))
+    cet_feature |= GNU_PROPERTY_X86_FEATURE_1_IBT;
+  if (CPU_FEATURE_USABLE_P (cpu_features, SHSTK))
+    cet_feature |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
+  if (!cet_feature)
+    return cet_feature;
+
+  struct link_map *main_map = _dl_get_dl_main_map ();
+
+  /* Scan program headers backward to check PT_GNU_PROPERTY early for
+     x86 feature bits on static executable.  */
+  const ElfW(Phdr) *phdr = GL(dl_phdr);
+  const ElfW(Phdr) *ph;
+  for (ph = phdr + GL(dl_phnum); ph != phdr; ph--)
+    if (ph[-1].p_type == PT_GNU_PROPERTY)
+      {
+	_dl_process_pt_gnu_property (main_map, -1, &ph[-1]);
+	/* Enable IBT and SHSTK only if they are enabled on static
+	   executable.  */
+	cet_feature &= (main_map->l_x86_feature_1_and
+			& (GNU_PROPERTY_X86_FEATURE_1_IBT
+			   | GNU_PROPERTY_X86_FEATURE_1_SHSTK));
+	/* Set GL(dl_x86_feature_1) to the enabled CET features.  */
+	GL(dl_x86_feature_1) = cet_feature;
+	break;
+      }
+
+  return cet_feature;
+}
+
+/* The function using this macro to enable shadow stack must not return
+   to avoid shadow stack underflow.  */
+#  define ARCH_SETUP_TLS()						\
+  {									\
+    __libc_setup_tls ();						\
+									\
+    unsigned int cet_feature = get_cet_feature ();			\
+    ENABLE_X86_CET (cet_feature);					\
+    _dl_cet_setup_features (cet_feature);				\
+  }
+# else
+#  define ARCH_SETUP_TLS()	__libc_setup_tls ()
 # endif
 #endif /* !SHARED */
diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
index 581a2f1a9e..faeae723cb 100644
--- a/sysdeps/x86_64/dl-machine.h
+++ b/sysdeps/x86_64/dl-machine.h
@@ -29,6 +29,11 @@
 #include <dl-static-tls.h>
 #include <dl-machine-rel.h>
 #include <isa-level.h>
+#ifdef __CET__
+# include <dl-cet.h>
+#else
+# define RTLD_START_ENABLE_X86_FEATURES
+#endif
 
 /* Return nonzero iff ELF header is compatible with the running host.  */
 static inline int __attribute__ ((unused))
@@ -146,13 +151,16 @@ _start:\n\
 _dl_start_user:\n\
 	# Save the user entry point address in %r12.\n\
 	movq %rax, %r12\n\
+	# Save %rsp value in %r13.\n\
+	movq %rsp, %r13\n\
+"\
+	RTLD_START_ENABLE_X86_FEATURES \
+"\
 	# Read the original argument count.\n\
 	movq (%rsp), %rdx\n\
 	# Call _dl_init (struct link_map *main_map, int argc, char **argv, char **env)\n\
 	# argc -> rsi\n\
 	movq %rdx, %rsi\n\
-	# Save %rsp value in %r13.\n\
-	movq %rsp, %r13\n\
 	# And align stack for the _dl_init call. \n\
 	andq $-16, %rsp\n\
 	# _dl_loaded -> rdi\n\
-- 
2.43.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 4/6] x86/cet: Check feature_1 in TCB for active IBT and SHSTK
  2023-12-22 16:58 [PATCH v5 0/6] x86/cet: Update CET kernel interface H.J. Lu
                   ` (2 preceding siblings ...)
  2023-12-22 16:58 ` [PATCH v5 3/6] x86/cet: Enable shadow stack during startup H.J. Lu
@ 2023-12-22 16:58 ` H.J. Lu
  2023-12-29 14:59   ` Adhemerval Zanella Netto
  2023-12-22 16:58 ` [PATCH v5 5/6] x86/cet: Don't set CET active by default H.J. Lu
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: H.J. Lu @ 2023-12-22 16:58 UTC (permalink / raw)
  To: libc-alpha; +Cc: rick.p.edgecombe, goldstein.w.n

Initially, IBT and SHSTK are marked as active when CPU supports them
and CET are enabled in glibc.  They can be disabled early by tunables
before relocation.  Since after relocation, GLRO(dl_x86_cpu_features)
becomes read-only, we can't update GLRO(dl_x86_cpu_features) to mark
IBT and SHSTK as inactive.  Instead, check the feature_1 field in TCB
to decide if IBT and SHST are active.
---
 sysdeps/x86/bits/platform/x86.h      |  8 ++++++++
 sysdeps/x86/get-cpuid-feature-leaf.c | 11 ++++++++++-
 sysdeps/x86/sys/platform/x86.h       | 17 +++++++++++++++++
 3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/sysdeps/x86/bits/platform/x86.h b/sysdeps/x86/bits/platform/x86.h
index 1e23d53ba2..1575ae53fb 100644
--- a/sysdeps/x86/bits/platform/x86.h
+++ b/sysdeps/x86/bits/platform/x86.h
@@ -337,3 +337,11 @@ enum
   x86_cpu_AVX10_YMM = x86_cpu_index_24_ecx_0_ebx + 17,
   x86_cpu_AVX10_ZMM = x86_cpu_index_24_ecx_0_ebx + 18,
 };
+
+/* Bits in the feature_1 field in TCB.  */
+
+enum
+{
+  x86_feature_1_ibt		= 1U << 0,
+  x86_feature_1_shstk		= 1U << 1
+};
diff --git a/sysdeps/x86/get-cpuid-feature-leaf.c b/sysdeps/x86/get-cpuid-feature-leaf.c
index 9317a6b494..f69936b31e 100644
--- a/sysdeps/x86/get-cpuid-feature-leaf.c
+++ b/sysdeps/x86/get-cpuid-feature-leaf.c
@@ -15,9 +15,18 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-
+#include <assert.h>
+#include <tcb-offsets.h>
 #include <ldsodefs.h>
 
+#ifdef __x86_64__
+# ifdef __LP64__
+_Static_assert (FEATURE_1_OFFSET == 72, "FEATURE_1_OFFSET != 72");
+# else
+_Static_assert (FEATURE_1_OFFSET == 40, "FEATURE_1_OFFSET != 40");
+# endif
+#endif
+
 const struct cpuid_feature *
 __x86_get_cpuid_feature_leaf (unsigned int leaf)
 {
diff --git a/sysdeps/x86/sys/platform/x86.h b/sysdeps/x86/sys/platform/x86.h
index 1ea2c5fc0b..89b1b16f22 100644
--- a/sysdeps/x86/sys/platform/x86.h
+++ b/sysdeps/x86/sys/platform/x86.h
@@ -45,6 +45,23 @@ x86_cpu_present (unsigned int __index)
 static __inline__ _Bool
 x86_cpu_active (unsigned int __index)
 {
+  if (__index == x86_cpu_IBT || __index == x86_cpu_SHSTK)
+    {
+#ifdef __x86_64__
+      unsigned int __feature_1;
+# ifdef __LP64__
+      __asm__ ("mov %%fs:72, %0" : "=r" (__feature_1));
+# else
+      __asm__ ("mov %%fs:40, %0" : "=r" (__feature_1));
+# endif
+      if (__index == x86_cpu_IBT)
+	return __feature_1 & x86_feature_1_ibt;
+      else
+	return __feature_1 & x86_feature_1_shstk;
+#else
+      return false;
+#endif
+    }
   const struct cpuid_feature *__ptr = __x86_get_cpuid_feature_leaf
     (__index / (8 * sizeof (unsigned int) * 4));
   unsigned int __reg
-- 
2.43.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 5/6] x86/cet: Don't set CET active by default
  2023-12-22 16:58 [PATCH v5 0/6] x86/cet: Update CET kernel interface H.J. Lu
                   ` (3 preceding siblings ...)
  2023-12-22 16:58 ` [PATCH v5 4/6] x86/cet: Check feature_1 in TCB for active IBT and SHSTK H.J. Lu
@ 2023-12-22 16:58 ` H.J. Lu
  2023-12-22 16:58 ` [PATCH v5 6/6] x86/cet: Run some CET tests with shadow stack H.J. Lu
  2023-12-28 16:00 ` [PATCH v5 0/6] x86/cet: Update CET kernel interface Florian Weimer
  6 siblings, 0 replies; 18+ messages in thread
From: H.J. Lu @ 2023-12-22 16:58 UTC (permalink / raw)
  To: libc-alpha; +Cc: rick.p.edgecombe, goldstein.w.n

Not all CET enabled applications and libraries have been properly tested
in CET enabled environments.  Some CET enabled applications or libraries
will crash or misbehave when CET is enabled.  Don't set CET active by
default so that all applications and libraries will run normally regardless
of whether CET is active or not.  Shadow stack can be enabled by

$ export GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK

at run-time if shadow stack can be enabled by kernel.

NB: This commit can be reverted if it is OK to enable CET by default for
all applications and libraries.
---
 sysdeps/x86/cpu-features.c |  2 +-
 sysdeps/x86/cpu-tunables.c | 15 ++++++++++++++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index 097868c1d9..80a07ac589 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -110,7 +110,7 @@ update_active (struct cpu_features *cpu_features)
   if (!CPU_FEATURES_CPU_P (cpu_features, RTM_ALWAYS_ABORT))
     CPU_FEATURE_SET_ACTIVE (cpu_features, RTM);
 
-#if CET_ENABLED
+#if CET_ENABLED && 0
   CPU_FEATURE_SET_ACTIVE (cpu_features, IBT);
   CPU_FEATURE_SET_ACTIVE (cpu_features, SHSTK);
 #endif
diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c
index 142c6b9240..1742400525 100644
--- a/sysdeps/x86/cpu-tunables.c
+++ b/sysdeps/x86/cpu-tunables.c
@@ -35,6 +35,17 @@
       break;								\
     }
 
+#define CHECK_GLIBC_IFUNC_CPU_BOTH(f, cpu_features, name, len)		\
+  _Static_assert (sizeof (#name) - 1 == len, #name " != " #len);	\
+  if (tunable_str_comma_strcmp_cte (&f, #name))				\
+    {									\
+      if (f.disable)							\
+	CPU_FEATURE_UNSET (cpu_features, name)				\
+      else								\
+	CPU_FEATURE_SET_ACTIVE (cpu_features, name)			\
+      break;								\
+    }
+
 /* Disable a preferred feature NAME.  We don't enable a preferred feature
    which isn't available.  */
 #define CHECK_GLIBC_IFUNC_PREFERRED_OFF(f, cpu_features, name, len)	\
@@ -131,11 +142,13 @@ TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp)
 	    }
 	  break;
 	case 5:
+	  {
+	    CHECK_GLIBC_IFUNC_CPU_BOTH (n, cpu_features, SHSTK, 5);
+	  }
 	  if (n.disable)
 	    {
 	      CHECK_GLIBC_IFUNC_CPU_OFF (n, cpu_features, LZCNT, 5);
 	      CHECK_GLIBC_IFUNC_CPU_OFF (n, cpu_features, MOVBE, 5);
-	      CHECK_GLIBC_IFUNC_CPU_OFF (n, cpu_features, SHSTK, 5);
 	      CHECK_GLIBC_IFUNC_CPU_OFF (n, cpu_features, SSSE3, 5);
 	      CHECK_GLIBC_IFUNC_CPU_OFF (n, cpu_features, XSAVE, 5);
 	    }
-- 
2.43.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v5 6/6] x86/cet: Run some CET tests with shadow stack
  2023-12-22 16:58 [PATCH v5 0/6] x86/cet: Update CET kernel interface H.J. Lu
                   ` (4 preceding siblings ...)
  2023-12-22 16:58 ` [PATCH v5 5/6] x86/cet: Don't set CET active by default H.J. Lu
@ 2023-12-22 16:58 ` H.J. Lu
  2023-12-28 16:00 ` [PATCH v5 0/6] x86/cet: Update CET kernel interface Florian Weimer
  6 siblings, 0 replies; 18+ messages in thread
From: H.J. Lu @ 2023-12-22 16:58 UTC (permalink / raw)
  To: libc-alpha; +Cc: rick.p.edgecombe, goldstein.w.n

When CET is disabled by default, run some CET tests with shadow stack
enabled using

$ export GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
---
 sysdeps/x86/Makefile                      | 14 ++++++++++++++
 sysdeps/x86/tst-shstk-legacy-1e-static.sh |  1 +
 sysdeps/x86/tst-shstk-legacy-1e.sh        |  1 +
 sysdeps/x86/tst-shstk-legacy-1g.sh        |  1 +
 4 files changed, 17 insertions(+)

diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
index 0aafd2afeb..a49b13c595 100644
--- a/sysdeps/x86/Makefile
+++ b/sysdeps/x86/Makefile
@@ -249,6 +249,13 @@ CFLAGS-tst-cet-legacy-10-static.c += -mshstk
 CFLAGS-tst-cet-legacy-10a.c += -fcf-protection=none
 CFLAGS-tst-cet-legacy-10a-static.c += -fcf-protection=none
 
+tst-cet-legacy-4-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
+tst-cet-legacy-6-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
+tst-cet-legacy-10-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
+tst-cet-legacy-10-static-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
+tst-cet-legacy-10a-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
+tst-cet-legacy-10a-static-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
+
 CFLAGS-tst-shstk-legacy-1a.c += -fcf-protection=none
 CFLAGS-tst-shstk-legacy-1a-static.c += -fcf-protection=none
 CFLAGS-tst-shstk-legacy-1d.c += -fcf-protection=none
@@ -288,14 +295,20 @@ tst-cet-legacy-6b-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK
 tst-cet-legacy-9-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK
 tst-cet-legacy-9-static-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK
 
+tst-shstk-legacy-1a-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
+tst-shstk-legacy-1a-static-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
 $(objpfx)tst-shstk-legacy-1a: $(objpfx)tst-shstk-legacy-1-extra.o
 $(objpfx)tst-shstk-legacy-1a-static: $(objpfx)tst-shstk-legacy-1-extra.o
+tst-shstk-legacy-1b-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
+tst-shstk-legacy-1b-static-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
 $(objpfx)tst-shstk-legacy-1b: $(objpfx)tst-shstk-legacy-1-extra.o
 $(objpfx)tst-shstk-legacy-1b-static: $(objpfx)tst-shstk-legacy-1-extra.o
 tst-shstk-legacy-1c-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=-SHSTK
 tst-shstk-legacy-1c-static-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=-SHSTK
 $(objpfx)tst-shstk-legacy-1c: $(objpfx)tst-shstk-legacy-1-extra.o
 $(objpfx)tst-shstk-legacy-1c-static: $(objpfx)tst-shstk-legacy-1-extra.o
+tst-shstk-legacy-1d-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
+tst-shstk-legacy-1d-static-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
 $(objpfx)tst-shstk-legacy-1d: $(objpfx)tst-shstk-legacy-1-extra.o
 $(objpfx)tst-shstk-legacy-1d-static: $(objpfx)tst-shstk-legacy-1-extra.o
 $(objpfx)tst-shstk-legacy-1e: $(objpfx)tst-shstk-legacy-1-extra.o
@@ -309,6 +322,7 @@ $(objpfx)tst-shstk-legacy-1e-static.out: \
   $(objpfx)tst-shstk-legacy-1e-static
 	$(SHELL) $< $(common-objpfx) 2> $@; \
 	$(evaluate-test)
+tst-shstk-legacy-1f-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK
 $(objpfx)tst-shstk-legacy-1f: $(objpfx)tst-shstk-legacy-mod-1.so
 $(objpfx)tst-shstk-legacy-mod-1.so: \
   $(objpfx)tst-shstk-legacy-mod-1.os \
diff --git a/sysdeps/x86/tst-shstk-legacy-1e-static.sh b/sysdeps/x86/tst-shstk-legacy-1e-static.sh
index e943aec70e..008c50dae3 100755
--- a/sysdeps/x86/tst-shstk-legacy-1e-static.sh
+++ b/sysdeps/x86/tst-shstk-legacy-1e-static.sh
@@ -20,6 +20,7 @@
 
 common_objpfx=$1; shift
 
+GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK \
 ${common_objpfx}elf/tst-shstk-legacy-1e-static
 # The exit status should only be unsupported (77) or segfault (139).
 status=$?
diff --git a/sysdeps/x86/tst-shstk-legacy-1e.sh b/sysdeps/x86/tst-shstk-legacy-1e.sh
index b0467aa899..82f2acbf75 100755
--- a/sysdeps/x86/tst-shstk-legacy-1e.sh
+++ b/sysdeps/x86/tst-shstk-legacy-1e.sh
@@ -21,6 +21,7 @@
 common_objpfx=$1; shift
 test_program_prefix=$1; shift
 
+GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK \
 ${test_program_prefix} \
   ${common_objpfx}elf/tst-shstk-legacy-1e
 # The exit status should only be unsupported (77) or segfault (139).
diff --git a/sysdeps/x86/tst-shstk-legacy-1g.sh b/sysdeps/x86/tst-shstk-legacy-1g.sh
index c112bf6d8d..261eef7cac 100755
--- a/sysdeps/x86/tst-shstk-legacy-1g.sh
+++ b/sysdeps/x86/tst-shstk-legacy-1g.sh
@@ -21,6 +21,7 @@
 common_objpfx=$1; shift
 test_program_prefix=$1; shift
 
+GLIBC_TUNABLES=glibc.cpu.hwcaps=SHSTK \
 ${test_program_prefix} \
   ${common_objpfx}elf/tst-shstk-legacy-1g
 # The exit status should only be unsupported (77) or segfault (139).
-- 
2.43.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/6] x86/cet: Sync with Linux kernel 6.6 shadow stack interface
  2023-12-22 16:58 ` [PATCH v5 1/6] x86/cet: Sync with Linux kernel 6.6 shadow stack interface H.J. Lu
@ 2023-12-26 17:37   ` Noah Goldstein
  2023-12-26 17:56     ` H.J. Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Noah Goldstein @ 2023-12-26 17:37 UTC (permalink / raw)
  To: H.J. Lu; +Cc: libc-alpha, rick.p.edgecombe

On Fri, Dec 22, 2023 at 8:58 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Sync with Linux kernel 6.6 shadow stack interface.  Since only x86-64 is
> supported, i386 shadow stack codes are unchanged and CET shouldn't be
> enabled for i386.
>
> 1. When the shadow stack base in TCB is unset, the default shadow stack
> is in use.  Use the current shadow stack pointer as the marker for the
> default shadow stack. It is used to identify if the current shadow stack
> is the same as the target shadow stack when switching ucontexts.  If yes,
> INCSSP will be used to unwind shadow stack.  Otherwise, shadow stack
> restore token will be used.
> 2. Allocate shadow stack with the map_shadow_stack syscall.  Since there
> is no function to explicitly release ucontext, there is no place to
> release shadow stack allocated by map_shadow_stack in ucontext functions.
> Such shadow stacks will be leaked.
> 3. Rename arch_prctl CET commands to ARCH_SHSTK_XXX.
> 4. Rewrite the CET control functions with the current kernel shadow stack
> interface.
>
> Since CET is no longer enabled by kernel, a separate patch will enable
> shadow stack during startup.
> ---
>  sysdeps/unix/sysv/linux/x86/bits/mman.h       |  5 ++
>  sysdeps/unix/sysv/linux/x86/cpu-features.c    | 13 +++--
>  sysdeps/unix/sysv/linux/x86/dl-cet.h          | 16 ++++--
>  .../unix/sysv/linux/x86/include/asm/prctl.h   | 37 ++++++-------
>  .../sysv/linux/x86/tst-cet-setcontext-1.c     | 17 +++---
>  sysdeps/unix/sysv/linux/x86_64/Makefile       |  2 +-
>  .../unix/sysv/linux/x86_64/__start_context.S  | 38 +++----------
>  .../sysv/linux/x86_64/allocate-shadow-stack.c | 55 +++++++++++++++++++
>  .../sysv/linux/x86_64/allocate-shadow-stack.h | 24 ++++++++
>  sysdeps/unix/sysv/linux/x86_64/getcontext.S   | 30 ++--------
>  sysdeps/unix/sysv/linux/x86_64/makecontext.c  | 28 +++++-----
>  sysdeps/unix/sysv/linux/x86_64/swapcontext.S  | 22 ++------
>  sysdeps/x86/cpu-features.c                    | 15 +++--
>  sysdeps/x86/dl-cet.c                          |  2 +-
>  sysdeps/x86_64/nptl/tls.h                     |  2 +-
>  15 files changed, 173 insertions(+), 133 deletions(-)
>  create mode 100644 sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
>  create mode 100644 sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
>
> diff --git a/sysdeps/unix/sysv/linux/x86/bits/mman.h b/sysdeps/unix/sysv/linux/x86/bits/mman.h
> index 3d356e86a0..232b55a13d 100644
> --- a/sysdeps/unix/sysv/linux/x86/bits/mman.h
> +++ b/sysdeps/unix/sysv/linux/x86/bits/mman.h
> @@ -27,6 +27,11 @@
>  #define MAP_32BIT      0x40            /* Only give out 32-bit addresses.  */
>  #define MAP_ABOVE4G    0x80            /* Only map above 4GB.  */
>
> +#ifdef __USE_MISC
> +/* Set up a restore token in the newly allocated shadow stack */
> +# define SHADOW_STACK_SET_TOKEN 0x1
> +#endif
> +
>  #include <bits/mman-map-flags-generic.h>
>
>  /* Include generic Linux declarations.  */
> diff --git a/sysdeps/unix/sysv/linux/x86/cpu-features.c b/sysdeps/unix/sysv/linux/x86/cpu-features.c
> index 41e7600668..0e6e2bf855 100644
> --- a/sysdeps/unix/sysv/linux/x86/cpu-features.c
> +++ b/sysdeps/unix/sysv/linux/x86/cpu-features.c
> @@ -23,10 +23,15 @@
>  static inline int __attribute__ ((always_inline))
>  get_cet_status (void)
>  {
> -  unsigned long long cet_status[3];
> -  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_STATUS, cet_status) == 0)
> -    return cet_status[0];
> -  return 0;
> +  unsigned long long kernel_feature;
> +  unsigned int status = 0;
> +  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_STATUS,
> +                            &kernel_feature) == 0)
> +    {
> +      if ((kernel_feature & ARCH_SHSTK_SHSTK) != 0)
> +       status = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> +    }
> +  return status;
>  }
>
>  # ifndef SHARED
> diff --git a/sysdeps/unix/sysv/linux/x86/dl-cet.h b/sysdeps/unix/sysv/linux/x86/dl-cet.h
> index c885bf1323..da220ac627 100644
> --- a/sysdeps/unix/sysv/linux/x86/dl-cet.h
> +++ b/sysdeps/unix/sysv/linux/x86/dl-cet.h
> @@ -21,12 +21,20 @@
>  static inline int __attribute__ ((always_inline))
>  dl_cet_disable_cet (unsigned int cet_feature)
>  {
> -  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_DISABLE,
> -                                     cet_feature);
> +  if (cet_feature != GNU_PROPERTY_X86_FEATURE_1_SHSTK)
> +    return -1;
> +  long long int kernel_feature = ARCH_SHSTK_SHSTK;
> +  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_DISABLE,
> +                                     kernel_feature);
>  }
>
>  static inline int __attribute__ ((always_inline))
> -dl_cet_lock_cet (void)
> +dl_cet_lock_cet (unsigned int cet_feature)
>  {
> -  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_LOCK, 0);
> +  if (cet_feature != GNU_PROPERTY_X86_FEATURE_1_SHSTK)
> +    return -1;
> +  /* Lock all SHSTK features.  */
> +  long long int kernel_feature = -1;
> +  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_LOCK,
> +                                     kernel_feature);
>  }
> diff --git a/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h b/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
> index 45ad0b052f..2f511321ad 100644
> --- a/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
> +++ b/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
> @@ -4,24 +4,19 @@
>
>  #include_next <asm/prctl.h>
>
> -#ifndef ARCH_CET_STATUS
> -/* CET features:
> -   IBT:   GNU_PROPERTY_X86_FEATURE_1_IBT
> -   SHSTK: GNU_PROPERTY_X86_FEATURE_1_SHSTK
> - */
> -/* Return CET features in unsigned long long *addr:
> -     features: addr[0].
> -     shadow stack base address: addr[1].
> -     shadow stack size: addr[2].
> - */
> -# define ARCH_CET_STATUS       0x3001
> -/* Disable CET features in unsigned int features.  */
> -# define ARCH_CET_DISABLE      0x3002
> -/* Lock all CET features.  */
> -# define ARCH_CET_LOCK         0x3003
> -/* Allocate a new shadow stack with unsigned long long *addr:
> -     IN: requested shadow stack size: *addr.
> -     OUT: allocated shadow stack address: *addr.
> - */
> -# define ARCH_CET_ALLOC_SHSTK  0x3004
> -#endif /* ARCH_CET_STATUS */
> +#ifndef ARCH_SHSTK_ENABLE
> +/* Enable SHSTK features in unsigned long int features.  */
> +# define ARCH_SHSTK_ENABLE             0x5001
> +/* Disable SHSTK features in unsigned long int features.  */
> +# define ARCH_SHSTK_DISABLE            0x5002
> +/* Lock SHSTK features in unsigned long int features.  */
> +# define ARCH_SHSTK_LOCK               0x5003
> +/* Unlock SHSTK features in unsigned long int features.  */
> +# define ARCH_SHSTK_UNLOCK             0x5004
> +/* Return SHSTK features in unsigned long int features.  */
> +# define ARCH_SHSTK_STATUS             0x5005
> +
> +/* ARCH_SHSTK_ features bits */
> +# define ARCH_SHSTK_SHSTK              0x1
> +# define ARCH_SHSTK_WRSS               0x2
> +#endif
> diff --git a/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c b/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
> index 837a9fd0eb..2ea66c803b 100644
> --- a/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
> +++ b/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
> @@ -87,15 +87,14 @@ do_test (void)
>    ctx[4].uc_link = &ctx[0];
>    makecontext (&ctx[4], (void (*) (void)) f1, 0);
>
> -  /* NB: When shadow stack is enabled, makecontext calls arch_prctl
> -     with ARCH_CET_ALLOC_SHSTK to allocate a new shadow stack which
> -     can be unmapped.  The base address and size of the new shadow
> -     stack are returned in __ssp[1] and __ssp[2].  makecontext is
> -     called for CTX1, CTX3 and CTX4.  But only CTX1 is used.  New
> -     shadow stacks are allocated in the order of CTX3, CTX1, CTX4.
> -     It is very likely that CTX1's shadow stack is placed between
> -     CTX3 and CTX4.  We munmap CTX3's and CTX4's shadow stacks to
> -     create gaps above and below CTX1's shadow stack.  We check
> +  /* NB: When shadow stack is enabled, makecontext calls map_shadow_stack
> +     to allocate a new shadow stack which can be unmapped.  The base
> +     address and size of the new shadow stack are returned in __ssp[1]
> +     and __ssp[2].  makecontext is called for CTX1, CTX3 and CTX4.  But
> +     only CTX1 is used.  New shadow stacks are allocated in the order
> +     of CTX3, CTX1, CTX4.  It is very likely that CTX1's shadow stack is
> +     placed between CTX3 and CTX4.  We munmap CTX3's and CTX4's shadow
> +     stacks to create gaps above and below CTX1's shadow stack.  We check
>       that setcontext CTX1 works correctly in this case.  */
>    if (_get_ssp () != 0)
>      {
> diff --git a/sysdeps/unix/sysv/linux/x86_64/Makefile b/sysdeps/unix/sysv/linux/x86_64/Makefile
> index 5e19202ebf..06b873949e 100644
> --- a/sysdeps/unix/sysv/linux/x86_64/Makefile
> +++ b/sysdeps/unix/sysv/linux/x86_64/Makefile
> @@ -3,7 +3,7 @@ sysdep_routines += ioperm iopl
>  endif
>
>  ifeq ($(subdir),stdlib)
> -sysdep_routines += __start_context
> +sysdep_routines += __start_context allocate-shadow-stack
>  endif
>
>  ifeq ($(subdir),csu)
> diff --git a/sysdeps/unix/sysv/linux/x86_64/__start_context.S b/sysdeps/unix/sysv/linux/x86_64/__start_context.S
> index f6436dd6bb..ae04203c90 100644
> --- a/sysdeps/unix/sysv/linux/x86_64/__start_context.S
> +++ b/sysdeps/unix/sysv/linux/x86_64/__start_context.S
> @@ -24,20 +24,14 @@
>  /* Use CALL to push __start_context onto the new stack as well as the new
>     shadow stack.  RDI points to ucontext:
>     Incoming:
> -     __ssp[0]: The original caller's shadow stack pointer.
> -     __ssp[1]: The size of the new shadow stack.
> -     __ssp[2]: The size of the new shadow stack.
> -   Outgoing:
>       __ssp[0]: The new shadow stack pointer.
>       __ssp[1]: The base address of the new shadow stack.
>       __ssp[2]: The size of the new shadow stack.
>   */
>
>  ENTRY(__push___start_context)
> -       /* Save the pointer to ucontext.  */
> -       movq    %rdi, %r9
>         /* Get the original shadow stack pointer.  */
> -       rdsspq  %r8
> +       rdsspq  %rcx
>         /* Save the original stack pointer.  */
>         movq    %rsp, %rdx
>         /* Load the top of the new stack into RSI.  */
> @@ -45,24 +39,12 @@ ENTRY(__push___start_context)
>         /* Add 8 bytes to RSI since CALL will push the 8-byte return
>            address onto stack.  */
>         leaq    8(%rsi), %rsp
> -       /* Allocate the new shadow stack.  The size of the new shadow
> -          stack is passed in __ssp[1].  */
> -       lea     (oSSP + 8)(%rdi), %RSI_LP
> -       movl    $ARCH_CET_ALLOC_SHSTK, %edi
> -       movl    $__NR_arch_prctl, %eax
> -       /* The new shadow stack base is returned in __ssp[1].  */
> -       syscall
> -       testq   %rax, %rax
> -       jne     L(hlt)          /* This should never happen.  */
> -
> -       /* Get the size of the new shadow stack.  */
> -       movq    8(%rsi), %rdi
> -
> -       /* Get the base address of the new shadow stack.  */
> -       movq    (%rsi), %rsi
> -
> +       /* The size of the new shadow stack is stored in __ssp[2].  */
> +       mov     (oSSP + 16)(%rdi), %RSI_LP
> +       /* The new shadow stack base is stored in __ssp[1].  */
> +       mov     (oSSP + 8)(%rdi), %RAX_LP
>         /* Use the restore stoken to restore the new shadow stack.  */
> -       rstorssp -8(%rsi, %rdi)
> +       rstorssp -8(%rax, %rsi)
>
>         /* Save the restore token on the original shadow stack.  */
>         saveprevssp
> @@ -73,18 +55,12 @@ ENTRY(__push___start_context)
>         jmp     __start_context
>  1:
>
> -       /* Get the new shadow stack pointer.  */
> -       rdsspq  %rdi
> -
>         /* Use the restore stoken to restore the original shadow stack.  */
> -       rstorssp -8(%r8)
> +       rstorssp -8(%rcx)
>
>         /* Save the restore token on the new shadow stack.  */
>         saveprevssp
>
> -       /* Store the new shadow stack pointer in __ssp[0].  */
> -       movq    %rdi, oSSP(%r9)
> -
>         /* Restore the original stack.  */
>         mov     %rdx, %rsp
>         ret
> diff --git a/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
> new file mode 100644
> index 0000000000..f2e1d03b96
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
> @@ -0,0 +1,55 @@
> +/* Helper function to allocate shadow stack.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <stdint.h>
> +#include <errno.h>
> +#include <sys/mman.h>
> +#include <libc-pointer-arith.h>
> +#include <allocate-shadow-stack.h>
> +
> +/* NB: This can be treated as a syscall by caller.  */
> +
> +long int
> +__allocate_shadow_stack (size_t stack_size,
> +                        shadow_stack_size_t *child_stack)
> +{
> +#ifdef __NR_map_shadow_stack
> +  size_t shadow_stack_size
> +    = stack_size >> STACK_SIZE_TO_SHADOW_STACK_SIZE_SHIFT;
> +  /* Align shadow stack to 8 bytes.  */
> +  shadow_stack_size = ALIGN_UP (shadow_stack_size, 8);
> +  /* Since sigaltstack shares shadow stack with the current context in
> +     the thread, add extra 20 stack frames in shadow stack for signal
> +     handlers.  */
> +  shadow_stack_size += 20 * 8;
> +  void *shadow_stack = (void *)INLINE_SYSCALL_CALL
> +    (map_shadow_stack, NULL, shadow_stack_size, SHADOW_STACK_SET_TOKEN);
> +  /* Report the map_shadow_stack error.  */
> +  if (shadow_stack == MAP_FAILED)
> +    return -errno;
> +
> +  /* Save the shadow stack base and size on child stack.  */
> +  child_stack[0] = (uintptr_t) shadow_stack;
> +  child_stack[1] = shadow_stack_size;
> +
> +  return 0;
> +#else
> +  return -ENOSYS;
> +#endif
> +}
> diff --git a/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
> new file mode 100644
> index 0000000000..d05aaf16e5
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
> @@ -0,0 +1,24 @@
> +/* Helper function to allocate shadow stack.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <ucontext.h>
> +
> +typedef __typeof (((ucontext_t *) 0)->__ssp[0]) shadow_stack_size_t;
> +
> +extern long int __allocate_shadow_stack (size_t, shadow_stack_size_t *)
> +  attribute_hidden;
> diff --git a/sysdeps/unix/sysv/linux/x86_64/getcontext.S b/sysdeps/unix/sysv/linux/x86_64/getcontext.S
> index a00e2f6290..71f3802dca 100644
> --- a/sysdeps/unix/sysv/linux/x86_64/getcontext.S
> +++ b/sysdeps/unix/sysv/linux/x86_64/getcontext.S
> @@ -58,35 +58,15 @@ ENTRY(__getcontext)
>         testl   $X86_FEATURE_1_SHSTK, %fs:FEATURE_1_OFFSET
>         jz      L(no_shstk)
>
> -       /* Save RDI in RDX which won't be clobbered by syscall.  */
> -       movq    %rdi, %rdx
> -
>         xorl    %eax, %eax
>         cmpq    %fs:SSP_BASE_OFFSET, %rax
>         jnz     L(shadow_stack_bound_recorded)
>
> -       /* Get the base address and size of the default shadow stack
> -          which must be the current shadow stack since nothing has
> -          been recorded yet.  */
> -       sub     $24, %RSP_LP
> -       mov     %RSP_LP, %RSI_LP
> -       movl    $ARCH_CET_STATUS, %edi
> -       movl    $__NR_arch_prctl, %eax
> -       syscall
> -       testq   %rax, %rax
> -       jz      L(continue_no_err)
> -
> -       /* This should never happen.  */
> -       hlt
> -
> -L(continue_no_err):
> -       /* Record the base of the current shadow stack.  */
> -       movq    8(%rsp), %rax
> +       /* When the shadow stack base is unset, the default shadow
> +          stack is in use.  Use the current shadow stack pointer
> +          as the marker for the default shadow stack.  */
> +       rdsspq  %rax
>         movq    %rax, %fs:SSP_BASE_OFFSET
> -       add     $24, %RSP_LP
> -
> -       /* Restore RDI.  */
> -       movq    %rdx, %rdi
>
>  L(shadow_stack_bound_recorded):
>         /* Get the current shadow stack pointer.  */
> @@ -94,7 +74,7 @@ L(shadow_stack_bound_recorded):
>         /* NB: Save the caller's shadow stack so that we can jump back
>            to the caller directly.  */
>         addq    $8, %rax
> -       movq    %rax, oSSP(%rdx)
> +       movq    %rax, oSSP(%rdi)
>
>         /* Save the current shadow stack base in ucontext.  */
>         movq    %fs:SSP_BASE_OFFSET, %rax
> diff --git a/sysdeps/unix/sysv/linux/x86_64/makecontext.c b/sysdeps/unix/sysv/linux/x86_64/makecontext.c
> index de9e03eb81..e4f025bd50 100644
> --- a/sysdeps/unix/sysv/linux/x86_64/makecontext.c
> +++ b/sysdeps/unix/sysv/linux/x86_64/makecontext.c
> @@ -24,6 +24,7 @@
>  # include <pthread.h>
>  # include <libc-pointer-arith.h>
>  # include <sys/prctl.h>
> +# include <allocate-shadow-stack.h>
>  #endif
>
>  #include "ucontext_i.h"
> @@ -88,23 +89,24 @@ __makecontext (ucontext_t *ucp, void (*func) (void), int argc, ...)
>    if ((feature_1 & X86_FEATURE_1_SHSTK) != 0)
>      {
>        /* Shadow stack is enabled.  We need to allocate a new shadow
> -         stack.  */
> -      unsigned long ssp_size = (((uintptr_t) sp
> -                                - (uintptr_t) ucp->uc_stack.ss_sp)
> -                               >> STACK_SIZE_TO_SHADOW_STACK_SIZE_SHIFT);
> -      /* Align shadow stack to 8 bytes.  */
> -      ssp_size = ALIGN_UP (ssp_size, 8);
> -
> -      ucp->__ssp[1] = ssp_size;
> -      ucp->__ssp[2] = ssp_size;
> -
> -      /* Call __push___start_context to allocate a new shadow stack,
> -        push __start_context onto the new stack as well as the new
> -        shadow stack.  NB: After __push___start_context returns,
> +         stack.  NB:
>            ucp->__ssp[0]: The new shadow stack pointer.
>            ucp->__ssp[1]: The base address of the new shadow stack.
>            ucp->__ssp[2]: The size of the new shadow stack.
>         */
> +      long int ret
> +       = __allocate_shadow_stack (((uintptr_t) sp
> +                                   - (uintptr_t) ucp->uc_stack.ss_sp),
> +                                  &ucp->__ssp[1]);
> +      if (ret != 0)
> +       {
> +         /* FIXME: What should we do?  */
> +         abort ();
> +       }
> +
> +      ucp->__ssp[0] = ucp->__ssp[1] + ucp->__ssp[2] - 8;
> +      /* Call __push___start_context to push __start_context onto the new
> +        stack as well as the new shadow stack.  */
>        __push___start_context (ucp);
>      }
>    else
> diff --git a/sysdeps/unix/sysv/linux/x86_64/swapcontext.S b/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
> index 5925752164..2f2fe9875b 100644
> --- a/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
> +++ b/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
> @@ -109,25 +109,11 @@ ENTRY(__swapcontext)
>         cmpq    %fs:SSP_BASE_OFFSET, %rax
>         jnz     L(shadow_stack_bound_recorded)
>
> -       /* Get the base address and size of the default shadow stack
> -          which must be the current shadow stack since nothing has
> -          been recorded yet.  */
> -       sub     $24, %RSP_LP
> -       mov     %RSP_LP, %RSI_LP
> -       movl    $ARCH_CET_STATUS, %edi
> -       movl    $__NR_arch_prctl, %eax
> -       syscall
> -       testq   %rax, %rax
> -       jz      L(continue_no_err)
> -
> -       /* This should never happen.  */
> -       hlt
> -
> -L(continue_no_err):
> -       /* Record the base of the current shadow stack.  */
> -       movq    8(%rsp), %rax
> +       /* When the shadow stack base is unset, the default shadow
> +          stack is in use.  Use the current shadow stack pointer
> +          as the marker for the default shadow stack.  */
> +       rdsspq  %rax
>         movq    %rax, %fs:SSP_BASE_OFFSET
> -       add     $24, %RSP_LP
>
>  L(shadow_stack_bound_recorded):
>          /* If we unwind the stack, we can't undo stack unwinding.  Just
> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> index 0bf923d48b..f180f0d9a4 100644
> --- a/sysdeps/x86/cpu-features.c
> +++ b/sysdeps/x86/cpu-features.c
> @@ -1121,8 +1121,9 @@ no_cpuid:
>
>  # ifndef SHARED
>        /* Check if IBT and SHSTK are enabled by kernel.  */
> -      if ((cet_status & GNU_PROPERTY_X86_FEATURE_1_IBT)
> -         || (cet_status & GNU_PROPERTY_X86_FEATURE_1_SHSTK))
> +      if ((cet_status
> +          & (GNU_PROPERTY_X86_FEATURE_1_IBT
> +             | GNU_PROPERTY_X86_FEATURE_1_SHSTK)))

I think the code here and elsewhere would be simplifiable with a
define/enum of
`GNU_PROPERTY_X86_FEATURE_1_SHSTK_OR_IBT =
GNU_PROPERTY_X86_FEATURE_1_SHSTK | GNU_PROPERTY_X86_FEATURE_1_IBT`
>         {
>           /* Disable IBT and/or SHSTK if they are enabled by kernel, but
>              disabled by environment variable:
> @@ -1131,9 +1132,11 @@ no_cpuid:
>            */
>           unsigned int cet_feature = 0;
>           if (!CPU_FEATURE_USABLE (IBT))
> -           cet_feature |= GNU_PROPERTY_X86_FEATURE_1_IBT;
> +           cet_feature |= (cet_status
> +                           & GNU_PROPERTY_X86_FEATURE_1_IBT);
>           if (!CPU_FEATURE_USABLE (SHSTK))
> -           cet_feature |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> +           cet_feature |= (cet_status
> +                           & GNU_PROPERTY_X86_FEATURE_1_SHSTK);
>
>           if (cet_feature)
>             {
> @@ -1148,7 +1151,9 @@ no_cpuid:
>              lock CET if IBT or SHSTK is enabled permissively.  */
>           if (GL(dl_x86_feature_control).ibt != cet_permissive
>               && GL(dl_x86_feature_control).shstk != cet_permissive)
> -           dl_cet_lock_cet ();
> +           dl_cet_lock_cet (GL(dl_x86_feature_1)
> +                            & (GNU_PROPERTY_X86_FEATURE_1_IBT
> +                               | GNU_PROPERTY_X86_FEATURE_1_SHSTK));
>         }
>  # endif
>      }
> diff --git a/sysdeps/x86/dl-cet.c b/sysdeps/x86/dl-cet.c
> index e486e549be..66a78244d4 100644
> --- a/sysdeps/x86/dl-cet.c
> +++ b/sysdeps/x86/dl-cet.c
> @@ -202,7 +202,7 @@ dl_cet_check_startup (struct link_map *m, struct dl_cet_info *info)
>         feature_1_lock |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
>
>        if (feature_1_lock != 0
> -         && dl_cet_lock_cet () != 0)
> +         && dl_cet_lock_cet (feature_1_lock) != 0)
>         _dl_fatal_printf ("%s: can't lock CET\n", info->program);
>      }
>
> diff --git a/sysdeps/x86_64/nptl/tls.h b/sysdeps/x86_64/nptl/tls.h
> index 1403f939f7..4bcc2552a1 100644
> --- a/sysdeps/x86_64/nptl/tls.h
> +++ b/sysdeps/x86_64/nptl/tls.h
> @@ -60,7 +60,7 @@ typedef struct
>    void *__private_tm[4];
>    /* GCC split stack support.  */
>    void *__private_ss;
> -  /* The lowest address of shadow stack,  */
> +  /* The marker for the current shadow stack.  */
>    unsigned long long int ssp_base;
>    /* Must be kept even if it is no longer used by glibc since programs,
>       like AddressSanitizer, depend on the size of tcbhead_t.  */
> --
> 2.43.0
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/6] x86/cet: Sync with Linux kernel 6.6 shadow stack interface
  2023-12-26 17:37   ` Noah Goldstein
@ 2023-12-26 17:56     ` H.J. Lu
  2023-12-27  0:40       ` Noah Goldstein
  0 siblings, 1 reply; 18+ messages in thread
From: H.J. Lu @ 2023-12-26 17:56 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: libc-alpha, rick.p.edgecombe

On Tue, Dec 26, 2023 at 9:38 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Fri, Dec 22, 2023 at 8:58 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > Sync with Linux kernel 6.6 shadow stack interface.  Since only x86-64 is
> > supported, i386 shadow stack codes are unchanged and CET shouldn't be
> > enabled for i386.
> >
> > 1. When the shadow stack base in TCB is unset, the default shadow stack
> > is in use.  Use the current shadow stack pointer as the marker for the
> > default shadow stack. It is used to identify if the current shadow stack
> > is the same as the target shadow stack when switching ucontexts.  If yes,
> > INCSSP will be used to unwind shadow stack.  Otherwise, shadow stack
> > restore token will be used.
> > 2. Allocate shadow stack with the map_shadow_stack syscall.  Since there
> > is no function to explicitly release ucontext, there is no place to
> > release shadow stack allocated by map_shadow_stack in ucontext functions.
> > Such shadow stacks will be leaked.
> > 3. Rename arch_prctl CET commands to ARCH_SHSTK_XXX.
> > 4. Rewrite the CET control functions with the current kernel shadow stack
> > interface.
> >
> > Since CET is no longer enabled by kernel, a separate patch will enable
> > shadow stack during startup.
> > ---
> >  sysdeps/unix/sysv/linux/x86/bits/mman.h       |  5 ++
> >  sysdeps/unix/sysv/linux/x86/cpu-features.c    | 13 +++--
> >  sysdeps/unix/sysv/linux/x86/dl-cet.h          | 16 ++++--
> >  .../unix/sysv/linux/x86/include/asm/prctl.h   | 37 ++++++-------
> >  .../sysv/linux/x86/tst-cet-setcontext-1.c     | 17 +++---
> >  sysdeps/unix/sysv/linux/x86_64/Makefile       |  2 +-
> >  .../unix/sysv/linux/x86_64/__start_context.S  | 38 +++----------
> >  .../sysv/linux/x86_64/allocate-shadow-stack.c | 55 +++++++++++++++++++
> >  .../sysv/linux/x86_64/allocate-shadow-stack.h | 24 ++++++++
> >  sysdeps/unix/sysv/linux/x86_64/getcontext.S   | 30 ++--------
> >  sysdeps/unix/sysv/linux/x86_64/makecontext.c  | 28 +++++-----
> >  sysdeps/unix/sysv/linux/x86_64/swapcontext.S  | 22 ++------
> >  sysdeps/x86/cpu-features.c                    | 15 +++--
> >  sysdeps/x86/dl-cet.c                          |  2 +-
> >  sysdeps/x86_64/nptl/tls.h                     |  2 +-
> >  15 files changed, 173 insertions(+), 133 deletions(-)
> >  create mode 100644 sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
> >  create mode 100644 sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
> >
> > diff --git a/sysdeps/unix/sysv/linux/x86/bits/mman.h b/sysdeps/unix/sysv/linux/x86/bits/mman.h
> > index 3d356e86a0..232b55a13d 100644
> > --- a/sysdeps/unix/sysv/linux/x86/bits/mman.h
> > +++ b/sysdeps/unix/sysv/linux/x86/bits/mman.h
> > @@ -27,6 +27,11 @@
> >  #define MAP_32BIT      0x40            /* Only give out 32-bit addresses.  */
> >  #define MAP_ABOVE4G    0x80            /* Only map above 4GB.  */
> >
> > +#ifdef __USE_MISC
> > +/* Set up a restore token in the newly allocated shadow stack */
> > +# define SHADOW_STACK_SET_TOKEN 0x1
> > +#endif
> > +
> >  #include <bits/mman-map-flags-generic.h>
> >
> >  /* Include generic Linux declarations.  */
> > diff --git a/sysdeps/unix/sysv/linux/x86/cpu-features.c b/sysdeps/unix/sysv/linux/x86/cpu-features.c
> > index 41e7600668..0e6e2bf855 100644
> > --- a/sysdeps/unix/sysv/linux/x86/cpu-features.c
> > +++ b/sysdeps/unix/sysv/linux/x86/cpu-features.c
> > @@ -23,10 +23,15 @@
> >  static inline int __attribute__ ((always_inline))
> >  get_cet_status (void)
> >  {
> > -  unsigned long long cet_status[3];
> > -  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_STATUS, cet_status) == 0)
> > -    return cet_status[0];
> > -  return 0;
> > +  unsigned long long kernel_feature;
> > +  unsigned int status = 0;
> > +  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_STATUS,
> > +                            &kernel_feature) == 0)
> > +    {
> > +      if ((kernel_feature & ARCH_SHSTK_SHSTK) != 0)
> > +       status = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> > +    }
> > +  return status;
> >  }
> >
> >  # ifndef SHARED
> > diff --git a/sysdeps/unix/sysv/linux/x86/dl-cet.h b/sysdeps/unix/sysv/linux/x86/dl-cet.h
> > index c885bf1323..da220ac627 100644
> > --- a/sysdeps/unix/sysv/linux/x86/dl-cet.h
> > +++ b/sysdeps/unix/sysv/linux/x86/dl-cet.h
> > @@ -21,12 +21,20 @@
> >  static inline int __attribute__ ((always_inline))
> >  dl_cet_disable_cet (unsigned int cet_feature)
> >  {
> > -  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_DISABLE,
> > -                                     cet_feature);
> > +  if (cet_feature != GNU_PROPERTY_X86_FEATURE_1_SHSTK)
> > +    return -1;
> > +  long long int kernel_feature = ARCH_SHSTK_SHSTK;
> > +  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_DISABLE,
> > +                                     kernel_feature);
> >  }
> >
> >  static inline int __attribute__ ((always_inline))
> > -dl_cet_lock_cet (void)
> > +dl_cet_lock_cet (unsigned int cet_feature)
> >  {
> > -  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_LOCK, 0);
> > +  if (cet_feature != GNU_PROPERTY_X86_FEATURE_1_SHSTK)
> > +    return -1;
> > +  /* Lock all SHSTK features.  */
> > +  long long int kernel_feature = -1;
> > +  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_LOCK,
> > +                                     kernel_feature);
> >  }
> > diff --git a/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h b/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
> > index 45ad0b052f..2f511321ad 100644
> > --- a/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
> > +++ b/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
> > @@ -4,24 +4,19 @@
> >
> >  #include_next <asm/prctl.h>
> >
> > -#ifndef ARCH_CET_STATUS
> > -/* CET features:
> > -   IBT:   GNU_PROPERTY_X86_FEATURE_1_IBT
> > -   SHSTK: GNU_PROPERTY_X86_FEATURE_1_SHSTK
> > - */
> > -/* Return CET features in unsigned long long *addr:
> > -     features: addr[0].
> > -     shadow stack base address: addr[1].
> > -     shadow stack size: addr[2].
> > - */
> > -# define ARCH_CET_STATUS       0x3001
> > -/* Disable CET features in unsigned int features.  */
> > -# define ARCH_CET_DISABLE      0x3002
> > -/* Lock all CET features.  */
> > -# define ARCH_CET_LOCK         0x3003
> > -/* Allocate a new shadow stack with unsigned long long *addr:
> > -     IN: requested shadow stack size: *addr.
> > -     OUT: allocated shadow stack address: *addr.
> > - */
> > -# define ARCH_CET_ALLOC_SHSTK  0x3004
> > -#endif /* ARCH_CET_STATUS */
> > +#ifndef ARCH_SHSTK_ENABLE
> > +/* Enable SHSTK features in unsigned long int features.  */
> > +# define ARCH_SHSTK_ENABLE             0x5001
> > +/* Disable SHSTK features in unsigned long int features.  */
> > +# define ARCH_SHSTK_DISABLE            0x5002
> > +/* Lock SHSTK features in unsigned long int features.  */
> > +# define ARCH_SHSTK_LOCK               0x5003
> > +/* Unlock SHSTK features in unsigned long int features.  */
> > +# define ARCH_SHSTK_UNLOCK             0x5004
> > +/* Return SHSTK features in unsigned long int features.  */
> > +# define ARCH_SHSTK_STATUS             0x5005
> > +
> > +/* ARCH_SHSTK_ features bits */
> > +# define ARCH_SHSTK_SHSTK              0x1
> > +# define ARCH_SHSTK_WRSS               0x2
> > +#endif
> > diff --git a/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c b/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
> > index 837a9fd0eb..2ea66c803b 100644
> > --- a/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
> > +++ b/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
> > @@ -87,15 +87,14 @@ do_test (void)
> >    ctx[4].uc_link = &ctx[0];
> >    makecontext (&ctx[4], (void (*) (void)) f1, 0);
> >
> > -  /* NB: When shadow stack is enabled, makecontext calls arch_prctl
> > -     with ARCH_CET_ALLOC_SHSTK to allocate a new shadow stack which
> > -     can be unmapped.  The base address and size of the new shadow
> > -     stack are returned in __ssp[1] and __ssp[2].  makecontext is
> > -     called for CTX1, CTX3 and CTX4.  But only CTX1 is used.  New
> > -     shadow stacks are allocated in the order of CTX3, CTX1, CTX4.
> > -     It is very likely that CTX1's shadow stack is placed between
> > -     CTX3 and CTX4.  We munmap CTX3's and CTX4's shadow stacks to
> > -     create gaps above and below CTX1's shadow stack.  We check
> > +  /* NB: When shadow stack is enabled, makecontext calls map_shadow_stack
> > +     to allocate a new shadow stack which can be unmapped.  The base
> > +     address and size of the new shadow stack are returned in __ssp[1]
> > +     and __ssp[2].  makecontext is called for CTX1, CTX3 and CTX4.  But
> > +     only CTX1 is used.  New shadow stacks are allocated in the order
> > +     of CTX3, CTX1, CTX4.  It is very likely that CTX1's shadow stack is
> > +     placed between CTX3 and CTX4.  We munmap CTX3's and CTX4's shadow
> > +     stacks to create gaps above and below CTX1's shadow stack.  We check
> >       that setcontext CTX1 works correctly in this case.  */
> >    if (_get_ssp () != 0)
> >      {
> > diff --git a/sysdeps/unix/sysv/linux/x86_64/Makefile b/sysdeps/unix/sysv/linux/x86_64/Makefile
> > index 5e19202ebf..06b873949e 100644
> > --- a/sysdeps/unix/sysv/linux/x86_64/Makefile
> > +++ b/sysdeps/unix/sysv/linux/x86_64/Makefile
> > @@ -3,7 +3,7 @@ sysdep_routines += ioperm iopl
> >  endif
> >
> >  ifeq ($(subdir),stdlib)
> > -sysdep_routines += __start_context
> > +sysdep_routines += __start_context allocate-shadow-stack
> >  endif
> >
> >  ifeq ($(subdir),csu)
> > diff --git a/sysdeps/unix/sysv/linux/x86_64/__start_context.S b/sysdeps/unix/sysv/linux/x86_64/__start_context.S
> > index f6436dd6bb..ae04203c90 100644
> > --- a/sysdeps/unix/sysv/linux/x86_64/__start_context.S
> > +++ b/sysdeps/unix/sysv/linux/x86_64/__start_context.S
> > @@ -24,20 +24,14 @@
> >  /* Use CALL to push __start_context onto the new stack as well as the new
> >     shadow stack.  RDI points to ucontext:
> >     Incoming:
> > -     __ssp[0]: The original caller's shadow stack pointer.
> > -     __ssp[1]: The size of the new shadow stack.
> > -     __ssp[2]: The size of the new shadow stack.
> > -   Outgoing:
> >       __ssp[0]: The new shadow stack pointer.
> >       __ssp[1]: The base address of the new shadow stack.
> >       __ssp[2]: The size of the new shadow stack.
> >   */
> >
> >  ENTRY(__push___start_context)
> > -       /* Save the pointer to ucontext.  */
> > -       movq    %rdi, %r9
> >         /* Get the original shadow stack pointer.  */
> > -       rdsspq  %r8
> > +       rdsspq  %rcx
> >         /* Save the original stack pointer.  */
> >         movq    %rsp, %rdx
> >         /* Load the top of the new stack into RSI.  */
> > @@ -45,24 +39,12 @@ ENTRY(__push___start_context)
> >         /* Add 8 bytes to RSI since CALL will push the 8-byte return
> >            address onto stack.  */
> >         leaq    8(%rsi), %rsp
> > -       /* Allocate the new shadow stack.  The size of the new shadow
> > -          stack is passed in __ssp[1].  */
> > -       lea     (oSSP + 8)(%rdi), %RSI_LP
> > -       movl    $ARCH_CET_ALLOC_SHSTK, %edi
> > -       movl    $__NR_arch_prctl, %eax
> > -       /* The new shadow stack base is returned in __ssp[1].  */
> > -       syscall
> > -       testq   %rax, %rax
> > -       jne     L(hlt)          /* This should never happen.  */
> > -
> > -       /* Get the size of the new shadow stack.  */
> > -       movq    8(%rsi), %rdi
> > -
> > -       /* Get the base address of the new shadow stack.  */
> > -       movq    (%rsi), %rsi
> > -
> > +       /* The size of the new shadow stack is stored in __ssp[2].  */
> > +       mov     (oSSP + 16)(%rdi), %RSI_LP
> > +       /* The new shadow stack base is stored in __ssp[1].  */
> > +       mov     (oSSP + 8)(%rdi), %RAX_LP
> >         /* Use the restore stoken to restore the new shadow stack.  */
> > -       rstorssp -8(%rsi, %rdi)
> > +       rstorssp -8(%rax, %rsi)
> >
> >         /* Save the restore token on the original shadow stack.  */
> >         saveprevssp
> > @@ -73,18 +55,12 @@ ENTRY(__push___start_context)
> >         jmp     __start_context
> >  1:
> >
> > -       /* Get the new shadow stack pointer.  */
> > -       rdsspq  %rdi
> > -
> >         /* Use the restore stoken to restore the original shadow stack.  */
> > -       rstorssp -8(%r8)
> > +       rstorssp -8(%rcx)
> >
> >         /* Save the restore token on the new shadow stack.  */
> >         saveprevssp
> >
> > -       /* Store the new shadow stack pointer in __ssp[0].  */
> > -       movq    %rdi, oSSP(%r9)
> > -
> >         /* Restore the original stack.  */
> >         mov     %rdx, %rsp
> >         ret
> > diff --git a/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
> > new file mode 100644
> > index 0000000000..f2e1d03b96
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
> > @@ -0,0 +1,55 @@
> > +/* Helper function to allocate shadow stack.
> > +   Copyright (C) 2023 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#include <sysdep.h>
> > +#include <stdint.h>
> > +#include <errno.h>
> > +#include <sys/mman.h>
> > +#include <libc-pointer-arith.h>
> > +#include <allocate-shadow-stack.h>
> > +
> > +/* NB: This can be treated as a syscall by caller.  */
> > +
> > +long int
> > +__allocate_shadow_stack (size_t stack_size,
> > +                        shadow_stack_size_t *child_stack)
> > +{
> > +#ifdef __NR_map_shadow_stack
> > +  size_t shadow_stack_size
> > +    = stack_size >> STACK_SIZE_TO_SHADOW_STACK_SIZE_SHIFT;
> > +  /* Align shadow stack to 8 bytes.  */
> > +  shadow_stack_size = ALIGN_UP (shadow_stack_size, 8);
> > +  /* Since sigaltstack shares shadow stack with the current context in
> > +     the thread, add extra 20 stack frames in shadow stack for signal
> > +     handlers.  */
> > +  shadow_stack_size += 20 * 8;
> > +  void *shadow_stack = (void *)INLINE_SYSCALL_CALL
> > +    (map_shadow_stack, NULL, shadow_stack_size, SHADOW_STACK_SET_TOKEN);
> > +  /* Report the map_shadow_stack error.  */
> > +  if (shadow_stack == MAP_FAILED)
> > +    return -errno;
> > +
> > +  /* Save the shadow stack base and size on child stack.  */
> > +  child_stack[0] = (uintptr_t) shadow_stack;
> > +  child_stack[1] = shadow_stack_size;
> > +
> > +  return 0;
> > +#else
> > +  return -ENOSYS;
> > +#endif
> > +}
> > diff --git a/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
> > new file mode 100644
> > index 0000000000..d05aaf16e5
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
> > @@ -0,0 +1,24 @@
> > +/* Helper function to allocate shadow stack.
> > +   Copyright (C) 2023 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#include <ucontext.h>
> > +
> > +typedef __typeof (((ucontext_t *) 0)->__ssp[0]) shadow_stack_size_t;
> > +
> > +extern long int __allocate_shadow_stack (size_t, shadow_stack_size_t *)
> > +  attribute_hidden;
> > diff --git a/sysdeps/unix/sysv/linux/x86_64/getcontext.S b/sysdeps/unix/sysv/linux/x86_64/getcontext.S
> > index a00e2f6290..71f3802dca 100644
> > --- a/sysdeps/unix/sysv/linux/x86_64/getcontext.S
> > +++ b/sysdeps/unix/sysv/linux/x86_64/getcontext.S
> > @@ -58,35 +58,15 @@ ENTRY(__getcontext)
> >         testl   $X86_FEATURE_1_SHSTK, %fs:FEATURE_1_OFFSET
> >         jz      L(no_shstk)
> >
> > -       /* Save RDI in RDX which won't be clobbered by syscall.  */
> > -       movq    %rdi, %rdx
> > -
> >         xorl    %eax, %eax
> >         cmpq    %fs:SSP_BASE_OFFSET, %rax
> >         jnz     L(shadow_stack_bound_recorded)
> >
> > -       /* Get the base address and size of the default shadow stack
> > -          which must be the current shadow stack since nothing has
> > -          been recorded yet.  */
> > -       sub     $24, %RSP_LP
> > -       mov     %RSP_LP, %RSI_LP
> > -       movl    $ARCH_CET_STATUS, %edi
> > -       movl    $__NR_arch_prctl, %eax
> > -       syscall
> > -       testq   %rax, %rax
> > -       jz      L(continue_no_err)
> > -
> > -       /* This should never happen.  */
> > -       hlt
> > -
> > -L(continue_no_err):
> > -       /* Record the base of the current shadow stack.  */
> > -       movq    8(%rsp), %rax
> > +       /* When the shadow stack base is unset, the default shadow
> > +          stack is in use.  Use the current shadow stack pointer
> > +          as the marker for the default shadow stack.  */
> > +       rdsspq  %rax
> >         movq    %rax, %fs:SSP_BASE_OFFSET
> > -       add     $24, %RSP_LP
> > -
> > -       /* Restore RDI.  */
> > -       movq    %rdx, %rdi
> >
> >  L(shadow_stack_bound_recorded):
> >         /* Get the current shadow stack pointer.  */
> > @@ -94,7 +74,7 @@ L(shadow_stack_bound_recorded):
> >         /* NB: Save the caller's shadow stack so that we can jump back
> >            to the caller directly.  */
> >         addq    $8, %rax
> > -       movq    %rax, oSSP(%rdx)
> > +       movq    %rax, oSSP(%rdi)
> >
> >         /* Save the current shadow stack base in ucontext.  */
> >         movq    %fs:SSP_BASE_OFFSET, %rax
> > diff --git a/sysdeps/unix/sysv/linux/x86_64/makecontext.c b/sysdeps/unix/sysv/linux/x86_64/makecontext.c
> > index de9e03eb81..e4f025bd50 100644
> > --- a/sysdeps/unix/sysv/linux/x86_64/makecontext.c
> > +++ b/sysdeps/unix/sysv/linux/x86_64/makecontext.c
> > @@ -24,6 +24,7 @@
> >  # include <pthread.h>
> >  # include <libc-pointer-arith.h>
> >  # include <sys/prctl.h>
> > +# include <allocate-shadow-stack.h>
> >  #endif
> >
> >  #include "ucontext_i.h"
> > @@ -88,23 +89,24 @@ __makecontext (ucontext_t *ucp, void (*func) (void), int argc, ...)
> >    if ((feature_1 & X86_FEATURE_1_SHSTK) != 0)
> >      {
> >        /* Shadow stack is enabled.  We need to allocate a new shadow
> > -         stack.  */
> > -      unsigned long ssp_size = (((uintptr_t) sp
> > -                                - (uintptr_t) ucp->uc_stack.ss_sp)
> > -                               >> STACK_SIZE_TO_SHADOW_STACK_SIZE_SHIFT);
> > -      /* Align shadow stack to 8 bytes.  */
> > -      ssp_size = ALIGN_UP (ssp_size, 8);
> > -
> > -      ucp->__ssp[1] = ssp_size;
> > -      ucp->__ssp[2] = ssp_size;
> > -
> > -      /* Call __push___start_context to allocate a new shadow stack,
> > -        push __start_context onto the new stack as well as the new
> > -        shadow stack.  NB: After __push___start_context returns,
> > +         stack.  NB:
> >            ucp->__ssp[0]: The new shadow stack pointer.
> >            ucp->__ssp[1]: The base address of the new shadow stack.
> >            ucp->__ssp[2]: The size of the new shadow stack.
> >         */
> > +      long int ret
> > +       = __allocate_shadow_stack (((uintptr_t) sp
> > +                                   - (uintptr_t) ucp->uc_stack.ss_sp),
> > +                                  &ucp->__ssp[1]);
> > +      if (ret != 0)
> > +       {
> > +         /* FIXME: What should we do?  */
> > +         abort ();
> > +       }
> > +
> > +      ucp->__ssp[0] = ucp->__ssp[1] + ucp->__ssp[2] - 8;
> > +      /* Call __push___start_context to push __start_context onto the new
> > +        stack as well as the new shadow stack.  */
> >        __push___start_context (ucp);
> >      }
> >    else
> > diff --git a/sysdeps/unix/sysv/linux/x86_64/swapcontext.S b/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
> > index 5925752164..2f2fe9875b 100644
> > --- a/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
> > +++ b/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
> > @@ -109,25 +109,11 @@ ENTRY(__swapcontext)
> >         cmpq    %fs:SSP_BASE_OFFSET, %rax
> >         jnz     L(shadow_stack_bound_recorded)
> >
> > -       /* Get the base address and size of the default shadow stack
> > -          which must be the current shadow stack since nothing has
> > -          been recorded yet.  */
> > -       sub     $24, %RSP_LP
> > -       mov     %RSP_LP, %RSI_LP
> > -       movl    $ARCH_CET_STATUS, %edi
> > -       movl    $__NR_arch_prctl, %eax
> > -       syscall
> > -       testq   %rax, %rax
> > -       jz      L(continue_no_err)
> > -
> > -       /* This should never happen.  */
> > -       hlt
> > -
> > -L(continue_no_err):
> > -       /* Record the base of the current shadow stack.  */
> > -       movq    8(%rsp), %rax
> > +       /* When the shadow stack base is unset, the default shadow
> > +          stack is in use.  Use the current shadow stack pointer
> > +          as the marker for the default shadow stack.  */
> > +       rdsspq  %rax
> >         movq    %rax, %fs:SSP_BASE_OFFSET
> > -       add     $24, %RSP_LP
> >
> >  L(shadow_stack_bound_recorded):
> >          /* If we unwind the stack, we can't undo stack unwinding.  Just
> > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> > index 0bf923d48b..f180f0d9a4 100644
> > --- a/sysdeps/x86/cpu-features.c
> > +++ b/sysdeps/x86/cpu-features.c
> > @@ -1121,8 +1121,9 @@ no_cpuid:
> >
> >  # ifndef SHARED
> >        /* Check if IBT and SHSTK are enabled by kernel.  */
> > -      if ((cet_status & GNU_PROPERTY_X86_FEATURE_1_IBT)
> > -         || (cet_status & GNU_PROPERTY_X86_FEATURE_1_SHSTK))
> > +      if ((cet_status
> > +          & (GNU_PROPERTY_X86_FEATURE_1_IBT
> > +             | GNU_PROPERTY_X86_FEATURE_1_SHSTK)))
>
> I think the code here and elsewhere would be simplifiable with a
> define/enum of
> `GNU_PROPERTY_X86_FEATURE_1_SHSTK_OR_IBT =
> GNU_PROPERTY_X86_FEATURE_1_SHSTK | GNU_PROPERTY_X86_FEATURE_1_IBT`

That may touch more places beyond this patch series.  A separate
patch after this series has been merged?

> >         {
> >           /* Disable IBT and/or SHSTK if they are enabled by kernel, but
> >              disabled by environment variable:
> > @@ -1131,9 +1132,11 @@ no_cpuid:
> >            */
> >           unsigned int cet_feature = 0;
> >           if (!CPU_FEATURE_USABLE (IBT))
> > -           cet_feature |= GNU_PROPERTY_X86_FEATURE_1_IBT;
> > +           cet_feature |= (cet_status
> > +                           & GNU_PROPERTY_X86_FEATURE_1_IBT);
> >           if (!CPU_FEATURE_USABLE (SHSTK))
> > -           cet_feature |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> > +           cet_feature |= (cet_status
> > +                           & GNU_PROPERTY_X86_FEATURE_1_SHSTK);
> >
> >           if (cet_feature)
> >             {
> > @@ -1148,7 +1151,9 @@ no_cpuid:
> >              lock CET if IBT or SHSTK is enabled permissively.  */
> >           if (GL(dl_x86_feature_control).ibt != cet_permissive
> >               && GL(dl_x86_feature_control).shstk != cet_permissive)
> > -           dl_cet_lock_cet ();
> > +           dl_cet_lock_cet (GL(dl_x86_feature_1)
> > +                            & (GNU_PROPERTY_X86_FEATURE_1_IBT
> > +                               | GNU_PROPERTY_X86_FEATURE_1_SHSTK));
> >         }
> >  # endif
> >      }
> > diff --git a/sysdeps/x86/dl-cet.c b/sysdeps/x86/dl-cet.c
> > index e486e549be..66a78244d4 100644
> > --- a/sysdeps/x86/dl-cet.c
> > +++ b/sysdeps/x86/dl-cet.c
> > @@ -202,7 +202,7 @@ dl_cet_check_startup (struct link_map *m, struct dl_cet_info *info)
> >         feature_1_lock |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> >
> >        if (feature_1_lock != 0
> > -         && dl_cet_lock_cet () != 0)
> > +         && dl_cet_lock_cet (feature_1_lock) != 0)
> >         _dl_fatal_printf ("%s: can't lock CET\n", info->program);
> >      }
> >
> > diff --git a/sysdeps/x86_64/nptl/tls.h b/sysdeps/x86_64/nptl/tls.h
> > index 1403f939f7..4bcc2552a1 100644
> > --- a/sysdeps/x86_64/nptl/tls.h
> > +++ b/sysdeps/x86_64/nptl/tls.h
> > @@ -60,7 +60,7 @@ typedef struct
> >    void *__private_tm[4];
> >    /* GCC split stack support.  */
> >    void *__private_ss;
> > -  /* The lowest address of shadow stack,  */
> > +  /* The marker for the current shadow stack.  */
> >    unsigned long long int ssp_base;
> >    /* Must be kept even if it is no longer used by glibc since programs,
> >       like AddressSanitizer, depend on the size of tcbhead_t.  */
> > --
> > 2.43.0
> >



-- 
H.J.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 1/6] x86/cet: Sync with Linux kernel 6.6 shadow stack interface
  2023-12-26 17:56     ` H.J. Lu
@ 2023-12-27  0:40       ` Noah Goldstein
  0 siblings, 0 replies; 18+ messages in thread
From: Noah Goldstein @ 2023-12-27  0:40 UTC (permalink / raw)
  To: H.J. Lu; +Cc: libc-alpha, rick.p.edgecombe

On Tue, Dec 26, 2023 at 9:57 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Tue, Dec 26, 2023 at 9:38 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > On Fri, Dec 22, 2023 at 8:58 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > Sync with Linux kernel 6.6 shadow stack interface.  Since only x86-64 is
> > > supported, i386 shadow stack codes are unchanged and CET shouldn't be
> > > enabled for i386.
> > >
> > > 1. When the shadow stack base in TCB is unset, the default shadow stack
> > > is in use.  Use the current shadow stack pointer as the marker for the
> > > default shadow stack. It is used to identify if the current shadow stack
> > > is the same as the target shadow stack when switching ucontexts.  If yes,
> > > INCSSP will be used to unwind shadow stack.  Otherwise, shadow stack
> > > restore token will be used.
> > > 2. Allocate shadow stack with the map_shadow_stack syscall.  Since there
> > > is no function to explicitly release ucontext, there is no place to
> > > release shadow stack allocated by map_shadow_stack in ucontext functions.
> > > Such shadow stacks will be leaked.
> > > 3. Rename arch_prctl CET commands to ARCH_SHSTK_XXX.
> > > 4. Rewrite the CET control functions with the current kernel shadow stack
> > > interface.
> > >
> > > Since CET is no longer enabled by kernel, a separate patch will enable
> > > shadow stack during startup.
> > > ---
> > >  sysdeps/unix/sysv/linux/x86/bits/mman.h       |  5 ++
> > >  sysdeps/unix/sysv/linux/x86/cpu-features.c    | 13 +++--
> > >  sysdeps/unix/sysv/linux/x86/dl-cet.h          | 16 ++++--
> > >  .../unix/sysv/linux/x86/include/asm/prctl.h   | 37 ++++++-------
> > >  .../sysv/linux/x86/tst-cet-setcontext-1.c     | 17 +++---
> > >  sysdeps/unix/sysv/linux/x86_64/Makefile       |  2 +-
> > >  .../unix/sysv/linux/x86_64/__start_context.S  | 38 +++----------
> > >  .../sysv/linux/x86_64/allocate-shadow-stack.c | 55 +++++++++++++++++++
> > >  .../sysv/linux/x86_64/allocate-shadow-stack.h | 24 ++++++++
> > >  sysdeps/unix/sysv/linux/x86_64/getcontext.S   | 30 ++--------
> > >  sysdeps/unix/sysv/linux/x86_64/makecontext.c  | 28 +++++-----
> > >  sysdeps/unix/sysv/linux/x86_64/swapcontext.S  | 22 ++------
> > >  sysdeps/x86/cpu-features.c                    | 15 +++--
> > >  sysdeps/x86/dl-cet.c                          |  2 +-
> > >  sysdeps/x86_64/nptl/tls.h                     |  2 +-
> > >  15 files changed, 173 insertions(+), 133 deletions(-)
> > >  create mode 100644 sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
> > >  create mode 100644 sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
> > >
> > > diff --git a/sysdeps/unix/sysv/linux/x86/bits/mman.h b/sysdeps/unix/sysv/linux/x86/bits/mman.h
> > > index 3d356e86a0..232b55a13d 100644
> > > --- a/sysdeps/unix/sysv/linux/x86/bits/mman.h
> > > +++ b/sysdeps/unix/sysv/linux/x86/bits/mman.h
> > > @@ -27,6 +27,11 @@
> > >  #define MAP_32BIT      0x40            /* Only give out 32-bit addresses.  */
> > >  #define MAP_ABOVE4G    0x80            /* Only map above 4GB.  */
> > >
> > > +#ifdef __USE_MISC
> > > +/* Set up a restore token in the newly allocated shadow stack */
> > > +# define SHADOW_STACK_SET_TOKEN 0x1
> > > +#endif
> > > +
> > >  #include <bits/mman-map-flags-generic.h>
> > >
> > >  /* Include generic Linux declarations.  */
> > > diff --git a/sysdeps/unix/sysv/linux/x86/cpu-features.c b/sysdeps/unix/sysv/linux/x86/cpu-features.c
> > > index 41e7600668..0e6e2bf855 100644
> > > --- a/sysdeps/unix/sysv/linux/x86/cpu-features.c
> > > +++ b/sysdeps/unix/sysv/linux/x86/cpu-features.c
> > > @@ -23,10 +23,15 @@
> > >  static inline int __attribute__ ((always_inline))
> > >  get_cet_status (void)
> > >  {
> > > -  unsigned long long cet_status[3];
> > > -  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_STATUS, cet_status) == 0)
> > > -    return cet_status[0];
> > > -  return 0;
> > > +  unsigned long long kernel_feature;
> > > +  unsigned int status = 0;
> > > +  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_STATUS,
> > > +                            &kernel_feature) == 0)
> > > +    {
> > > +      if ((kernel_feature & ARCH_SHSTK_SHSTK) != 0)
> > > +       status = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> > > +    }
> > > +  return status;
> > >  }
> > >
> > >  # ifndef SHARED
> > > diff --git a/sysdeps/unix/sysv/linux/x86/dl-cet.h b/sysdeps/unix/sysv/linux/x86/dl-cet.h
> > > index c885bf1323..da220ac627 100644
> > > --- a/sysdeps/unix/sysv/linux/x86/dl-cet.h
> > > +++ b/sysdeps/unix/sysv/linux/x86/dl-cet.h
> > > @@ -21,12 +21,20 @@
> > >  static inline int __attribute__ ((always_inline))
> > >  dl_cet_disable_cet (unsigned int cet_feature)
> > >  {
> > > -  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_DISABLE,
> > > -                                     cet_feature);
> > > +  if (cet_feature != GNU_PROPERTY_X86_FEATURE_1_SHSTK)
> > > +    return -1;
> > > +  long long int kernel_feature = ARCH_SHSTK_SHSTK;
> > > +  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_DISABLE,
> > > +                                     kernel_feature);
> > >  }
> > >
> > >  static inline int __attribute__ ((always_inline))
> > > -dl_cet_lock_cet (void)
> > > +dl_cet_lock_cet (unsigned int cet_feature)
> > >  {
> > > -  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_CET_LOCK, 0);
> > > +  if (cet_feature != GNU_PROPERTY_X86_FEATURE_1_SHSTK)
> > > +    return -1;
> > > +  /* Lock all SHSTK features.  */
> > > +  long long int kernel_feature = -1;
> > > +  return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_LOCK,
> > > +                                     kernel_feature);
> > >  }
> > > diff --git a/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h b/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
> > > index 45ad0b052f..2f511321ad 100644
> > > --- a/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
> > > +++ b/sysdeps/unix/sysv/linux/x86/include/asm/prctl.h
> > > @@ -4,24 +4,19 @@
> > >
> > >  #include_next <asm/prctl.h>
> > >
> > > -#ifndef ARCH_CET_STATUS
> > > -/* CET features:
> > > -   IBT:   GNU_PROPERTY_X86_FEATURE_1_IBT
> > > -   SHSTK: GNU_PROPERTY_X86_FEATURE_1_SHSTK
> > > - */
> > > -/* Return CET features in unsigned long long *addr:
> > > -     features: addr[0].
> > > -     shadow stack base address: addr[1].
> > > -     shadow stack size: addr[2].
> > > - */
> > > -# define ARCH_CET_STATUS       0x3001
> > > -/* Disable CET features in unsigned int features.  */
> > > -# define ARCH_CET_DISABLE      0x3002
> > > -/* Lock all CET features.  */
> > > -# define ARCH_CET_LOCK         0x3003
> > > -/* Allocate a new shadow stack with unsigned long long *addr:
> > > -     IN: requested shadow stack size: *addr.
> > > -     OUT: allocated shadow stack address: *addr.
> > > - */
> > > -# define ARCH_CET_ALLOC_SHSTK  0x3004
> > > -#endif /* ARCH_CET_STATUS */
> > > +#ifndef ARCH_SHSTK_ENABLE
> > > +/* Enable SHSTK features in unsigned long int features.  */
> > > +# define ARCH_SHSTK_ENABLE             0x5001
> > > +/* Disable SHSTK features in unsigned long int features.  */
> > > +# define ARCH_SHSTK_DISABLE            0x5002
> > > +/* Lock SHSTK features in unsigned long int features.  */
> > > +# define ARCH_SHSTK_LOCK               0x5003
> > > +/* Unlock SHSTK features in unsigned long int features.  */
> > > +# define ARCH_SHSTK_UNLOCK             0x5004
> > > +/* Return SHSTK features in unsigned long int features.  */
> > > +# define ARCH_SHSTK_STATUS             0x5005
> > > +
> > > +/* ARCH_SHSTK_ features bits */
> > > +# define ARCH_SHSTK_SHSTK              0x1
> > > +# define ARCH_SHSTK_WRSS               0x2
> > > +#endif
> > > diff --git a/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c b/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
> > > index 837a9fd0eb..2ea66c803b 100644
> > > --- a/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
> > > +++ b/sysdeps/unix/sysv/linux/x86/tst-cet-setcontext-1.c
> > > @@ -87,15 +87,14 @@ do_test (void)
> > >    ctx[4].uc_link = &ctx[0];
> > >    makecontext (&ctx[4], (void (*) (void)) f1, 0);
> > >
> > > -  /* NB: When shadow stack is enabled, makecontext calls arch_prctl
> > > -     with ARCH_CET_ALLOC_SHSTK to allocate a new shadow stack which
> > > -     can be unmapped.  The base address and size of the new shadow
> > > -     stack are returned in __ssp[1] and __ssp[2].  makecontext is
> > > -     called for CTX1, CTX3 and CTX4.  But only CTX1 is used.  New
> > > -     shadow stacks are allocated in the order of CTX3, CTX1, CTX4.
> > > -     It is very likely that CTX1's shadow stack is placed between
> > > -     CTX3 and CTX4.  We munmap CTX3's and CTX4's shadow stacks to
> > > -     create gaps above and below CTX1's shadow stack.  We check
> > > +  /* NB: When shadow stack is enabled, makecontext calls map_shadow_stack
> > > +     to allocate a new shadow stack which can be unmapped.  The base
> > > +     address and size of the new shadow stack are returned in __ssp[1]
> > > +     and __ssp[2].  makecontext is called for CTX1, CTX3 and CTX4.  But
> > > +     only CTX1 is used.  New shadow stacks are allocated in the order
> > > +     of CTX3, CTX1, CTX4.  It is very likely that CTX1's shadow stack is
> > > +     placed between CTX3 and CTX4.  We munmap CTX3's and CTX4's shadow
> > > +     stacks to create gaps above and below CTX1's shadow stack.  We check
> > >       that setcontext CTX1 works correctly in this case.  */
> > >    if (_get_ssp () != 0)
> > >      {
> > > diff --git a/sysdeps/unix/sysv/linux/x86_64/Makefile b/sysdeps/unix/sysv/linux/x86_64/Makefile
> > > index 5e19202ebf..06b873949e 100644
> > > --- a/sysdeps/unix/sysv/linux/x86_64/Makefile
> > > +++ b/sysdeps/unix/sysv/linux/x86_64/Makefile
> > > @@ -3,7 +3,7 @@ sysdep_routines += ioperm iopl
> > >  endif
> > >
> > >  ifeq ($(subdir),stdlib)
> > > -sysdep_routines += __start_context
> > > +sysdep_routines += __start_context allocate-shadow-stack
> > >  endif
> > >
> > >  ifeq ($(subdir),csu)
> > > diff --git a/sysdeps/unix/sysv/linux/x86_64/__start_context.S b/sysdeps/unix/sysv/linux/x86_64/__start_context.S
> > > index f6436dd6bb..ae04203c90 100644
> > > --- a/sysdeps/unix/sysv/linux/x86_64/__start_context.S
> > > +++ b/sysdeps/unix/sysv/linux/x86_64/__start_context.S
> > > @@ -24,20 +24,14 @@
> > >  /* Use CALL to push __start_context onto the new stack as well as the new
> > >     shadow stack.  RDI points to ucontext:
> > >     Incoming:
> > > -     __ssp[0]: The original caller's shadow stack pointer.
> > > -     __ssp[1]: The size of the new shadow stack.
> > > -     __ssp[2]: The size of the new shadow stack.
> > > -   Outgoing:
> > >       __ssp[0]: The new shadow stack pointer.
> > >       __ssp[1]: The base address of the new shadow stack.
> > >       __ssp[2]: The size of the new shadow stack.
> > >   */
> > >
> > >  ENTRY(__push___start_context)
> > > -       /* Save the pointer to ucontext.  */
> > > -       movq    %rdi, %r9
> > >         /* Get the original shadow stack pointer.  */
> > > -       rdsspq  %r8
> > > +       rdsspq  %rcx
> > >         /* Save the original stack pointer.  */
> > >         movq    %rsp, %rdx
> > >         /* Load the top of the new stack into RSI.  */
> > > @@ -45,24 +39,12 @@ ENTRY(__push___start_context)
> > >         /* Add 8 bytes to RSI since CALL will push the 8-byte return
> > >            address onto stack.  */
> > >         leaq    8(%rsi), %rsp
> > > -       /* Allocate the new shadow stack.  The size of the new shadow
> > > -          stack is passed in __ssp[1].  */
> > > -       lea     (oSSP + 8)(%rdi), %RSI_LP
> > > -       movl    $ARCH_CET_ALLOC_SHSTK, %edi
> > > -       movl    $__NR_arch_prctl, %eax
> > > -       /* The new shadow stack base is returned in __ssp[1].  */
> > > -       syscall
> > > -       testq   %rax, %rax
> > > -       jne     L(hlt)          /* This should never happen.  */
> > > -
> > > -       /* Get the size of the new shadow stack.  */
> > > -       movq    8(%rsi), %rdi
> > > -
> > > -       /* Get the base address of the new shadow stack.  */
> > > -       movq    (%rsi), %rsi
> > > -
> > > +       /* The size of the new shadow stack is stored in __ssp[2].  */
> > > +       mov     (oSSP + 16)(%rdi), %RSI_LP
> > > +       /* The new shadow stack base is stored in __ssp[1].  */
> > > +       mov     (oSSP + 8)(%rdi), %RAX_LP
> > >         /* Use the restore stoken to restore the new shadow stack.  */
> > > -       rstorssp -8(%rsi, %rdi)
> > > +       rstorssp -8(%rax, %rsi)
> > >
> > >         /* Save the restore token on the original shadow stack.  */
> > >         saveprevssp
> > > @@ -73,18 +55,12 @@ ENTRY(__push___start_context)
> > >         jmp     __start_context
> > >  1:
> > >
> > > -       /* Get the new shadow stack pointer.  */
> > > -       rdsspq  %rdi
> > > -
> > >         /* Use the restore stoken to restore the original shadow stack.  */
> > > -       rstorssp -8(%r8)
> > > +       rstorssp -8(%rcx)
> > >
> > >         /* Save the restore token on the new shadow stack.  */
> > >         saveprevssp
> > >
> > > -       /* Store the new shadow stack pointer in __ssp[0].  */
> > > -       movq    %rdi, oSSP(%r9)
> > > -
> > >         /* Restore the original stack.  */
> > >         mov     %rdx, %rsp
> > >         ret
> > > diff --git a/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
> > > new file mode 100644
> > > index 0000000000..f2e1d03b96
> > > --- /dev/null
> > > +++ b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.c
> > > @@ -0,0 +1,55 @@
> > > +/* Helper function to allocate shadow stack.
> > > +   Copyright (C) 2023 Free Software Foundation, Inc.
> > > +   This file is part of the GNU C Library.
> > > +
> > > +   The GNU C Library is free software; you can redistribute it and/or
> > > +   modify it under the terms of the GNU Lesser General Public
> > > +   License as published by the Free Software Foundation; either
> > > +   version 2.1 of the License, or (at your option) any later version.
> > > +
> > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > +   Lesser General Public License for more details.
> > > +
> > > +   You should have received a copy of the GNU Lesser General Public
> > > +   License along with the GNU C Library; if not, see
> > > +   <https://www.gnu.org/licenses/>.  */
> > > +
> > > +#include <sysdep.h>
> > > +#include <stdint.h>
> > > +#include <errno.h>
> > > +#include <sys/mman.h>
> > > +#include <libc-pointer-arith.h>
> > > +#include <allocate-shadow-stack.h>
> > > +
> > > +/* NB: This can be treated as a syscall by caller.  */
> > > +
> > > +long int
> > > +__allocate_shadow_stack (size_t stack_size,
> > > +                        shadow_stack_size_t *child_stack)
> > > +{
> > > +#ifdef __NR_map_shadow_stack
> > > +  size_t shadow_stack_size
> > > +    = stack_size >> STACK_SIZE_TO_SHADOW_STACK_SIZE_SHIFT;
> > > +  /* Align shadow stack to 8 bytes.  */
> > > +  shadow_stack_size = ALIGN_UP (shadow_stack_size, 8);
> > > +  /* Since sigaltstack shares shadow stack with the current context in
> > > +     the thread, add extra 20 stack frames in shadow stack for signal
> > > +     handlers.  */
> > > +  shadow_stack_size += 20 * 8;
> > > +  void *shadow_stack = (void *)INLINE_SYSCALL_CALL
> > > +    (map_shadow_stack, NULL, shadow_stack_size, SHADOW_STACK_SET_TOKEN);
> > > +  /* Report the map_shadow_stack error.  */
> > > +  if (shadow_stack == MAP_FAILED)
> > > +    return -errno;
> > > +
> > > +  /* Save the shadow stack base and size on child stack.  */
> > > +  child_stack[0] = (uintptr_t) shadow_stack;
> > > +  child_stack[1] = shadow_stack_size;
> > > +
> > > +  return 0;
> > > +#else
> > > +  return -ENOSYS;
> > > +#endif
> > > +}
> > > diff --git a/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
> > > new file mode 100644
> > > index 0000000000..d05aaf16e5
> > > --- /dev/null
> > > +++ b/sysdeps/unix/sysv/linux/x86_64/allocate-shadow-stack.h
> > > @@ -0,0 +1,24 @@
> > > +/* Helper function to allocate shadow stack.
> > > +   Copyright (C) 2023 Free Software Foundation, Inc.
> > > +   This file is part of the GNU C Library.
> > > +
> > > +   The GNU C Library is free software; you can redistribute it and/or
> > > +   modify it under the terms of the GNU Lesser General Public
> > > +   License as published by the Free Software Foundation; either
> > > +   version 2.1 of the License, or (at your option) any later version.
> > > +
> > > +   The GNU C Library is distributed in the hope that it will be useful,
> > > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > +   Lesser General Public License for more details.
> > > +
> > > +   You should have received a copy of the GNU Lesser General Public
> > > +   License along with the GNU C Library; if not, see
> > > +   <https://www.gnu.org/licenses/>.  */
> > > +
> > > +#include <ucontext.h>
> > > +
> > > +typedef __typeof (((ucontext_t *) 0)->__ssp[0]) shadow_stack_size_t;
> > > +
> > > +extern long int __allocate_shadow_stack (size_t, shadow_stack_size_t *)
> > > +  attribute_hidden;
> > > diff --git a/sysdeps/unix/sysv/linux/x86_64/getcontext.S b/sysdeps/unix/sysv/linux/x86_64/getcontext.S
> > > index a00e2f6290..71f3802dca 100644
> > > --- a/sysdeps/unix/sysv/linux/x86_64/getcontext.S
> > > +++ b/sysdeps/unix/sysv/linux/x86_64/getcontext.S
> > > @@ -58,35 +58,15 @@ ENTRY(__getcontext)
> > >         testl   $X86_FEATURE_1_SHSTK, %fs:FEATURE_1_OFFSET
> > >         jz      L(no_shstk)
> > >
> > > -       /* Save RDI in RDX which won't be clobbered by syscall.  */
> > > -       movq    %rdi, %rdx
> > > -
> > >         xorl    %eax, %eax
> > >         cmpq    %fs:SSP_BASE_OFFSET, %rax
> > >         jnz     L(shadow_stack_bound_recorded)
> > >
> > > -       /* Get the base address and size of the default shadow stack
> > > -          which must be the current shadow stack since nothing has
> > > -          been recorded yet.  */
> > > -       sub     $24, %RSP_LP
> > > -       mov     %RSP_LP, %RSI_LP
> > > -       movl    $ARCH_CET_STATUS, %edi
> > > -       movl    $__NR_arch_prctl, %eax
> > > -       syscall
> > > -       testq   %rax, %rax
> > > -       jz      L(continue_no_err)
> > > -
> > > -       /* This should never happen.  */
> > > -       hlt
> > > -
> > > -L(continue_no_err):
> > > -       /* Record the base of the current shadow stack.  */
> > > -       movq    8(%rsp), %rax
> > > +       /* When the shadow stack base is unset, the default shadow
> > > +          stack is in use.  Use the current shadow stack pointer
> > > +          as the marker for the default shadow stack.  */
> > > +       rdsspq  %rax
> > >         movq    %rax, %fs:SSP_BASE_OFFSET
> > > -       add     $24, %RSP_LP
> > > -
> > > -       /* Restore RDI.  */
> > > -       movq    %rdx, %rdi
> > >
> > >  L(shadow_stack_bound_recorded):
> > >         /* Get the current shadow stack pointer.  */
> > > @@ -94,7 +74,7 @@ L(shadow_stack_bound_recorded):
> > >         /* NB: Save the caller's shadow stack so that we can jump back
> > >            to the caller directly.  */
> > >         addq    $8, %rax
> > > -       movq    %rax, oSSP(%rdx)
> > > +       movq    %rax, oSSP(%rdi)
> > >
> > >         /* Save the current shadow stack base in ucontext.  */
> > >         movq    %fs:SSP_BASE_OFFSET, %rax
> > > diff --git a/sysdeps/unix/sysv/linux/x86_64/makecontext.c b/sysdeps/unix/sysv/linux/x86_64/makecontext.c
> > > index de9e03eb81..e4f025bd50 100644
> > > --- a/sysdeps/unix/sysv/linux/x86_64/makecontext.c
> > > +++ b/sysdeps/unix/sysv/linux/x86_64/makecontext.c
> > > @@ -24,6 +24,7 @@
> > >  # include <pthread.h>
> > >  # include <libc-pointer-arith.h>
> > >  # include <sys/prctl.h>
> > > +# include <allocate-shadow-stack.h>
> > >  #endif
> > >
> > >  #include "ucontext_i.h"
> > > @@ -88,23 +89,24 @@ __makecontext (ucontext_t *ucp, void (*func) (void), int argc, ...)
> > >    if ((feature_1 & X86_FEATURE_1_SHSTK) != 0)
> > >      {
> > >        /* Shadow stack is enabled.  We need to allocate a new shadow
> > > -         stack.  */
> > > -      unsigned long ssp_size = (((uintptr_t) sp
> > > -                                - (uintptr_t) ucp->uc_stack.ss_sp)
> > > -                               >> STACK_SIZE_TO_SHADOW_STACK_SIZE_SHIFT);
> > > -      /* Align shadow stack to 8 bytes.  */
> > > -      ssp_size = ALIGN_UP (ssp_size, 8);
> > > -
> > > -      ucp->__ssp[1] = ssp_size;
> > > -      ucp->__ssp[2] = ssp_size;
> > > -
> > > -      /* Call __push___start_context to allocate a new shadow stack,
> > > -        push __start_context onto the new stack as well as the new
> > > -        shadow stack.  NB: After __push___start_context returns,
> > > +         stack.  NB:
> > >            ucp->__ssp[0]: The new shadow stack pointer.
> > >            ucp->__ssp[1]: The base address of the new shadow stack.
> > >            ucp->__ssp[2]: The size of the new shadow stack.
> > >         */
> > > +      long int ret
> > > +       = __allocate_shadow_stack (((uintptr_t) sp
> > > +                                   - (uintptr_t) ucp->uc_stack.ss_sp),
> > > +                                  &ucp->__ssp[1]);
> > > +      if (ret != 0)
> > > +       {
> > > +         /* FIXME: What should we do?  */
> > > +         abort ();
> > > +       }
> > > +
> > > +      ucp->__ssp[0] = ucp->__ssp[1] + ucp->__ssp[2] - 8;
> > > +      /* Call __push___start_context to push __start_context onto the new
> > > +        stack as well as the new shadow stack.  */
> > >        __push___start_context (ucp);
> > >      }
> > >    else
> > > diff --git a/sysdeps/unix/sysv/linux/x86_64/swapcontext.S b/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
> > > index 5925752164..2f2fe9875b 100644
> > > --- a/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
> > > +++ b/sysdeps/unix/sysv/linux/x86_64/swapcontext.S
> > > @@ -109,25 +109,11 @@ ENTRY(__swapcontext)
> > >         cmpq    %fs:SSP_BASE_OFFSET, %rax
> > >         jnz     L(shadow_stack_bound_recorded)
> > >
> > > -       /* Get the base address and size of the default shadow stack
> > > -          which must be the current shadow stack since nothing has
> > > -          been recorded yet.  */
> > > -       sub     $24, %RSP_LP
> > > -       mov     %RSP_LP, %RSI_LP
> > > -       movl    $ARCH_CET_STATUS, %edi
> > > -       movl    $__NR_arch_prctl, %eax
> > > -       syscall
> > > -       testq   %rax, %rax
> > > -       jz      L(continue_no_err)
> > > -
> > > -       /* This should never happen.  */
> > > -       hlt
> > > -
> > > -L(continue_no_err):
> > > -       /* Record the base of the current shadow stack.  */
> > > -       movq    8(%rsp), %rax
> > > +       /* When the shadow stack base is unset, the default shadow
> > > +          stack is in use.  Use the current shadow stack pointer
> > > +          as the marker for the default shadow stack.  */
> > > +       rdsspq  %rax
> > >         movq    %rax, %fs:SSP_BASE_OFFSET
> > > -       add     $24, %RSP_LP
> > >
> > >  L(shadow_stack_bound_recorded):
> > >          /* If we unwind the stack, we can't undo stack unwinding.  Just
> > > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> > > index 0bf923d48b..f180f0d9a4 100644
> > > --- a/sysdeps/x86/cpu-features.c
> > > +++ b/sysdeps/x86/cpu-features.c
> > > @@ -1121,8 +1121,9 @@ no_cpuid:
> > >
> > >  # ifndef SHARED
> > >        /* Check if IBT and SHSTK are enabled by kernel.  */
> > > -      if ((cet_status & GNU_PROPERTY_X86_FEATURE_1_IBT)
> > > -         || (cet_status & GNU_PROPERTY_X86_FEATURE_1_SHSTK))
> > > +      if ((cet_status
> > > +          & (GNU_PROPERTY_X86_FEATURE_1_IBT
> > > +             | GNU_PROPERTY_X86_FEATURE_1_SHSTK)))
> >
> > I think the code here and elsewhere would be simplifiable with a
> > define/enum of
> > `GNU_PROPERTY_X86_FEATURE_1_SHSTK_OR_IBT =
> > GNU_PROPERTY_X86_FEATURE_1_SHSTK | GNU_PROPERTY_X86_FEATURE_1_IBT`
>
> That may touch more places beyond this patch series.  A separate
> patch after this series has been merged?

Fair enough.
>
> > >         {
> > >           /* Disable IBT and/or SHSTK if they are enabled by kernel, but
> > >              disabled by environment variable:
> > > @@ -1131,9 +1132,11 @@ no_cpuid:
> > >            */
> > >           unsigned int cet_feature = 0;
> > >           if (!CPU_FEATURE_USABLE (IBT))
> > > -           cet_feature |= GNU_PROPERTY_X86_FEATURE_1_IBT;
> > > +           cet_feature |= (cet_status
> > > +                           & GNU_PROPERTY_X86_FEATURE_1_IBT);
> > >           if (!CPU_FEATURE_USABLE (SHSTK))
> > > -           cet_feature |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> > > +           cet_feature |= (cet_status
> > > +                           & GNU_PROPERTY_X86_FEATURE_1_SHSTK);
> > >
> > >           if (cet_feature)
> > >             {
> > > @@ -1148,7 +1151,9 @@ no_cpuid:
> > >              lock CET if IBT or SHSTK is enabled permissively.  */
> > >           if (GL(dl_x86_feature_control).ibt != cet_permissive
> > >               && GL(dl_x86_feature_control).shstk != cet_permissive)
> > > -           dl_cet_lock_cet ();
> > > +           dl_cet_lock_cet (GL(dl_x86_feature_1)
> > > +                            & (GNU_PROPERTY_X86_FEATURE_1_IBT
> > > +                               | GNU_PROPERTY_X86_FEATURE_1_SHSTK));
> > >         }
> > >  # endif
> > >      }
> > > diff --git a/sysdeps/x86/dl-cet.c b/sysdeps/x86/dl-cet.c
> > > index e486e549be..66a78244d4 100644
> > > --- a/sysdeps/x86/dl-cet.c
> > > +++ b/sysdeps/x86/dl-cet.c
> > > @@ -202,7 +202,7 @@ dl_cet_check_startup (struct link_map *m, struct dl_cet_info *info)
> > >         feature_1_lock |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> > >
> > >        if (feature_1_lock != 0
> > > -         && dl_cet_lock_cet () != 0)
> > > +         && dl_cet_lock_cet (feature_1_lock) != 0)
> > >         _dl_fatal_printf ("%s: can't lock CET\n", info->program);
> > >      }
> > >
> > > diff --git a/sysdeps/x86_64/nptl/tls.h b/sysdeps/x86_64/nptl/tls.h
> > > index 1403f939f7..4bcc2552a1 100644
> > > --- a/sysdeps/x86_64/nptl/tls.h
> > > +++ b/sysdeps/x86_64/nptl/tls.h
> > > @@ -60,7 +60,7 @@ typedef struct
> > >    void *__private_tm[4];
> > >    /* GCC split stack support.  */
> > >    void *__private_ss;
> > > -  /* The lowest address of shadow stack,  */
> > > +  /* The marker for the current shadow stack.  */
> > >    unsigned long long int ssp_base;
> > >    /* Must be kept even if it is no longer used by glibc since programs,
> > >       like AddressSanitizer, depend on the size of tcbhead_t.  */
> > > --
> > > 2.43.0
> > >
>
>
>
> --
> H.J.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 0/6] x86/cet: Update CET kernel interface
  2023-12-22 16:58 [PATCH v5 0/6] x86/cet: Update CET kernel interface H.J. Lu
                   ` (5 preceding siblings ...)
  2023-12-22 16:58 ` [PATCH v5 6/6] x86/cet: Run some CET tests with shadow stack H.J. Lu
@ 2023-12-28 16:00 ` Florian Weimer
  2023-12-28 21:17   ` H.J. Lu
  6 siblings, 1 reply; 18+ messages in thread
From: Florian Weimer @ 2023-12-28 16:00 UTC (permalink / raw)
  To: H.J. Lu; +Cc: libc-alpha, rick.p.edgecombe, goldstein.w.n

* H. J. Lu:

> H.J. Lu (6):
>   x86/cet: Sync with Linux kernel 6.6 shadow stack interface
>   elf: Always provide _dl_get_dl_main_map in libc.a
>   x86/cet: Enable shadow stack during startup
>   x86/cet: Check feature_1 in TCB for active IBT and SHSTK
>   x86/cet: Don't set CET active by default
>   x86/cet: Run some CET tests with shadow stack

I tested this on:

vendor_id       : AuthenticAMD
cpu family      : 25
model           : 97
model name      : AMD Ryzen 9 7950X 16-Core Processor
stepping        : 2
microcode       : 0xa601206

and the CET tests pass, except elf/tst-cet-legacy-8 and
elf/tst-cet-property-2, which are flagged as UNSUPPORTED because IBT
is not available (as expected).

What's missing is a fault test that verifies that an unmatched RET
instruction results in a SIGSEGV with a code of SEGV_CPERR, but that
can be added later.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 0/6] x86/cet: Update CET kernel interface
  2023-12-28 16:00 ` [PATCH v5 0/6] x86/cet: Update CET kernel interface Florian Weimer
@ 2023-12-28 21:17   ` H.J. Lu
  0 siblings, 0 replies; 18+ messages in thread
From: H.J. Lu @ 2023-12-28 21:17 UTC (permalink / raw)
  To: Florian Weimer; +Cc: libc-alpha, rick.p.edgecombe, goldstein.w.n

On Thu, Dec 28, 2023 at 8:00 AM Florian Weimer <fw@deneb.enyo.de> wrote:
>
> * H. J. Lu:
>
> > H.J. Lu (6):
> >   x86/cet: Sync with Linux kernel 6.6 shadow stack interface
> >   elf: Always provide _dl_get_dl_main_map in libc.a
> >   x86/cet: Enable shadow stack during startup
> >   x86/cet: Check feature_1 in TCB for active IBT and SHSTK
> >   x86/cet: Don't set CET active by default
> >   x86/cet: Run some CET tests with shadow stack
>
> I tested this on:
>
> vendor_id       : AuthenticAMD
> cpu family      : 25
> model           : 97
> model name      : AMD Ryzen 9 7950X 16-Core Processor
> stepping        : 2
> microcode       : 0xa601206
>
> and the CET tests pass, except elf/tst-cet-legacy-8 and
> elf/tst-cet-property-2, which are flagged as UNSUPPORTED because IBT
> is not available (as expected).

Thanks for your feedback.

> What's missing is a fault test that verifies that an unmatched RET
> instruction results in a SIGSEGV with a code of SEGV_CPERR, but that
> can be added later.

We could add more shadow stack tests after the shadow stack is enabled.

I will submit a patch to allow mixing longjmp with user contexts.

I will check in the v5 patch series next week if there are no objections so
that we can start validating shadow stack support in applications and libraries.

-- 
H.J.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/6] elf: Always provide _dl_get_dl_main_map in libc.a
  2023-12-22 16:58 ` [PATCH v5 2/6] elf: Always provide _dl_get_dl_main_map in libc.a H.J. Lu
@ 2023-12-29 14:45   ` Adhemerval Zanella Netto
  2023-12-29 15:15     ` H.J. Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Adhemerval Zanella Netto @ 2023-12-29 14:45 UTC (permalink / raw)
  To: libc-alpha, H.J. Lu



On 22/12/23 13:58, H.J. Lu wrote:
> Always provide _dl_get_dl_main_map in libc.a.  It will be used by x86
> to process PT_GNU_PROPERTY segment.
> ---
>  elf/dl-support.c           | 2 --
>  sysdeps/generic/ldsodefs.h | 8 ++++----
>  2 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/elf/dl-support.c b/elf/dl-support.c
> index 837fa1c836..70c5b3599a 100644
> --- a/elf/dl-support.c
> +++ b/elf/dl-support.c
> @@ -344,7 +344,6 @@ _dl_non_dynamic_init (void)
>  DL_SYSINFO_IMPLEMENTATION
>  #endif
>  
> -#if ENABLE_STATIC_PIE
>  /* Since relocation to hidden _dl_main_map causes relocation overflow on
>     aarch64, a function is used to get the address of _dl_main_map.  */
>  
> @@ -353,7 +352,6 @@ _dl_get_dl_main_map (void)
>  {
>    return &_dl_main_map;
>  }
> -#endif
>  
>  /* This is used by _dl_runtime_profile, not used on static code.  */
>  void
> diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
> index 9b50ddd09f..0e8a008a49 100644
> --- a/sysdeps/generic/ldsodefs.h
> +++ b/sysdeps/generic/ldsodefs.h
> @@ -1172,10 +1172,6 @@ void __libc_setup_tls (void);
>  # if ENABLE_STATIC_PIE
>  /* Relocate static executable with PIE.  */
>  extern void _dl_relocate_static_pie (void) attribute_hidden;
> -
> -/* Get a pointer to _dl_main_map.  */
> -extern struct link_map * _dl_get_dl_main_map (void)
> -  __attribute__ ((visibility ("hidden")));
>  # else
>  #  define _dl_relocate_static_pie()
>  # endif
> @@ -1217,6 +1213,10 @@ rtld_hidden_proto (_dl_deallocate_tls)
>  
>  extern void _dl_nothread_init_static_tls (struct link_map *) attribute_hidden;
>  
> +/* Get a pointer to _dl_main_map.  */
> +extern struct link_map * _dl_get_dl_main_map (void)
> +  __attribute__ ((visibility ("hidden")));

You can se attribute_hidden here.

> +
>  /* Find origin of the executable.  */
>  extern const char *_dl_get_origin (void) attribute_hidden;
>  

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 3/6] x86/cet: Enable shadow stack during startup
  2023-12-22 16:58 ` [PATCH v5 3/6] x86/cet: Enable shadow stack during startup H.J. Lu
@ 2023-12-29 14:55   ` Adhemerval Zanella Netto
  2023-12-29 15:24     ` H.J. Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Adhemerval Zanella Netto @ 2023-12-29 14:55 UTC (permalink / raw)
  To: H.J. Lu, libc-alpha; +Cc: rick.p.edgecombe, goldstein.w.n



On 22/12/23 13:58, H.J. Lu wrote:
> Previously, CET was enabled by kernel before passing control to user
> space and the startup code must disable CET if applications or shared
> libraries aren't CET enabled.  Since the current kernel only supports
> shadow stack and won't enable shadow stack before passing control to
> user space, we need to enable shadow stack during startup if the
> application and all shared library are shadow stack enabled.  There
> is no need to disable shadow stack at startup.  Shadow stack can only
> be enabled in a function which will never return.  Otherwise, shadow
> stack will underflow at the function return.
> 
> 1. GL(dl_x86_feature_1) is set to the CET features which are supported
> by the processor and are not disabled by the tunable.  Only non-zero
> features in GL(dl_x86_feature_1) should be enabled.  After enabling
> shadow stack with ARCH_SHSTK_ENABLE, ARCH_SHSTK_STATUS is used to check
> if shadow stack is really enabled.
> 2. Use ARCH_SHSTK_ENABLE in RTLD_START in dynamic executable.  It is
> safe since RTLD_START never returns.
> 3. Call arch_prctl (ARCH_SHSTK_ENABLE) from ARCH_SETUP_TLS in static
> executable.  Since the start function using ARCH_SETUP_TLS never returns,
> it is safe to enable shadow stack in ARCH_SETUP_TLS.
> ---
>  sysdeps/unix/sysv/linux/x86/cpu-features.c | 49 --------------
>  sysdeps/unix/sysv/linux/x86/dl-cet.h       | 23 +++++++
>  sysdeps/unix/sysv/linux/x86_64/dl-cet.h    | 47 +++++++++++++
>  sysdeps/x86/cpu-features-offsets.sym       |  1 +
>  sysdeps/x86/cpu-features.c                 | 51 --------------
>  sysdeps/x86/dl-cet.c                       | 77 +++++++++++-----------
>  sysdeps/x86/get-cpuid-feature-leaf.c       |  2 +-
>  sysdeps/x86/include/cpu-features.h         |  3 +
>  sysdeps/x86/libc-start.h                   | 54 ++++++++++++++-
>  sysdeps/x86_64/dl-machine.h                | 12 +++-
>  10 files changed, 175 insertions(+), 144 deletions(-)
>  delete mode 100644 sysdeps/unix/sysv/linux/x86/cpu-features.c
>  create mode 100644 sysdeps/unix/sysv/linux/x86_64/dl-cet.h
> 
> diff --git a/sysdeps/unix/sysv/linux/x86/cpu-features.c b/sysdeps/unix/sysv/linux/x86/cpu-features.c
> deleted file mode 100644
> index 0e6e2bf855..0000000000
> --- a/sysdeps/unix/sysv/linux/x86/cpu-features.c
> +++ /dev/null
> @@ -1,49 +0,0 @@
> -/* Initialize CPU feature data for Linux/x86.
> -   This file is part of the GNU C Library.
> -   Copyright (C) 2018-2023 Free Software Foundation, Inc.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <https://www.gnu.org/licenses/>.  */
> -
> -#if CET_ENABLED
> -# include <sys/prctl.h>
> -# include <asm/prctl.h>
> -
> -static inline int __attribute__ ((always_inline))
> -get_cet_status (void)
> -{
> -  unsigned long long kernel_feature;
> -  unsigned int status = 0;
> -  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_STATUS,
> -			     &kernel_feature) == 0)
> -    {
> -      if ((kernel_feature & ARCH_SHSTK_SHSTK) != 0)
> -	status = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> -    }
> -  return status;
> -}
> -
> -# ifndef SHARED
> -static inline void
> -x86_setup_tls (void)
> -{
> -  __libc_setup_tls ();
> -  THREAD_SETMEM (THREAD_SELF, header.feature_1, GL(dl_x86_feature_1));
> -}
> -
> -#  define ARCH_SETUP_TLS() x86_setup_tls ()
> -# endif
> -#endif
> -
> -#include <sysdeps/x86/cpu-features.c>
> diff --git a/sysdeps/unix/sysv/linux/x86/dl-cet.h b/sysdeps/unix/sysv/linux/x86/dl-cet.h
> index da220ac627..634c885d33 100644
> --- a/sysdeps/unix/sysv/linux/x86/dl-cet.h
> +++ b/sysdeps/unix/sysv/linux/x86/dl-cet.h
> @@ -38,3 +38,26 @@ dl_cet_lock_cet (unsigned int cet_feature)
>    return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_LOCK,
>  				      kernel_feature);
>  }
> +
> +static inline unsigned int __attribute__ ((always_inline))

You can use use 'static __always_inline unsigned int' here.

> +dl_cet_get_cet_status (void)
> +{
> +  unsigned long long kernel_feature;
> +  unsigned int status = 0;
> +  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_STATUS,
> +			     &kernel_feature) == 0)
> +    {
> +      if ((kernel_feature & ARCH_SHSTK_SHSTK) != 0)
> +	status = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> +    }
> +  return status;
> +}
> +
> +/* Enable shadow stack with a macro to avoid shadow stack underflow.  */
> +#define ENABLE_X86_CET(cet_feature)				\
> +  if ((cet_feature & GNU_PROPERTY_X86_FEATURE_1_SHSTK))		\
> +    {								\
> +      long long int kernel_feature = ARCH_SHSTK_SHSTK;		\
> +      INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_ENABLE,	\
> +			     kernel_feature);			\
> +    }

The Linux documentation Documentation/arch/x86/shstk.rst states the
argument is a 'unsigned long'. I am not use it would matter though.

> diff --git a/sysdeps/unix/sysv/linux/x86_64/dl-cet.h b/sysdeps/unix/sysv/linux/x86_64/dl-cet.h
> new file mode 100644
> index 0000000000..e23e05c6b8
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/x86_64/dl-cet.h
> @@ -0,0 +1,47 @@
> +/* Linux/x86-64 CET initializers function.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <cpu-features-offsets.h>
> +#include_next <dl-cet.h>
> +
> +#define X86_STRINGIFY_1(x)	#x
> +#define X86_STRINGIFY(x)	X86_STRINGIFY_1 (x)
> +
> +/* Enable shadow stack before calling _dl_init if it is enabled in
> +   GL(dl_x86_feature_1).  Call _dl_setup_x86_features to setup shadow
> +   stack.  */
> +#define RTLD_START_ENABLE_X86_FEATURES \
> +"\
> +	# Check if shadow stack is enabled in GL(dl_x86_feature_1).\n\
> +	movl _rtld_local+" X86_STRINGIFY (RTLD_GLOBAL_DL_X86_FEATURE_1_OFFSET) "(%rip), %edx\n\
> +	testl $" X86_STRINGIFY (X86_FEATURE_1_SHSTK) ", %edx\n\
> +	jz 1f\n\
> +	# Enable shadow stack if enabled in GL(dl_x86_feature_1).\n\
> +	movl $" X86_STRINGIFY (ARCH_SHSTK_SHSTK) ", %esi\n\
> +	movl $" X86_STRINGIFY (ARCH_SHSTK_ENABLE) ", %edi\n\
> +	movl $" X86_STRINGIFY (__NR_arch_prctl) ", %eax\n\
> +	syscall\n\
> +1:\n\

It seems that the syscall might eventually fail if the shadow stack can not be
allocated (alloc_shstk), although it seems really unlikely to happen on loader
itself (maybe in a really constraint environment).  Should we handle this case?

> +	# Pass GL(dl_x86_feature_1) to _dl_cet_setup_features.\n\
> +	movl %edx, %edi\n\
> +	# Align stack for the _dl_cet_setup_features call.\n\
> +	andq $-16, %rsp\n\
> +	call _dl_cet_setup_features\n\
> +	# Restore %rax and %rsp from %r12 and %r13.\n\
> +	movq %r12, %rax\n\
> +	movq %r13, %rsp\n\
> +"
> diff --git a/sysdeps/x86/cpu-features-offsets.sym b/sysdeps/x86/cpu-features-offsets.sym
> index 6d03cea8e8..5429f60632 100644
> --- a/sysdeps/x86/cpu-features-offsets.sym
> +++ b/sysdeps/x86/cpu-features-offsets.sym
> @@ -4,3 +4,4 @@
>  
>  RTLD_GLOBAL_RO_DL_X86_CPU_FEATURES_OFFSET offsetof (struct rtld_global_ro, _dl_x86_cpu_features)
>  XSAVE_STATE_SIZE_OFFSET	offsetof (struct cpu_features, xsave_state_size)
> +RTLD_GLOBAL_DL_X86_FEATURE_1_OFFSET offsetof (struct rtld_global, _dl_x86_feature_1)
> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> index f180f0d9a4..097868c1d9 100644
> --- a/sysdeps/x86/cpu-features.c
> +++ b/sysdeps/x86/cpu-features.c
> @@ -1106,57 +1106,6 @@ no_cpuid:
>  	       TUNABLE_CALLBACK (set_x86_ibt));
>    TUNABLE_GET (x86_shstk, tunable_val_t *,
>  	       TUNABLE_CALLBACK (set_x86_shstk));
> -
> -  /* Check CET status.  */
> -  unsigned int cet_status = get_cet_status ();
> -
> -  if ((cet_status & GNU_PROPERTY_X86_FEATURE_1_IBT) == 0)
> -    CPU_FEATURE_UNSET (cpu_features, IBT)
> -  if ((cet_status & GNU_PROPERTY_X86_FEATURE_1_SHSTK) == 0)
> -    CPU_FEATURE_UNSET (cpu_features, SHSTK)
> -
> -  if (cet_status)
> -    {
> -      GL(dl_x86_feature_1) = cet_status;
> -
> -# ifndef SHARED
> -      /* Check if IBT and SHSTK are enabled by kernel.  */
> -      if ((cet_status
> -	   & (GNU_PROPERTY_X86_FEATURE_1_IBT
> -	      | GNU_PROPERTY_X86_FEATURE_1_SHSTK)))
> -	{
> -	  /* Disable IBT and/or SHSTK if they are enabled by kernel, but
> -	     disabled by environment variable:
> -
> -	     GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK
> -	   */
> -	  unsigned int cet_feature = 0;
> -	  if (!CPU_FEATURE_USABLE (IBT))
> -	    cet_feature |= (cet_status
> -			    & GNU_PROPERTY_X86_FEATURE_1_IBT);
> -	  if (!CPU_FEATURE_USABLE (SHSTK))
> -	    cet_feature |= (cet_status
> -			    & GNU_PROPERTY_X86_FEATURE_1_SHSTK);
> -
> -	  if (cet_feature)
> -	    {
> -	      int res = dl_cet_disable_cet (cet_feature);
> -
> -	      /* Clear the disabled bits in dl_x86_feature_1.  */
> -	      if (res == 0)
> -		GL(dl_x86_feature_1) &= ~cet_feature;
> -	    }
> -
> -	  /* Lock CET if IBT or SHSTK is enabled in executable.  Don't
> -	     lock CET if IBT or SHSTK is enabled permissively.  */
> -	  if (GL(dl_x86_feature_control).ibt != cet_permissive
> -	      && GL(dl_x86_feature_control).shstk != cet_permissive)
> -	    dl_cet_lock_cet (GL(dl_x86_feature_1)
> -			     & (GNU_PROPERTY_X86_FEATURE_1_IBT
> -				| GNU_PROPERTY_X86_FEATURE_1_SHSTK));
> -	}
> -# endif
> -    }
>  #endif
>  
>  #ifndef SHARED
> diff --git a/sysdeps/x86/dl-cet.c b/sysdeps/x86/dl-cet.c
> index 66a78244d4..25add215f2 100644
> --- a/sysdeps/x86/dl-cet.c
> +++ b/sysdeps/x86/dl-cet.c
> @@ -173,40 +173,11 @@ dl_cet_check_startup (struct link_map *m, struct dl_cet_info *info)
>      = info->enable_feature_1 ^ info->feature_1_enabled;
>    if (disable_feature_1 != 0)
>      {
> -      /* Disable features in the kernel because of legacy objects or
> -	 cet_always_off.  */
> -      if (dl_cet_disable_cet (disable_feature_1) != 0)
> -	_dl_fatal_printf ("%s: can't disable x86 Features\n",
> -			  info->program);
> -
>        /* Clear the disabled bits.  Sync dl_x86_feature_1 and
>           info->feature_1_enabled with info->enable_feature_1.  */
>        info->feature_1_enabled = info->enable_feature_1;
>        GL(dl_x86_feature_1) = info->enable_feature_1;
>      }
> -
> -  if (HAS_CPU_FEATURE (IBT) || HAS_CPU_FEATURE (SHSTK))
> -    {
> -      /* Lock CET features only if IBT or SHSTK are enabled and are not
> -         enabled permissively.  */
> -      unsigned int feature_1_lock = 0;
> -
> -      if (((info->feature_1_enabled & GNU_PROPERTY_X86_FEATURE_1_IBT)
> -	   != 0)
> -	  && info->enable_ibt_type != cet_permissive)
> -	feature_1_lock |= GNU_PROPERTY_X86_FEATURE_1_IBT;
> -
> -      if (((info->feature_1_enabled & GNU_PROPERTY_X86_FEATURE_1_SHSTK)
> -	   != 0)
> -	  && info->enable_shstk_type != cet_permissive)
> -	feature_1_lock |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> -
> -      if (feature_1_lock != 0
> -	  && dl_cet_lock_cet (feature_1_lock) != 0)
> -	_dl_fatal_printf ("%s: can't lock CET\n", info->program);
> -    }
> -
> -  THREAD_SETMEM (THREAD_SELF, header.feature_1, GL(dl_x86_feature_1));
>  }
>  #endif
>  
> @@ -298,6 +269,15 @@ dl_cet_check (struct link_map *m, const char *program)
>  {
>    struct dl_cet_info info;
>  
> +  /* CET is enabled only if RTLD_START_ENABLE_X86_FEATURES is defined.  */
> +#if defined SHARED && defined RTLD_START_ENABLE_X86_FEATURES
> +  /* Set dl_x86_feature_1 to features enabled in the executable.  */
> +  if (program != NULL)
> +    GL(dl_x86_feature_1) = (m->l_x86_feature_1_and
> +			    & (X86_FEATURE_1_IBT
> +			       | X86_FEATURE_1_SHSTK));
> +#endif
> +
>    /* Check how IBT and SHSTK should be enabled. */
>    info.enable_ibt_type = GL(dl_x86_feature_control).ibt;
>    info.enable_shstk_type = GL(dl_x86_feature_control).shstk;
> @@ -307,17 +287,9 @@ dl_cet_check (struct link_map *m, const char *program)
>    /* No legacy object check if IBT and SHSTK are always on.  */
>    if (info.enable_ibt_type == cet_always_on
>        && info.enable_shstk_type == cet_always_on)
> -    {
> -#ifdef SHARED
> -      /* Set it only during startup.  */
> -      if (program != NULL)
> -	THREAD_SETMEM (THREAD_SELF, header.feature_1,
> -		       info.feature_1_enabled);
> -#endif
> -      return;
> -    }
> +    return;
>  
> -  /* Check if IBT and SHSTK were enabled by kernel.  */
> +  /* Check if IBT and SHSTK were enabled.  */
>    if (info.feature_1_enabled == 0)
>      return;
>  
> @@ -351,6 +323,33 @@ _dl_cet_open_check (struct link_map *l)
>    dl_cet_check (l, NULL);
>  }
>  
> +/* Set GL(dl_x86_feature_1) to the enabled features and clear the
> +   active bits of the disabled features.  */
> +
> +attribute_hidden
> +void

I think the code guideline states attribute should in the same line
as the return type.

> +_dl_cet_setup_features (unsigned int cet_feature)
> +{
> +  /* NB: cet_feature == GL(dl_x86_feature_1) which is set to features
> +     enabled from executable, not necessarily supported by kernel.  */
> +  if (cet_feature)

No implicit check for integer types.

> +    {
> +      cet_feature = dl_cet_get_cet_status ();
> +      if (cet_feature)
> +	{
> +	  THREAD_SETMEM (THREAD_SELF, header.feature_1, cet_feature);
> +
> +	  /* Lock CET if IBT or SHSTK is enabled in executable.  Don't
> +	     lock CET if IBT or SHSTK is enabled permissively.  */
> +	  if (GL(dl_x86_feature_control).ibt != cet_permissive
> +	      && (GL(dl_x86_feature_control).shstk != cet_permissive))
> +	    dl_cet_lock_cet (cet_feature);
> +	}
> +      /* Sync GL(dl_x86_feature_1) with kernel.  */
> +      GL(dl_x86_feature_1) = cet_feature;
> +    }
> +}
> +
>  #ifdef SHARED
>  
>  # ifndef LINKAGE
> diff --git a/sysdeps/x86/get-cpuid-feature-leaf.c b/sysdeps/x86/get-cpuid-feature-leaf.c
> index 40a46cc79c..9317a6b494 100644
> --- a/sysdeps/x86/get-cpuid-feature-leaf.c
> +++ b/sysdeps/x86/get-cpuid-feature-leaf.c
> @@ -24,7 +24,7 @@ __x86_get_cpuid_feature_leaf (unsigned int leaf)
>    static const struct cpuid_feature feature = {};
>    if (leaf < CPUID_INDEX_MAX)
>      return ((const struct cpuid_feature *)
> -	      &GLRO(dl_x86_cpu_features).features[leaf]);
> +	    &GLRO(dl_x86_cpu_features).features[leaf]);
>    else
>      return &feature;
>  }
> diff --git a/sysdeps/x86/include/cpu-features.h b/sysdeps/x86/include/cpu-features.h
> index 2d7427a6c0..23bd8146a2 100644
> --- a/sysdeps/x86/include/cpu-features.h
> +++ b/sysdeps/x86/include/cpu-features.h
> @@ -990,6 +990,9 @@ extern const struct cpu_features *_dl_x86_get_cpu_features (void)
>  # define INIT_ARCH()
>  # define _dl_x86_get_cpu_features() (&GLRO(dl_x86_cpu_features))
>  extern void _dl_x86_init_cpu_features (void) attribute_hidden;
> +
> +extern void _dl_cet_setup_features (unsigned int)
> +    attribute_hidden;
>  #endif
>  
>  #ifdef __x86_64__
> diff --git a/sysdeps/x86/libc-start.h b/sysdeps/x86/libc-start.h
> index e93da6ef3d..856230daeb 100644
> --- a/sysdeps/x86/libc-start.h
> +++ b/sysdeps/x86/libc-start.h
> @@ -19,7 +19,57 @@
>  #ifndef SHARED
>  # define ARCH_SETUP_IREL() apply_irel ()
>  # define ARCH_APPLY_IREL()
> -# ifndef ARCH_SETUP_TLS
> -#  define ARCH_SETUP_TLS() __libc_setup_tls ()
> +# ifdef __CET__
> +/* Get CET features enabled in the static executable.  */
> +
> +static inline unsigned int
> +get_cet_feature (void)
> +{
> +  /* Check if CET is supported and not disabled by tunables.  */
> +  struct cpu_features *cpu_features
> +    = (struct cpu_features *) __get_cpu_features ();

Would be better to add a proper function to return a non-const point
to the cpu features?  Casting like this does seems not a good approach.

> +  unsigned int cet_feature = 0;
> +  if (CPU_FEATURE_USABLE_P (cpu_features, IBT))
> +    cet_feature |= GNU_PROPERTY_X86_FEATURE_1_IBT;
> +  if (CPU_FEATURE_USABLE_P (cpu_features, SHSTK))
> +    cet_feature |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> +  if (!cet_feature)
> +    return cet_feature;
> +
> +  struct link_map *main_map = _dl_get_dl_main_map ();
> +
> +  /* Scan program headers backward to check PT_GNU_PROPERTY early for
> +     x86 feature bits on static executable.  */
> +  const ElfW(Phdr) *phdr = GL(dl_phdr);
> +  const ElfW(Phdr) *ph;
> +  for (ph = phdr + GL(dl_phnum); ph != phdr; ph--)
> +    if (ph[-1].p_type == PT_GNU_PROPERTY)
> +      {
> +	_dl_process_pt_gnu_property (main_map, -1, &ph[-1]);
> +	/* Enable IBT and SHSTK only if they are enabled on static
> +	   executable.  */
> +	cet_feature &= (main_map->l_x86_feature_1_and
> +			& (GNU_PROPERTY_X86_FEATURE_1_IBT
> +			   | GNU_PROPERTY_X86_FEATURE_1_SHSTK));
> +	/* Set GL(dl_x86_feature_1) to the enabled CET features.  */
> +	GL(dl_x86_feature_1) = cet_feature;
> +	break;
> +      }
> +
> +  return cet_feature;
> +}
> +
> +/* The function using this macro to enable shadow stack must not return
> +   to avoid shadow stack underflow.  */
> +#  define ARCH_SETUP_TLS()						\
> +  {									\
> +    __libc_setup_tls ();						\
> +									\
> +    unsigned int cet_feature = get_cet_feature ();			\
> +    ENABLE_X86_CET (cet_feature);					\
> +    _dl_cet_setup_features (cet_feature);				\
> +  }
> +# else
> +#  define ARCH_SETUP_TLS()	__libc_setup_tls ()
>  # endif
>  #endif /* !SHARED */
> diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
> index 581a2f1a9e..faeae723cb 100644
> --- a/sysdeps/x86_64/dl-machine.h
> +++ b/sysdeps/x86_64/dl-machine.h
> @@ -29,6 +29,11 @@
>  #include <dl-static-tls.h>
>  #include <dl-machine-rel.h>
>  #include <isa-level.h>
> +#ifdef __CET__
> +# include <dl-cet.h>
> +#else
> +# define RTLD_START_ENABLE_X86_FEATURES
> +#endif
>  
>  /* Return nonzero iff ELF header is compatible with the running host.  */
>  static inline int __attribute__ ((unused))
> @@ -146,13 +151,16 @@ _start:\n\
>  _dl_start_user:\n\
>  	# Save the user entry point address in %r12.\n\
>  	movq %rax, %r12\n\
> +	# Save %rsp value in %r13.\n\
> +	movq %rsp, %r13\n\
> +"\
> +	RTLD_START_ENABLE_X86_FEATURES \
> +"\
>  	# Read the original argument count.\n\
>  	movq (%rsp), %rdx\n\
>  	# Call _dl_init (struct link_map *main_map, int argc, char **argv, char **env)\n\
>  	# argc -> rsi\n\
>  	movq %rdx, %rsi\n\
> -	# Save %rsp value in %r13.\n\
> -	movq %rsp, %r13\n\
>  	# And align stack for the _dl_init call. \n\
>  	andq $-16, %rsp\n\
>  	# _dl_loaded -> rdi\n\

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 4/6] x86/cet: Check feature_1 in TCB for active IBT and SHSTK
  2023-12-22 16:58 ` [PATCH v5 4/6] x86/cet: Check feature_1 in TCB for active IBT and SHSTK H.J. Lu
@ 2023-12-29 14:59   ` Adhemerval Zanella Netto
  2023-12-29 15:14     ` H.J. Lu
  0 siblings, 1 reply; 18+ messages in thread
From: Adhemerval Zanella Netto @ 2023-12-29 14:59 UTC (permalink / raw)
  To: H.J. Lu, libc-alpha; +Cc: rick.p.edgecombe, goldstein.w.n



On 22/12/23 13:58, H.J. Lu wrote:
> Initially, IBT and SHSTK are marked as active when CPU supports them
> and CET are enabled in glibc.  They can be disabled early by tunables
> before relocation.  Since after relocation, GLRO(dl_x86_cpu_features)
> becomes read-only, we can't update GLRO(dl_x86_cpu_features) to mark
> IBT and SHSTK as inactive.  Instead, check the feature_1 field in TCB
> to decide if IBT and SHST are active.
> ---
>  sysdeps/x86/bits/platform/x86.h      |  8 ++++++++
>  sysdeps/x86/get-cpuid-feature-leaf.c | 11 ++++++++++-
>  sysdeps/x86/sys/platform/x86.h       | 17 +++++++++++++++++
>  3 files changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/sysdeps/x86/bits/platform/x86.h b/sysdeps/x86/bits/platform/x86.h
> index 1e23d53ba2..1575ae53fb 100644
> --- a/sysdeps/x86/bits/platform/x86.h
> +++ b/sysdeps/x86/bits/platform/x86.h
> @@ -337,3 +337,11 @@ enum
>    x86_cpu_AVX10_YMM = x86_cpu_index_24_ecx_0_ebx + 17,
>    x86_cpu_AVX10_ZMM = x86_cpu_index_24_ecx_0_ebx + 18,
>  };
> +
> +/* Bits in the feature_1 field in TCB.  */
> +
> +enum
> +{
> +  x86_feature_1_ibt		= 1U << 0,
> +  x86_feature_1_shstk		= 1U << 1
> +};
> diff --git a/sysdeps/x86/get-cpuid-feature-leaf.c b/sysdeps/x86/get-cpuid-feature-leaf.c
> index 9317a6b494..f69936b31e 100644
> --- a/sysdeps/x86/get-cpuid-feature-leaf.c
> +++ b/sysdeps/x86/get-cpuid-feature-leaf.c
> @@ -15,9 +15,18 @@
>     License along with the GNU C Library; if not, see
>     <https://www.gnu.org/licenses/>.  */
>  
> -
> +#include <assert.h>
> +#include <tcb-offsets.h>
>  #include <ldsodefs.h>
>  
> +#ifdef __x86_64__
> +# ifdef __LP64__
> +_Static_assert (FEATURE_1_OFFSET == 72, "FEATURE_1_OFFSET != 72");
> +# else
> +_Static_assert (FEATURE_1_OFFSET == 40, "FEATURE_1_OFFSET != 40");
> +# endif
> +#endif
> +
>  const struct cpuid_feature *
>  __x86_get_cpuid_feature_leaf (unsigned int leaf)
>  {
> diff --git a/sysdeps/x86/sys/platform/x86.h b/sysdeps/x86/sys/platform/x86.h
> index 1ea2c5fc0b..89b1b16f22 100644
> --- a/sysdeps/x86/sys/platform/x86.h
> +++ b/sysdeps/x86/sys/platform/x86.h
> @@ -45,6 +45,23 @@ x86_cpu_present (unsigned int __index)
>  static __inline__ _Bool
>  x86_cpu_active (unsigned int __index)
>  {
> +  if (__index == x86_cpu_IBT || __index == x86_cpu_SHSTK)
> +    {
> +#ifdef __x86_64__
> +      unsigned int __feature_1;
> +# ifdef __LP64__
> +      __asm__ ("mov %%fs:72, %0" : "=r" (__feature_1));
> +# else
> +      __asm__ ("mov %%fs:40, %0" : "=r" (__feature_1));
> +# endif
> +      if (__index == x86_cpu_IBT)
> +	return __feature_1 & x86_feature_1_ibt;
> +      else
> +	return __feature_1 & x86_feature_1_shstk;

So I take that shadow stack is fully supported on x32, right?

> +#else
> +      return false;
> +#endif
> +    }
>    const struct cpuid_feature *__ptr = __x86_get_cpuid_feature_leaf
>      (__index / (8 * sizeof (unsigned int) * 4));
>    unsigned int __reg

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 4/6] x86/cet: Check feature_1 in TCB for active IBT and SHSTK
  2023-12-29 14:59   ` Adhemerval Zanella Netto
@ 2023-12-29 15:14     ` H.J. Lu
  0 siblings, 0 replies; 18+ messages in thread
From: H.J. Lu @ 2023-12-29 15:14 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: libc-alpha, rick.p.edgecombe, goldstein.w.n

On Fri, Dec 29, 2023 at 6:59 AM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 22/12/23 13:58, H.J. Lu wrote:
> > Initially, IBT and SHSTK are marked as active when CPU supports them
> > and CET are enabled in glibc.  They can be disabled early by tunables
> > before relocation.  Since after relocation, GLRO(dl_x86_cpu_features)
> > becomes read-only, we can't update GLRO(dl_x86_cpu_features) to mark
> > IBT and SHSTK as inactive.  Instead, check the feature_1 field in TCB
> > to decide if IBT and SHST are active.
> > ---
> >  sysdeps/x86/bits/platform/x86.h      |  8 ++++++++
> >  sysdeps/x86/get-cpuid-feature-leaf.c | 11 ++++++++++-
> >  sysdeps/x86/sys/platform/x86.h       | 17 +++++++++++++++++
> >  3 files changed, 35 insertions(+), 1 deletion(-)
> >
> > diff --git a/sysdeps/x86/bits/platform/x86.h b/sysdeps/x86/bits/platform/x86.h
> > index 1e23d53ba2..1575ae53fb 100644
> > --- a/sysdeps/x86/bits/platform/x86.h
> > +++ b/sysdeps/x86/bits/platform/x86.h
> > @@ -337,3 +337,11 @@ enum
> >    x86_cpu_AVX10_YMM = x86_cpu_index_24_ecx_0_ebx + 17,
> >    x86_cpu_AVX10_ZMM = x86_cpu_index_24_ecx_0_ebx + 18,
> >  };
> > +
> > +/* Bits in the feature_1 field in TCB.  */
> > +
> > +enum
> > +{
> > +  x86_feature_1_ibt          = 1U << 0,
> > +  x86_feature_1_shstk                = 1U << 1
> > +};
> > diff --git a/sysdeps/x86/get-cpuid-feature-leaf.c b/sysdeps/x86/get-cpuid-feature-leaf.c
> > index 9317a6b494..f69936b31e 100644
> > --- a/sysdeps/x86/get-cpuid-feature-leaf.c
> > +++ b/sysdeps/x86/get-cpuid-feature-leaf.c
> > @@ -15,9 +15,18 @@
> >     License along with the GNU C Library; if not, see
> >     <https://www.gnu.org/licenses/>.  */
> >
> > -
> > +#include <assert.h>
> > +#include <tcb-offsets.h>
> >  #include <ldsodefs.h>
> >
> > +#ifdef __x86_64__
> > +# ifdef __LP64__
> > +_Static_assert (FEATURE_1_OFFSET == 72, "FEATURE_1_OFFSET != 72");
> > +# else
> > +_Static_assert (FEATURE_1_OFFSET == 40, "FEATURE_1_OFFSET != 40");
> > +# endif
> > +#endif
> > +
> >  const struct cpuid_feature *
> >  __x86_get_cpuid_feature_leaf (unsigned int leaf)
> >  {
> > diff --git a/sysdeps/x86/sys/platform/x86.h b/sysdeps/x86/sys/platform/x86.h
> > index 1ea2c5fc0b..89b1b16f22 100644
> > --- a/sysdeps/x86/sys/platform/x86.h
> > +++ b/sysdeps/x86/sys/platform/x86.h
> > @@ -45,6 +45,23 @@ x86_cpu_present (unsigned int __index)
> >  static __inline__ _Bool
> >  x86_cpu_active (unsigned int __index)
> >  {
> > +  if (__index == x86_cpu_IBT || __index == x86_cpu_SHSTK)
> > +    {
> > +#ifdef __x86_64__
> > +      unsigned int __feature_1;
> > +# ifdef __LP64__
> > +      __asm__ ("mov %%fs:72, %0" : "=r" (__feature_1));
> > +# else
> > +      __asm__ ("mov %%fs:40, %0" : "=r" (__feature_1));
> > +# endif
> > +      if (__index == x86_cpu_IBT)
> > +     return __feature_1 & x86_feature_1_ibt;
> > +      else
> > +     return __feature_1 & x86_feature_1_shstk;
>
> So I take that shadow stack is fully supported on x32, right?

Not yet.  I have additional kernel and glibc patches to enable
shadow stack on x32.  I will submit them after shadow stack
is enabled in glibc.

> > +#else
> > +      return false;
> > +#endif
> > +    }
> >    const struct cpuid_feature *__ptr = __x86_get_cpuid_feature_leaf
> >      (__index / (8 * sizeof (unsigned int) * 4));
> >    unsigned int __reg

Thanks.


-- 
H.J.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 2/6] elf: Always provide _dl_get_dl_main_map in libc.a
  2023-12-29 14:45   ` Adhemerval Zanella Netto
@ 2023-12-29 15:15     ` H.J. Lu
  0 siblings, 0 replies; 18+ messages in thread
From: H.J. Lu @ 2023-12-29 15:15 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: libc-alpha

On Fri, Dec 29, 2023 at 6:45 AM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 22/12/23 13:58, H.J. Lu wrote:
> > Always provide _dl_get_dl_main_map in libc.a.  It will be used by x86
> > to process PT_GNU_PROPERTY segment.
> > ---
> >  elf/dl-support.c           | 2 --
> >  sysdeps/generic/ldsodefs.h | 8 ++++----
> >  2 files changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/elf/dl-support.c b/elf/dl-support.c
> > index 837fa1c836..70c5b3599a 100644
> > --- a/elf/dl-support.c
> > +++ b/elf/dl-support.c
> > @@ -344,7 +344,6 @@ _dl_non_dynamic_init (void)
> >  DL_SYSINFO_IMPLEMENTATION
> >  #endif
> >
> > -#if ENABLE_STATIC_PIE
> >  /* Since relocation to hidden _dl_main_map causes relocation overflow on
> >     aarch64, a function is used to get the address of _dl_main_map.  */
> >
> > @@ -353,7 +352,6 @@ _dl_get_dl_main_map (void)
> >  {
> >    return &_dl_main_map;
> >  }
> > -#endif
> >
> >  /* This is used by _dl_runtime_profile, not used on static code.  */
> >  void
> > diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
> > index 9b50ddd09f..0e8a008a49 100644
> > --- a/sysdeps/generic/ldsodefs.h
> > +++ b/sysdeps/generic/ldsodefs.h
> > @@ -1172,10 +1172,6 @@ void __libc_setup_tls (void);
> >  # if ENABLE_STATIC_PIE
> >  /* Relocate static executable with PIE.  */
> >  extern void _dl_relocate_static_pie (void) attribute_hidden;
> > -
> > -/* Get a pointer to _dl_main_map.  */
> > -extern struct link_map * _dl_get_dl_main_map (void)
> > -  __attribute__ ((visibility ("hidden")));
> >  # else
> >  #  define _dl_relocate_static_pie()
> >  # endif
> > @@ -1217,6 +1213,10 @@ rtld_hidden_proto (_dl_deallocate_tls)
> >
> >  extern void _dl_nothread_init_static_tls (struct link_map *) attribute_hidden;
> >
> > +/* Get a pointer to _dl_main_map.  */
> > +extern struct link_map * _dl_get_dl_main_map (void)
> > +  __attribute__ ((visibility ("hidden")));
>
> You can se attribute_hidden here.

Fixed.

> > +
> >  /* Find origin of the executable.  */
> >  extern const char *_dl_get_origin (void) attribute_hidden;
> >

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v5 3/6] x86/cet: Enable shadow stack during startup
  2023-12-29 14:55   ` Adhemerval Zanella Netto
@ 2023-12-29 15:24     ` H.J. Lu
  0 siblings, 0 replies; 18+ messages in thread
From: H.J. Lu @ 2023-12-29 15:24 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: libc-alpha, rick.p.edgecombe, goldstein.w.n

On Fri, Dec 29, 2023 at 6:55 AM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 22/12/23 13:58, H.J. Lu wrote:
> > Previously, CET was enabled by kernel before passing control to user
> > space and the startup code must disable CET if applications or shared
> > libraries aren't CET enabled.  Since the current kernel only supports
> > shadow stack and won't enable shadow stack before passing control to
> > user space, we need to enable shadow stack during startup if the
> > application and all shared library are shadow stack enabled.  There
> > is no need to disable shadow stack at startup.  Shadow stack can only
> > be enabled in a function which will never return.  Otherwise, shadow
> > stack will underflow at the function return.
> >
> > 1. GL(dl_x86_feature_1) is set to the CET features which are supported
> > by the processor and are not disabled by the tunable.  Only non-zero
> > features in GL(dl_x86_feature_1) should be enabled.  After enabling
> > shadow stack with ARCH_SHSTK_ENABLE, ARCH_SHSTK_STATUS is used to check
> > if shadow stack is really enabled.
> > 2. Use ARCH_SHSTK_ENABLE in RTLD_START in dynamic executable.  It is
> > safe since RTLD_START never returns.
> > 3. Call arch_prctl (ARCH_SHSTK_ENABLE) from ARCH_SETUP_TLS in static
> > executable.  Since the start function using ARCH_SETUP_TLS never returns,
> > it is safe to enable shadow stack in ARCH_SETUP_TLS.
> > ---
> >  sysdeps/unix/sysv/linux/x86/cpu-features.c | 49 --------------
> >  sysdeps/unix/sysv/linux/x86/dl-cet.h       | 23 +++++++
> >  sysdeps/unix/sysv/linux/x86_64/dl-cet.h    | 47 +++++++++++++
> >  sysdeps/x86/cpu-features-offsets.sym       |  1 +
> >  sysdeps/x86/cpu-features.c                 | 51 --------------
> >  sysdeps/x86/dl-cet.c                       | 77 +++++++++++-----------
> >  sysdeps/x86/get-cpuid-feature-leaf.c       |  2 +-
> >  sysdeps/x86/include/cpu-features.h         |  3 +
> >  sysdeps/x86/libc-start.h                   | 54 ++++++++++++++-
> >  sysdeps/x86_64/dl-machine.h                | 12 +++-
> >  10 files changed, 175 insertions(+), 144 deletions(-)
> >  delete mode 100644 sysdeps/unix/sysv/linux/x86/cpu-features.c
> >  create mode 100644 sysdeps/unix/sysv/linux/x86_64/dl-cet.h
> >
> > diff --git a/sysdeps/unix/sysv/linux/x86/cpu-features.c b/sysdeps/unix/sysv/linux/x86/cpu-features.c
> > deleted file mode 100644
> > index 0e6e2bf855..0000000000
> > --- a/sysdeps/unix/sysv/linux/x86/cpu-features.c
> > +++ /dev/null
> > @@ -1,49 +0,0 @@
> > -/* Initialize CPU feature data for Linux/x86.
> > -   This file is part of the GNU C Library.
> > -   Copyright (C) 2018-2023 Free Software Foundation, Inc.
> > -
> > -   The GNU C Library is free software; you can redistribute it and/or
> > -   modify it under the terms of the GNU Lesser General Public
> > -   License as published by the Free Software Foundation; either
> > -   version 2.1 of the License, or (at your option) any later version.
> > -
> > -   The GNU C Library is distributed in the hope that it will be useful,
> > -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > -   Lesser General Public License for more details.
> > -
> > -   You should have received a copy of the GNU Lesser General Public
> > -   License along with the GNU C Library; if not, see
> > -   <https://www.gnu.org/licenses/>.  */
> > -
> > -#if CET_ENABLED
> > -# include <sys/prctl.h>
> > -# include <asm/prctl.h>
> > -
> > -static inline int __attribute__ ((always_inline))
> > -get_cet_status (void)
> > -{
> > -  unsigned long long kernel_feature;
> > -  unsigned int status = 0;
> > -  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_STATUS,
> > -                          &kernel_feature) == 0)
> > -    {
> > -      if ((kernel_feature & ARCH_SHSTK_SHSTK) != 0)
> > -     status = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> > -    }
> > -  return status;
> > -}
> > -
> > -# ifndef SHARED
> > -static inline void
> > -x86_setup_tls (void)
> > -{
> > -  __libc_setup_tls ();
> > -  THREAD_SETMEM (THREAD_SELF, header.feature_1, GL(dl_x86_feature_1));
> > -}
> > -
> > -#  define ARCH_SETUP_TLS() x86_setup_tls ()
> > -# endif
> > -#endif
> > -
> > -#include <sysdeps/x86/cpu-features.c>
> > diff --git a/sysdeps/unix/sysv/linux/x86/dl-cet.h b/sysdeps/unix/sysv/linux/x86/dl-cet.h
> > index da220ac627..634c885d33 100644
> > --- a/sysdeps/unix/sysv/linux/x86/dl-cet.h
> > +++ b/sysdeps/unix/sysv/linux/x86/dl-cet.h
> > @@ -38,3 +38,26 @@ dl_cet_lock_cet (unsigned int cet_feature)
> >    return (int) INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_LOCK,
> >                                     kernel_feature);
> >  }
> > +
> > +static inline unsigned int __attribute__ ((always_inline))
>
> You can use use 'static __always_inline unsigned int' here.

Fixed.

> > +dl_cet_get_cet_status (void)
> > +{
> > +  unsigned long long kernel_feature;
> > +  unsigned int status = 0;
> > +  if (INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_STATUS,
> > +                          &kernel_feature) == 0)
> > +    {
> > +      if ((kernel_feature & ARCH_SHSTK_SHSTK) != 0)
> > +     status = GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> > +    }
> > +  return status;
> > +}
> > +
> > +/* Enable shadow stack with a macro to avoid shadow stack underflow.  */
> > +#define ENABLE_X86_CET(cet_feature)                          \
> > +  if ((cet_feature & GNU_PROPERTY_X86_FEATURE_1_SHSTK))              \
> > +    {                                                                \
> > +      long long int kernel_feature = ARCH_SHSTK_SHSTK;               \
> > +      INTERNAL_SYSCALL_CALL (arch_prctl, ARCH_SHSTK_ENABLE,  \
> > +                          kernel_feature);                   \
> > +    }
>
> The Linux documentation Documentation/arch/x86/shstk.rst states the
> argument is a 'unsigned long'. I am not use it would matter though.

I have additional kernel and glibc patches to enable shadow stack on x32
which needs unsigned long long to match kernel syscall.

>
> > diff --git a/sysdeps/unix/sysv/linux/x86_64/dl-cet.h b/sysdeps/unix/sysv/linux/x86_64/dl-cet.h
> > new file mode 100644
> > index 0000000000..e23e05c6b8
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/x86_64/dl-cet.h
> > @@ -0,0 +1,47 @@
> > +/* Linux/x86-64 CET initializers function.
> > +   Copyright (C) 2023 Free Software Foundation, Inc.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#include <cpu-features-offsets.h>
> > +#include_next <dl-cet.h>
> > +
> > +#define X86_STRINGIFY_1(x)   #x
> > +#define X86_STRINGIFY(x)     X86_STRINGIFY_1 (x)
> > +
> > +/* Enable shadow stack before calling _dl_init if it is enabled in
> > +   GL(dl_x86_feature_1).  Call _dl_setup_x86_features to setup shadow
> > +   stack.  */
> > +#define RTLD_START_ENABLE_X86_FEATURES \
> > +"\
> > +     # Check if shadow stack is enabled in GL(dl_x86_feature_1).\n\
> > +     movl _rtld_local+" X86_STRINGIFY (RTLD_GLOBAL_DL_X86_FEATURE_1_OFFSET) "(%rip), %edx\n\
> > +     testl $" X86_STRINGIFY (X86_FEATURE_1_SHSTK) ", %edx\n\
> > +     jz 1f\n\
> > +     # Enable shadow stack if enabled in GL(dl_x86_feature_1).\n\
> > +     movl $" X86_STRINGIFY (ARCH_SHSTK_SHSTK) ", %esi\n\
> > +     movl $" X86_STRINGIFY (ARCH_SHSTK_ENABLE) ", %edi\n\
> > +     movl $" X86_STRINGIFY (__NR_arch_prctl) ", %eax\n\
> > +     syscall\n\
> > +1:\n\
>
> It seems that the syscall might eventually fail if the shadow stack can not be
> allocated (alloc_shstk), although it seems really unlikely to happen on loader
> itself (maybe in a really constraint environment).  Should we handle this case?

Enable shadow stack can fail on non-shadow stack kernels or legacy processors.
We may need to enable IBT in the future.  Glibc will issue syscalls to
enable CET
features.  _dl_cet_setup_features is called to check if CET features are truly
enabled.

> > +     # Pass GL(dl_x86_feature_1) to _dl_cet_setup_features.\n\
> > +     movl %edx, %edi\n\
> > +     # Align stack for the _dl_cet_setup_features call.\n\
> > +     andq $-16, %rsp\n\
> > +     call _dl_cet_setup_features\n\
> > +     # Restore %rax and %rsp from %r12 and %r13.\n\
> > +     movq %r12, %rax\n\
> > +     movq %r13, %rsp\n\
> > +"
> > diff --git a/sysdeps/x86/cpu-features-offsets.sym b/sysdeps/x86/cpu-features-offsets.sym
> > index 6d03cea8e8..5429f60632 100644
> > --- a/sysdeps/x86/cpu-features-offsets.sym
> > +++ b/sysdeps/x86/cpu-features-offsets.sym
> > @@ -4,3 +4,4 @@
> >
> >  RTLD_GLOBAL_RO_DL_X86_CPU_FEATURES_OFFSET offsetof (struct rtld_global_ro, _dl_x86_cpu_features)
> >  XSAVE_STATE_SIZE_OFFSET      offsetof (struct cpu_features, xsave_state_size)
> > +RTLD_GLOBAL_DL_X86_FEATURE_1_OFFSET offsetof (struct rtld_global, _dl_x86_feature_1)
> > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> > index f180f0d9a4..097868c1d9 100644
> > --- a/sysdeps/x86/cpu-features.c
> > +++ b/sysdeps/x86/cpu-features.c
> > @@ -1106,57 +1106,6 @@ no_cpuid:
> >              TUNABLE_CALLBACK (set_x86_ibt));
> >    TUNABLE_GET (x86_shstk, tunable_val_t *,
> >              TUNABLE_CALLBACK (set_x86_shstk));
> > -
> > -  /* Check CET status.  */
> > -  unsigned int cet_status = get_cet_status ();
> > -
> > -  if ((cet_status & GNU_PROPERTY_X86_FEATURE_1_IBT) == 0)
> > -    CPU_FEATURE_UNSET (cpu_features, IBT)
> > -  if ((cet_status & GNU_PROPERTY_X86_FEATURE_1_SHSTK) == 0)
> > -    CPU_FEATURE_UNSET (cpu_features, SHSTK)
> > -
> > -  if (cet_status)
> > -    {
> > -      GL(dl_x86_feature_1) = cet_status;
> > -
> > -# ifndef SHARED
> > -      /* Check if IBT and SHSTK are enabled by kernel.  */
> > -      if ((cet_status
> > -        & (GNU_PROPERTY_X86_FEATURE_1_IBT
> > -           | GNU_PROPERTY_X86_FEATURE_1_SHSTK)))
> > -     {
> > -       /* Disable IBT and/or SHSTK if they are enabled by kernel, but
> > -          disabled by environment variable:
> > -
> > -          GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK
> > -        */
> > -       unsigned int cet_feature = 0;
> > -       if (!CPU_FEATURE_USABLE (IBT))
> > -         cet_feature |= (cet_status
> > -                         & GNU_PROPERTY_X86_FEATURE_1_IBT);
> > -       if (!CPU_FEATURE_USABLE (SHSTK))
> > -         cet_feature |= (cet_status
> > -                         & GNU_PROPERTY_X86_FEATURE_1_SHSTK);
> > -
> > -       if (cet_feature)
> > -         {
> > -           int res = dl_cet_disable_cet (cet_feature);
> > -
> > -           /* Clear the disabled bits in dl_x86_feature_1.  */
> > -           if (res == 0)
> > -             GL(dl_x86_feature_1) &= ~cet_feature;
> > -         }
> > -
> > -       /* Lock CET if IBT or SHSTK is enabled in executable.  Don't
> > -          lock CET if IBT or SHSTK is enabled permissively.  */
> > -       if (GL(dl_x86_feature_control).ibt != cet_permissive
> > -           && GL(dl_x86_feature_control).shstk != cet_permissive)
> > -         dl_cet_lock_cet (GL(dl_x86_feature_1)
> > -                          & (GNU_PROPERTY_X86_FEATURE_1_IBT
> > -                             | GNU_PROPERTY_X86_FEATURE_1_SHSTK));
> > -     }
> > -# endif
> > -    }
> >  #endif
> >
> >  #ifndef SHARED
> > diff --git a/sysdeps/x86/dl-cet.c b/sysdeps/x86/dl-cet.c
> > index 66a78244d4..25add215f2 100644
> > --- a/sysdeps/x86/dl-cet.c
> > +++ b/sysdeps/x86/dl-cet.c
> > @@ -173,40 +173,11 @@ dl_cet_check_startup (struct link_map *m, struct dl_cet_info *info)
> >      = info->enable_feature_1 ^ info->feature_1_enabled;
> >    if (disable_feature_1 != 0)
> >      {
> > -      /* Disable features in the kernel because of legacy objects or
> > -      cet_always_off.  */
> > -      if (dl_cet_disable_cet (disable_feature_1) != 0)
> > -     _dl_fatal_printf ("%s: can't disable x86 Features\n",
> > -                       info->program);
> > -
> >        /* Clear the disabled bits.  Sync dl_x86_feature_1 and
> >           info->feature_1_enabled with info->enable_feature_1.  */
> >        info->feature_1_enabled = info->enable_feature_1;
> >        GL(dl_x86_feature_1) = info->enable_feature_1;
> >      }
> > -
> > -  if (HAS_CPU_FEATURE (IBT) || HAS_CPU_FEATURE (SHSTK))
> > -    {
> > -      /* Lock CET features only if IBT or SHSTK are enabled and are not
> > -         enabled permissively.  */
> > -      unsigned int feature_1_lock = 0;
> > -
> > -      if (((info->feature_1_enabled & GNU_PROPERTY_X86_FEATURE_1_IBT)
> > -        != 0)
> > -       && info->enable_ibt_type != cet_permissive)
> > -     feature_1_lock |= GNU_PROPERTY_X86_FEATURE_1_IBT;
> > -
> > -      if (((info->feature_1_enabled & GNU_PROPERTY_X86_FEATURE_1_SHSTK)
> > -        != 0)
> > -       && info->enable_shstk_type != cet_permissive)
> > -     feature_1_lock |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> > -
> > -      if (feature_1_lock != 0
> > -       && dl_cet_lock_cet (feature_1_lock) != 0)
> > -     _dl_fatal_printf ("%s: can't lock CET\n", info->program);
> > -    }
> > -
> > -  THREAD_SETMEM (THREAD_SELF, header.feature_1, GL(dl_x86_feature_1));
> >  }
> >  #endif
> >
> > @@ -298,6 +269,15 @@ dl_cet_check (struct link_map *m, const char *program)
> >  {
> >    struct dl_cet_info info;
> >
> > +  /* CET is enabled only if RTLD_START_ENABLE_X86_FEATURES is defined.  */
> > +#if defined SHARED && defined RTLD_START_ENABLE_X86_FEATURES
> > +  /* Set dl_x86_feature_1 to features enabled in the executable.  */
> > +  if (program != NULL)
> > +    GL(dl_x86_feature_1) = (m->l_x86_feature_1_and
> > +                         & (X86_FEATURE_1_IBT
> > +                            | X86_FEATURE_1_SHSTK));
> > +#endif
> > +
> >    /* Check how IBT and SHSTK should be enabled. */
> >    info.enable_ibt_type = GL(dl_x86_feature_control).ibt;
> >    info.enable_shstk_type = GL(dl_x86_feature_control).shstk;
> > @@ -307,17 +287,9 @@ dl_cet_check (struct link_map *m, const char *program)
> >    /* No legacy object check if IBT and SHSTK are always on.  */
> >    if (info.enable_ibt_type == cet_always_on
> >        && info.enable_shstk_type == cet_always_on)
> > -    {
> > -#ifdef SHARED
> > -      /* Set it only during startup.  */
> > -      if (program != NULL)
> > -     THREAD_SETMEM (THREAD_SELF, header.feature_1,
> > -                    info.feature_1_enabled);
> > -#endif
> > -      return;
> > -    }
> > +    return;
> >
> > -  /* Check if IBT and SHSTK were enabled by kernel.  */
> > +  /* Check if IBT and SHSTK were enabled.  */
> >    if (info.feature_1_enabled == 0)
> >      return;
> >
> > @@ -351,6 +323,33 @@ _dl_cet_open_check (struct link_map *l)
> >    dl_cet_check (l, NULL);
> >  }
> >
> > +/* Set GL(dl_x86_feature_1) to the enabled features and clear the
> > +   active bits of the disabled features.  */
> > +
> > +attribute_hidden
> > +void
>
> I think the code guideline states attribute should in the same line
> as the return type.

Fixed.

> > +_dl_cet_setup_features (unsigned int cet_feature)
> > +{
> > +  /* NB: cet_feature == GL(dl_x86_feature_1) which is set to features
> > +     enabled from executable, not necessarily supported by kernel.  */
> > +  if (cet_feature)
>
> No implicit check for integer types.

Fixed.

> > +    {
> > +      cet_feature = dl_cet_get_cet_status ();
> > +      if (cet_feature)
> > +     {
> > +       THREAD_SETMEM (THREAD_SELF, header.feature_1, cet_feature);
> > +
> > +       /* Lock CET if IBT or SHSTK is enabled in executable.  Don't
> > +          lock CET if IBT or SHSTK is enabled permissively.  */
> > +       if (GL(dl_x86_feature_control).ibt != cet_permissive
> > +           && (GL(dl_x86_feature_control).shstk != cet_permissive))
> > +         dl_cet_lock_cet (cet_feature);
> > +     }
> > +      /* Sync GL(dl_x86_feature_1) with kernel.  */
> > +      GL(dl_x86_feature_1) = cet_feature;
> > +    }
> > +}
> > +
> >  #ifdef SHARED
> >
> >  # ifndef LINKAGE
> > diff --git a/sysdeps/x86/get-cpuid-feature-leaf.c b/sysdeps/x86/get-cpuid-feature-leaf.c
> > index 40a46cc79c..9317a6b494 100644
> > --- a/sysdeps/x86/get-cpuid-feature-leaf.c
> > +++ b/sysdeps/x86/get-cpuid-feature-leaf.c
> > @@ -24,7 +24,7 @@ __x86_get_cpuid_feature_leaf (unsigned int leaf)
> >    static const struct cpuid_feature feature = {};
> >    if (leaf < CPUID_INDEX_MAX)
> >      return ((const struct cpuid_feature *)
> > -           &GLRO(dl_x86_cpu_features).features[leaf]);
> > +         &GLRO(dl_x86_cpu_features).features[leaf]);
> >    else
> >      return &feature;
> >  }
> > diff --git a/sysdeps/x86/include/cpu-features.h b/sysdeps/x86/include/cpu-features.h
> > index 2d7427a6c0..23bd8146a2 100644
> > --- a/sysdeps/x86/include/cpu-features.h
> > +++ b/sysdeps/x86/include/cpu-features.h
> > @@ -990,6 +990,9 @@ extern const struct cpu_features *_dl_x86_get_cpu_features (void)
> >  # define INIT_ARCH()
> >  # define _dl_x86_get_cpu_features() (&GLRO(dl_x86_cpu_features))
> >  extern void _dl_x86_init_cpu_features (void) attribute_hidden;
> > +
> > +extern void _dl_cet_setup_features (unsigned int)
> > +    attribute_hidden;
> >  #endif
> >
> >  #ifdef __x86_64__
> > diff --git a/sysdeps/x86/libc-start.h b/sysdeps/x86/libc-start.h
> > index e93da6ef3d..856230daeb 100644
> > --- a/sysdeps/x86/libc-start.h
> > +++ b/sysdeps/x86/libc-start.h
> > @@ -19,7 +19,57 @@
> >  #ifndef SHARED
> >  # define ARCH_SETUP_IREL() apply_irel ()
> >  # define ARCH_APPLY_IREL()
> > -# ifndef ARCH_SETUP_TLS
> > -#  define ARCH_SETUP_TLS() __libc_setup_tls ()
> > +# ifdef __CET__
> > +/* Get CET features enabled in the static executable.  */
> > +
> > +static inline unsigned int
> > +get_cet_feature (void)
> > +{
> > +  /* Check if CET is supported and not disabled by tunables.  */
> > +  struct cpu_features *cpu_features
> > +    = (struct cpu_features *) __get_cpu_features ();
>
> Would be better to add a proper function to return a non-const point
> to the cpu features?  Casting like this does seems not a good approach.

Change it to

const struct cpu_features *cpu_features = __get_cpu_features ();

> > +  unsigned int cet_feature = 0;
> > +  if (CPU_FEATURE_USABLE_P (cpu_features, IBT))
> > +    cet_feature |= GNU_PROPERTY_X86_FEATURE_1_IBT;
> > +  if (CPU_FEATURE_USABLE_P (cpu_features, SHSTK))
> > +    cet_feature |= GNU_PROPERTY_X86_FEATURE_1_SHSTK;
> > +  if (!cet_feature)
> > +    return cet_feature;
> > +
> > +  struct link_map *main_map = _dl_get_dl_main_map ();
> > +
> > +  /* Scan program headers backward to check PT_GNU_PROPERTY early for
> > +     x86 feature bits on static executable.  */
> > +  const ElfW(Phdr) *phdr = GL(dl_phdr);
> > +  const ElfW(Phdr) *ph;
> > +  for (ph = phdr + GL(dl_phnum); ph != phdr; ph--)
> > +    if (ph[-1].p_type == PT_GNU_PROPERTY)
> > +      {
> > +     _dl_process_pt_gnu_property (main_map, -1, &ph[-1]);
> > +     /* Enable IBT and SHSTK only if they are enabled on static
> > +        executable.  */
> > +     cet_feature &= (main_map->l_x86_feature_1_and
> > +                     & (GNU_PROPERTY_X86_FEATURE_1_IBT
> > +                        | GNU_PROPERTY_X86_FEATURE_1_SHSTK));
> > +     /* Set GL(dl_x86_feature_1) to the enabled CET features.  */
> > +     GL(dl_x86_feature_1) = cet_feature;
> > +     break;
> > +      }
> > +
> > +  return cet_feature;
> > +}
> > +
> > +/* The function using this macro to enable shadow stack must not return
> > +   to avoid shadow stack underflow.  */
> > +#  define ARCH_SETUP_TLS()                                           \
> > +  {                                                                  \
> > +    __libc_setup_tls ();                                             \
> > +                                                                     \
> > +    unsigned int cet_feature = get_cet_feature ();                   \
> > +    ENABLE_X86_CET (cet_feature);                                    \
> > +    _dl_cet_setup_features (cet_feature);                            \
> > +  }
> > +# else
> > +#  define ARCH_SETUP_TLS()   __libc_setup_tls ()
> >  # endif
> >  #endif /* !SHARED */
> > diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
> > index 581a2f1a9e..faeae723cb 100644
> > --- a/sysdeps/x86_64/dl-machine.h
> > +++ b/sysdeps/x86_64/dl-machine.h
> > @@ -29,6 +29,11 @@
> >  #include <dl-static-tls.h>
> >  #include <dl-machine-rel.h>
> >  #include <isa-level.h>
> > +#ifdef __CET__
> > +# include <dl-cet.h>
> > +#else
> > +# define RTLD_START_ENABLE_X86_FEATURES
> > +#endif
> >
> >  /* Return nonzero iff ELF header is compatible with the running host.  */
> >  static inline int __attribute__ ((unused))
> > @@ -146,13 +151,16 @@ _start:\n\
> >  _dl_start_user:\n\
> >       # Save the user entry point address in %r12.\n\
> >       movq %rax, %r12\n\
> > +     # Save %rsp value in %r13.\n\
> > +     movq %rsp, %r13\n\
> > +"\
> > +     RTLD_START_ENABLE_X86_FEATURES \
> > +"\
> >       # Read the original argument count.\n\
> >       movq (%rsp), %rdx\n\
> >       # Call _dl_init (struct link_map *main_map, int argc, char **argv, char **env)\n\
> >       # argc -> rsi\n\
> >       movq %rdx, %rsi\n\
> > -     # Save %rsp value in %r13.\n\
> > -     movq %rsp, %r13\n\
> >       # And align stack for the _dl_init call. \n\
> >       andq $-16, %rsp\n\
> >       # _dl_loaded -> rdi\n\

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-12-29 15:24 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-22 16:58 [PATCH v5 0/6] x86/cet: Update CET kernel interface H.J. Lu
2023-12-22 16:58 ` [PATCH v5 1/6] x86/cet: Sync with Linux kernel 6.6 shadow stack interface H.J. Lu
2023-12-26 17:37   ` Noah Goldstein
2023-12-26 17:56     ` H.J. Lu
2023-12-27  0:40       ` Noah Goldstein
2023-12-22 16:58 ` [PATCH v5 2/6] elf: Always provide _dl_get_dl_main_map in libc.a H.J. Lu
2023-12-29 14:45   ` Adhemerval Zanella Netto
2023-12-29 15:15     ` H.J. Lu
2023-12-22 16:58 ` [PATCH v5 3/6] x86/cet: Enable shadow stack during startup H.J. Lu
2023-12-29 14:55   ` Adhemerval Zanella Netto
2023-12-29 15:24     ` H.J. Lu
2023-12-22 16:58 ` [PATCH v5 4/6] x86/cet: Check feature_1 in TCB for active IBT and SHSTK H.J. Lu
2023-12-29 14:59   ` Adhemerval Zanella Netto
2023-12-29 15:14     ` H.J. Lu
2023-12-22 16:58 ` [PATCH v5 5/6] x86/cet: Don't set CET active by default H.J. Lu
2023-12-22 16:58 ` [PATCH v5 6/6] x86/cet: Run some CET tests with shadow stack H.J. Lu
2023-12-28 16:00 ` [PATCH v5 0/6] x86/cet: Update CET kernel interface Florian Weimer
2023-12-28 21:17   ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).