* [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register @ 2018-03-30 17:41 H.J. Lu 2018-04-06 4:46 ` Carlos O'Donell 0 siblings, 1 reply; 13+ messages in thread From: H.J. Lu @ 2018-03-30 17:41 UTC (permalink / raw) To: Florian Weimer Cc: Joseph Myers, Carlos O'Donell, Tsimbalist, Igor V, GNU C Library [-- Attachment #1: Type: text/plain, Size: 1172 bytes --] On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote: > * H. J. Lu: > >> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >>> * H. J. Lu: >>> >>>> You need to make a choice. You either don't introduce a new symbol >>>> version or don't save shadow stack for thread cancellation. You >>>> can't have both. >>> >>> I don't understand. We have room to save the shadow stack pointer in >>> the existing struct. >> >> No, we don't have room in struct pthread_unwind_buf: >> >> Note: There is an unused pointer space in pthread_unwind_buf_data. But >> it isn't suitable for saving and restoring shadow stack register since >> x32 is a 64-bit process with 32-bit software pointer and kernel may >> place x32 shadow stack above 4GB. We need to save and restore 64-bit >> shadow stack register for x32. > > We have for void * fields. They are subsequently overwritten by > __pthread_register_cancel. But __sigsetjmp can write to them first > without causing any harm. We just need a private __longjmp_cancel > that doesn't restore the shadow stack pointer. Here is the patch which does that. Any comments? Thanks. -- H.J. [-- Attachment #2: 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch --] [-- Type: text/x-patch, Size: 11254 bytes --] From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001 From: "H.J. Lu" <hjl.tools@gmail.com> Date: Sat, 24 Feb 2018 17:23:54 -0800 Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register The pad array in struct pthread_unwind_buf is used by setjmp to save shadow stack register. We assert that size of struct pthread_unwind_buf is no less than offset of shadow stack pointer + shadow stack pointer size. Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as these with thread cancellation, call setjmp, but never return after __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as __libc_longjmp on x86, doesn't need to restore shadow stack register. __libc_longjmp, which is a private interface for thread cancellation implementation in libpthread, is changed to call __longjmp_cancel, instead of __longjmp. __longjmp_cancel is a new internal function in libc, which is similar to __longjmp, but doesn't restore shadow stack register. The compatibility longjmp and siglongjmp in libpthread.so are changed to call __libc_siglongjmp, instead of __libc_longjmp, so that they will restore shadow stack register. Tested with build-many-glibcs.py. * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous handlers after setjmp. * setjmp/longjmp.c (__libc_longjmp): Don't define alias if defined. * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG): Changed to 97. * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. * sysdeps/x86/__longjmp_cancel.S: New file. * sysdeps/x86/longjmp.c: Likewise. * sysdeps/x86/nptl/pt-longjmp.c: Likewise. --- nptl/pthread_create.c | 9 ++-- setjmp/longjmp.c | 2 + sysdeps/unix/sysv/linux/x86/setjmpP.h | 4 +- sysdeps/x86/Makefile | 4 ++ sysdeps/x86/__longjmp_cancel.S | 20 ++++++++ sysdeps/x86/longjmp.c | 45 ++++++++++++++++ sysdeps/x86/nptl/pt-longjmp.c | 97 +++++++++++++++++++++++++++++++++++ 7 files changed, 176 insertions(+), 5 deletions(-) create mode 100644 sysdeps/x86/__longjmp_cancel.S create mode 100644 sysdeps/x86/longjmp.c create mode 100644 sysdeps/x86/nptl/pt-longjmp.c diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index caaf07c134..1c5b3780c6 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -427,12 +427,15 @@ START_THREAD_DEFN compilers without that support we do use setjmp. */ struct pthread_unwind_buf unwind_buf; - /* No previous handlers. */ + int not_first_call; + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); + + /* No previous handlers. NB: This must be done after setjmp since + the same space may be used by setjmp to store extra data which + should never be used by __libc_unwind_longjmp. */ unwind_buf.priv.data.prev = NULL; unwind_buf.priv.data.cleanup = NULL; - int not_first_call; - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); if (__glibc_likely (! not_first_call)) { /* Store the new cleanup handler info. */ diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c index a2a7065a85..453889e103 100644 --- a/setjmp/longjmp.c +++ b/setjmp/longjmp.c @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) } #ifndef __libc_siglongjmp +# ifndef __libc_longjmp /* __libc_longjmp is a private interface for cancellation implementation in libpthread. */ strong_alias (__libc_siglongjmp, __libc_longjmp) +# endif weak_alias (__libc_siglongjmp, _longjmp) weak_alias (__libc_siglongjmp, longjmp) weak_alias (__libc_siglongjmp, siglongjmp) diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h index c0ed767a0d..90a6bbcf32 100644 --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h @@ -22,8 +22,8 @@ #include <bits/types/__sigset_t.h> /* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. - Define it to 513 to leave some rooms for future use. */ -#define _JUMP_BUF_SIGSET_NSIG 513 + Define it to 97 to leave some rooms for future use. */ +#define _JUMP_BUF_SIGSET_NSIG 97 /* Number of longs to hold all signals. */ #define _JUMP_BUF_SIGSET_NWORDS \ ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile index 0d0326c21a..d25d6f0ae4 100644 --- a/sysdeps/x86/Makefile +++ b/sysdeps/x86/Makefile @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features tests += tst-get-cpu-features tests-static += tst-get-cpu-features-static endif + +ifeq ($(subdir),setjmp) +sysdep_routines += __longjmp_cancel +endif diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S new file mode 100644 index 0000000000..b57dbfa376 --- /dev/null +++ b/sysdeps/x86/__longjmp_cancel.S @@ -0,0 +1,20 @@ +/* __longjmp_cancel for x86. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#define __longjmp __longjmp_cancel +#include <__longjmp.S> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c new file mode 100644 index 0000000000..a53f31e1dd --- /dev/null +++ b/sysdeps/x86/longjmp.c @@ -0,0 +1,45 @@ +/* __libc_siglongjmp for x86. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#define __libc_longjmp __redirect___libc_longjmp +#include <setjmp/longjmp.c> +#undef __libc_longjmp + +extern void __longjmp_cancel (__jmp_buf __env, int __val) + __attribute__ ((__noreturn__)) attribute_hidden; + +/* Since __libc_longjmp is a private interface for cancellation + implementation in libpthread, there is no need to restore shadow + stack register. */ + +void +__libc_longjmp (sigjmp_buf env, int val) +{ + /* Perform any cleanups needed by the frames being unwound. */ + _longjmp_unwind (env, val); + + if (env[0].__mask_was_saved) + /* Restore the saved signal mask. */ + (void) __sigprocmask (SIG_SETMASK, + (sigset_t *) &env[0].__saved_mask, + (sigset_t *) NULL); + + /* Call the machine-dependent function to restore machine state + without shadow stack. */ + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); +} diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c new file mode 100644 index 0000000000..7eb8651cfe --- /dev/null +++ b/sysdeps/x86/nptl/pt-longjmp.c @@ -0,0 +1,97 @@ +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. + X86 version. + Copyright (C) 18 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +/* <nptl/descr.h> has + +struct pthread_unwind_buf +{ + struct + { + __jmp_buf jmp_buf; + int mask_was_saved; + } cancel_jmp_buf[1]; + + union + { + void *pad[4]; + struct + { + struct pthread_unwind_buf *prev; + struct _pthread_cleanup_buffer *cleanup; + int canceltype; + } data; + } priv; +}; + + The pad array in struct pthread_unwind_buf is used by setjmp to save + shadow stack register. Assert that size of struct pthread_unwind_buf + is no less than offset of shadow stack pointer plus shadow stack + pointer size. + + NB: setjmp is called in libpthread to save shadow stack register. But + __libc_unwind_longjmp doesn't restore shadow stack register since they + never return after longjmp. */ + +#include <pthreadP.h> +#include <jmp_buf-ssp.h> + +#ifdef __x86_64__ +# define SHADOW_STACK_POINTER_SIZE 8 +#else +# define SHADOW_STACK_POINTER_SIZE 4 +#endif + +_Static_assert ((sizeof (struct pthread_unwind_buf) + >= (SHADOW_STACK_POINTER_OFFSET + + SHADOW_STACK_POINTER_SIZE)), + "size of struct pthread_unwind_buf < " + "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); + +#include <shlib-compat.h> + +/* libpthread once had its own longjmp (and siglongjmp alias), though there + was no apparent reason for it. There is no use in having a separate + symbol in libpthread, but the historical ABI requires it. For static + linking, there is no need to provide anything here--the libc version + will be linked in. For shared library ABI compatibility, there must be + longjmp and siglongjmp symbols in libpthread.so. + + With an IFUNC resolver, it would be possible to avoid the indirection, + but the IFUNC resolver might run before the __libc_longjmp symbol has + been relocated, in which case the IFUNC resolver would not be able to + provide the correct address. */ + +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) + +static void __attribute__ ((noreturn, used)) +longjmp_compat (jmp_buf env, int val) +{ + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since + __libc_longjmp is a private interface for cancellation which + doesn't restore shadow stack register. */ + __libc_siglongjmp (env, val); +} + +strong_alias (longjmp_compat, longjmp_alias) +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); + +strong_alias (longjmp_alias, siglongjmp_alias) +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); + +#endif -- 2.14.3 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-03-30 17:41 [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register H.J. Lu @ 2018-04-06 4:46 ` Carlos O'Donell 2018-04-06 12:59 ` H.J. Lu 0 siblings, 1 reply; 13+ messages in thread From: Carlos O'Donell @ 2018-04-06 4:46 UTC (permalink / raw) To: H.J. Lu, Florian Weimer; +Cc: Joseph Myers, Tsimbalist, Igor V, GNU C Library On 03/30/2018 12:41 PM, H.J. Lu wrote: > On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >> * H. J. Lu: >> >>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >>>> * H. J. Lu: >>>> >>>>> You need to make a choice. You either don't introduce a new symbol >>>>> version or don't save shadow stack for thread cancellation. You >>>>> can't have both. >>>> I don't understand. We have room to save the shadow stack pointer in >>>> the existing struct. >>> No, we don't have room in struct pthread_unwind_buf: >>> >>> Note: There is an unused pointer space in pthread_unwind_buf_data. But >>> it isn't suitable for saving and restoring shadow stack register since >>> x32 is a 64-bit process with 32-bit software pointer and kernel may >>> place x32 shadow stack above 4GB. We need to save and restore 64-bit >>> shadow stack register for x32. >> We have for void * fields. They are subsequently overwritten by >> __pthread_register_cancel. But __sigsetjmp can write to them first >> without causing any harm. We just need a private __longjmp_cancel >> that doesn't restore the shadow stack pointer. > Here is the patch which does that. Any comments? OK, so I have reviewed https://github.com/hjl-tools/glibc/ hjl/cet/master, please confirm that this is the correct branch of the required implementation for CET. It really helps to review the rest of the patch set, you should be preparing this as a patch set instead of having it reviewed one-at-a-time. This issue was already raised in the thread with Zack. Architecture: * We avoid a "flag day" with feature_1 TCB flag to switch to a new ABI, which we have discussed is a fragile process which should be avoided if a supportable alternative solution exists. * We avoid versioned symbols, this makes CET backportable, and this has a bigger benefit for long-term stable distributions. * A key design problem has been that cancellation jump buffers within glibc are truncated to optimize on-stack size, and this means that setjmp could write beyond the structure because setjmp now tries to save the shadowstack pointer into space that the cancellation jump buffer did not allocate. For the record this optimization seems premature and I'm sad we did it, this is a lesson we should learn from. * We have all agreed to the following concepts: * The cancellation process, in particular the unwinder, never returns from any of the functions we call, it just keeps calling into the unwinder to jump to the next unwound cancellation function all the way to the thread start routine. Therefore because we never return from one of these functions we never need to restore the shadow stack, and consequently wherever it is stored in the cancellation jump buffer can be overwritten if we need the space (it's a dead store). * The corollary to this is that function calls made from cancellation handlers will continue to advance the shadowstack from the deepest point at which cancellation is initiated from. This means that the depth of the shadowstack doesn't match the depth of the real stack while we are unwinding. I don't know if this will have consequences on analysis tooling or not, or debug tools during unwinding. It's a fairly advanced situation and corner case, and restoring the shadowstack is not useful becuase we don't need it and simplifies the implementation. * The cancellation jump buffer has private data used for chaining the cancel jump buffers together such that the custom unwinder can follow them and call them in sequence. This space constitutes 4 void *'s which is space that setjmp can write to, because we will just overwrite it when we register the cancel handler. * If the new shadowstack-enabled setjmp stores the shadowstack pointer into the space taken by the 4 void*'s then we won't overflow the stack, and we don't need to change the layout of the cancellation jump buffer. The 4 void*'s are sufficient, even for x32 to write a 64-bit shadow stack address. * After fixing the cancellation jump buffers the following work needs to be reviewed: * Add feature_1 in tcb head to track CET status and make it easily available to runtime for checking. * Save and restore shadowstack in setjmp/longjmp. * Add CET support to ld.so et. al. and track runtime status. * Adjust vfork for shadow stack usage. * Add ENDBR or NOTRACK where required in assembly. * CET and makecontext incompatible. - Probably need to discuss which default is appropriate. - Should the user get CET automatically disabled in makecontext() et. al. silently? - Should your current solution, which is to error out during the build, and require flag changes, be the default? This forces the user to review the security for their application. * prctl for CET. * The work to review after this patch appears to be less contentious in terms of the kinds of changes that are required. Most of the changes are internal details of enabling CET and not ABI details, with the exception of the possible pain we might cause with makecontext() being unsupported and what default position to take there. Design: * Overall the implementation looks exactly how I might expect it to look, but some of the math that places the shadowstack pointer appears to need either commenting or fixing because I don't understand it. You need to make it easy for me to see that we have placed the shadowstack pointer into the 4 pad words. Details: * One comment needs filling out a bit more, noted below. > 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch > > > From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001 > From: "H.J. Lu" <hjl.tools@gmail.com> > Date: Sat, 24 Feb 2018 17:23:54 -0800 > Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack > register > > The pad array in struct pthread_unwind_buf is used by setjmp to save > shadow stack register. We assert that size of struct pthread_unwind_buf > is no less than offset of shadow stack pointer + shadow stack pointer > size. > OK. > Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as > these with thread cancellation, call setjmp, but never return after > __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as > __libc_longjmp on x86, doesn't need to restore shadow stack register. OK. > __libc_longjmp, which is a private interface for thread cancellation > implementation in libpthread, is changed to call __longjmp_cancel, > instead of __longjmp. __longjmp_cancel is a new internal function > in libc, which is similar to __longjmp, but doesn't restore shadow > stack register. OK. Good. I like the use of a __longjmp_cancel name to call out what's going on in the API (clear semantics). > > The compatibility longjmp and siglongjmp in libpthread.so are changed > to call __libc_siglongjmp, instead of __libc_longjmp, so that they will > restore shadow stack register. OK. > > Tested with build-many-glibcs.py. > > * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous > handlers after setjmp. > * setjmp/longjmp.c (__libc_longjmp): Don't define alias if > defined. > * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG): > Changed to 97. > * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. > * sysdeps/x86/__longjmp_cancel.S: New file. > * sysdeps/x86/longjmp.c: Likewise. > * sysdeps/x86/nptl/pt-longjmp.c: Likewise. This looks much better. > --- > nptl/pthread_create.c | 9 ++-- > setjmp/longjmp.c | 2 + > sysdeps/unix/sysv/linux/x86/setjmpP.h | 4 +- > sysdeps/x86/Makefile | 4 ++ > sysdeps/x86/__longjmp_cancel.S | 20 ++++++++ > sysdeps/x86/longjmp.c | 45 ++++++++++++++++ > sysdeps/x86/nptl/pt-longjmp.c | 97 +++++++++++++++++++++++++++++++++++ > 7 files changed, 176 insertions(+), 5 deletions(-) > create mode 100644 sysdeps/x86/__longjmp_cancel.S > create mode 100644 sysdeps/x86/longjmp.c > create mode 100644 sysdeps/x86/nptl/pt-longjmp.c > > diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c > index caaf07c134..1c5b3780c6 100644 > --- a/nptl/pthread_create.c > +++ b/nptl/pthread_create.c > @@ -427,12 +427,15 @@ START_THREAD_DEFN > compilers without that support we do use setjmp. */ > struct pthread_unwind_buf unwind_buf; > > - /* No previous handlers. */ > + int not_first_call; > + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); > + > + /* No previous handlers. NB: This must be done after setjmp since > + the same space may be used by setjmp to store extra data which > + should never be used by __libc_unwind_longjmp. */ Suggest: ~~~ No previous handlers. NB: This must be done after setjmp since the private space in the unwind jump buffer may overlap space used by setjmp to store extra architecture-specific information which is never be used by the cancellation-specific __libc_unwind_longjmp. The private space is allowed to overlap because the unwinder never has to return through any of the jumped-to call frames, and thus only a minimum amount of saved data need be stored, and for example, need not include the process signal mask information. This is all an optimization to reduce stack usage when pushing cancellation handlers. ~~~ > unwind_buf.priv.data.prev = NULL; > unwind_buf.priv.data.cleanup = NULL; > > - int not_first_call; > - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); OK. > if (__glibc_likely (! not_first_call)) > { > /* Store the new cleanup handler info. */ > diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c > index a2a7065a85..453889e103 100644 > --- a/setjmp/longjmp.c > +++ b/setjmp/longjmp.c > @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) > } > > #ifndef __libc_siglongjmp > +# ifndef __libc_longjmp > /* __libc_longjmp is a private interface for cancellation implementation > in libpthread. */ > strong_alias (__libc_siglongjmp, __libc_longjmp) > +# endif OK. > weak_alias (__libc_siglongjmp, _longjmp) > weak_alias (__libc_siglongjmp, longjmp) > weak_alias (__libc_siglongjmp, siglongjmp) > diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h > index c0ed767a0d..90a6bbcf32 100644 > --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h > +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h > @@ -22,8 +22,8 @@ > #include <bits/types/__sigset_t.h> > > /* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. > - Define it to 513 to leave some rooms for future use. */ > -#define _JUMP_BUF_SIGSET_NSIG 513 > + Define it to 97 to leave some rooms for future use. */ OK. > +#define _JUMP_BUF_SIGSET_NSIG 97 Please provide proof in the way of a comment or rewriting this constant to show that it places the shadow stack pointer on both x86_64 and x32 into the range of the private pad. Also, from commit f33632ccd1dec3217583fcfdd965afb62954203c, where did this math come from? ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) Why the +7? > /* Number of longs to hold all signals. */ > #define _JUMP_BUF_SIGSET_NWORDS \ > ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) > diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile > index 0d0326c21a..d25d6f0ae4 100644 > --- a/sysdeps/x86/Makefile > +++ b/sysdeps/x86/Makefile > @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features > tests += tst-get-cpu-features > tests-static += tst-get-cpu-features-static > endif > + > +ifeq ($(subdir),setjmp) > +sysdep_routines += __longjmp_cancel > +endif OK. > diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S > new file mode 100644 > index 0000000000..b57dbfa376 > --- /dev/null > +++ b/sysdeps/x86/__longjmp_cancel.S > @@ -0,0 +1,20 @@ > +/* __longjmp_cancel for x86. > + Copyright (C) 2018 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#define __longjmp __longjmp_cancel > +#include <__longjmp.S> OK. > diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c > new file mode 100644 > index 0000000000..a53f31e1dd > --- /dev/null > +++ b/sysdeps/x86/longjmp.c > @@ -0,0 +1,45 @@ > +/* __libc_siglongjmp for x86. > + Copyright (C) 2018 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#define __libc_longjmp __redirect___libc_longjmp > +#include <setjmp/longjmp.c> > +#undef __libc_longjmp > + > +extern void __longjmp_cancel (__jmp_buf __env, int __val) > + __attribute__ ((__noreturn__)) attribute_hidden; > + > +/* Since __libc_longjmp is a private interface for cancellation > + implementation in libpthread, there is no need to restore shadow > + stack register. */ > + > +void > +__libc_longjmp (sigjmp_buf env, int val) > +{ > + /* Perform any cleanups needed by the frames being unwound. */ > + _longjmp_unwind (env, val); OK. > + > + if (env[0].__mask_was_saved) > + /* Restore the saved signal mask. */ > + (void) __sigprocmask (SIG_SETMASK, > + (sigset_t *) &env[0].__saved_mask, > + (sigset_t *) NULL); OK. > + > + /* Call the machine-dependent function to restore machine state > + without shadow stack. */ > + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); OK. > +} > diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c > new file mode 100644 > index 0000000000..7eb8651cfe > --- /dev/null > +++ b/sysdeps/x86/nptl/pt-longjmp.c > @@ -0,0 +1,97 @@ > +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. > + X86 version. > + Copyright (C) 18 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +/* <nptl/descr.h> has > + > +struct pthread_unwind_buf > +{ > + struct > + { > + __jmp_buf jmp_buf; > + int mask_was_saved; > + } cancel_jmp_buf[1]; > + > + union > + { > + void *pad[4]; > + struct > + { > + struct pthread_unwind_buf *prev; > + struct _pthread_cleanup_buffer *cleanup; > + int canceltype; > + } data; > + } priv; > +}; > + > + The pad array in struct pthread_unwind_buf is used by setjmp to save This appears to be SHADOW_STACK_POINTER_OFFSET in subsequent patches. However on your hjl/cet/master branch it appears that this offset is not defined to be *just after* the mask_was_saved? > + shadow stack register. Assert that size of struct pthread_unwind_buf > + is no less than offset of shadow stack pointer plus shadow stack > + pointer size. > + > + NB: setjmp is called in libpthread to save shadow stack register. But > + __libc_unwind_longjmp doesn't restore shadow stack register since they > + never return after longjmp. */ > + > +#include <pthreadP.h> > +#include <jmp_buf-ssp.h> > + > +#ifdef __x86_64__ > +# define SHADOW_STACK_POINTER_SIZE 8 > +#else > +# define SHADOW_STACK_POINTER_SIZE 4 > +#endif > + > +_Static_assert ((sizeof (struct pthread_unwind_buf) > + >= (SHADOW_STACK_POINTER_OFFSET > + + SHADOW_STACK_POINTER_SIZE)), > + "size of struct pthread_unwind_buf < " > + "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); OK. > + > +#include <shlib-compat.h> > + > +/* libpthread once had its own longjmp (and siglongjmp alias), though there > + was no apparent reason for it. There is no use in having a separate > + symbol in libpthread, but the historical ABI requires it. For static > + linking, there is no need to provide anything here--the libc version > + will be linked in. For shared library ABI compatibility, there must be > + longjmp and siglongjmp symbols in libpthread.so. > + > + With an IFUNC resolver, it would be possible to avoid the indirection, > + but the IFUNC resolver might run before the __libc_longjmp symbol has > + been relocated, in which case the IFUNC resolver would not be able to > + provide the correct address. */ > + > +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) > + > +static void __attribute__ ((noreturn, used)) > +longjmp_compat (jmp_buf env, int val) > +{ > + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since > + __libc_longjmp is a private interface for cancellation which > + doesn't restore shadow stack register. */ > + __libc_siglongjmp (env, val); OK. > +} > + > +strong_alias (longjmp_compat, longjmp_alias) > +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); > + > +strong_alias (longjmp_alias, siglongjmp_alias) > +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); > + > +#endif > -- 2.14.3 -- Cheers, Carlos. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-04-06 4:46 ` Carlos O'Donell @ 2018-04-06 12:59 ` H.J. Lu 2018-04-06 20:26 ` H.J. Lu ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: H.J. Lu @ 2018-04-06 12:59 UTC (permalink / raw) To: Carlos O'Donell Cc: Florian Weimer, Joseph Myers, Tsimbalist, Igor V, GNU C Library On Thu, Apr 5, 2018 at 9:46 PM, Carlos O'Donell <carlos@redhat.com> wrote: > On 03/30/2018 12:41 PM, H.J. Lu wrote: >> On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >>> * H. J. Lu: >>> >>>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >>>>> * H. J. Lu: >>>>> >>>>>> You need to make a choice. You either don't introduce a new symbol >>>>>> version or don't save shadow stack for thread cancellation. You >>>>>> can't have both. >>>>> I don't understand. We have room to save the shadow stack pointer in >>>>> the existing struct. >>>> No, we don't have room in struct pthread_unwind_buf: >>>> >>>> Note: There is an unused pointer space in pthread_unwind_buf_data. But >>>> it isn't suitable for saving and restoring shadow stack register since >>>> x32 is a 64-bit process with 32-bit software pointer and kernel may >>>> place x32 shadow stack above 4GB. We need to save and restore 64-bit >>>> shadow stack register for x32. >>> We have for void * fields. They are subsequently overwritten by >>> __pthread_register_cancel. But __sigsetjmp can write to them first >>> without causing any harm. We just need a private __longjmp_cancel >>> that doesn't restore the shadow stack pointer. >> Here is the patch which does that. Any comments? > > OK, so I have reviewed https://github.com/hjl-tools/glibc/ hjl/cet/master, > please confirm that this is the correct branch of the required implementation > for CET. It really helps to review the rest of the patch set, you should be > preparing this as a patch set instead of having it reviewed one-at-a-time. > This issue was already raised in the thread with Zack. Thanks for your feedbacks. > Architecture: > > * We avoid a "flag day" with feature_1 TCB flag to switch to a new ABI, > which we have discussed is a fragile process which should be avoided if > a supportable alternative solution exists. > > * We avoid versioned symbols, this makes CET backportable, and this has a > bigger benefit for long-term stable distributions. > > * A key design problem has been that cancellation jump buffers within glibc > are truncated to optimize on-stack size, and this means that setjmp could > write beyond the structure because setjmp now tries to save the shadowstack > pointer into space that the cancellation jump buffer did not allocate. > For the record this optimization seems premature and I'm sad we did it, this > is a lesson we should learn from. > > * We have all agreed to the following concepts: > > * The cancellation process, in particular the unwinder, never returns from > any of the functions we call, it just keeps calling into the unwinder to > jump to the next unwound cancellation function all the way to the thread > start routine. Therefore because we never return from one of these functions > we never need to restore the shadow stack, and consequently wherever it is > stored in the cancellation jump buffer can be overwritten if we need the > space (it's a dead store). > > * The corollary to this is that function calls made from cancellation handlers > will continue to advance the shadowstack from the deepest point at which > cancellation is initiated from. This means that the depth of the shadowstack > doesn't match the depth of the real stack while we are unwinding. I don't > know if this will have consequences on analysis tooling or not, or debug > tools during unwinding. It's a fairly advanced situation and corner case, > and restoring the shadowstack is not useful becuase we don't need it and > simplifies the implementation. > > * The cancellation jump buffer has private data used for chaining the cancel > jump buffers together such that the custom unwinder can follow them and > call them in sequence. This space constitutes 4 void *'s which is space > that setjmp can write to, because we will just overwrite it when we register > the cancel handler. > > * If the new shadowstack-enabled setjmp stores the shadowstack pointer into > the space taken by the 4 void*'s then we won't overflow the stack, and we > don't need to change the layout of the cancellation jump buffer. The 4 void*'s > are sufficient, even for x32 to write a 64-bit shadow stack address. > > * After fixing the cancellation jump buffers the following work needs to be reviewed: > > * Add feature_1 in tcb head to track CET status and make it easily available > to runtime for checking. > > * Save and restore shadowstack in setjmp/longjmp. > > * Add CET support to ld.so et. al. and track runtime status. > > * Adjust vfork for shadow stack usage. > > * Add ENDBR or NOTRACK where required in assembly. > > * CET and makecontext incompatible. > - Probably need to discuss which default is appropriate. > - Should the user get CET automatically disabled in makecontext() et. al. silently? > - Should your current solution, which is to error out during the build, and require > flag changes, be the default? This forces the user to review the security for their > application. I'd like to reserve 4 slots in ucontext for shadow stack: https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1 It should be binary backward compatible. I will investigate if there is a way to support shadow stack with existing API. Otherwise, we need to add a new API for ucontext functions with shadow stack. > * prctl for CET. We have been experimenting different approaches to get the best implementation. I am expecting that this patch may change as we collect more data. > * The work to review after this patch appears to be less contentious in terms of > the kinds of changes that are required. Most of the changes are internal details > of enabling CET and not ABI details, with the exception of the possible pain we > might cause with makecontext() being unsupported and what default position to take > there. > > Design: > > * Overall the implementation looks exactly how I might expect it to look, but some > of the math that places the shadowstack pointer appears to need either commenting > or fixing because I don't understand it. You need to make it easy for me to see > that we have placed the shadowstack pointer into the 4 pad words. > > Details: > > * One comment needs filling out a bit more, noted below. > >> 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch >> >> >> From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001 >> From: "H.J. Lu" <hjl.tools@gmail.com> >> Date: Sat, 24 Feb 2018 17:23:54 -0800 >> Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack >> register >> >> The pad array in struct pthread_unwind_buf is used by setjmp to save >> shadow stack register. We assert that size of struct pthread_unwind_buf >> is no less than offset of shadow stack pointer + shadow stack pointer >> size. >> > > OK. > >> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as >> these with thread cancellation, call setjmp, but never return after >> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as >> __libc_longjmp on x86, doesn't need to restore shadow stack register. > > OK. > >> __libc_longjmp, which is a private interface for thread cancellation >> implementation in libpthread, is changed to call __longjmp_cancel, >> instead of __longjmp. __longjmp_cancel is a new internal function >> in libc, which is similar to __longjmp, but doesn't restore shadow >> stack register. > > OK. Good. I like the use of a __longjmp_cancel name to call out what's > going on in the API (clear semantics). > >> >> The compatibility longjmp and siglongjmp in libpthread.so are changed >> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will >> restore shadow stack register. > > OK. > >> >> Tested with build-many-glibcs.py. >> >> * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous >> handlers after setjmp. >> * setjmp/longjmp.c (__libc_longjmp): Don't define alias if >> defined. >> * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG): >> Changed to 97. >> * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. >> * sysdeps/x86/__longjmp_cancel.S: New file. >> * sysdeps/x86/longjmp.c: Likewise. >> * sysdeps/x86/nptl/pt-longjmp.c: Likewise. > > This looks much better. > >> --- >> nptl/pthread_create.c | 9 ++-- >> setjmp/longjmp.c | 2 + >> sysdeps/unix/sysv/linux/x86/setjmpP.h | 4 +- >> sysdeps/x86/Makefile | 4 ++ >> sysdeps/x86/__longjmp_cancel.S | 20 ++++++++ >> sysdeps/x86/longjmp.c | 45 ++++++++++++++++ >> sysdeps/x86/nptl/pt-longjmp.c | 97 +++++++++++++++++++++++++++++++++++ >> 7 files changed, 176 insertions(+), 5 deletions(-) >> create mode 100644 sysdeps/x86/__longjmp_cancel.S >> create mode 100644 sysdeps/x86/longjmp.c >> create mode 100644 sysdeps/x86/nptl/pt-longjmp.c >> >> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c >> index caaf07c134..1c5b3780c6 100644 >> --- a/nptl/pthread_create.c >> +++ b/nptl/pthread_create.c >> @@ -427,12 +427,15 @@ START_THREAD_DEFN >> compilers without that support we do use setjmp. */ >> struct pthread_unwind_buf unwind_buf; >> >> - /* No previous handlers. */ >> + int not_first_call; >> + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); >> + >> + /* No previous handlers. NB: This must be done after setjmp since >> + the same space may be used by setjmp to store extra data which >> + should never be used by __libc_unwind_longjmp. */ > > Suggest: > ~~~ > No previous handlers. NB: This must be done after setjmp since > the private space in the unwind jump buffer may overlap space > used by setjmp to store extra architecture-specific information > which is never be used by the cancellation-specific > __libc_unwind_longjmp. > > The private space is allowed to overlap because the unwinder never > has to return through any of the jumped-to call frames, and thus > only a minimum amount of saved data need be stored, and for example, > need not include the process signal mask information. This is all > an optimization to reduce stack usage when pushing cancellation > handlers. > ~~~ Will fix it. >> unwind_buf.priv.data.prev = NULL; >> unwind_buf.priv.data.cleanup = NULL; >> >> - int not_first_call; >> - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); > > OK. > >> if (__glibc_likely (! not_first_call)) >> { >> /* Store the new cleanup handler info. */ >> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c >> index a2a7065a85..453889e103 100644 >> --- a/setjmp/longjmp.c >> +++ b/setjmp/longjmp.c >> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) >> } >> >> #ifndef __libc_siglongjmp >> +# ifndef __libc_longjmp >> /* __libc_longjmp is a private interface for cancellation implementation >> in libpthread. */ >> strong_alias (__libc_siglongjmp, __libc_longjmp) >> +# endif > > OK. > >> weak_alias (__libc_siglongjmp, _longjmp) >> weak_alias (__libc_siglongjmp, longjmp) >> weak_alias (__libc_siglongjmp, siglongjmp) >> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h >> index c0ed767a0d..90a6bbcf32 100644 >> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h >> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h >> @@ -22,8 +22,8 @@ >> #include <bits/types/__sigset_t.h> >> >> /* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. >> - Define it to 513 to leave some rooms for future use. */ >> -#define _JUMP_BUF_SIGSET_NSIG 513 >> + Define it to 97 to leave some rooms for future use. */ > > OK. > >> +#define _JUMP_BUF_SIGSET_NSIG 97 > > Please provide proof in the way of a comment or rewriting this constant > to show that it places the shadow stack pointer on both x86_64 and x32 > into the range of the private pad. sysdeps/x86/nptl/pt-longjmp.c has _Static_assert ((sizeof (struct pthread_unwind_buf) >= (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)), "size of struct pthread_unwind_buf < " "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); If shadow stack pointer is saved in the offset bigger than the size of struct pthread_unwind_buf, assert will trigger at compile-time. > Also, from commit f33632ccd1dec3217583fcfdd965afb62954203c, > where did this math come from? > > ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) > > Why the +7? _JUMP_BUF_SIGSET_NSIG is the biggest signal number + 1. _JUMP_BUF_SIGSET_NSIG - 1 gives the biggest signal number. _JUMP_BUF_SIGSET_NSIG - 1 + 7 rounds up to the number of bytes which are needed to store the biggest signal number. >> /* Number of longs to hold all signals. */ >> #define _JUMP_BUF_SIGSET_NWORDS \ >> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) >> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile >> index 0d0326c21a..d25d6f0ae4 100644 >> --- a/sysdeps/x86/Makefile >> +++ b/sysdeps/x86/Makefile >> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features >> tests += tst-get-cpu-features >> tests-static += tst-get-cpu-features-static >> endif >> + >> +ifeq ($(subdir),setjmp) >> +sysdep_routines += __longjmp_cancel >> +endif > > OK. > >> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S >> new file mode 100644 >> index 0000000000..b57dbfa376 >> --- /dev/null >> +++ b/sysdeps/x86/__longjmp_cancel.S >> @@ -0,0 +1,20 @@ >> +/* __longjmp_cancel for x86. >> + Copyright (C) 2018 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + <http://www.gnu.org/licenses/>. */ >> + >> +#define __longjmp __longjmp_cancel >> +#include <__longjmp.S> > > OK. > >> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c >> new file mode 100644 >> index 0000000000..a53f31e1dd >> --- /dev/null >> +++ b/sysdeps/x86/longjmp.c >> @@ -0,0 +1,45 @@ >> +/* __libc_siglongjmp for x86. >> + Copyright (C) 2018 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + <http://www.gnu.org/licenses/>. */ >> + >> +#define __libc_longjmp __redirect___libc_longjmp >> +#include <setjmp/longjmp.c> >> +#undef __libc_longjmp >> + >> +extern void __longjmp_cancel (__jmp_buf __env, int __val) >> + __attribute__ ((__noreturn__)) attribute_hidden; >> + >> +/* Since __libc_longjmp is a private interface for cancellation >> + implementation in libpthread, there is no need to restore shadow >> + stack register. */ >> + >> +void >> +__libc_longjmp (sigjmp_buf env, int val) >> +{ >> + /* Perform any cleanups needed by the frames being unwound. */ >> + _longjmp_unwind (env, val); > > OK. > >> + >> + if (env[0].__mask_was_saved) >> + /* Restore the saved signal mask. */ >> + (void) __sigprocmask (SIG_SETMASK, >> + (sigset_t *) &env[0].__saved_mask, >> + (sigset_t *) NULL); > > OK. > >> + >> + /* Call the machine-dependent function to restore machine state >> + without shadow stack. */ >> + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); > > OK. > >> +} >> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c >> new file mode 100644 >> index 0000000000..7eb8651cfe >> --- /dev/null >> +++ b/sysdeps/x86/nptl/pt-longjmp.c >> @@ -0,0 +1,97 @@ >> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. >> + X86 version. >> + Copyright (C) 18 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + <http://www.gnu.org/licenses/>. */ >> + >> +/* <nptl/descr.h> has >> + >> +struct pthread_unwind_buf >> +{ >> + struct >> + { >> + __jmp_buf jmp_buf; >> + int mask_was_saved; >> + } cancel_jmp_buf[1]; >> + >> + union >> + { >> + void *pad[4]; >> + struct >> + { >> + struct pthread_unwind_buf *prev; >> + struct _pthread_cleanup_buffer *cleanup; >> + int canceltype; >> + } data; >> + } priv; >> +}; >> + >> + The pad array in struct pthread_unwind_buf is used by setjmp to save > > This appears to be SHADOW_STACK_POINTER_OFFSET in subsequent patches. > > However on your hjl/cet/master branch it appears that this offset is not > defined to be *just after* the mask_was_saved? sysdeps/unix/sysv/linux/x86/setjmpP.h has typedef struct { unsigned long int __val[_JUMP_BUF_SIGSET_NWORDS]; } __jmp_buf_sigset_t; typedef union { __sigset_t __saved_mask_compat; struct { __jmp_buf_sigset_t __saved_mask; /* Used for shadow stack pointer. */ unsigned long int __shadow_stack_pointer; } __saved; } __jmpbuf_arch_t; __shadow_stack_pointer is placed after __saved_mask, aka mask_was_saved. >> + shadow stack register. Assert that size of struct pthread_unwind_buf >> + is no less than offset of shadow stack pointer plus shadow stack >> + pointer size. >> + >> + NB: setjmp is called in libpthread to save shadow stack register. But >> + __libc_unwind_longjmp doesn't restore shadow stack register since they >> + never return after longjmp. */ >> + >> +#include <pthreadP.h> >> +#include <jmp_buf-ssp.h> >> + >> +#ifdef __x86_64__ >> +# define SHADOW_STACK_POINTER_SIZE 8 >> +#else >> +# define SHADOW_STACK_POINTER_SIZE 4 >> +#endif >> + >> +_Static_assert ((sizeof (struct pthread_unwind_buf) >> + >= (SHADOW_STACK_POINTER_OFFSET >> + + SHADOW_STACK_POINTER_SIZE)), >> + "size of struct pthread_unwind_buf < " >> + "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); > > OK. > >> + >> +#include <shlib-compat.h> >> + >> +/* libpthread once had its own longjmp (and siglongjmp alias), though there >> + was no apparent reason for it. There is no use in having a separate >> + symbol in libpthread, but the historical ABI requires it. For static >> + linking, there is no need to provide anything here--the libc version >> + will be linked in. For shared library ABI compatibility, there must be >> + longjmp and siglongjmp symbols in libpthread.so. >> + >> + With an IFUNC resolver, it would be possible to avoid the indirection, >> + but the IFUNC resolver might run before the __libc_longjmp symbol has >> + been relocated, in which case the IFUNC resolver would not be able to >> + provide the correct address. */ >> + >> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) >> + >> +static void __attribute__ ((noreturn, used)) >> +longjmp_compat (jmp_buf env, int val) >> +{ >> + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since >> + __libc_longjmp is a private interface for cancellation which >> + doesn't restore shadow stack register. */ >> + __libc_siglongjmp (env, val); > > OK. > >> +} >> + >> +strong_alias (longjmp_compat, longjmp_alias) >> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); >> + >> +strong_alias (longjmp_alias, siglongjmp_alias) >> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); >> + >> +#endif >> -- 2.14.3 > > > -- > Cheers, > Carlos. -- H.J. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-04-06 12:59 ` H.J. Lu @ 2018-04-06 20:26 ` H.J. Lu 2018-04-12 21:36 ` Carlos O'Donell 2018-04-12 21:36 ` Carlos O'Donell 2018-04-17 20:03 ` [PATCH] " Joseph Myers 2 siblings, 1 reply; 13+ messages in thread From: H.J. Lu @ 2018-04-06 20:26 UTC (permalink / raw) To: Carlos O'Donell Cc: Florian Weimer, Joseph Myers, Tsimbalist, Igor V, GNU C Library [-- Attachment #1: Type: text/plain, Size: 22194 bytes --] On Fri, Apr 6, 2018 at 5:59 AM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Thu, Apr 5, 2018 at 9:46 PM, Carlos O'Donell <carlos@redhat.com> wrote: >> On 03/30/2018 12:41 PM, H.J. Lu wrote: >>> On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >>>> * H. J. Lu: >>>> >>>>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >>>>>> * H. J. Lu: >>>>>> >>>>>>> You need to make a choice. You either don't introduce a new symbol >>>>>>> version or don't save shadow stack for thread cancellation. You >>>>>>> can't have both. >>>>>> I don't understand. We have room to save the shadow stack pointer in >>>>>> the existing struct. >>>>> No, we don't have room in struct pthread_unwind_buf: >>>>> >>>>> Note: There is an unused pointer space in pthread_unwind_buf_data. But >>>>> it isn't suitable for saving and restoring shadow stack register since >>>>> x32 is a 64-bit process with 32-bit software pointer and kernel may >>>>> place x32 shadow stack above 4GB. We need to save and restore 64-bit >>>>> shadow stack register for x32. >>>> We have for void * fields. They are subsequently overwritten by >>>> __pthread_register_cancel. But __sigsetjmp can write to them first >>>> without causing any harm. We just need a private __longjmp_cancel >>>> that doesn't restore the shadow stack pointer. >>> Here is the patch which does that. Any comments? >> >> OK, so I have reviewed https://github.com/hjl-tools/glibc/ hjl/cet/master, >> please confirm that this is the correct branch of the required implementation >> for CET. It really helps to review the rest of the patch set, you should be >> preparing this as a patch set instead of having it reviewed one-at-a-time. >> This issue was already raised in the thread with Zack. > > Thanks for your feedbacks. > >> Architecture: >> >> * We avoid a "flag day" with feature_1 TCB flag to switch to a new ABI, >> which we have discussed is a fragile process which should be avoided if >> a supportable alternative solution exists. >> >> * We avoid versioned symbols, this makes CET backportable, and this has a >> bigger benefit for long-term stable distributions. >> >> * A key design problem has been that cancellation jump buffers within glibc >> are truncated to optimize on-stack size, and this means that setjmp could >> write beyond the structure because setjmp now tries to save the shadowstack >> pointer into space that the cancellation jump buffer did not allocate. >> For the record this optimization seems premature and I'm sad we did it, this >> is a lesson we should learn from. >> >> * We have all agreed to the following concepts: >> >> * The cancellation process, in particular the unwinder, never returns from >> any of the functions we call, it just keeps calling into the unwinder to >> jump to the next unwound cancellation function all the way to the thread >> start routine. Therefore because we never return from one of these functions >> we never need to restore the shadow stack, and consequently wherever it is >> stored in the cancellation jump buffer can be overwritten if we need the >> space (it's a dead store). >> >> * The corollary to this is that function calls made from cancellation handlers >> will continue to advance the shadowstack from the deepest point at which >> cancellation is initiated from. This means that the depth of the shadowstack >> doesn't match the depth of the real stack while we are unwinding. I don't >> know if this will have consequences on analysis tooling or not, or debug >> tools during unwinding. It's a fairly advanced situation and corner case, >> and restoring the shadowstack is not useful becuase we don't need it and >> simplifies the implementation. >> >> * The cancellation jump buffer has private data used for chaining the cancel >> jump buffers together such that the custom unwinder can follow them and >> call them in sequence. This space constitutes 4 void *'s which is space >> that setjmp can write to, because we will just overwrite it when we register >> the cancel handler. >> >> * If the new shadowstack-enabled setjmp stores the shadowstack pointer into >> the space taken by the 4 void*'s then we won't overflow the stack, and we >> don't need to change the layout of the cancellation jump buffer. The 4 void*'s >> are sufficient, even for x32 to write a 64-bit shadow stack address. >> >> * After fixing the cancellation jump buffers the following work needs to be reviewed: >> >> * Add feature_1 in tcb head to track CET status and make it easily available >> to runtime for checking. >> >> * Save and restore shadowstack in setjmp/longjmp. >> >> * Add CET support to ld.so et. al. and track runtime status. >> >> * Adjust vfork for shadow stack usage. >> >> * Add ENDBR or NOTRACK where required in assembly. >> >> * CET and makecontext incompatible. >> - Probably need to discuss which default is appropriate. >> - Should the user get CET automatically disabled in makecontext() et. al. silently? >> - Should your current solution, which is to error out during the build, and require >> flag changes, be the default? This forces the user to review the security for their >> application. > > I'd like to reserve 4 slots in ucontext for shadow stack: > > https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1 > > It should be binary backward compatible. I will investigate if there is a way > to support shadow stack with existing API. Otherwise, we need to add a new > API for ucontext functions with shadow stack. > >> * prctl for CET. > > We have been experimenting different approaches to get the best implementation. > I am expecting that this patch may change as we collect more data. > >> * The work to review after this patch appears to be less contentious in terms of >> the kinds of changes that are required. Most of the changes are internal details >> of enabling CET and not ABI details, with the exception of the possible pain we >> might cause with makecontext() being unsupported and what default position to take >> there. >> >> Design: >> >> * Overall the implementation looks exactly how I might expect it to look, but some >> of the math that places the shadowstack pointer appears to need either commenting >> or fixing because I don't understand it. You need to make it easy for me to see >> that we have placed the shadowstack pointer into the 4 pad words. >> >> Details: >> >> * One comment needs filling out a bit more, noted below. >> >>> 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch >>> >>> >>> From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001 >>> From: "H.J. Lu" <hjl.tools@gmail.com> >>> Date: Sat, 24 Feb 2018 17:23:54 -0800 >>> Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack >>> register >>> >>> The pad array in struct pthread_unwind_buf is used by setjmp to save >>> shadow stack register. We assert that size of struct pthread_unwind_buf >>> is no less than offset of shadow stack pointer + shadow stack pointer >>> size. >>> >> >> OK. >> >>> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as >>> these with thread cancellation, call setjmp, but never return after >>> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as >>> __libc_longjmp on x86, doesn't need to restore shadow stack register. >> >> OK. >> >>> __libc_longjmp, which is a private interface for thread cancellation >>> implementation in libpthread, is changed to call __longjmp_cancel, >>> instead of __longjmp. __longjmp_cancel is a new internal function >>> in libc, which is similar to __longjmp, but doesn't restore shadow >>> stack register. >> >> OK. Good. I like the use of a __longjmp_cancel name to call out what's >> going on in the API (clear semantics). >> >>> >>> The compatibility longjmp and siglongjmp in libpthread.so are changed >>> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will >>> restore shadow stack register. >> >> OK. >> >>> >>> Tested with build-many-glibcs.py. >>> >>> * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous >>> handlers after setjmp. >>> * setjmp/longjmp.c (__libc_longjmp): Don't define alias if >>> defined. >>> * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG): >>> Changed to 97. >>> * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. >>> * sysdeps/x86/__longjmp_cancel.S: New file. >>> * sysdeps/x86/longjmp.c: Likewise. >>> * sysdeps/x86/nptl/pt-longjmp.c: Likewise. >> >> This looks much better. >> >>> --- >>> nptl/pthread_create.c | 9 ++-- >>> setjmp/longjmp.c | 2 + >>> sysdeps/unix/sysv/linux/x86/setjmpP.h | 4 +- >>> sysdeps/x86/Makefile | 4 ++ >>> sysdeps/x86/__longjmp_cancel.S | 20 ++++++++ >>> sysdeps/x86/longjmp.c | 45 ++++++++++++++++ >>> sysdeps/x86/nptl/pt-longjmp.c | 97 +++++++++++++++++++++++++++++++++++ >>> 7 files changed, 176 insertions(+), 5 deletions(-) >>> create mode 100644 sysdeps/x86/__longjmp_cancel.S >>> create mode 100644 sysdeps/x86/longjmp.c >>> create mode 100644 sysdeps/x86/nptl/pt-longjmp.c >>> >>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c >>> index caaf07c134..1c5b3780c6 100644 >>> --- a/nptl/pthread_create.c >>> +++ b/nptl/pthread_create.c >>> @@ -427,12 +427,15 @@ START_THREAD_DEFN >>> compilers without that support we do use setjmp. */ >>> struct pthread_unwind_buf unwind_buf; >>> >>> - /* No previous handlers. */ >>> + int not_first_call; >>> + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); >>> + >>> + /* No previous handlers. NB: This must be done after setjmp since >>> + the same space may be used by setjmp to store extra data which >>> + should never be used by __libc_unwind_longjmp. */ >> >> Suggest: >> ~~~ >> No previous handlers. NB: This must be done after setjmp since >> the private space in the unwind jump buffer may overlap space >> used by setjmp to store extra architecture-specific information >> which is never be used by the cancellation-specific >> __libc_unwind_longjmp. >> >> The private space is allowed to overlap because the unwinder never >> has to return through any of the jumped-to call frames, and thus >> only a minimum amount of saved data need be stored, and for example, >> need not include the process signal mask information. This is all >> an optimization to reduce stack usage when pushing cancellation >> handlers. >> ~~~ > > Will fix it. > >>> unwind_buf.priv.data.prev = NULL; >>> unwind_buf.priv.data.cleanup = NULL; >>> >>> - int not_first_call; >>> - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); >> >> OK. >> >>> if (__glibc_likely (! not_first_call)) >>> { >>> /* Store the new cleanup handler info. */ >>> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c >>> index a2a7065a85..453889e103 100644 >>> --- a/setjmp/longjmp.c >>> +++ b/setjmp/longjmp.c >>> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) >>> } >>> >>> #ifndef __libc_siglongjmp >>> +# ifndef __libc_longjmp >>> /* __libc_longjmp is a private interface for cancellation implementation >>> in libpthread. */ >>> strong_alias (__libc_siglongjmp, __libc_longjmp) >>> +# endif >> >> OK. >> >>> weak_alias (__libc_siglongjmp, _longjmp) >>> weak_alias (__libc_siglongjmp, longjmp) >>> weak_alias (__libc_siglongjmp, siglongjmp) >>> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h >>> index c0ed767a0d..90a6bbcf32 100644 >>> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h >>> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h >>> @@ -22,8 +22,8 @@ >>> #include <bits/types/__sigset_t.h> >>> >>> /* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. >>> - Define it to 513 to leave some rooms for future use. */ >>> -#define _JUMP_BUF_SIGSET_NSIG 513 >>> + Define it to 97 to leave some rooms for future use. */ >> >> OK. >> >>> +#define _JUMP_BUF_SIGSET_NSIG 97 >> >> Please provide proof in the way of a comment or rewriting this constant >> to show that it places the shadow stack pointer on both x86_64 and x32 >> into the range of the private pad. > > sysdeps/x86/nptl/pt-longjmp.c has > > _Static_assert ((sizeof (struct pthread_unwind_buf) >>= (SHADOW_STACK_POINTER_OFFSET > + SHADOW_STACK_POINTER_SIZE)), > "size of struct pthread_unwind_buf < " > "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); > > If shadow stack pointer is saved in the offset bigger than the size of > struct pthread_unwind_buf, assert will trigger at compile-time. > >> Also, from commit f33632ccd1dec3217583fcfdd965afb62954203c, >> where did this math come from? >> >> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) >> >> Why the +7? > > _JUMP_BUF_SIGSET_NSIG is the biggest signal number + 1. > _JUMP_BUF_SIGSET_NSIG - 1 gives the biggest signal number. > _JUMP_BUF_SIGSET_NSIG - 1 + 7 rounds up to the number of bytes > which are needed to store the biggest signal number. > >>> /* Number of longs to hold all signals. */ >>> #define _JUMP_BUF_SIGSET_NWORDS \ >>> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) >>> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile >>> index 0d0326c21a..d25d6f0ae4 100644 >>> --- a/sysdeps/x86/Makefile >>> +++ b/sysdeps/x86/Makefile >>> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features >>> tests += tst-get-cpu-features >>> tests-static += tst-get-cpu-features-static >>> endif >>> + >>> +ifeq ($(subdir),setjmp) >>> +sysdep_routines += __longjmp_cancel >>> +endif >> >> OK. >> >>> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S >>> new file mode 100644 >>> index 0000000000..b57dbfa376 >>> --- /dev/null >>> +++ b/sysdeps/x86/__longjmp_cancel.S >>> @@ -0,0 +1,20 @@ >>> +/* __longjmp_cancel for x86. >>> + Copyright (C) 2018 Free Software Foundation, Inc. >>> + This file is part of the GNU C Library. >>> + >>> + The GNU C Library is free software; you can redistribute it and/or >>> + modify it under the terms of the GNU Lesser General Public >>> + License as published by the Free Software Foundation; either >>> + version 2.1 of the License, or (at your option) any later version. >>> + >>> + The GNU C Library is distributed in the hope that it will be useful, >>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>> + Lesser General Public License for more details. >>> + >>> + You should have received a copy of the GNU Lesser General Public >>> + License along with the GNU C Library; if not, see >>> + <http://www.gnu.org/licenses/>. */ >>> + >>> +#define __longjmp __longjmp_cancel >>> +#include <__longjmp.S> >> >> OK. >> >>> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c >>> new file mode 100644 >>> index 0000000000..a53f31e1dd >>> --- /dev/null >>> +++ b/sysdeps/x86/longjmp.c >>> @@ -0,0 +1,45 @@ >>> +/* __libc_siglongjmp for x86. >>> + Copyright (C) 2018 Free Software Foundation, Inc. >>> + This file is part of the GNU C Library. >>> + >>> + The GNU C Library is free software; you can redistribute it and/or >>> + modify it under the terms of the GNU Lesser General Public >>> + License as published by the Free Software Foundation; either >>> + version 2.1 of the License, or (at your option) any later version. >>> + >>> + The GNU C Library is distributed in the hope that it will be useful, >>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>> + Lesser General Public License for more details. >>> + >>> + You should have received a copy of the GNU Lesser General Public >>> + License along with the GNU C Library; if not, see >>> + <http://www.gnu.org/licenses/>. */ >>> + >>> +#define __libc_longjmp __redirect___libc_longjmp >>> +#include <setjmp/longjmp.c> >>> +#undef __libc_longjmp >>> + >>> +extern void __longjmp_cancel (__jmp_buf __env, int __val) >>> + __attribute__ ((__noreturn__)) attribute_hidden; >>> + >>> +/* Since __libc_longjmp is a private interface for cancellation >>> + implementation in libpthread, there is no need to restore shadow >>> + stack register. */ >>> + >>> +void >>> +__libc_longjmp (sigjmp_buf env, int val) >>> +{ >>> + /* Perform any cleanups needed by the frames being unwound. */ >>> + _longjmp_unwind (env, val); >> >> OK. >> >>> + >>> + if (env[0].__mask_was_saved) >>> + /* Restore the saved signal mask. */ >>> + (void) __sigprocmask (SIG_SETMASK, >>> + (sigset_t *) &env[0].__saved_mask, >>> + (sigset_t *) NULL); >> >> OK. >> >>> + >>> + /* Call the machine-dependent function to restore machine state >>> + without shadow stack. */ >>> + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); >> >> OK. >> >>> +} >>> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c >>> new file mode 100644 >>> index 0000000000..7eb8651cfe >>> --- /dev/null >>> +++ b/sysdeps/x86/nptl/pt-longjmp.c >>> @@ -0,0 +1,97 @@ >>> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. >>> + X86 version. >>> + Copyright (C) 18 Free Software Foundation, Inc. >>> + This file is part of the GNU C Library. >>> + >>> + The GNU C Library is free software; you can redistribute it and/or >>> + modify it under the terms of the GNU Lesser General Public >>> + License as published by the Free Software Foundation; either >>> + version 2.1 of the License, or (at your option) any later version. >>> + >>> + The GNU C Library is distributed in the hope that it will be useful, >>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>> + Lesser General Public License for more details. >>> + >>> + You should have received a copy of the GNU Lesser General Public >>> + License along with the GNU C Library; if not, see >>> + <http://www.gnu.org/licenses/>. */ >>> + >>> +/* <nptl/descr.h> has >>> + >>> +struct pthread_unwind_buf >>> +{ >>> + struct >>> + { >>> + __jmp_buf jmp_buf; >>> + int mask_was_saved; >>> + } cancel_jmp_buf[1]; >>> + >>> + union >>> + { >>> + void *pad[4]; >>> + struct >>> + { >>> + struct pthread_unwind_buf *prev; >>> + struct _pthread_cleanup_buffer *cleanup; >>> + int canceltype; >>> + } data; >>> + } priv; >>> +}; >>> + >>> + The pad array in struct pthread_unwind_buf is used by setjmp to save >> >> This appears to be SHADOW_STACK_POINTER_OFFSET in subsequent patches. >> >> However on your hjl/cet/master branch it appears that this offset is not >> defined to be *just after* the mask_was_saved? > > sysdeps/unix/sysv/linux/x86/setjmpP.h has > > typedef struct > { > unsigned long int __val[_JUMP_BUF_SIGSET_NWORDS]; > } __jmp_buf_sigset_t; > > typedef union > { > __sigset_t __saved_mask_compat; > struct > { > __jmp_buf_sigset_t __saved_mask; > /* Used for shadow stack pointer. */ > unsigned long int __shadow_stack_pointer; > } __saved; > } __jmpbuf_arch_t; > > __shadow_stack_pointer is placed after __saved_mask, aka mask_was_saved. > >>> + shadow stack register. Assert that size of struct pthread_unwind_buf >>> + is no less than offset of shadow stack pointer plus shadow stack >>> + pointer size. >>> + >>> + NB: setjmp is called in libpthread to save shadow stack register. But >>> + __libc_unwind_longjmp doesn't restore shadow stack register since they >>> + never return after longjmp. */ >>> + >>> +#include <pthreadP.h> >>> +#include <jmp_buf-ssp.h> >>> + >>> +#ifdef __x86_64__ >>> +# define SHADOW_STACK_POINTER_SIZE 8 >>> +#else >>> +# define SHADOW_STACK_POINTER_SIZE 4 >>> +#endif >>> + >>> +_Static_assert ((sizeof (struct pthread_unwind_buf) >>> + >= (SHADOW_STACK_POINTER_OFFSET >>> + + SHADOW_STACK_POINTER_SIZE)), >>> + "size of struct pthread_unwind_buf < " >>> + "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); >> >> OK. >> >>> + >>> +#include <shlib-compat.h> >>> + >>> +/* libpthread once had its own longjmp (and siglongjmp alias), though there >>> + was no apparent reason for it. There is no use in having a separate >>> + symbol in libpthread, but the historical ABI requires it. For static >>> + linking, there is no need to provide anything here--the libc version >>> + will be linked in. For shared library ABI compatibility, there must be >>> + longjmp and siglongjmp symbols in libpthread.so. >>> + >>> + With an IFUNC resolver, it would be possible to avoid the indirection, >>> + but the IFUNC resolver might run before the __libc_longjmp symbol has >>> + been relocated, in which case the IFUNC resolver would not be able to >>> + provide the correct address. */ >>> + >>> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) >>> + >>> +static void __attribute__ ((noreturn, used)) >>> +longjmp_compat (jmp_buf env, int val) >>> +{ >>> + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since >>> + __libc_longjmp is a private interface for cancellation which >>> + doesn't restore shadow stack register. */ >>> + __libc_siglongjmp (env, val); >> >> OK. >> >>> +} >>> + >>> +strong_alias (longjmp_compat, longjmp_alias) >>> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); >>> + >>> +strong_alias (longjmp_alias, siglongjmp_alias) >>> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); >>> + >>> +#endif >>> -- 2.14.3 >> >> Here is the updated patch. OK for master? I will submit a CET patch set after this patch is merged. Thanks. -- H.J. [-- Attachment #2: 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch --] [-- Type: text/x-patch, Size: 11739 bytes --] From a1f8a06212dd3c5ed445fdc4e23424b766d41932 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" <hjl.tools@gmail.com> Date: Sat, 24 Feb 2018 17:23:54 -0800 Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register The pad array in struct pthread_unwind_buf is used by setjmp to save shadow stack register. We assert that size of struct pthread_unwind_buf is no less than offset of shadow stack pointer + shadow stack pointer size. Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as these with thread cancellation, call setjmp, but never return after __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as __libc_longjmp on x86, doesn't need to restore shadow stack register. __libc_longjmp, which is a private interface for thread cancellation implementation in libpthread, is changed to call __longjmp_cancel, instead of __longjmp. __longjmp_cancel is a new internal function in libc, which is similar to __longjmp, but doesn't restore shadow stack register. The compatibility longjmp and siglongjmp in libpthread.so are changed to call __libc_siglongjmp, instead of __libc_longjmp, so that they will restore shadow stack register. Tested with build-many-glibcs.py. * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous handlers after setjmp. * setjmp/longjmp.c (__libc_longjmp): Don't define alias if defined. * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG): Changed to 97. * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. * sysdeps/x86/__longjmp_cancel.S: New file. * sysdeps/x86/longjmp.c: Likewise. * sysdeps/x86/nptl/pt-longjmp.c: Likewise. --- nptl/pthread_create.c | 18 +++++-- setjmp/longjmp.c | 2 + sysdeps/unix/sysv/linux/x86/setjmpP.h | 4 +- sysdeps/x86/Makefile | 4 ++ sysdeps/x86/__longjmp_cancel.S | 20 ++++++++ sysdeps/x86/longjmp.c | 45 ++++++++++++++++ sysdeps/x86/nptl/pt-longjmp.c | 97 +++++++++++++++++++++++++++++++++++ 7 files changed, 185 insertions(+), 5 deletions(-) create mode 100644 sysdeps/x86/__longjmp_cancel.S create mode 100644 sysdeps/x86/longjmp.c create mode 100644 sysdeps/x86/nptl/pt-longjmp.c diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index caaf07c134..8b1f06599d 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -427,12 +427,24 @@ START_THREAD_DEFN compilers without that support we do use setjmp. */ struct pthread_unwind_buf unwind_buf; - /* No previous handlers. */ + int not_first_call; + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); + + /* No previous handlers. NB: This must be done after setjmp since + the private space in the unwind jump buffer may overlap space + used by setjmp to store extra architecture-specific information + which is never be used by the cancellation-specific + __libc_unwind_longjmp. + + The private space is allowed to overlap because the unwinder never + has to return through any of the jumped-to call frames, and thus + only a minimum amount of saved data need be stored, and for example, + need not include the process signal mask information. This is all + an optimization to reduce stack usage when pushing cancellation + handlers. */ unwind_buf.priv.data.prev = NULL; unwind_buf.priv.data.cleanup = NULL; - int not_first_call; - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); if (__glibc_likely (! not_first_call)) { /* Store the new cleanup handler info. */ diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c index a2a7065a85..453889e103 100644 --- a/setjmp/longjmp.c +++ b/setjmp/longjmp.c @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) } #ifndef __libc_siglongjmp +# ifndef __libc_longjmp /* __libc_longjmp is a private interface for cancellation implementation in libpthread. */ strong_alias (__libc_siglongjmp, __libc_longjmp) +# endif weak_alias (__libc_siglongjmp, _longjmp) weak_alias (__libc_siglongjmp, longjmp) weak_alias (__libc_siglongjmp, siglongjmp) diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h index c0ed767a0d..90a6bbcf32 100644 --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h @@ -22,8 +22,8 @@ #include <bits/types/__sigset_t.h> /* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. - Define it to 513 to leave some rooms for future use. */ -#define _JUMP_BUF_SIGSET_NSIG 513 + Define it to 97 to leave some rooms for future use. */ +#define _JUMP_BUF_SIGSET_NSIG 97 /* Number of longs to hold all signals. */ #define _JUMP_BUF_SIGSET_NWORDS \ ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile index 0d0326c21a..d25d6f0ae4 100644 --- a/sysdeps/x86/Makefile +++ b/sysdeps/x86/Makefile @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features tests += tst-get-cpu-features tests-static += tst-get-cpu-features-static endif + +ifeq ($(subdir),setjmp) +sysdep_routines += __longjmp_cancel +endif diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S new file mode 100644 index 0000000000..b57dbfa376 --- /dev/null +++ b/sysdeps/x86/__longjmp_cancel.S @@ -0,0 +1,20 @@ +/* __longjmp_cancel for x86. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#define __longjmp __longjmp_cancel +#include <__longjmp.S> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c new file mode 100644 index 0000000000..a53f31e1dd --- /dev/null +++ b/sysdeps/x86/longjmp.c @@ -0,0 +1,45 @@ +/* __libc_siglongjmp for x86. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#define __libc_longjmp __redirect___libc_longjmp +#include <setjmp/longjmp.c> +#undef __libc_longjmp + +extern void __longjmp_cancel (__jmp_buf __env, int __val) + __attribute__ ((__noreturn__)) attribute_hidden; + +/* Since __libc_longjmp is a private interface for cancellation + implementation in libpthread, there is no need to restore shadow + stack register. */ + +void +__libc_longjmp (sigjmp_buf env, int val) +{ + /* Perform any cleanups needed by the frames being unwound. */ + _longjmp_unwind (env, val); + + if (env[0].__mask_was_saved) + /* Restore the saved signal mask. */ + (void) __sigprocmask (SIG_SETMASK, + (sigset_t *) &env[0].__saved_mask, + (sigset_t *) NULL); + + /* Call the machine-dependent function to restore machine state + without shadow stack. */ + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); +} diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c new file mode 100644 index 0000000000..7eb8651cfe --- /dev/null +++ b/sysdeps/x86/nptl/pt-longjmp.c @@ -0,0 +1,97 @@ +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. + X86 version. + Copyright (C) 18 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +/* <nptl/descr.h> has + +struct pthread_unwind_buf +{ + struct + { + __jmp_buf jmp_buf; + int mask_was_saved; + } cancel_jmp_buf[1]; + + union + { + void *pad[4]; + struct + { + struct pthread_unwind_buf *prev; + struct _pthread_cleanup_buffer *cleanup; + int canceltype; + } data; + } priv; +}; + + The pad array in struct pthread_unwind_buf is used by setjmp to save + shadow stack register. Assert that size of struct pthread_unwind_buf + is no less than offset of shadow stack pointer plus shadow stack + pointer size. + + NB: setjmp is called in libpthread to save shadow stack register. But + __libc_unwind_longjmp doesn't restore shadow stack register since they + never return after longjmp. */ + +#include <pthreadP.h> +#include <jmp_buf-ssp.h> + +#ifdef __x86_64__ +# define SHADOW_STACK_POINTER_SIZE 8 +#else +# define SHADOW_STACK_POINTER_SIZE 4 +#endif + +_Static_assert ((sizeof (struct pthread_unwind_buf) + >= (SHADOW_STACK_POINTER_OFFSET + + SHADOW_STACK_POINTER_SIZE)), + "size of struct pthread_unwind_buf < " + "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); + +#include <shlib-compat.h> + +/* libpthread once had its own longjmp (and siglongjmp alias), though there + was no apparent reason for it. There is no use in having a separate + symbol in libpthread, but the historical ABI requires it. For static + linking, there is no need to provide anything here--the libc version + will be linked in. For shared library ABI compatibility, there must be + longjmp and siglongjmp symbols in libpthread.so. + + With an IFUNC resolver, it would be possible to avoid the indirection, + but the IFUNC resolver might run before the __libc_longjmp symbol has + been relocated, in which case the IFUNC resolver would not be able to + provide the correct address. */ + +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) + +static void __attribute__ ((noreturn, used)) +longjmp_compat (jmp_buf env, int val) +{ + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since + __libc_longjmp is a private interface for cancellation which + doesn't restore shadow stack register. */ + __libc_siglongjmp (env, val); +} + +strong_alias (longjmp_compat, longjmp_alias) +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); + +strong_alias (longjmp_alias, siglongjmp_alias) +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); + +#endif -- 2.14.3 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-04-06 20:26 ` H.J. Lu @ 2018-04-12 21:36 ` Carlos O'Donell 0 siblings, 0 replies; 13+ messages in thread From: Carlos O'Donell @ 2018-04-12 21:36 UTC (permalink / raw) To: H.J. Lu; +Cc: Florian Weimer, Joseph Myers, Tsimbalist, Igor V, GNU C Library On 04/06/2018 03:26 PM, H.J. Lu wrote: > Here is the updated patch. OK for master? > > I will submit a CET patch set after this patch is merged. > > Thanks. > I will wait for a conclusion around the incorrect _JUMP_BUF_SIGSET_NWORDS computation. -- Cheers, Carlos. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-04-06 12:59 ` H.J. Lu 2018-04-06 20:26 ` H.J. Lu @ 2018-04-12 21:36 ` Carlos O'Donell 2018-04-12 23:50 ` H.J. Lu 2018-04-17 20:03 ` [PATCH] " Joseph Myers 2 siblings, 1 reply; 13+ messages in thread From: Carlos O'Donell @ 2018-04-12 21:36 UTC (permalink / raw) To: H.J. Lu; +Cc: Florian Weimer, Joseph Myers, Tsimbalist, Igor V, GNU C Library On 04/06/2018 07:59 AM, H.J. Lu wrote: > On Thu, Apr 5, 2018 at 9:46 PM, Carlos O'Donell <carlos@redhat.com> wrote: >> On 03/30/2018 12:41 PM, H.J. Lu wrote: >>> On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >>>> * H. J. Lu: >>>> >>>>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >>>>>> * H. J. Lu: >>>>>> >>>>>>> You need to make a choice. You either don't introduce a new symbol >>>>>>> version or don't save shadow stack for thread cancellation. You >>>>>>> can't have both. >>>>>> I don't understand. We have room to save the shadow stack pointer in >>>>>> the existing struct. >>>>> No, we don't have room in struct pthread_unwind_buf: >>>>> >>>>> Note: There is an unused pointer space in pthread_unwind_buf_data. But >>>>> it isn't suitable for saving and restoring shadow stack register since >>>>> x32 is a 64-bit process with 32-bit software pointer and kernel may >>>>> place x32 shadow stack above 4GB. We need to save and restore 64-bit >>>>> shadow stack register for x32. >>>> We have for void * fields. They are subsequently overwritten by >>>> __pthread_register_cancel. But __sigsetjmp can write to them first >>>> without causing any harm. We just need a private __longjmp_cancel >>>> that doesn't restore the shadow stack pointer. >>> Here is the patch which does that. Any comments? >> >> OK, so I have reviewed https://github.com/hjl-tools/glibc/ hjl/cet/master, >> please confirm that this is the correct branch of the required implementation >> for CET. It really helps to review the rest of the patch set, you should be >> preparing this as a patch set instead of having it reviewed one-at-a-time. >> This issue was already raised in the thread with Zack. > > Thanks for your feedbacks. > >> Architecture: >> >> * We avoid a "flag day" with feature_1 TCB flag to switch to a new ABI, >> which we have discussed is a fragile process which should be avoided if >> a supportable alternative solution exists. >> >> * We avoid versioned symbols, this makes CET backportable, and this has a >> bigger benefit for long-term stable distributions. >> >> * A key design problem has been that cancellation jump buffers within glibc >> are truncated to optimize on-stack size, and this means that setjmp could >> write beyond the structure because setjmp now tries to save the shadowstack >> pointer into space that the cancellation jump buffer did not allocate. >> For the record this optimization seems premature and I'm sad we did it, this >> is a lesson we should learn from. >> >> * We have all agreed to the following concepts: >> >> * The cancellation process, in particular the unwinder, never returns from >> any of the functions we call, it just keeps calling into the unwinder to >> jump to the next unwound cancellation function all the way to the thread >> start routine. Therefore because we never return from one of these functions >> we never need to restore the shadow stack, and consequently wherever it is >> stored in the cancellation jump buffer can be overwritten if we need the >> space (it's a dead store). >> >> * The corollary to this is that function calls made from cancellation handlers >> will continue to advance the shadowstack from the deepest point at which >> cancellation is initiated from. This means that the depth of the shadowstack >> doesn't match the depth of the real stack while we are unwinding. I don't >> know if this will have consequences on analysis tooling or not, or debug >> tools during unwinding. It's a fairly advanced situation and corner case, >> and restoring the shadowstack is not useful becuase we don't need it and >> simplifies the implementation. >> >> * The cancellation jump buffer has private data used for chaining the cancel >> jump buffers together such that the custom unwinder can follow them and >> call them in sequence. This space constitutes 4 void *'s which is space >> that setjmp can write to, because we will just overwrite it when we register >> the cancel handler. >> >> * If the new shadowstack-enabled setjmp stores the shadowstack pointer into >> the space taken by the 4 void*'s then we won't overflow the stack, and we >> don't need to change the layout of the cancellation jump buffer. The 4 void*'s >> are sufficient, even for x32 to write a 64-bit shadow stack address. >> >> * After fixing the cancellation jump buffers the following work needs to be reviewed: >> >> * Add feature_1 in tcb head to track CET status and make it easily available >> to runtime for checking. >> >> * Save and restore shadowstack in setjmp/longjmp. >> >> * Add CET support to ld.so et. al. and track runtime status. >> >> * Adjust vfork for shadow stack usage. >> >> * Add ENDBR or NOTRACK where required in assembly. >> >> * CET and makecontext incompatible. >> - Probably need to discuss which default is appropriate. >> - Should the user get CET automatically disabled in makecontext() et. al. silently? >> - Should your current solution, which is to error out during the build, and require >> flag changes, be the default? This forces the user to review the security for their >> application. > > I'd like to reserve 4 slots in ucontext for shadow stack: > > https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1 > > It should be binary backward compatible. I will investigate if there is a way > to support shadow stack with existing API. Otherwise, we need to add a new > API for ucontext functions with shadow stack. Could you explain in detail how this is binary backwards compatible? Is the assumption that if you extend ucontext_t, that the kernel will just write more to the stack, and users who accesss it via the void* in a signal handler setup via sigaction + SA_SIGINFO, will not read past the size they expect? How is the shadow stack information moved between a getcontext -> setcontext set of API calls? The user ucontext_t in existing binaries is too small to hold the shadow stack? Would we again have a "flag day" where CET enablement must be coordinated with the definition of a new ucontext_t? >> * prctl for CET. > > We have been experimenting different approaches to get the best implementation. > I am expecting that this patch may change as we collect more data. > OK. >> * The work to review after this patch appears to be less contentious in terms of >> the kinds of changes that are required. Most of the changes are internal details >> of enabling CET and not ABI details, with the exception of the possible pain we >> might cause with makecontext() being unsupported and what default position to take >> there. >> >> Design: >> >> * Overall the implementation looks exactly how I might expect it to look, but some >> of the math that places the shadowstack pointer appears to need either commenting >> or fixing because I don't understand it. You need to make it easy for me to see >> that we have placed the shadowstack pointer into the 4 pad words. >> >> Details: >> >> * One comment needs filling out a bit more, noted below. >> >>> 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch >>> >>> >>> From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001 >>> From: "H.J. Lu" <hjl.tools@gmail.com> >>> Date: Sat, 24 Feb 2018 17:23:54 -0800 >>> Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack >>> register >>> >>> The pad array in struct pthread_unwind_buf is used by setjmp to save >>> shadow stack register. We assert that size of struct pthread_unwind_buf >>> is no less than offset of shadow stack pointer + shadow stack pointer >>> size. >>> >> >> OK. >> >>> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as >>> these with thread cancellation, call setjmp, but never return after >>> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as >>> __libc_longjmp on x86, doesn't need to restore shadow stack register. >> >> OK. >> >>> __libc_longjmp, which is a private interface for thread cancellation >>> implementation in libpthread, is changed to call __longjmp_cancel, >>> instead of __longjmp. __longjmp_cancel is a new internal function >>> in libc, which is similar to __longjmp, but doesn't restore shadow >>> stack register. >> >> OK. Good. I like the use of a __longjmp_cancel name to call out what's >> going on in the API (clear semantics). >> >>> >>> The compatibility longjmp and siglongjmp in libpthread.so are changed >>> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will >>> restore shadow stack register. >> >> OK. >> >>> >>> Tested with build-many-glibcs.py. >>> >>> * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous >>> handlers after setjmp. >>> * setjmp/longjmp.c (__libc_longjmp): Don't define alias if >>> defined. >>> * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG): >>> Changed to 97. >>> * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. >>> * sysdeps/x86/__longjmp_cancel.S: New file. >>> * sysdeps/x86/longjmp.c: Likewise. >>> * sysdeps/x86/nptl/pt-longjmp.c: Likewise. >> >> This looks much better. >> >>> --- >>> nptl/pthread_create.c | 9 ++-- >>> setjmp/longjmp.c | 2 + >>> sysdeps/unix/sysv/linux/x86/setjmpP.h | 4 +- >>> sysdeps/x86/Makefile | 4 ++ >>> sysdeps/x86/__longjmp_cancel.S | 20 ++++++++ >>> sysdeps/x86/longjmp.c | 45 ++++++++++++++++ >>> sysdeps/x86/nptl/pt-longjmp.c | 97 +++++++++++++++++++++++++++++++++++ >>> 7 files changed, 176 insertions(+), 5 deletions(-) >>> create mode 100644 sysdeps/x86/__longjmp_cancel.S >>> create mode 100644 sysdeps/x86/longjmp.c >>> create mode 100644 sysdeps/x86/nptl/pt-longjmp.c >>> >>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c >>> index caaf07c134..1c5b3780c6 100644 >>> --- a/nptl/pthread_create.c >>> +++ b/nptl/pthread_create.c >>> @@ -427,12 +427,15 @@ START_THREAD_DEFN >>> compilers without that support we do use setjmp. */ >>> struct pthread_unwind_buf unwind_buf; >>> >>> - /* No previous handlers. */ >>> + int not_first_call; >>> + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); >>> + >>> + /* No previous handlers. NB: This must be done after setjmp since >>> + the same space may be used by setjmp to store extra data which >>> + should never be used by __libc_unwind_longjmp. */ >> >> Suggest: >> ~~~ >> No previous handlers. NB: This must be done after setjmp since >> the private space in the unwind jump buffer may overlap space >> used by setjmp to store extra architecture-specific information >> which is never be used by the cancellation-specific >> __libc_unwind_longjmp. >> >> The private space is allowed to overlap because the unwinder never >> has to return through any of the jumped-to call frames, and thus >> only a minimum amount of saved data need be stored, and for example, >> need not include the process signal mask information. This is all >> an optimization to reduce stack usage when pushing cancellation >> handlers. >> ~~~ > > Will fix it. > >>> unwind_buf.priv.data.prev = NULL; >>> unwind_buf.priv.data.cleanup = NULL; >>> >>> - int not_first_call; >>> - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); >> >> OK. >> >>> if (__glibc_likely (! not_first_call)) >>> { >>> /* Store the new cleanup handler info. */ >>> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c >>> index a2a7065a85..453889e103 100644 >>> --- a/setjmp/longjmp.c >>> +++ b/setjmp/longjmp.c >>> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) >>> } >>> >>> #ifndef __libc_siglongjmp >>> +# ifndef __libc_longjmp >>> /* __libc_longjmp is a private interface for cancellation implementation >>> in libpthread. */ >>> strong_alias (__libc_siglongjmp, __libc_longjmp) >>> +# endif >> >> OK. >> >>> weak_alias (__libc_siglongjmp, _longjmp) >>> weak_alias (__libc_siglongjmp, longjmp) >>> weak_alias (__libc_siglongjmp, siglongjmp) >>> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h >>> index c0ed767a0d..90a6bbcf32 100644 >>> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h >>> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h >>> @@ -22,8 +22,8 @@ >>> #include <bits/types/__sigset_t.h> >>> >>> /* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. >>> - Define it to 513 to leave some rooms for future use. */ >>> -#define _JUMP_BUF_SIGSET_NSIG 513 >>> + Define it to 97 to leave some rooms for future use. */ >> >> OK. >> >>> +#define _JUMP_BUF_SIGSET_NSIG 97 >> >> Please provide proof in the way of a comment or rewriting this constant >> to show that it places the shadow stack pointer on both x86_64 and x32 >> into the range of the private pad. > > sysdeps/x86/nptl/pt-longjmp.c has > > _Static_assert ((sizeof (struct pthread_unwind_buf) >> = (SHADOW_STACK_POINTER_OFFSET > + SHADOW_STACK_POINTER_SIZE)), > "size of struct pthread_unwind_buf < " > "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); > > If shadow stack pointer is saved in the offset bigger than the size of > struct pthread_unwind_buf, assert will trigger at compile-time. > Thanks, I'll review this in the patch you posted. >> Also, from commit f33632ccd1dec3217583fcfdd965afb62954203c, >> where did this math come from? >> >> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) >> >> Why the +7? > > _JUMP_BUF_SIGSET_NSIG is the biggest signal number + 1. Agreed. > _JUMP_BUF_SIGSET_NSIG - 1 gives the biggest signal number. Agreed. > _JUMP_BUF_SIGSET_NSIG - 1 + 7 rounds up to the number of bytes > which are needed to store the biggest signal number. It does not. Result # of signals # of bits Current # Expected # FAIL 1 64 0 1 FAIL 2 64 0 1 FAIL 3 64 0 1 FAIL 4 64 0 1 FAIL 5 64 0 1 FAIL 6 64 0 1 FAIL 7 64 0 1 FAIL 8 64 0 1 FAIL 9 64 0 1 FAIL 10 64 0 1 FAIL 11 64 0 1 FAIL 12 64 0 1 FAIL 13 64 0 1 FAIL 14 64 0 1 FAIL 15 64 0 1 FAIL 16 64 0 1 FAIL 17 64 0 1 FAIL 18 64 0 1 FAIL 19 64 0 1 FAIL 20 64 0 1 FAIL 21 64 0 1 FAIL 22 64 0 1 FAIL 23 64 0 1 FAIL 24 64 0 1 FAIL 25 64 0 1 FAIL 26 64 0 1 FAIL 27 64 0 1 FAIL 28 64 0 1 FAIL 29 64 0 1 FAIL 30 64 0 1 FAIL 31 64 0 1 FAIL 32 64 0 1 FAIL 33 64 0 1 FAIL 34 64 0 1 FAIL 35 64 0 1 FAIL 36 64 0 1 FAIL 37 64 0 1 FAIL 38 64 0 1 FAIL 39 64 0 1 FAIL 40 64 0 1 FAIL 41 64 0 1 FAIL 42 64 0 1 FAIL 43 64 0 1 FAIL 44 64 0 1 FAIL 45 64 0 1 FAIL 46 64 0 1 FAIL 47 64 0 1 FAIL 48 64 0 1 FAIL 49 64 0 1 FAIL 50 64 0 1 FAIL 51 64 0 1 FAIL 52 64 0 1 FAIL 53 64 0 1 FAIL 54 64 0 1 FAIL 55 64 0 1 FAIL 56 64 0 1 FAIL 57 64 0 1 PASS 58 64 1 1 PASS 59 64 1 1 PASS 60 64 1 1 PASS 61 64 1 1 PASS 62 64 1 1 PASS 63 64 1 1 PASS 64 64 1 1 FAIL 65 128 1 2 FAIL 66 128 1 2 FAIL 67 128 1 2 FAIL 68 128 1 2 FAIL 69 128 1 2 FAIL 70 128 1 2 FAIL 71 128 1 2 FAIL 72 128 1 2 FAIL 73 128 1 2 FAIL 74 128 1 2 FAIL 75 128 1 2 FAIL 76 128 1 2 FAIL 77 128 1 2 FAIL 78 128 1 2 FAIL 79 128 1 2 FAIL 80 128 1 2 FAIL 81 128 1 2 FAIL 82 128 1 2 FAIL 83 128 1 2 FAIL 84 128 1 2 FAIL 85 128 1 2 FAIL 86 128 1 2 FAIL 87 128 1 2 FAIL 88 128 1 2 FAIL 89 128 1 2 FAIL 90 128 1 2 FAIL 91 128 1 2 FAIL 92 128 1 2 FAIL 93 128 1 2 FAIL 94 128 1 2 FAIL 95 128 1 2 FAIL 96 128 1 2 FAIL 97 128 1 2 FAIL 98 128 1 2 FAIL 99 128 1 2 FAIL 100 128 1 2 FAIL 101 128 1 2 FAIL 102 128 1 2 FAIL 103 128 1 2 FAIL 104 128 1 2 FAIL 105 128 1 2 FAIL 106 128 1 2 FAIL 107 128 1 2 FAIL 108 128 1 2 FAIL 109 128 1 2 FAIL 110 128 1 2 FAIL 111 128 1 2 FAIL 112 128 1 2 FAIL 113 128 1 2 FAIL 114 128 1 2 FAIL 115 128 1 2 FAIL 116 128 1 2 FAIL 117 128 1 2 FAIL 118 128 1 2 FAIL 119 128 1 2 FAIL 120 128 1 2 FAIL 121 128 1 2 PASS 122 128 2 2 PASS 123 128 2 2 PASS 124 128 2 2 PASS 125 128 2 2 PASS 126 128 2 2 PASS 127 128 2 2 PASS 128 128 2 2 FAIL 129 192 2 3 FAIL 130 192 2 3 FAIL 131 192 2 3 FAIL 132 192 2 3 FAIL 133 192 2 3 FAIL 134 192 2 3 FAIL 135 192 2 3 FAIL 136 192 2 3 FAIL 137 192 2 3 FAIL 138 192 2 3 FAIL 139 192 2 3 FAIL 140 192 2 3 FAIL 141 192 2 3 FAIL 142 192 2 3 FAIL 143 192 2 3 FAIL 144 192 2 3 FAIL 145 192 2 3 FAIL 146 192 2 3 FAIL 147 192 2 3 FAIL 148 192 2 3 FAIL 149 192 2 3 FAIL 150 192 2 3 FAIL 151 192 2 3 FAIL 152 192 2 3 FAIL 153 192 2 3 FAIL 154 192 2 3 FAIL 155 192 2 3 FAIL 156 192 2 3 FAIL 157 192 2 3 FAIL 158 192 2 3 FAIL 159 192 2 3 FAIL 160 192 2 3 FAIL 161 192 2 3 FAIL 162 192 2 3 FAIL 163 192 2 3 FAIL 164 192 2 3 FAIL 165 192 2 3 FAIL 166 192 2 3 FAIL 167 192 2 3 FAIL 168 192 2 3 FAIL 169 192 2 3 FAIL 170 192 2 3 FAIL 171 192 2 3 FAIL 172 192 2 3 FAIL 173 192 2 3 FAIL 174 192 2 3 FAIL 175 192 2 3 FAIL 176 192 2 3 FAIL 177 192 2 3 FAIL 178 192 2 3 FAIL 179 192 2 3 FAIL 180 192 2 3 FAIL 181 192 2 3 FAIL 182 192 2 3 FAIL 183 192 2 3 FAIL 184 192 2 3 FAIL 185 192 2 3 PASS 186 192 3 3 PASS 187 192 3 3 PASS 188 192 3 3 PASS 189 192 3 3 PASS 190 192 3 3 PASS 191 192 3 3 PASS 192 192 3 3 FAIL 193 256 3 4 FAIL 194 256 3 4 FAIL 195 256 3 4 FAIL 196 256 3 4 FAIL 197 256 3 4 FAIL 198 256 3 4 FAIL 199 256 3 4 FAIL 200 256 3 4 FAIL 201 256 3 4 FAIL 202 256 3 4 FAIL 203 256 3 4 FAIL 204 256 3 4 FAIL 205 256 3 4 FAIL 206 256 3 4 FAIL 207 256 3 4 FAIL 208 256 3 4 FAIL 209 256 3 4 FAIL 210 256 3 4 FAIL 211 256 3 4 FAIL 212 256 3 4 FAIL 213 256 3 4 FAIL 214 256 3 4 FAIL 215 256 3 4 FAIL 216 256 3 4 FAIL 217 256 3 4 FAIL 218 256 3 4 FAIL 219 256 3 4 FAIL 220 256 3 4 FAIL 221 256 3 4 FAIL 222 256 3 4 FAIL 223 256 3 4 FAIL 224 256 3 4 FAIL 225 256 3 4 FAIL 226 256 3 4 FAIL 227 256 3 4 FAIL 228 256 3 4 FAIL 229 256 3 4 FAIL 230 256 3 4 FAIL 231 256 3 4 FAIL 232 256 3 4 FAIL 233 256 3 4 FAIL 234 256 3 4 FAIL 235 256 3 4 FAIL 236 256 3 4 FAIL 237 256 3 4 FAIL 238 256 3 4 FAIL 239 256 3 4 FAIL 240 256 3 4 FAIL 241 256 3 4 FAIL 242 256 3 4 FAIL 243 256 3 4 FAIL 244 256 3 4 FAIL 245 256 3 4 FAIL 246 256 3 4 FAIL 247 256 3 4 FAIL 248 256 3 4 FAIL 249 256 3 4 PASS 250 256 4 4 PASS 251 256 4 4 PASS 252 256 4 4 PASS 253 256 4 4 PASS 254 256 4 4 PASS 255 256 4 4 PASS 256 256 4 4 FAIL 257 320 4 5 FAIL 258 320 4 5 FAIL 259 320 4 5 FAIL 260 320 4 5 FAIL 261 320 4 5 FAIL 262 320 4 5 FAIL 263 320 4 5 FAIL 264 320 4 5 FAIL 265 320 4 5 FAIL 266 320 4 5 FAIL 267 320 4 5 FAIL 268 320 4 5 FAIL 269 320 4 5 FAIL 270 320 4 5 FAIL 271 320 4 5 FAIL 272 320 4 5 FAIL 273 320 4 5 FAIL 274 320 4 5 FAIL 275 320 4 5 FAIL 276 320 4 5 FAIL 277 320 4 5 FAIL 278 320 4 5 FAIL 279 320 4 5 FAIL 280 320 4 5 FAIL 281 320 4 5 FAIL 282 320 4 5 FAIL 283 320 4 5 FAIL 284 320 4 5 FAIL 285 320 4 5 FAIL 286 320 4 5 FAIL 287 320 4 5 FAIL 288 320 4 5 FAIL 289 320 4 5 FAIL 290 320 4 5 FAIL 291 320 4 5 FAIL 292 320 4 5 FAIL 293 320 4 5 FAIL 294 320 4 5 FAIL 295 320 4 5 FAIL 296 320 4 5 FAIL 297 320 4 5 FAIL 298 320 4 5 FAIL 299 320 4 5 FAIL 300 320 4 5 FAIL 301 320 4 5 FAIL 302 320 4 5 FAIL 303 320 4 5 FAIL 304 320 4 5 FAIL 305 320 4 5 FAIL 306 320 4 5 FAIL 307 320 4 5 FAIL 308 320 4 5 FAIL 309 320 4 5 FAIL 310 320 4 5 FAIL 311 320 4 5 FAIL 312 320 4 5 FAIL 313 320 4 5 PASS 314 320 5 5 PASS 315 320 5 5 PASS 316 320 5 5 PASS 317 320 5 5 PASS 318 320 5 5 PASS 319 320 5 5 PASS 320 320 5 5 FAIL 321 384 5 6 FAIL 322 384 5 6 FAIL 323 384 5 6 FAIL 324 384 5 6 FAIL 325 384 5 6 FAIL 326 384 5 6 FAIL 327 384 5 6 FAIL 328 384 5 6 FAIL 329 384 5 6 FAIL 330 384 5 6 FAIL 331 384 5 6 FAIL 332 384 5 6 FAIL 333 384 5 6 FAIL 334 384 5 6 FAIL 335 384 5 6 FAIL 336 384 5 6 FAIL 337 384 5 6 FAIL 338 384 5 6 FAIL 339 384 5 6 FAIL 340 384 5 6 FAIL 341 384 5 6 FAIL 342 384 5 6 FAIL 343 384 5 6 FAIL 344 384 5 6 FAIL 345 384 5 6 FAIL 346 384 5 6 FAIL 347 384 5 6 FAIL 348 384 5 6 FAIL 349 384 5 6 FAIL 350 384 5 6 FAIL 351 384 5 6 FAIL 352 384 5 6 FAIL 353 384 5 6 FAIL 354 384 5 6 FAIL 355 384 5 6 FAIL 356 384 5 6 FAIL 357 384 5 6 FAIL 358 384 5 6 FAIL 359 384 5 6 FAIL 360 384 5 6 FAIL 361 384 5 6 FAIL 362 384 5 6 FAIL 363 384 5 6 FAIL 364 384 5 6 FAIL 365 384 5 6 FAIL 366 384 5 6 FAIL 367 384 5 6 FAIL 368 384 5 6 FAIL 369 384 5 6 FAIL 370 384 5 6 FAIL 371 384 5 6 FAIL 372 384 5 6 FAIL 373 384 5 6 FAIL 374 384 5 6 FAIL 375 384 5 6 FAIL 376 384 5 6 FAIL 377 384 5 6 PASS 378 384 6 6 PASS 379 384 6 6 PASS 380 384 6 6 PASS 381 384 6 6 PASS 382 384 6 6 PASS 383 384 6 6 PASS 384 384 6 6 FAIL 385 448 6 7 FAIL 386 448 6 7 FAIL 387 448 6 7 FAIL 388 448 6 7 FAIL 389 448 6 7 FAIL 390 448 6 7 FAIL 391 448 6 7 FAIL 392 448 6 7 FAIL 393 448 6 7 FAIL 394 448 6 7 FAIL 395 448 6 7 FAIL 396 448 6 7 FAIL 397 448 6 7 FAIL 398 448 6 7 FAIL 399 448 6 7 FAIL 400 448 6 7 FAIL 401 448 6 7 FAIL 402 448 6 7 FAIL 403 448 6 7 FAIL 404 448 6 7 FAIL 405 448 6 7 FAIL 406 448 6 7 FAIL 407 448 6 7 FAIL 408 448 6 7 FAIL 409 448 6 7 FAIL 410 448 6 7 FAIL 411 448 6 7 FAIL 412 448 6 7 FAIL 413 448 6 7 FAIL 414 448 6 7 FAIL 415 448 6 7 FAIL 416 448 6 7 FAIL 417 448 6 7 FAIL 418 448 6 7 FAIL 419 448 6 7 FAIL 420 448 6 7 FAIL 421 448 6 7 FAIL 422 448 6 7 FAIL 423 448 6 7 FAIL 424 448 6 7 FAIL 425 448 6 7 FAIL 426 448 6 7 FAIL 427 448 6 7 FAIL 428 448 6 7 FAIL 429 448 6 7 FAIL 430 448 6 7 FAIL 431 448 6 7 FAIL 432 448 6 7 FAIL 433 448 6 7 FAIL 434 448 6 7 FAIL 435 448 6 7 FAIL 436 448 6 7 FAIL 437 448 6 7 FAIL 438 448 6 7 FAIL 439 448 6 7 FAIL 440 448 6 7 FAIL 441 448 6 7 PASS 442 448 7 7 PASS 443 448 7 7 PASS 444 448 7 7 PASS 445 448 7 7 PASS 446 448 7 7 PASS 447 448 7 7 PASS 448 448 7 7 FAIL 449 512 7 8 FAIL 450 512 7 8 FAIL 451 512 7 8 FAIL 452 512 7 8 FAIL 453 512 7 8 FAIL 454 512 7 8 FAIL 455 512 7 8 FAIL 456 512 7 8 FAIL 457 512 7 8 FAIL 458 512 7 8 FAIL 459 512 7 8 FAIL 460 512 7 8 FAIL 461 512 7 8 FAIL 462 512 7 8 FAIL 463 512 7 8 FAIL 464 512 7 8 FAIL 465 512 7 8 FAIL 466 512 7 8 FAIL 467 512 7 8 FAIL 468 512 7 8 FAIL 469 512 7 8 FAIL 470 512 7 8 FAIL 471 512 7 8 FAIL 472 512 7 8 FAIL 473 512 7 8 FAIL 474 512 7 8 FAIL 475 512 7 8 FAIL 476 512 7 8 FAIL 477 512 7 8 FAIL 478 512 7 8 FAIL 479 512 7 8 FAIL 480 512 7 8 FAIL 481 512 7 8 FAIL 482 512 7 8 FAIL 483 512 7 8 FAIL 484 512 7 8 FAIL 485 512 7 8 FAIL 486 512 7 8 FAIL 487 512 7 8 FAIL 488 512 7 8 FAIL 489 512 7 8 FAIL 490 512 7 8 FAIL 491 512 7 8 FAIL 492 512 7 8 FAIL 493 512 7 8 FAIL 494 512 7 8 FAIL 495 512 7 8 FAIL 496 512 7 8 FAIL 497 512 7 8 FAIL 498 512 7 8 FAIL 499 512 7 8 FAIL 500 512 7 8 FAIL 501 512 7 8 FAIL 502 512 7 8 FAIL 503 512 7 8 FAIL 504 512 7 8 FAIL 505 512 7 8 PASS 506 512 8 8 PASS 507 512 8 8 PASS 508 512 8 8 PASS 509 512 8 8 PASS 510 512 8 8 PASS 511 512 8 8 PASS 512 512 8 8 Luckily for 512 signals (513) the math works out. For 96 signals it does not. (96 - 1 + 7) = 102 102 / 64 = 1 That's only a signal word, that only supports 64 signals, not 96. Why doesn't the static assert trigger? Because you a < the size of the pthread_unwind_buf, and too small actually, writing into other parts of the buffer! I would expect us to use something like this: diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h index c0ed767a0d..6e1e8f901c 100644 --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h @@ -20,13 +20,15 @@ #define _SETJMPP_H 1 #include <bits/types/__sigset_t.h> +#include <libc-pointer-arith.h> -/* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. - Define it to 513 to leave some rooms for future use. */ -#define _JUMP_BUF_SIGSET_NSIG 513 +/* As of kernel 4.14, x86 _NSIG is 64. + Define it to 512 to leave some rooms for future use. */ +#define _JUMP_BUF_SIGSET_NSIG 512 /* Number of longs to hold all signals. */ #define _JUMP_BUF_SIGSET_NWORDS \ - ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) + (ALIGN_UP(_JUMP_BUF_SIGSET_NSIG, (8 * sizeof (unsigned long int))) \ + / (8 * sizeof (unsigned long int))) typedef struct { --- ... but with the size you need and explain *why* it's 96. You need a comment like this: /* The layout looks like this: - N words for this. - N words for that. - N words for shadow stack. Total 96 signals. */ Align the number of signals up to multiple of signals you can store in a hardware word, and then figure out how many words that takes up. Please review my math. >>> /* Number of longs to hold all signals. */ >>> #define _JUMP_BUF_SIGSET_NWORDS \ >>> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) >>> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile >>> index 0d0326c21a..d25d6f0ae4 100644 >>> --- a/sysdeps/x86/Makefile >>> +++ b/sysdeps/x86/Makefile >>> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features >>> tests += tst-get-cpu-features >>> tests-static += tst-get-cpu-features-static >>> endif >>> + >>> +ifeq ($(subdir),setjmp) >>> +sysdep_routines += __longjmp_cancel >>> +endif >> >> OK. >> >>> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S >>> new file mode 100644 >>> index 0000000000..b57dbfa376 >>> --- /dev/null >>> +++ b/sysdeps/x86/__longjmp_cancel.S >>> @@ -0,0 +1,20 @@ >>> +/* __longjmp_cancel for x86. >>> + Copyright (C) 2018 Free Software Foundation, Inc. >>> + This file is part of the GNU C Library. >>> + >>> + The GNU C Library is free software; you can redistribute it and/or >>> + modify it under the terms of the GNU Lesser General Public >>> + License as published by the Free Software Foundation; either >>> + version 2.1 of the License, or (at your option) any later version. >>> + >>> + The GNU C Library is distributed in the hope that it will be useful, >>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>> + Lesser General Public License for more details. >>> + >>> + You should have received a copy of the GNU Lesser General Public >>> + License along with the GNU C Library; if not, see >>> + <http://www.gnu.org/licenses/>. */ >>> + >>> +#define __longjmp __longjmp_cancel >>> +#include <__longjmp.S> >> >> OK. >> >>> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c >>> new file mode 100644 >>> index 0000000000..a53f31e1dd >>> --- /dev/null >>> +++ b/sysdeps/x86/longjmp.c >>> @@ -0,0 +1,45 @@ >>> +/* __libc_siglongjmp for x86. >>> + Copyright (C) 2018 Free Software Foundation, Inc. >>> + This file is part of the GNU C Library. >>> + >>> + The GNU C Library is free software; you can redistribute it and/or >>> + modify it under the terms of the GNU Lesser General Public >>> + License as published by the Free Software Foundation; either >>> + version 2.1 of the License, or (at your option) any later version. >>> + >>> + The GNU C Library is distributed in the hope that it will be useful, >>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>> + Lesser General Public License for more details. >>> + >>> + You should have received a copy of the GNU Lesser General Public >>> + License along with the GNU C Library; if not, see >>> + <http://www.gnu.org/licenses/>. */ >>> + >>> +#define __libc_longjmp __redirect___libc_longjmp >>> +#include <setjmp/longjmp.c> >>> +#undef __libc_longjmp >>> + >>> +extern void __longjmp_cancel (__jmp_buf __env, int __val) >>> + __attribute__ ((__noreturn__)) attribute_hidden; >>> + >>> +/* Since __libc_longjmp is a private interface for cancellation >>> + implementation in libpthread, there is no need to restore shadow >>> + stack register. */ >>> + >>> +void >>> +__libc_longjmp (sigjmp_buf env, int val) >>> +{ >>> + /* Perform any cleanups needed by the frames being unwound. */ >>> + _longjmp_unwind (env, val); >> >> OK. >> >>> + >>> + if (env[0].__mask_was_saved) >>> + /* Restore the saved signal mask. */ >>> + (void) __sigprocmask (SIG_SETMASK, >>> + (sigset_t *) &env[0].__saved_mask, >>> + (sigset_t *) NULL); >> >> OK. >> >>> + >>> + /* Call the machine-dependent function to restore machine state >>> + without shadow stack. */ >>> + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); >> >> OK. >> >>> +} >>> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c >>> new file mode 100644 >>> index 0000000000..7eb8651cfe >>> --- /dev/null >>> +++ b/sysdeps/x86/nptl/pt-longjmp.c >>> @@ -0,0 +1,97 @@ >>> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. >>> + X86 version. >>> + Copyright (C) 18 Free Software Foundation, Inc. >>> + This file is part of the GNU C Library. >>> + >>> + The GNU C Library is free software; you can redistribute it and/or >>> + modify it under the terms of the GNU Lesser General Public >>> + License as published by the Free Software Foundation; either >>> + version 2.1 of the License, or (at your option) any later version. >>> + >>> + The GNU C Library is distributed in the hope that it will be useful, >>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>> + Lesser General Public License for more details. >>> + >>> + You should have received a copy of the GNU Lesser General Public >>> + License along with the GNU C Library; if not, see >>> + <http://www.gnu.org/licenses/>. */ >>> + >>> +/* <nptl/descr.h> has >>> + >>> +struct pthread_unwind_buf >>> +{ >>> + struct >>> + { >>> + __jmp_buf jmp_buf; >>> + int mask_was_saved; >>> + } cancel_jmp_buf[1]; >>> + >>> + union >>> + { >>> + void *pad[4]; >>> + struct >>> + { >>> + struct pthread_unwind_buf *prev; >>> + struct _pthread_cleanup_buffer *cleanup; >>> + int canceltype; >>> + } data; >>> + } priv; >>> +}; >>> + >>> + The pad array in struct pthread_unwind_buf is used by setjmp to save >> >> This appears to be SHADOW_STACK_POINTER_OFFSET in subsequent patches. >> >> However on your hjl/cet/master branch it appears that this offset is not >> defined to be *just after* the mask_was_saved? > > sysdeps/unix/sysv/linux/x86/setjmpP.h has > > typedef struct > { > unsigned long int __val[_JUMP_BUF_SIGSET_NWORDS]; > } __jmp_buf_sigset_t; > > typedef union > { > __sigset_t __saved_mask_compat; > struct > { > __jmp_buf_sigset_t __saved_mask; > /* Used for shadow stack pointer. */ > unsigned long int __shadow_stack_pointer; > } __saved; > } __jmpbuf_arch_t; > > __shadow_stack_pointer is placed after __saved_mask, aka mask_was_saved. OK, I'll re-review. >>> + shadow stack register. Assert that size of struct pthread_unwind_buf >>> + is no less than offset of shadow stack pointer plus shadow stack >>> + pointer size. >>> + >>> + NB: setjmp is called in libpthread to save shadow stack register. But >>> + __libc_unwind_longjmp doesn't restore shadow stack register since they >>> + never return after longjmp. */ >>> + >>> +#include <pthreadP.h> >>> +#include <jmp_buf-ssp.h> >>> + >>> +#ifdef __x86_64__ >>> +# define SHADOW_STACK_POINTER_SIZE 8 >>> +#else >>> +# define SHADOW_STACK_POINTER_SIZE 4 >>> +#endif >>> + >>> +_Static_assert ((sizeof (struct pthread_unwind_buf) >>> + >= (SHADOW_STACK_POINTER_OFFSET >>> + + SHADOW_STACK_POINTER_SIZE)), >>> + "size of struct pthread_unwind_buf < " >>> + "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); >> >> OK. >> >>> + >>> +#include <shlib-compat.h> >>> + >>> +/* libpthread once had its own longjmp (and siglongjmp alias), though there >>> + was no apparent reason for it. There is no use in having a separate >>> + symbol in libpthread, but the historical ABI requires it. For static >>> + linking, there is no need to provide anything here--the libc version >>> + will be linked in. For shared library ABI compatibility, there must be >>> + longjmp and siglongjmp symbols in libpthread.so. >>> + >>> + With an IFUNC resolver, it would be possible to avoid the indirection, >>> + but the IFUNC resolver might run before the __libc_longjmp symbol has >>> + been relocated, in which case the IFUNC resolver would not be able to >>> + provide the correct address. */ >>> + >>> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) >>> + >>> +static void __attribute__ ((noreturn, used)) >>> +longjmp_compat (jmp_buf env, int val) >>> +{ >>> + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since >>> + __libc_longjmp is a private interface for cancellation which >>> + doesn't restore shadow stack register. */ >>> + __libc_siglongjmp (env, val); >> >> OK. >> >>> +} >>> + >>> +strong_alias (longjmp_compat, longjmp_alias) >>> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); >>> + >>> +strong_alias (longjmp_alias, siglongjmp_alias) >>> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); >>> + >>> +#endif >>> -- 2.14.3 >> >> >> -- >> Cheers, >> Carlos. > > > I look forward to a v2 with correct rounding and a new comment. -- Cheers, Carlos. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-04-12 21:36 ` Carlos O'Donell @ 2018-04-12 23:50 ` H.J. Lu 2018-04-21 3:28 ` [PATCHv2] " Carlos O'Donell 0 siblings, 1 reply; 13+ messages in thread From: H.J. Lu @ 2018-04-12 23:50 UTC (permalink / raw) To: Carlos O'Donell Cc: Florian Weimer, Joseph Myers, Tsimbalist, Igor V, GNU C Library [-- Attachment #1: Type: text/plain, Size: 24004 bytes --] On Thu, Apr 12, 2018 at 2:36 PM, Carlos O'Donell <carlos@redhat.com> wrote: > On 04/06/2018 07:59 AM, H.J. Lu wrote: >> On Thu, Apr 5, 2018 at 9:46 PM, Carlos O'Donell <carlos@redhat.com> wrote: >>> On 03/30/2018 12:41 PM, H.J. Lu wrote: >>>> On Thu, Mar 29, 2018 at 1:20 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >>>>> * H. J. Lu: >>>>> >>>>>> On Thu, Mar 29, 2018 at 1:15 PM, Florian Weimer <fw@deneb.enyo.de> wrote: >>>>>>> * H. J. Lu: >>>>>>> >>>>>>>> You need to make a choice. You either don't introduce a new symbol >>>>>>>> version or don't save shadow stack for thread cancellation. You >>>>>>>> can't have both. >>>>>>> I don't understand. We have room to save the shadow stack pointer in >>>>>>> the existing struct. >>>>>> No, we don't have room in struct pthread_unwind_buf: >>>>>> >>>>>> Note: There is an unused pointer space in pthread_unwind_buf_data. But >>>>>> it isn't suitable for saving and restoring shadow stack register since >>>>>> x32 is a 64-bit process with 32-bit software pointer and kernel may >>>>>> place x32 shadow stack above 4GB. We need to save and restore 64-bit >>>>>> shadow stack register for x32. >>>>> We have for void * fields. They are subsequently overwritten by >>>>> __pthread_register_cancel. But __sigsetjmp can write to them first >>>>> without causing any harm. We just need a private __longjmp_cancel >>>>> that doesn't restore the shadow stack pointer. >>>> Here is the patch which does that. Any comments? >>> >>> OK, so I have reviewed https://github.com/hjl-tools/glibc/ hjl/cet/master, >>> please confirm that this is the correct branch of the required implementation >>> for CET. It really helps to review the rest of the patch set, you should be >>> preparing this as a patch set instead of having it reviewed one-at-a-time. >>> This issue was already raised in the thread with Zack. >> >> Thanks for your feedbacks. >> >>> Architecture: >>> >>> * We avoid a "flag day" with feature_1 TCB flag to switch to a new ABI, >>> which we have discussed is a fragile process which should be avoided if >>> a supportable alternative solution exists. >>> >>> * We avoid versioned symbols, this makes CET backportable, and this has a >>> bigger benefit for long-term stable distributions. >>> >>> * A key design problem has been that cancellation jump buffers within glibc >>> are truncated to optimize on-stack size, and this means that setjmp could >>> write beyond the structure because setjmp now tries to save the shadowstack >>> pointer into space that the cancellation jump buffer did not allocate. >>> For the record this optimization seems premature and I'm sad we did it, this >>> is a lesson we should learn from. >>> >>> * We have all agreed to the following concepts: >>> >>> * The cancellation process, in particular the unwinder, never returns from >>> any of the functions we call, it just keeps calling into the unwinder to >>> jump to the next unwound cancellation function all the way to the thread >>> start routine. Therefore because we never return from one of these functions >>> we never need to restore the shadow stack, and consequently wherever it is >>> stored in the cancellation jump buffer can be overwritten if we need the >>> space (it's a dead store). >>> >>> * The corollary to this is that function calls made from cancellation handlers >>> will continue to advance the shadowstack from the deepest point at which >>> cancellation is initiated from. This means that the depth of the shadowstack >>> doesn't match the depth of the real stack while we are unwinding. I don't >>> know if this will have consequences on analysis tooling or not, or debug >>> tools during unwinding. It's a fairly advanced situation and corner case, >>> and restoring the shadowstack is not useful becuase we don't need it and >>> simplifies the implementation. >>> >>> * The cancellation jump buffer has private data used for chaining the cancel >>> jump buffers together such that the custom unwinder can follow them and >>> call them in sequence. This space constitutes 4 void *'s which is space >>> that setjmp can write to, because we will just overwrite it when we register >>> the cancel handler. >>> >>> * If the new shadowstack-enabled setjmp stores the shadowstack pointer into >>> the space taken by the 4 void*'s then we won't overflow the stack, and we >>> don't need to change the layout of the cancellation jump buffer. The 4 void*'s >>> are sufficient, even for x32 to write a 64-bit shadow stack address. >>> >>> * After fixing the cancellation jump buffers the following work needs to be reviewed: >>> >>> * Add feature_1 in tcb head to track CET status and make it easily available >>> to runtime for checking. >>> >>> * Save and restore shadowstack in setjmp/longjmp. >>> >>> * Add CET support to ld.so et. al. and track runtime status. >>> >>> * Adjust vfork for shadow stack usage. >>> >>> * Add ENDBR or NOTRACK where required in assembly. >>> >>> * CET and makecontext incompatible. >>> - Probably need to discuss which default is appropriate. >>> - Should the user get CET automatically disabled in makecontext() et. al. silently? >>> - Should your current solution, which is to error out during the build, and require >>> flag changes, be the default? This forces the user to review the security for their >>> application. >> >> I'd like to reserve 4 slots in ucontext for shadow stack: >> >> https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1 >> >> It should be binary backward compatible. I will investigate if there is a way >> to support shadow stack with existing API. Otherwise, we need to add a new >> API for ucontext functions with shadow stack. > > Could you explain in detail how this is binary backwards compatible? > > Is the assumption that if you extend ucontext_t, that the kernel will just write > more to the stack, and users who accesss it via the void* in a signal handler > setup via sigaction + SA_SIGINFO, will not read past the size they expect? > > How is the shadow stack information moved between a getcontext -> setcontext > set of API calls? The user ucontext_t in existing binaries is too small to hold the > shadow stack? > > Would we again have a "flag day" where CET enablement must be coordinated with > the definition of a new ucontext_t? > >>> * prctl for CET. >> >> We have been experimenting different approaches to get the best implementation. >> I am expecting that this patch may change as we collect more data. >> > > OK. > >>> * The work to review after this patch appears to be less contentious in terms of >>> the kinds of changes that are required. Most of the changes are internal details >>> of enabling CET and not ABI details, with the exception of the possible pain we >>> might cause with makecontext() being unsupported and what default position to take >>> there. >>> >>> Design: >>> >>> * Overall the implementation looks exactly how I might expect it to look, but some >>> of the math that places the shadowstack pointer appears to need either commenting >>> or fixing because I don't understand it. You need to make it easy for me to see >>> that we have placed the shadowstack pointer into the 4 pad words. >>> >>> Details: >>> >>> * One comment needs filling out a bit more, noted below. >>> >>>> 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch >>>> >>>> >>>> From f222537447f5ec879427f318b7c0396362b7453a Mon Sep 17 00:00:00 2001 >>>> From: "H.J. Lu" <hjl.tools@gmail.com> >>>> Date: Sat, 24 Feb 2018 17:23:54 -0800 >>>> Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack >>>> register >>>> >>>> The pad array in struct pthread_unwind_buf is used by setjmp to save >>>> shadow stack register. We assert that size of struct pthread_unwind_buf >>>> is no less than offset of shadow stack pointer + shadow stack pointer >>>> size. >>>> >>> >>> OK. >>> >>>> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as >>>> these with thread cancellation, call setjmp, but never return after >>>> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as >>>> __libc_longjmp on x86, doesn't need to restore shadow stack register. >>> >>> OK. >>> >>>> __libc_longjmp, which is a private interface for thread cancellation >>>> implementation in libpthread, is changed to call __longjmp_cancel, >>>> instead of __longjmp. __longjmp_cancel is a new internal function >>>> in libc, which is similar to __longjmp, but doesn't restore shadow >>>> stack register. >>> >>> OK. Good. I like the use of a __longjmp_cancel name to call out what's >>> going on in the API (clear semantics). >>> >>>> >>>> The compatibility longjmp and siglongjmp in libpthread.so are changed >>>> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will >>>> restore shadow stack register. >>> >>> OK. >>> >>>> >>>> Tested with build-many-glibcs.py. >>>> >>>> * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous >>>> handlers after setjmp. >>>> * setjmp/longjmp.c (__libc_longjmp): Don't define alias if >>>> defined. >>>> * sysdeps/unix/sysv/linux/x86/setjmpP.h (_JUMP_BUF_SIGSET_NSIG): >>>> Changed to 97. >>>> * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. >>>> * sysdeps/x86/__longjmp_cancel.S: New file. >>>> * sysdeps/x86/longjmp.c: Likewise. >>>> * sysdeps/x86/nptl/pt-longjmp.c: Likewise. >>> >>> This looks much better. >>> >>>> --- >>>> nptl/pthread_create.c | 9 ++-- >>>> setjmp/longjmp.c | 2 + >>>> sysdeps/unix/sysv/linux/x86/setjmpP.h | 4 +- >>>> sysdeps/x86/Makefile | 4 ++ >>>> sysdeps/x86/__longjmp_cancel.S | 20 ++++++++ >>>> sysdeps/x86/longjmp.c | 45 ++++++++++++++++ >>>> sysdeps/x86/nptl/pt-longjmp.c | 97 +++++++++++++++++++++++++++++++++++ >>>> 7 files changed, 176 insertions(+), 5 deletions(-) >>>> create mode 100644 sysdeps/x86/__longjmp_cancel.S >>>> create mode 100644 sysdeps/x86/longjmp.c >>>> create mode 100644 sysdeps/x86/nptl/pt-longjmp.c >>>> >>>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c >>>> index caaf07c134..1c5b3780c6 100644 >>>> --- a/nptl/pthread_create.c >>>> +++ b/nptl/pthread_create.c >>>> @@ -427,12 +427,15 @@ START_THREAD_DEFN >>>> compilers without that support we do use setjmp. */ >>>> struct pthread_unwind_buf unwind_buf; >>>> >>>> - /* No previous handlers. */ >>>> + int not_first_call; >>>> + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); >>>> + >>>> + /* No previous handlers. NB: This must be done after setjmp since >>>> + the same space may be used by setjmp to store extra data which >>>> + should never be used by __libc_unwind_longjmp. */ >>> >>> Suggest: >>> ~~~ >>> No previous handlers. NB: This must be done after setjmp since >>> the private space in the unwind jump buffer may overlap space >>> used by setjmp to store extra architecture-specific information >>> which is never be used by the cancellation-specific >>> __libc_unwind_longjmp. >>> >>> The private space is allowed to overlap because the unwinder never >>> has to return through any of the jumped-to call frames, and thus >>> only a minimum amount of saved data need be stored, and for example, >>> need not include the process signal mask information. This is all >>> an optimization to reduce stack usage when pushing cancellation >>> handlers. >>> ~~~ >> >> Will fix it. >> >>>> unwind_buf.priv.data.prev = NULL; >>>> unwind_buf.priv.data.cleanup = NULL; >>>> >>>> - int not_first_call; >>>> - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); >>> >>> OK. >>> >>>> if (__glibc_likely (! not_first_call)) >>>> { >>>> /* Store the new cleanup handler info. */ >>>> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c >>>> index a2a7065a85..453889e103 100644 >>>> --- a/setjmp/longjmp.c >>>> +++ b/setjmp/longjmp.c >>>> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) >>>> } >>>> >>>> #ifndef __libc_siglongjmp >>>> +# ifndef __libc_longjmp >>>> /* __libc_longjmp is a private interface for cancellation implementation >>>> in libpthread. */ >>>> strong_alias (__libc_siglongjmp, __libc_longjmp) >>>> +# endif >>> >>> OK. >>> >>>> weak_alias (__libc_siglongjmp, _longjmp) >>>> weak_alias (__libc_siglongjmp, longjmp) >>>> weak_alias (__libc_siglongjmp, siglongjmp) >>>> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h >>>> index c0ed767a0d..90a6bbcf32 100644 >>>> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h >>>> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h >>>> @@ -22,8 +22,8 @@ >>>> #include <bits/types/__sigset_t.h> >>>> >>>> /* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. >>>> - Define it to 513 to leave some rooms for future use. */ >>>> -#define _JUMP_BUF_SIGSET_NSIG 513 >>>> + Define it to 97 to leave some rooms for future use. */ >>> >>> OK. >>> >>>> +#define _JUMP_BUF_SIGSET_NSIG 97 >>> >>> Please provide proof in the way of a comment or rewriting this constant >>> to show that it places the shadow stack pointer on both x86_64 and x32 >>> into the range of the private pad. >> >> sysdeps/x86/nptl/pt-longjmp.c has >> >> _Static_assert ((sizeof (struct pthread_unwind_buf) >>> = (SHADOW_STACK_POINTER_OFFSET >> + SHADOW_STACK_POINTER_SIZE)), >> "size of struct pthread_unwind_buf < " >> "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); >> >> If shadow stack pointer is saved in the offset bigger than the size of >> struct pthread_unwind_buf, assert will trigger at compile-time. >> > > Thanks, I'll review this in the patch you posted. > >>> Also, from commit f33632ccd1dec3217583fcfdd965afb62954203c, >>> where did this math come from? >>> >>> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) >>> >>> Why the +7? >> >> _JUMP_BUF_SIGSET_NSIG is the biggest signal number + 1. > > Agreed. > >> _JUMP_BUF_SIGSET_NSIG - 1 gives the biggest signal number. > > Agreed. > >> _JUMP_BUF_SIGSET_NSIG - 1 + 7 rounds up to the number of bytes >> which are needed to store the biggest signal number. > > It does not. I changed to /* Number of bits per long. */ #define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int)) /* The biggest signal number. As of kernel 4.14, x86 _NSIG is 64. Define it to 96 to leave some rooms for future use. */ #define _JUMP_BUF_SIGSET_NSIG 96 /* Number of longs to hold all signals. */ #define _JUMP_BUF_SIGSET_NWORDS \ (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \ / _JUMP_BUF_SIGSET_BITS_PER_WORD) > Result # of signals # of bits Current # Expected # ... > Luckily for 512 signals (513) the math works out. > > For 96 signals it does not. > > (96 - 1 + 7) = 102 > 102 / 64 = 1 True. > That's only a signal word, that only supports 64 signals, not 96. Lucky for me. Since unsigned long int __shadow_stack_pointer is aligned as unsigned long, there is padding before __shadow_stack_pointer. > Why doesn't the static assert trigger? Because you a < the size of the > pthread_unwind_buf, and too small actually, writing into other parts of > the buffer! Assert _Static_assert ((sizeof (struct pthread_unwind_buf) >= (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)), "size of struct pthread_unwind_buf < " "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); is correct. > I would expect us to use something like this: > > diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h > index c0ed767a0d..6e1e8f901c 100644 > --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h > +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h > @@ -20,13 +20,15 @@ > #define _SETJMPP_H 1 > > #include <bits/types/__sigset_t.h> > +#include <libc-pointer-arith.h> > > -/* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. > - Define it to 513 to leave some rooms for future use. */ > -#define _JUMP_BUF_SIGSET_NSIG 513 > +/* As of kernel 4.14, x86 _NSIG is 64. > + Define it to 512 to leave some rooms for future use. */ > +#define _JUMP_BUF_SIGSET_NSIG 512 > /* Number of longs to hold all signals. */ > #define _JUMP_BUF_SIGSET_NWORDS \ > - ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) > + (ALIGN_UP(_JUMP_BUF_SIGSET_NSIG, (8 * sizeof (unsigned long int))) \ > + / (8 * sizeof (unsigned long int))) > > typedef struct > { > --- > ... but with the size you need and explain *why* it's 96. > > You need a comment like this: > /* The layout looks like this: > - N words for this. > - N words for that. > - N words for shadow stack. > Total 96 signals. */ > > Align the number of signals up to multiple of signals you can store in > a hardware word, and then figure out how many words that takes up. I put comments in sysdeps/x86/nptl/pt-longjmp.c together with static assert. > Please review my math. > >>>> /* Number of longs to hold all signals. */ >>>> #define _JUMP_BUF_SIGSET_NWORDS \ >>>> ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) >>>> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile >>>> index 0d0326c21a..d25d6f0ae4 100644 >>>> --- a/sysdeps/x86/Makefile >>>> +++ b/sysdeps/x86/Makefile >>>> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features >>>> tests += tst-get-cpu-features >>>> tests-static += tst-get-cpu-features-static >>>> endif >>>> + >>>> +ifeq ($(subdir),setjmp) >>>> +sysdep_routines += __longjmp_cancel >>>> +endif >>> >>> OK. >>> >>>> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S >>>> new file mode 100644 >>>> index 0000000000..b57dbfa376 >>>> --- /dev/null >>>> +++ b/sysdeps/x86/__longjmp_cancel.S >>>> @@ -0,0 +1,20 @@ >>>> +/* __longjmp_cancel for x86. >>>> + Copyright (C) 2018 Free Software Foundation, Inc. >>>> + This file is part of the GNU C Library. >>>> + >>>> + The GNU C Library is free software; you can redistribute it and/or >>>> + modify it under the terms of the GNU Lesser General Public >>>> + License as published by the Free Software Foundation; either >>>> + version 2.1 of the License, or (at your option) any later version. >>>> + >>>> + The GNU C Library is distributed in the hope that it will be useful, >>>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>>> + Lesser General Public License for more details. >>>> + >>>> + You should have received a copy of the GNU Lesser General Public >>>> + License along with the GNU C Library; if not, see >>>> + <http://www.gnu.org/licenses/>. */ >>>> + >>>> +#define __longjmp __longjmp_cancel >>>> +#include <__longjmp.S> >>> >>> OK. >>> >>>> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c >>>> new file mode 100644 >>>> index 0000000000..a53f31e1dd >>>> --- /dev/null >>>> +++ b/sysdeps/x86/longjmp.c >>>> @@ -0,0 +1,45 @@ >>>> +/* __libc_siglongjmp for x86. >>>> + Copyright (C) 2018 Free Software Foundation, Inc. >>>> + This file is part of the GNU C Library. >>>> + >>>> + The GNU C Library is free software; you can redistribute it and/or >>>> + modify it under the terms of the GNU Lesser General Public >>>> + License as published by the Free Software Foundation; either >>>> + version 2.1 of the License, or (at your option) any later version. >>>> + >>>> + The GNU C Library is distributed in the hope that it will be useful, >>>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>>> + Lesser General Public License for more details. >>>> + >>>> + You should have received a copy of the GNU Lesser General Public >>>> + License along with the GNU C Library; if not, see >>>> + <http://www.gnu.org/licenses/>. */ >>>> + >>>> +#define __libc_longjmp __redirect___libc_longjmp >>>> +#include <setjmp/longjmp.c> >>>> +#undef __libc_longjmp >>>> + >>>> +extern void __longjmp_cancel (__jmp_buf __env, int __val) >>>> + __attribute__ ((__noreturn__)) attribute_hidden; >>>> + >>>> +/* Since __libc_longjmp is a private interface for cancellation >>>> + implementation in libpthread, there is no need to restore shadow >>>> + stack register. */ >>>> + >>>> +void >>>> +__libc_longjmp (sigjmp_buf env, int val) >>>> +{ >>>> + /* Perform any cleanups needed by the frames being unwound. */ >>>> + _longjmp_unwind (env, val); >>> >>> OK. >>> >>>> + >>>> + if (env[0].__mask_was_saved) >>>> + /* Restore the saved signal mask. */ >>>> + (void) __sigprocmask (SIG_SETMASK, >>>> + (sigset_t *) &env[0].__saved_mask, >>>> + (sigset_t *) NULL); >>> >>> OK. >>> >>>> + >>>> + /* Call the machine-dependent function to restore machine state >>>> + without shadow stack. */ >>>> + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); >>> >>> OK. >>> >>>> +} >>>> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c >>>> new file mode 100644 >>>> index 0000000000..7eb8651cfe >>>> --- /dev/null >>>> +++ b/sysdeps/x86/nptl/pt-longjmp.c >>>> @@ -0,0 +1,97 @@ >>>> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. >>>> + X86 version. >>>> + Copyright (C) 18 Free Software Foundation, Inc. >>>> + This file is part of the GNU C Library. >>>> + >>>> + The GNU C Library is free software; you can redistribute it and/or >>>> + modify it under the terms of the GNU Lesser General Public >>>> + License as published by the Free Software Foundation; either >>>> + version 2.1 of the License, or (at your option) any later version. >>>> + >>>> + The GNU C Library is distributed in the hope that it will be useful, >>>> + but WITHOUT ANY WARRANTY; without even the implied warranty of >>>> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>>> + Lesser General Public License for more details. >>>> + >>>> + You should have received a copy of the GNU Lesser General Public >>>> + License along with the GNU C Library; if not, see >>>> + <http://www.gnu.org/licenses/>. */ >>>> + >>>> +/* <nptl/descr.h> has >>>> + >>>> +struct pthread_unwind_buf >>>> +{ >>>> + struct >>>> + { >>>> + __jmp_buf jmp_buf; >>>> + int mask_was_saved; >>>> + } cancel_jmp_buf[1]; >>>> + >>>> + union >>>> + { >>>> + void *pad[4]; >>>> + struct >>>> + { >>>> + struct pthread_unwind_buf *prev; >>>> + struct _pthread_cleanup_buffer *cleanup; >>>> + int canceltype; >>>> + } data; >>>> + } priv; >>>> +}; >>>> + >>>> + The pad array in struct pthread_unwind_buf is used by setjmp to save >>> >>> This appears to be SHADOW_STACK_POINTER_OFFSET in subsequent patches. >>> >>> However on your hjl/cet/master branch it appears that this offset is not >>> defined to be *just after* the mask_was_saved? >> >> sysdeps/unix/sysv/linux/x86/setjmpP.h has >> >> typedef struct >> { >> unsigned long int __val[_JUMP_BUF_SIGSET_NWORDS]; >> } __jmp_buf_sigset_t; >> >> typedef union >> { >> __sigset_t __saved_mask_compat; >> struct >> { >> __jmp_buf_sigset_t __saved_mask; >> /* Used for shadow stack pointer. */ >> unsigned long int __shadow_stack_pointer; >> } __saved; >> } __jmpbuf_arch_t; >> >> __shadow_stack_pointer is placed after __saved_mask, aka mask_was_saved. > > OK, I'll re-review. > > > I look forward to a v2 with correct rounding and a new comment. > Here is the updated patch. Please see if sysdeps/unix/sysv/linux/x86/setjmpP.h sysdeps/x86/nptl/pt-longjmp.c address your concerns. Thanks. -- H.J. [-- Attachment #2: 0001-x86-Use-pad-in-pthread_unwind_buf-to-preserve-shadow.patch --] [-- Type: text/x-patch, Size: 12270 bytes --] From 1ce2b4f199070a63d9a60bd9b7ca9e33013d4208 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" <hjl.tools@gmail.com> Date: Sat, 24 Feb 2018 17:23:54 -0800 Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register The pad array in struct pthread_unwind_buf is used by setjmp to save shadow stack register. We assert that size of struct pthread_unwind_buf is no less than offset of shadow stack pointer + shadow stack pointer size. Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as these with thread cancellation, call setjmp, but never return after __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as __libc_longjmp on x86, doesn't need to restore shadow stack register. __libc_longjmp, which is a private interface for thread cancellation implementation in libpthread, is changed to call __longjmp_cancel, instead of __longjmp. __longjmp_cancel is a new internal function in libc, which is similar to __longjmp, but doesn't restore shadow stack register. The compatibility longjmp and siglongjmp in libpthread.so are changed to call __libc_siglongjmp, instead of __libc_longjmp, so that they will restore shadow stack register. Tested with build-many-glibcs.py. * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous handlers after setjmp. * setjmp/longjmp.c (__libc_longjmp): Don't define alias if defined. * sysdeps/unix/sysv/linux/x86/setjmpP.h: Include <libc-pointer-arith.h>. (_JUMP_BUF_SIGSET_BITS_PER_WORD): New. (_JUMP_BUF_SIGSET_NSIG): Changed to 96. (_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and _JUMP_BUF_SIGSET_BITS_PER_WORD. * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. * sysdeps/x86/__longjmp_cancel.S: New file. * sysdeps/x86/longjmp.c: Likewise. * sysdeps/x86/nptl/pt-longjmp.c: Likewise. --- nptl/pthread_create.c | 18 +++++-- setjmp/longjmp.c | 2 + sysdeps/unix/sysv/linux/x86/setjmpP.h | 12 +++-- sysdeps/x86/Makefile | 4 ++ sysdeps/x86/__longjmp_cancel.S | 20 ++++++++ sysdeps/x86/longjmp.c | 45 ++++++++++++++++ sysdeps/x86/nptl/pt-longjmp.c | 97 +++++++++++++++++++++++++++++++++++ 7 files changed, 191 insertions(+), 7 deletions(-) create mode 100644 sysdeps/x86/__longjmp_cancel.S create mode 100644 sysdeps/x86/longjmp.c create mode 100644 sysdeps/x86/nptl/pt-longjmp.c diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index caaf07c134..8b1f06599d 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -427,12 +427,24 @@ START_THREAD_DEFN compilers without that support we do use setjmp. */ struct pthread_unwind_buf unwind_buf; - /* No previous handlers. */ + int not_first_call; + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); + + /* No previous handlers. NB: This must be done after setjmp since + the private space in the unwind jump buffer may overlap space + used by setjmp to store extra architecture-specific information + which is never be used by the cancellation-specific + __libc_unwind_longjmp. + + The private space is allowed to overlap because the unwinder never + has to return through any of the jumped-to call frames, and thus + only a minimum amount of saved data need be stored, and for example, + need not include the process signal mask information. This is all + an optimization to reduce stack usage when pushing cancellation + handlers. */ unwind_buf.priv.data.prev = NULL; unwind_buf.priv.data.cleanup = NULL; - int not_first_call; - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); if (__glibc_likely (! not_first_call)) { /* Store the new cleanup handler info. */ diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c index a2a7065a85..453889e103 100644 --- a/setjmp/longjmp.c +++ b/setjmp/longjmp.c @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) } #ifndef __libc_siglongjmp +# ifndef __libc_longjmp /* __libc_longjmp is a private interface for cancellation implementation in libpthread. */ strong_alias (__libc_siglongjmp, __libc_longjmp) +# endif weak_alias (__libc_siglongjmp, _longjmp) weak_alias (__libc_siglongjmp, longjmp) weak_alias (__libc_siglongjmp, siglongjmp) diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h index c0ed767a0d..24f87da204 100644 --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h @@ -20,13 +20,17 @@ #define _SETJMPP_H 1 #include <bits/types/__sigset_t.h> +#include <libc-pointer-arith.h> -/* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. - Define it to 513 to leave some rooms for future use. */ -#define _JUMP_BUF_SIGSET_NSIG 513 +/* Number of bits per long. */ +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int)) +/* The biggest signal number. As of kernel 4.14, x86 _NSIG is 64. + Define it to 96 to leave some rooms for future use. */ +#define _JUMP_BUF_SIGSET_NSIG 96 /* Number of longs to hold all signals. */ #define _JUMP_BUF_SIGSET_NWORDS \ - ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) + (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \ + / _JUMP_BUF_SIGSET_BITS_PER_WORD) typedef struct { diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile index 0d0326c21a..d25d6f0ae4 100644 --- a/sysdeps/x86/Makefile +++ b/sysdeps/x86/Makefile @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features tests += tst-get-cpu-features tests-static += tst-get-cpu-features-static endif + +ifeq ($(subdir),setjmp) +sysdep_routines += __longjmp_cancel +endif diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S new file mode 100644 index 0000000000..b57dbfa376 --- /dev/null +++ b/sysdeps/x86/__longjmp_cancel.S @@ -0,0 +1,20 @@ +/* __longjmp_cancel for x86. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#define __longjmp __longjmp_cancel +#include <__longjmp.S> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c new file mode 100644 index 0000000000..a53f31e1dd --- /dev/null +++ b/sysdeps/x86/longjmp.c @@ -0,0 +1,45 @@ +/* __libc_siglongjmp for x86. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#define __libc_longjmp __redirect___libc_longjmp +#include <setjmp/longjmp.c> +#undef __libc_longjmp + +extern void __longjmp_cancel (__jmp_buf __env, int __val) + __attribute__ ((__noreturn__)) attribute_hidden; + +/* Since __libc_longjmp is a private interface for cancellation + implementation in libpthread, there is no need to restore shadow + stack register. */ + +void +__libc_longjmp (sigjmp_buf env, int val) +{ + /* Perform any cleanups needed by the frames being unwound. */ + _longjmp_unwind (env, val); + + if (env[0].__mask_was_saved) + /* Restore the saved signal mask. */ + (void) __sigprocmask (SIG_SETMASK, + (sigset_t *) &env[0].__saved_mask, + (sigset_t *) NULL); + + /* Call the machine-dependent function to restore machine state + without shadow stack. */ + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); +} diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c new file mode 100644 index 0000000000..7eb8651cfe --- /dev/null +++ b/sysdeps/x86/nptl/pt-longjmp.c @@ -0,0 +1,97 @@ +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. + X86 version. + Copyright (C) 18 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +/* <nptl/descr.h> has + +struct pthread_unwind_buf +{ + struct + { + __jmp_buf jmp_buf; + int mask_was_saved; + } cancel_jmp_buf[1]; + + union + { + void *pad[4]; + struct + { + struct pthread_unwind_buf *prev; + struct _pthread_cleanup_buffer *cleanup; + int canceltype; + } data; + } priv; +}; + + The pad array in struct pthread_unwind_buf is used by setjmp to save + shadow stack register. Assert that size of struct pthread_unwind_buf + is no less than offset of shadow stack pointer plus shadow stack + pointer size. + + NB: setjmp is called in libpthread to save shadow stack register. But + __libc_unwind_longjmp doesn't restore shadow stack register since they + never return after longjmp. */ + +#include <pthreadP.h> +#include <jmp_buf-ssp.h> + +#ifdef __x86_64__ +# define SHADOW_STACK_POINTER_SIZE 8 +#else +# define SHADOW_STACK_POINTER_SIZE 4 +#endif + +_Static_assert ((sizeof (struct pthread_unwind_buf) + >= (SHADOW_STACK_POINTER_OFFSET + + SHADOW_STACK_POINTER_SIZE)), + "size of struct pthread_unwind_buf < " + "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); + +#include <shlib-compat.h> + +/* libpthread once had its own longjmp (and siglongjmp alias), though there + was no apparent reason for it. There is no use in having a separate + symbol in libpthread, but the historical ABI requires it. For static + linking, there is no need to provide anything here--the libc version + will be linked in. For shared library ABI compatibility, there must be + longjmp and siglongjmp symbols in libpthread.so. + + With an IFUNC resolver, it would be possible to avoid the indirection, + but the IFUNC resolver might run before the __libc_longjmp symbol has + been relocated, in which case the IFUNC resolver would not be able to + provide the correct address. */ + +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) + +static void __attribute__ ((noreturn, used)) +longjmp_compat (jmp_buf env, int val) +{ + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since + __libc_longjmp is a private interface for cancellation which + doesn't restore shadow stack register. */ + __libc_siglongjmp (env, val); +} + +strong_alias (longjmp_compat, longjmp_alias) +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); + +strong_alias (longjmp_alias, siglongjmp_alias) +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); + +#endif -- 2.14.3 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHv2] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-04-12 23:50 ` H.J. Lu @ 2018-04-21 3:28 ` Carlos O'Donell 2018-04-21 18:37 ` [PATCH/v3] " H.J. Lu 0 siblings, 1 reply; 13+ messages in thread From: Carlos O'Donell @ 2018-04-21 3:28 UTC (permalink / raw) To: H.J. Lu; +Cc: Florian Weimer, Joseph Myers, Tsimbalist, Igor V, GNU C Library Lets call this v2. Subject adjusted. Please keep incrementing the version number on your patches to make the review easier by myself and others. On 04/12/2018 06:50 PM, H.J. Lu wrote: > From 1ce2b4f199070a63d9a60bd9b7ca9e33013d4208 Mon Sep 17 00:00:00 2001 > From: "H.J. Lu" <hjl.tools@gmail.com> > Date: Sat, 24 Feb 2018 17:23:54 -0800 > Subject: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack > register > > The pad array in struct pthread_unwind_buf is used by setjmp to save > shadow stack register. We assert that size of struct pthread_unwind_buf > is no less than offset of shadow stack pointer + shadow stack pointer > size. > > Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as > these with thread cancellation, call setjmp, but never return after > __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as > __libc_longjmp on x86, doesn't need to restore shadow stack register. > __libc_longjmp, which is a private interface for thread cancellation > implementation in libpthread, is changed to call __longjmp_cancel, > instead of __longjmp. __longjmp_cancel is a new internal function > in libc, which is similar to __longjmp, but doesn't restore shadow > stack register. > > The compatibility longjmp and siglongjmp in libpthread.so are changed > to call __libc_siglongjmp, instead of __libc_longjmp, so that they will > restore shadow stack register. > > Tested with build-many-glibcs.py. > > * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous > handlers after setjmp. > * setjmp/longjmp.c (__libc_longjmp): Don't define alias if > defined. > * sysdeps/unix/sysv/linux/x86/setjmpP.h: Include > <libc-pointer-arith.h>. > (_JUMP_BUF_SIGSET_BITS_PER_WORD): New. > (_JUMP_BUF_SIGSET_NSIG): Changed to 96. > (_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and > _JUMP_BUF_SIGSET_BITS_PER_WORD. > * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. > * sysdeps/x86/__longjmp_cancel.S: New file. > * sysdeps/x86/longjmp.c: Likewise. > * sysdeps/x86/nptl/pt-longjmp.c: Likewise. > --- > nptl/pthread_create.c | 18 +++++-- > setjmp/longjmp.c | 2 + > sysdeps/unix/sysv/linux/x86/setjmpP.h | 12 +++-- > sysdeps/x86/Makefile | 4 ++ > sysdeps/x86/__longjmp_cancel.S | 20 ++++++++ > sysdeps/x86/longjmp.c | 45 ++++++++++++++++ > sysdeps/x86/nptl/pt-longjmp.c | 97 +++++++++++++++++++++++++++++++++++ > 7 files changed, 191 insertions(+), 7 deletions(-) > create mode 100644 sysdeps/x86/__longjmp_cancel.S > create mode 100644 sysdeps/x86/longjmp.c > create mode 100644 sysdeps/x86/nptl/pt-longjmp.c > > diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c > index caaf07c134..8b1f06599d 100644 > --- a/nptl/pthread_create.c > +++ b/nptl/pthread_create.c > @@ -427,12 +427,24 @@ START_THREAD_DEFN > compilers without that support we do use setjmp. */ > struct pthread_unwind_buf unwind_buf; > > - /* No previous handlers. */ > + int not_first_call; > + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); > + > + /* No previous handlers. NB: This must be done after setjmp since > + the private space in the unwind jump buffer may overlap space > + used by setjmp to store extra architecture-specific information > + which is never be used by the cancellation-specific s/be//g > + __libc_unwind_longjmp. > + > + The private space is allowed to overlap because the unwinder never > + has to return through any of the jumped-to call frames, and thus > + only a minimum amount of saved data need be stored, and for example, > + need not include the process signal mask information. This is all > + an optimization to reduce stack usage when pushing cancellation > + handlers. */ > unwind_buf.priv.data.prev = NULL; > unwind_buf.priv.data.cleanup = NULL; > > - int not_first_call; > - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); OK. > if (__glibc_likely (! not_first_call)) > { > /* Store the new cleanup handler info. */ > diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c > index a2a7065a85..453889e103 100644 > --- a/setjmp/longjmp.c > +++ b/setjmp/longjmp.c > @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) > } > > #ifndef __libc_siglongjmp > +# ifndef __libc_longjmp > /* __libc_longjmp is a private interface for cancellation implementation > in libpthread. */ > strong_alias (__libc_siglongjmp, __libc_longjmp) > +# endif OK. > weak_alias (__libc_siglongjmp, _longjmp) > weak_alias (__libc_siglongjmp, longjmp) > weak_alias (__libc_siglongjmp, siglongjmp) > diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h > index c0ed767a0d..24f87da204 100644 > --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h > +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h > @@ -20,13 +20,17 @@ > #define _SETJMPP_H 1 > > #include <bits/types/__sigset_t.h> > +#include <libc-pointer-arith.h> > > -/* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. > - Define it to 513 to leave some rooms for future use. */ > -#define _JUMP_BUF_SIGSET_NSIG 513 > +/* Number of bits per long. */ > +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int)) > +/* The biggest signal number. As of kernel 4.14, x86 _NSIG is 64. > + Define it to 96 to leave some rooms for future use. */ Why 96? Sure on x86_64 the cancel jump buf has 4 x void*'s and each is 64-bits, so you have 128 signals worth of space and then the shadow stack pointer. Does this work on x32? On x32 you have only 4 void*'s in the private pad. Your presently structured sigset_t looks like this: typedef union { __sigset_t __saved_mask_compat; struct { __jmp_buf_sigset_t __saved_mask; /* Used for shadow stack pointer. */ unsigned long int __shadow_stack_pointer; } __saved; } __jmpbuf_arch_t; Which means you have a sigset_t *before* the __shadow_stack_pointer. On x32 to save a 64-bit shadow stack pointer, you'll only have 2 x 32-bit words left? That's only space for 64 signals? Are you counting on one additional 32-bit word of padding between the __mask_was_saved and the rest of the long arguments? If so, then this still needs spelling out in an a comment why 96 signals works on both i686, x32, and x86_64. Also it should be explained if 96 is the *maximum* we can do with the current layout, or if more is possible. In which case picking 96 is *not* arbitrary. > +#define _JUMP_BUF_SIGSET_NSIG 96 > /* Number of longs to hold all signals. */ > #define _JUMP_BUF_SIGSET_NWORDS \ > - ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) > + (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \ > + / _JUMP_BUF_SIGSET_BITS_PER_WORD) > > typedef struct > { > diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile > index 0d0326c21a..d25d6f0ae4 100644 > --- a/sysdeps/x86/Makefile > +++ b/sysdeps/x86/Makefile > @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features > tests += tst-get-cpu-features > tests-static += tst-get-cpu-features-static > endif > + > +ifeq ($(subdir),setjmp) > +sysdep_routines += __longjmp_cancel > +endif > diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S > new file mode 100644 > index 0000000000..b57dbfa376 > --- /dev/null > +++ b/sysdeps/x86/__longjmp_cancel.S > @@ -0,0 +1,20 @@ > +/* __longjmp_cancel for x86. > + Copyright (C) 2018 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#define __longjmp __longjmp_cancel > +#include <__longjmp.S> OK. > diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c > new file mode 100644 > index 0000000000..a53f31e1dd > --- /dev/null > +++ b/sysdeps/x86/longjmp.c > @@ -0,0 +1,45 @@ > +/* __libc_siglongjmp for x86. > + Copyright (C) 2018 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#define __libc_longjmp __redirect___libc_longjmp > +#include <setjmp/longjmp.c> > +#undef __libc_longjmp > + > +extern void __longjmp_cancel (__jmp_buf __env, int __val) > + __attribute__ ((__noreturn__)) attribute_hidden; > + > +/* Since __libc_longjmp is a private interface for cancellation > + implementation in libpthread, there is no need to restore shadow > + stack register. */ > + > +void > +__libc_longjmp (sigjmp_buf env, int val) > +{ > + /* Perform any cleanups needed by the frames being unwound. */ > + _longjmp_unwind (env, val); > + > + if (env[0].__mask_was_saved) > + /* Restore the saved signal mask. */ > + (void) __sigprocmask (SIG_SETMASK, > + (sigset_t *) &env[0].__saved_mask, > + (sigset_t *) NULL); > + > + /* Call the machine-dependent function to restore machine state > + without shadow stack. */ > + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); > +} OK. > diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c > new file mode 100644 > index 0000000000..7eb8651cfe > --- /dev/null > +++ b/sysdeps/x86/nptl/pt-longjmp.c > @@ -0,0 +1,97 @@ > +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. > + X86 version. > + Copyright (C) 18 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +/* <nptl/descr.h> has > + > +struct pthread_unwind_buf > +{ > + struct > + { > + __jmp_buf jmp_buf; > + int mask_was_saved; > + } cancel_jmp_buf[1]; > + > + union > + { > + void *pad[4]; > + struct > + { > + struct pthread_unwind_buf *prev; > + struct _pthread_cleanup_buffer *cleanup; > + int canceltype; > + } data; > + } priv; > +}; > + > + The pad array in struct pthread_unwind_buf is used by setjmp to save > + shadow stack register. Assert that size of struct pthread_unwind_buf > + is no less than offset of shadow stack pointer plus shadow stack > + pointer size. > + > + NB: setjmp is called in libpthread to save shadow stack register. But > + __libc_unwind_longjmp doesn't restore shadow stack register since they > + never return after longjmp. */ Suggest: ~~~ The pad array in struct pthread_unwind_buf is used by setjmp to save the shadow stack register. Assert that the size of struct pthread_unwind_buf is no less than the offset of the shadow stack pointer plus the shadow stack pointer size. NB: We use setjmp in thread cancellation and this saves the shadow stack register, but __libc_unwind_longjmp doesn't restore the shadow stack register since cancellation never returns after longjmp. ~~~ > + > +#include <pthreadP.h> > +#include <jmp_buf-ssp.h> > + > +#ifdef __x86_64__ > +# define SHADOW_STACK_POINTER_SIZE 8 > +#else > +# define SHADOW_STACK_POINTER_SIZE 4 > +#endif > + > +_Static_assert ((sizeof (struct pthread_unwind_buf) > + >= (SHADOW_STACK_POINTER_OFFSET > + + SHADOW_STACK_POINTER_SIZE)), > + "size of struct pthread_unwind_buf < " > + "(SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)"); This assertion is too loose. The assertion you need is that the shadow stack pointer itself falls within the range of the private padding. This would have caught the previous bug with the rounded up size. Please adjust the assertion to be as tight as possible or add new assertions. Completely untested, but just to show what I'm thinking: _Static_assert ((offsetof (struct pthread_unwind_buf, cancel_jump_buf[0].mask_was_saved) + sizeof (int) < SHADOW_STACK_POINTER_OFFSET && (offsetof (struct pthread_unwind_buf, priv.pad[4]) > (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE), "Shadow stack pointer is not within private storage of pthread_unwind_buf."); > + > +#include <shlib-compat.h> > + > +/* libpthread once had its own longjmp (and siglongjmp alias), though there > + was no apparent reason for it. There is no use in having a separate > + symbol in libpthread, but the historical ABI requires it. For static > + linking, there is no need to provide anything here--the libc version > + will be linked in. For shared library ABI compatibility, there must be > + longjmp and siglongjmp symbols in libpthread.so. > + > + With an IFUNC resolver, it would be possible to avoid the indirection, > + but the IFUNC resolver might run before the __libc_longjmp symbol has > + been relocated, in which case the IFUNC resolver would not be able to > + provide the correct address. */ > + > +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) > + > +static void __attribute__ ((noreturn, used)) > +longjmp_compat (jmp_buf env, int val) > +{ > + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since > + __libc_longjmp is a private interface for cancellation which > + doesn't restore shadow stack register. */ > + __libc_siglongjmp (env, val); > +} > + > +strong_alias (longjmp_compat, longjmp_alias) > +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); > + > +strong_alias (longjmp_alias, siglongjmp_alias) > +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); > + > +#endif > -- 2.14.3 Looking forward to a v3. -- Cheers, Carlos. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH/v3] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-04-21 3:28 ` [PATCHv2] " Carlos O'Donell @ 2018-04-21 18:37 ` H.J. Lu 2018-05-02 4:43 ` Carlos O'Donell 0 siblings, 1 reply; 13+ messages in thread From: H.J. Lu @ 2018-04-21 18:37 UTC (permalink / raw) To: Carlos O'Donell Cc: Florian Weimer, Joseph Myers, Tsimbalist, Igor V, GNU C Library On Fri, Apr 20, 2018 at 10:28:24PM -0500, Carlos O'Donell wrote: > Lets call this v2. Subject adjusted. Please keep incrementing the version > number on your patches to make the review easier by myself and others. > Done. > > + > > + /* No previous handlers. NB: This must be done after setjmp since > > + the private space in the unwind jump buffer may overlap space > > + used by setjmp to store extra architecture-specific information > > + which is never be used by the cancellation-specific > > s/be//g Done > > +/* Number of bits per long. */ > > +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int)) > > +/* The biggest signal number. As of kernel 4.14, x86 _NSIG is 64. > > + Define it to 96 to leave some rooms for future use. */ > > Why 96? > > Sure on x86_64 the cancel jump buf has 4 x void*'s and each is 64-bits, > so you have 128 signals worth of space and then the shadow stack pointer. > > Does this work on x32? > > On x32 you have only 4 void*'s in the private pad. > > Your presently structured sigset_t looks like this: > > typedef union > { > __sigset_t __saved_mask_compat; > struct > { > __jmp_buf_sigset_t __saved_mask; > /* Used for shadow stack pointer. */ > unsigned long int __shadow_stack_pointer; > } __saved; > } __jmpbuf_arch_t; > > Which means you have a sigset_t *before* the __shadow_stack_pointer. > > On x32 to save a 64-bit shadow stack pointer, you'll only have 2 x 32-bit > words left? That's only space for 64 signals? > > Are you counting on one additional 32-bit word of padding between the > __mask_was_saved and the rest of the long arguments? > > If so, then this still needs spelling out in an a comment why 96 signals > works on both i686, x32, and x86_64. > > Also it should be explained if 96 is the *maximum* we can do with the current > layout, or if more is possible. In which case picking 96 is *not* arbitrary. The pad array in struct pthread_unwind_buf is used by setjmp to save shadow stack register. The usable space in __saved_mask for sigset and shadow stack pointer: 1. i386: The 4x4 byte pad array which can be used for 4 byte shadow stack pointer and maximum 12 byte sigset. 2. x32: 4 byte padding + the 4x4 byte pad array which can be used for 8 byte shadow stack pointer and maximum 12 byte sigset. 3. x86-64: The 4x8 byte pad array which can be used for 8 byte shadow stack pointer and maximum 24 byte sigset. Please see comments in sysdeps/unix/sysv/linux/x86/setjmpP.h for details. > Suggest: > ~~~ > The pad array in struct pthread_unwind_buf is used by setjmp to save > the shadow stack register. Assert that the size of struct pthread_unwind_buf > is no less than the offset of the shadow stack pointer plus the shadow stack > pointer size. > > NB: We use setjmp in thread cancellation and this saves the shadow stack > register, but __libc_unwind_longjmp doesn't restore the shadow stack register > since cancellation never returns after longjmp. > ~~~ > Done. > > This assertion is too loose. > > The assertion you need is that the shadow stack pointer itself falls within > the range of the private padding. This would have caught the previous bug > with the rounded up size. > > Please adjust the assertion to be as tight as possible or add new assertions. > > Completely untested, but just to show what I'm thinking: > > _Static_assert ((offsetof (struct pthread_unwind_buf, cancel_jump_buf[0].mask_was_saved) > + sizeof (int) < SHADOW_STACK_POINTER_OFFSET > && (offsetof (struct pthread_unwind_buf, priv.pad[4]) > > (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE), > "Shadow stack pointer is not within private storage of pthread_unwind_buf."); > I used a slighly different assert. > > Looking forward to a v3. > Here it is. H.J. --- The pad array in struct pthread_unwind_buf is used by setjmp to save shadow stack register. We assert that size of struct pthread_unwind_buf is no less than offset of shadow stack pointer + shadow stack pointer size. Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as these with thread cancellation, call setjmp, but never return after __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as __libc_longjmp on x86, doesn't need to restore shadow stack register. __libc_longjmp, which is a private interface for thread cancellation implementation in libpthread, is changed to call __longjmp_cancel, instead of __longjmp. __longjmp_cancel is a new internal function in libc, which is similar to __longjmp, but doesn't restore shadow stack register. The compatibility longjmp and siglongjmp in libpthread.so are changed to call __libc_siglongjmp, instead of __libc_longjmp, so that they will restore shadow stack register. Tested with build-many-glibcs.py. * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous handlers after setjmp. * setjmp/longjmp.c (__libc_longjmp): Don't define alias if defined. * sysdeps/unix/sysv/linux/x86/setjmpP.h: Include <libc-pointer-arith.h>. (_JUMP_BUF_SIGSET_BITS_PER_WORD): New. (_JUMP_BUF_SIGSET_NSIG): Changed to 96. (_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and _JUMP_BUF_SIGSET_BITS_PER_WORD. * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. * sysdeps/x86/__longjmp_cancel.S: New file. * sysdeps/x86/longjmp.c: Likewise. * sysdeps/x86/nptl/pt-longjmp.c: Likewise. --- nptl/pthread_create.c | 17 +++++++-- setjmp/longjmp.c | 2 + sysdeps/unix/sysv/linux/x86/setjmpP.h | 71 ++++++++++++++++++++++++++++++++--- sysdeps/x86/Makefile | 4 ++ sysdeps/x86/__longjmp_cancel.S | 20 ++++++++++ sysdeps/x86/longjmp.c | 45 ++++++++++++++++++++++ sysdeps/x86/nptl/pt-longjmp.c | 71 +++++++++++++++++++++++++++++++++++ 7 files changed, 222 insertions(+), 8 deletions(-) create mode 100644 sysdeps/x86/__longjmp_cancel.S create mode 100644 sysdeps/x86/longjmp.c create mode 100644 sysdeps/x86/nptl/pt-longjmp.c diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index caaf07c134..92c945b12b 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -427,12 +427,23 @@ START_THREAD_DEFN compilers without that support we do use setjmp. */ struct pthread_unwind_buf unwind_buf; - /* No previous handlers. */ + int not_first_call; + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); + + /* No previous handlers. NB: This must be done after setjmp since the + private space in the unwind jump buffer may overlap space used by + setjmp to store extra architecture-specific information which is + never used by the cancellation-specific __libc_unwind_longjmp. + + The private space is allowed to overlap because the unwinder never + has to return through any of the jumped-to call frames, and thus + only a minimum amount of saved data need be stored, and for example, + need not include the process signal mask information. This is all + an optimization to reduce stack usage when pushing cancellation + handlers. */ unwind_buf.priv.data.prev = NULL; unwind_buf.priv.data.cleanup = NULL; - int not_first_call; - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); if (__glibc_likely (! not_first_call)) { /* Store the new cleanup handler info. */ diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c index a2a7065a85..453889e103 100644 --- a/setjmp/longjmp.c +++ b/setjmp/longjmp.c @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) } #ifndef __libc_siglongjmp +# ifndef __libc_longjmp /* __libc_longjmp is a private interface for cancellation implementation in libpthread. */ strong_alias (__libc_siglongjmp, __libc_longjmp) +# endif weak_alias (__libc_siglongjmp, _longjmp) weak_alias (__libc_siglongjmp, longjmp) weak_alias (__libc_siglongjmp, siglongjmp) diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h index c0ed767a0d..6b2608453d 100644 --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h @@ -20,13 +20,72 @@ #define _SETJMPP_H 1 #include <bits/types/__sigset_t.h> +#include <libc-pointer-arith.h> -/* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. - Define it to 513 to leave some rooms for future use. */ -#define _JUMP_BUF_SIGSET_NSIG 513 +/* <setjmp/setjmp.h> has + +struct __jmp_buf_tag + { + __jmp_buf __jmpbuf; + int __mask_was_saved; + __sigset_t __saved_mask; + }; + + struct __jmp_buf_tag is 32 bits aligned on i386 and is 64 bits + aligned on x32 and x86-64. __saved_mask is aligned to 32 bits + on i386/x32 without padding and is aligned to 64 bits on x86-64 + with 32 bit padding. + + and <nptl/descr.h> has + +struct pthread_unwind_buf +{ + struct + { + __jmp_buf jmp_buf; + int mask_was_saved; + } cancel_jmp_buf[1]; + + union + { + void *pad[4]; + struct + { + struct pthread_unwind_buf *prev; + struct _pthread_cleanup_buffer *cleanup; + int canceltype; + } data; + } priv; +}; + + struct pthread_unwind_buf is 32 bits aligned on i386 and 64 bits + aligned on x32/x86-64. cancel_jmp_buf is aligned to 32 bits on + i386 and is aligned to 64 bits on x32/x86-64. + + The pad array in struct pthread_unwind_buf is used by setjmp to save + shadow stack register. The usable space in __saved_mask for sigset + and shadow stack pointer: + 1. i386: The 4x4 byte pad array which can be used for 4 byte shadow + stack pointer and maximum 12 byte sigset. + 2. x32: 4 byte padding + the 4x4 byte pad array which can be used + for 8 byte shadow stack pointer and maximum 12 byte sigset. + 3. x86-64: The 4x8 byte pad array which can be used for 8 byte + shadow stack pointer and maximum 24 byte sigset. + + NB: We use setjmp in thread cancellation and this saves the shadow + stack register, but __libc_unwind_longjmp doesn't restore the shadow + stack register since cancellation never returns after longjmp. */ + +/* Number of bits per long. */ +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int)) +/* The biggest signal number. As of kernel 4.14, x86 _NSIG is 64. The + common maximum sigset for i386, x32 and x86-64 is 12 bytes (96 bits). + Define it to 96 to leave some rooms for future use. */ +#define _JUMP_BUF_SIGSET_NSIG 96 /* Number of longs to hold all signals. */ #define _JUMP_BUF_SIGSET_NWORDS \ - ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) + (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \ + / _JUMP_BUF_SIGSET_BITS_PER_WORD) typedef struct { @@ -39,7 +98,9 @@ typedef union struct { __jmp_buf_sigset_t __saved_mask; - /* Used for shadow stack pointer. */ + /* Used for shadow stack pointer. NB: Shadow stack pointer + must have the same alignment as __saved_mask. Otherwise + offset of __saved_mask will be changed. */ unsigned long int __shadow_stack_pointer; } __saved; } __jmpbuf_arch_t; diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile index 0d0326c21a..d25d6f0ae4 100644 --- a/sysdeps/x86/Makefile +++ b/sysdeps/x86/Makefile @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features tests += tst-get-cpu-features tests-static += tst-get-cpu-features-static endif + +ifeq ($(subdir),setjmp) +sysdep_routines += __longjmp_cancel +endif diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S new file mode 100644 index 0000000000..b57dbfa376 --- /dev/null +++ b/sysdeps/x86/__longjmp_cancel.S @@ -0,0 +1,20 @@ +/* __longjmp_cancel for x86. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#define __longjmp __longjmp_cancel +#include <__longjmp.S> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c new file mode 100644 index 0000000000..a53f31e1dd --- /dev/null +++ b/sysdeps/x86/longjmp.c @@ -0,0 +1,45 @@ +/* __libc_siglongjmp for x86. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#define __libc_longjmp __redirect___libc_longjmp +#include <setjmp/longjmp.c> +#undef __libc_longjmp + +extern void __longjmp_cancel (__jmp_buf __env, int __val) + __attribute__ ((__noreturn__)) attribute_hidden; + +/* Since __libc_longjmp is a private interface for cancellation + implementation in libpthread, there is no need to restore shadow + stack register. */ + +void +__libc_longjmp (sigjmp_buf env, int val) +{ + /* Perform any cleanups needed by the frames being unwound. */ + _longjmp_unwind (env, val); + + if (env[0].__mask_was_saved) + /* Restore the saved signal mask. */ + (void) __sigprocmask (SIG_SETMASK, + (sigset_t *) &env[0].__saved_mask, + (sigset_t *) NULL); + + /* Call the machine-dependent function to restore machine state + without shadow stack. */ + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); +} diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c new file mode 100644 index 0000000000..6165c7d4a7 --- /dev/null +++ b/sysdeps/x86/nptl/pt-longjmp.c @@ -0,0 +1,71 @@ +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. + X86 version. + Copyright (C) 18 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <pthreadP.h> +#include <jmp_buf-ssp.h> + +#ifdef __x86_64__ +# define SHADOW_STACK_POINTER_SIZE 8 +#else +# define SHADOW_STACK_POINTER_SIZE 4 +#endif + +/* Assert that the priv field in struct pthread_unwind_buf has space + to store shadow stack pointer. */ +_Static_assert ((offsetof (struct pthread_unwind_buf, priv) + <= SHADOW_STACK_POINTER_OFFSET) + && ((offsetof (struct pthread_unwind_buf, priv) + + sizeof (((struct pthread_unwind_buf *) 0)->priv)) + >= (SHADOW_STACK_POINTER_OFFSET + + SHADOW_STACK_POINTER_SIZE)), + "Shadow stack pointer is not within private storage " + "of pthread_unwind_buf."); + +#include <shlib-compat.h> + +/* libpthread once had its own longjmp (and siglongjmp alias), though there + was no apparent reason for it. There is no use in having a separate + symbol in libpthread, but the historical ABI requires it. For static + linking, there is no need to provide anything here--the libc version + will be linked in. For shared library ABI compatibility, there must be + longjmp and siglongjmp symbols in libpthread.so. + + With an IFUNC resolver, it would be possible to avoid the indirection, + but the IFUNC resolver might run before the __libc_longjmp symbol has + been relocated, in which case the IFUNC resolver would not be able to + provide the correct address. */ + +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) + +static void __attribute__ ((noreturn, used)) +longjmp_compat (jmp_buf env, int val) +{ + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since + __libc_longjmp is a private interface for cancellation which + doesn't restore shadow stack register. */ + __libc_siglongjmp (env, val); +} + +strong_alias (longjmp_compat, longjmp_alias) +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); + +strong_alias (longjmp_alias, siglongjmp_alias) +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); + +#endif -- 2.14.3 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/v3] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-04-21 18:37 ` [PATCH/v3] " H.J. Lu @ 2018-05-02 4:43 ` Carlos O'Donell 2018-05-02 12:45 ` H.J. Lu 0 siblings, 1 reply; 13+ messages in thread From: Carlos O'Donell @ 2018-05-02 4:43 UTC (permalink / raw) To: H.J. Lu; +Cc: Florian Weimer, Joseph Myers, Tsimbalist, Igor V, GNU C Library On 04/21/2018 02:37 PM, H.J. Lu wrote: > On Fri, Apr 20, 2018 at 10:28:24PM -0500, Carlos O'Donell wrote: >> Lets call this v2. Subject adjusted. Please keep incrementing the version >> number on your patches to make the review easier by myself and others. >> > > Done. > >>> + >>> + /* No previous handlers. NB: This must be done after setjmp since >>> + the private space in the unwind jump buffer may overlap space >>> + used by setjmp to store extra architecture-specific information >>> + which is never be used by the cancellation-specific >> >> s/be//g > > Done > >>> +/* Number of bits per long. */ >>> +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int)) >>> +/* The biggest signal number. As of kernel 4.14, x86 _NSIG is 64. >>> + Define it to 96 to leave some rooms for future use. */ >> >> Why 96? >> >> Sure on x86_64 the cancel jump buf has 4 x void*'s and each is 64-bits, >> so you have 128 signals worth of space and then the shadow stack pointer. >> >> Does this work on x32? >> >> On x32 you have only 4 void*'s in the private pad. >> >> Your presently structured sigset_t looks like this: >> >> typedef union >> { >> __sigset_t __saved_mask_compat; >> struct >> { >> __jmp_buf_sigset_t __saved_mask; >> /* Used for shadow stack pointer. */ >> unsigned long int __shadow_stack_pointer; >> } __saved; >> } __jmpbuf_arch_t; >> >> Which means you have a sigset_t *before* the __shadow_stack_pointer. >> >> On x32 to save a 64-bit shadow stack pointer, you'll only have 2 x 32-bit >> words left? That's only space for 64 signals? >> >> Are you counting on one additional 32-bit word of padding between the >> __mask_was_saved and the rest of the long arguments? >> >> If so, then this still needs spelling out in an a comment why 96 signals >> works on both i686, x32, and x86_64. >> >> Also it should be explained if 96 is the *maximum* we can do with the current >> layout, or if more is possible. In which case picking 96 is *not* arbitrary. > > The pad array in struct pthread_unwind_buf is used by setjmp to save > shadow stack register. The usable space in __saved_mask for sigset > and shadow stack pointer: > 1. i386: The 4x4 byte pad array which can be used for 4 byte shadow > stack pointer and maximum 12 byte sigset. > 2. x32: 4 byte padding + the 4x4 byte pad array which can be used > for 8 byte shadow stack pointer and maximum 12 byte sigset. > 3. x86-64: The 4x8 byte pad array which can be used for 8 byte > shadow stack pointer and maximum 24 byte sigset. > > Please see comments in sysdeps/unix/sysv/linux/x86/setjmpP.h for > details. > >> Suggest: >> ~~~ >> The pad array in struct pthread_unwind_buf is used by setjmp to save >> the shadow stack register. Assert that the size of struct pthread_unwind_buf >> is no less than the offset of the shadow stack pointer plus the shadow stack >> pointer size. >> >> NB: We use setjmp in thread cancellation and this saves the shadow stack >> register, but __libc_unwind_longjmp doesn't restore the shadow stack register >> since cancellation never returns after longjmp. >> ~~~ >> > > Done. > >> >> This assertion is too loose. >> >> The assertion you need is that the shadow stack pointer itself falls within >> the range of the private padding. This would have caught the previous bug >> with the rounded up size. >> >> Please adjust the assertion to be as tight as possible or add new assertions. >> >> Completely untested, but just to show what I'm thinking: >> >> _Static_assert ((offsetof (struct pthread_unwind_buf, cancel_jump_buf[0].mask_was_saved) >> + sizeof (int) < SHADOW_STACK_POINTER_OFFSET >> && (offsetof (struct pthread_unwind_buf, priv.pad[4]) >> > (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE), >> "Shadow stack pointer is not within private storage of pthread_unwind_buf."); >> > > I used a slighly different assert. > >> >> Looking forward to a v3. >> > > Here it is. > > H.J. > --- > The pad array in struct pthread_unwind_buf is used by setjmp to save > shadow stack register. We assert that size of struct pthread_unwind_buf > is no less than offset of shadow stack pointer + shadow stack pointer > size. > > Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as > these with thread cancellation, call setjmp, but never return after > __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as > __libc_longjmp on x86, doesn't need to restore shadow stack register. > __libc_longjmp, which is a private interface for thread cancellation > implementation in libpthread, is changed to call __longjmp_cancel, > instead of __longjmp. __longjmp_cancel is a new internal function > in libc, which is similar to __longjmp, but doesn't restore shadow > stack register. > > The compatibility longjmp and siglongjmp in libpthread.so are changed > to call __libc_siglongjmp, instead of __libc_longjmp, so that they will > restore shadow stack register. > > Tested with build-many-glibcs.py. > > * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous > handlers after setjmp. > * setjmp/longjmp.c (__libc_longjmp): Don't define alias if > defined. > * sysdeps/unix/sysv/linux/x86/setjmpP.h: Include > <libc-pointer-arith.h>. > (_JUMP_BUF_SIGSET_BITS_PER_WORD): New. > (_JUMP_BUF_SIGSET_NSIG): Changed to 96. > (_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and > _JUMP_BUF_SIGSET_BITS_PER_WORD. > * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. > * sysdeps/x86/__longjmp_cancel.S: New file. > * sysdeps/x86/longjmp.c: Likewise. > * sysdeps/x86/nptl/pt-longjmp.c: Likewise. > --- > nptl/pthread_create.c | 17 +++++++-- > setjmp/longjmp.c | 2 + > sysdeps/unix/sysv/linux/x86/setjmpP.h | 71 ++++++++++++++++++++++++++++++++--- > sysdeps/x86/Makefile | 4 ++ > sysdeps/x86/__longjmp_cancel.S | 20 ++++++++++ > sysdeps/x86/longjmp.c | 45 ++++++++++++++++++++++ > sysdeps/x86/nptl/pt-longjmp.c | 71 +++++++++++++++++++++++++++++++++++ > 7 files changed, 222 insertions(+), 8 deletions(-) > create mode 100644 sysdeps/x86/__longjmp_cancel.S > create mode 100644 sysdeps/x86/longjmp.c > create mode 100644 sysdeps/x86/nptl/pt-longjmp.c > > diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c > index caaf07c134..92c945b12b 100644 > --- a/nptl/pthread_create.c > +++ b/nptl/pthread_create.c > @@ -427,12 +427,23 @@ START_THREAD_DEFN > compilers without that support we do use setjmp. */ > struct pthread_unwind_buf unwind_buf; > > - /* No previous handlers. */ > + int not_first_call; > + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); > + > + /* No previous handlers. NB: This must be done after setjmp since the > + private space in the unwind jump buffer may overlap space used by > + setjmp to store extra architecture-specific information which is > + never used by the cancellation-specific __libc_unwind_longjmp. > + > + The private space is allowed to overlap because the unwinder never > + has to return through any of the jumped-to call frames, and thus > + only a minimum amount of saved data need be stored, and for example, > + need not include the process signal mask information. This is all > + an optimization to reduce stack usage when pushing cancellation > + handlers. */ OK. > unwind_buf.priv.data.prev = NULL; > unwind_buf.priv.data.cleanup = NULL; > > - int not_first_call; > - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); > if (__glibc_likely (! not_first_call)) > { > /* Store the new cleanup handler info. */ > diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c > index a2a7065a85..453889e103 100644 > --- a/setjmp/longjmp.c > +++ b/setjmp/longjmp.c > @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) > } > > #ifndef __libc_siglongjmp > +# ifndef __libc_longjmp > /* __libc_longjmp is a private interface for cancellation implementation > in libpthread. */ > strong_alias (__libc_siglongjmp, __libc_longjmp) > +# endif OK. > weak_alias (__libc_siglongjmp, _longjmp) > weak_alias (__libc_siglongjmp, longjmp) > weak_alias (__libc_siglongjmp, siglongjmp) > diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h > index c0ed767a0d..6b2608453d 100644 > --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h > +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h > @@ -20,13 +20,72 @@ > #define _SETJMPP_H 1 > > #include <bits/types/__sigset_t.h> > +#include <libc-pointer-arith.h> > > -/* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. > - Define it to 513 to leave some rooms for future use. */ > -#define _JUMP_BUF_SIGSET_NSIG 513 > +/* <setjmp/setjmp.h> has > + > +struct __jmp_buf_tag > + { > + __jmp_buf __jmpbuf; > + int __mask_was_saved; > + __sigset_t __saved_mask; > + }; > + > + struct __jmp_buf_tag is 32 bits aligned on i386 and is 64 bits > + aligned on x32 and x86-64. __saved_mask is aligned to 32 bits > + on i386/x32 without padding and is aligned to 64 bits on x86-64 > + with 32 bit padding. > + > + and <nptl/descr.h> has > + > +struct pthread_unwind_buf > +{ > + struct > + { > + __jmp_buf jmp_buf; > + int mask_was_saved; > + } cancel_jmp_buf[1]; > + > + union > + { > + void *pad[4]; > + struct > + { > + struct pthread_unwind_buf *prev; > + struct _pthread_cleanup_buffer *cleanup; > + int canceltype; > + } data; > + } priv; > +}; > + > + struct pthread_unwind_buf is 32 bits aligned on i386 and 64 bits > + aligned on x32/x86-64. cancel_jmp_buf is aligned to 32 bits on > + i386 and is aligned to 64 bits on x32/x86-64. OK. > + > + The pad array in struct pthread_unwind_buf is used by setjmp to save > + shadow stack register. The usable space in __saved_mask for sigset > + and shadow stack pointer: > + 1. i386: The 4x4 byte pad array which can be used for 4 byte shadow > + stack pointer and maximum 12 byte sigset. > + 2. x32: 4 byte padding + the 4x4 byte pad array which can be used > + for 8 byte shadow stack pointer and maximum 12 byte sigset. > + 3. x86-64: The 4x8 byte pad array which can be used for 8 byte > + shadow stack pointer and maximum 24 byte sigset. OK. > + > + NB: We use setjmp in thread cancellation and this saves the shadow > + stack register, but __libc_unwind_longjmp doesn't restore the shadow > + stack register since cancellation never returns after longjmp. */ > + > +/* Number of bits per long. */ > +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int)) > +/* The biggest signal number. As of kernel 4.14, x86 _NSIG is 64. The > + common maximum sigset for i386, x32 and x86-64 is 12 bytes (96 bits). > + Define it to 96 to leave some rooms for future use. */ > +#define _JUMP_BUF_SIGSET_NSIG 96 > /* Number of longs to hold all signals. */ > #define _JUMP_BUF_SIGSET_NWORDS \ > - ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) > + (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \ > + / _JUMP_BUF_SIGSET_BITS_PER_WORD) OK. > > typedef struct > { > @@ -39,7 +98,9 @@ typedef union > struct > { > __jmp_buf_sigset_t __saved_mask; > - /* Used for shadow stack pointer. */ > + /* Used for shadow stack pointer. NB: Shadow stack pointer > + must have the same alignment as __saved_mask. Otherwise > + offset of __saved_mask will be changed. */ s/offset of __saved_mask/offset of __shadow_stack_pointer/g? OK with that correction. It is the __shadow_stack_pointer offset which will change if it doesn't have the same alignment as the __saved_mask. The compiler will insert padding between the fields. Perhaps assert there is no padding? > unsigned long int __shadow_stack_pointer; > } __saved; > } __jmpbuf_arch_t; > diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile > index 0d0326c21a..d25d6f0ae4 100644 > --- a/sysdeps/x86/Makefile > +++ b/sysdeps/x86/Makefile > @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features > tests += tst-get-cpu-features > tests-static += tst-get-cpu-features-static > endif > + > +ifeq ($(subdir),setjmp) > +sysdep_routines += __longjmp_cancel > +endif OK. > diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S > new file mode 100644 > index 0000000000..b57dbfa376 > --- /dev/null > +++ b/sysdeps/x86/__longjmp_cancel.S > @@ -0,0 +1,20 @@ > +/* __longjmp_cancel for x86. > + Copyright (C) 2018 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#define __longjmp __longjmp_cancel > +#include <__longjmp.S> OK. > diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c > new file mode 100644 > index 0000000000..a53f31e1dd > --- /dev/null > +++ b/sysdeps/x86/longjmp.c > @@ -0,0 +1,45 @@ > +/* __libc_siglongjmp for x86. > + Copyright (C) 2018 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#define __libc_longjmp __redirect___libc_longjmp > +#include <setjmp/longjmp.c> > +#undef __libc_longjmp > + > +extern void __longjmp_cancel (__jmp_buf __env, int __val) > + __attribute__ ((__noreturn__)) attribute_hidden; > + > +/* Since __libc_longjmp is a private interface for cancellation > + implementation in libpthread, there is no need to restore shadow > + stack register. */ > + > +void > +__libc_longjmp (sigjmp_buf env, int val) > +{ > + /* Perform any cleanups needed by the frames being unwound. */ > + _longjmp_unwind (env, val); > + > + if (env[0].__mask_was_saved) > + /* Restore the saved signal mask. */ > + (void) __sigprocmask (SIG_SETMASK, > + (sigset_t *) &env[0].__saved_mask, > + (sigset_t *) NULL); > + > + /* Call the machine-dependent function to restore machine state > + without shadow stack. */ > + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); > +} OK. > diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c > new file mode 100644 > index 0000000000..6165c7d4a7 > --- /dev/null > +++ b/sysdeps/x86/nptl/pt-longjmp.c > @@ -0,0 +1,71 @@ > +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. > + X86 version. > + Copyright (C) 18 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#include <pthreadP.h> > +#include <jmp_buf-ssp.h> > + > +#ifdef __x86_64__ > +# define SHADOW_STACK_POINTER_SIZE 8 > +#else > +# define SHADOW_STACK_POINTER_SIZE 4 > +#endif > + > +/* Assert that the priv field in struct pthread_unwind_buf has space > + to store shadow stack pointer. */ > +_Static_assert ((offsetof (struct pthread_unwind_buf, priv) > + <= SHADOW_STACK_POINTER_OFFSET) > + && ((offsetof (struct pthread_unwind_buf, priv) > + + sizeof (((struct pthread_unwind_buf *) 0)->priv)) > + >= (SHADOW_STACK_POINTER_OFFSET > + + SHADOW_STACK_POINTER_SIZE)), > + "Shadow stack pointer is not within private storage " > + "of pthread_unwind_buf."); OK. Thanks for testing this out. > + > +#include <shlib-compat.h> > + > +/* libpthread once had its own longjmp (and siglongjmp alias), though there > + was no apparent reason for it. There is no use in having a separate > + symbol in libpthread, but the historical ABI requires it. For static > + linking, there is no need to provide anything here--the libc version > + will be linked in. For shared library ABI compatibility, there must be > + longjmp and siglongjmp symbols in libpthread.so. > + > + With an IFUNC resolver, it would be possible to avoid the indirection, > + but the IFUNC resolver might run before the __libc_longjmp symbol has > + been relocated, in which case the IFUNC resolver would not be able to > + provide the correct address. */ > + > +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) > + > +static void __attribute__ ((noreturn, used)) > +longjmp_compat (jmp_buf env, int val) > +{ > + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since > + __libc_longjmp is a private interface for cancellation which > + doesn't restore shadow stack register. */ > + __libc_siglongjmp (env, val); > +} > + > +strong_alias (longjmp_compat, longjmp_alias) > +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); > + > +strong_alias (longjmp_alias, siglongjmp_alias) > +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); > + > +#endif > Version 3 looks good to me with the comment correction I mentioned. Thank you for working with me to improve the quality of the code and comments, and for balancing the needs of the implementation against the distribution needs e.g. no added symbol for core CET enablement. Reviewed-by: Carlos O'Donell <carlos@redhat.com> -- Cheers, Carlos. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH/v3] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-05-02 4:43 ` Carlos O'Donell @ 2018-05-02 12:45 ` H.J. Lu 0 siblings, 0 replies; 13+ messages in thread From: H.J. Lu @ 2018-05-02 12:45 UTC (permalink / raw) To: Carlos O'Donell Cc: Florian Weimer, Joseph Myers, Tsimbalist, Igor V, GNU C Library On Tue, May 1, 2018 at 9:43 PM, Carlos O'Donell <carlos@redhat.com> wrote: > On 04/21/2018 02:37 PM, H.J. Lu wrote: >> On Fri, Apr 20, 2018 at 10:28:24PM -0500, Carlos O'Donell wrote: >>> Lets call this v2. Subject adjusted. Please keep incrementing the version >>> number on your patches to make the review easier by myself and others. >>> >> >> Done. >> >>>> + >>>> + /* No previous handlers. NB: This must be done after setjmp since >>>> + the private space in the unwind jump buffer may overlap space >>>> + used by setjmp to store extra architecture-specific information >>>> + which is never be used by the cancellation-specific >>> >>> s/be//g >> >> Done >> >>>> +/* Number of bits per long. */ >>>> +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int)) >>>> +/* The biggest signal number. As of kernel 4.14, x86 _NSIG is 64. >>>> + Define it to 96 to leave some rooms for future use. */ >>> >>> Why 96? >>> >>> Sure on x86_64 the cancel jump buf has 4 x void*'s and each is 64-bits, >>> so you have 128 signals worth of space and then the shadow stack pointer. >>> >>> Does this work on x32? >>> >>> On x32 you have only 4 void*'s in the private pad. >>> >>> Your presently structured sigset_t looks like this: >>> >>> typedef union >>> { >>> __sigset_t __saved_mask_compat; >>> struct >>> { >>> __jmp_buf_sigset_t __saved_mask; >>> /* Used for shadow stack pointer. */ >>> unsigned long int __shadow_stack_pointer; >>> } __saved; >>> } __jmpbuf_arch_t; >>> >>> Which means you have a sigset_t *before* the __shadow_stack_pointer. >>> >>> On x32 to save a 64-bit shadow stack pointer, you'll only have 2 x 32-bit >>> words left? That's only space for 64 signals? >>> >>> Are you counting on one additional 32-bit word of padding between the >>> __mask_was_saved and the rest of the long arguments? >>> >>> If so, then this still needs spelling out in an a comment why 96 signals >>> works on both i686, x32, and x86_64. >>> >>> Also it should be explained if 96 is the *maximum* we can do with the current >>> layout, or if more is possible. In which case picking 96 is *not* arbitrary. >> >> The pad array in struct pthread_unwind_buf is used by setjmp to save >> shadow stack register. The usable space in __saved_mask for sigset >> and shadow stack pointer: >> 1. i386: The 4x4 byte pad array which can be used for 4 byte shadow >> stack pointer and maximum 12 byte sigset. >> 2. x32: 4 byte padding + the 4x4 byte pad array which can be used >> for 8 byte shadow stack pointer and maximum 12 byte sigset. >> 3. x86-64: The 4x8 byte pad array which can be used for 8 byte >> shadow stack pointer and maximum 24 byte sigset. >> >> Please see comments in sysdeps/unix/sysv/linux/x86/setjmpP.h for >> details. >> >>> Suggest: >>> ~~~ >>> The pad array in struct pthread_unwind_buf is used by setjmp to save >>> the shadow stack register. Assert that the size of struct pthread_unwind_buf >>> is no less than the offset of the shadow stack pointer plus the shadow stack >>> pointer size. >>> >>> NB: We use setjmp in thread cancellation and this saves the shadow stack >>> register, but __libc_unwind_longjmp doesn't restore the shadow stack register >>> since cancellation never returns after longjmp. >>> ~~~ >>> >> >> Done. >> >>> >>> This assertion is too loose. >>> >>> The assertion you need is that the shadow stack pointer itself falls within >>> the range of the private padding. This would have caught the previous bug >>> with the rounded up size. >>> >>> Please adjust the assertion to be as tight as possible or add new assertions. >>> >>> Completely untested, but just to show what I'm thinking: >>> >>> _Static_assert ((offsetof (struct pthread_unwind_buf, cancel_jump_buf[0].mask_was_saved) >>> + sizeof (int) < SHADOW_STACK_POINTER_OFFSET >>> && (offsetof (struct pthread_unwind_buf, priv.pad[4]) >>> > (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE), >>> "Shadow stack pointer is not within private storage of pthread_unwind_buf."); >>> >> >> I used a slighly different assert. >> >>> >>> Looking forward to a v3. >>> >> >> Here it is. >> >> H.J. >> --- >> The pad array in struct pthread_unwind_buf is used by setjmp to save >> shadow stack register. We assert that size of struct pthread_unwind_buf >> is no less than offset of shadow stack pointer + shadow stack pointer >> size. >> >> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as >> these with thread cancellation, call setjmp, but never return after >> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as >> __libc_longjmp on x86, doesn't need to restore shadow stack register. >> __libc_longjmp, which is a private interface for thread cancellation >> implementation in libpthread, is changed to call __longjmp_cancel, >> instead of __longjmp. __longjmp_cancel is a new internal function >> in libc, which is similar to __longjmp, but doesn't restore shadow >> stack register. >> >> The compatibility longjmp and siglongjmp in libpthread.so are changed >> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will >> restore shadow stack register. >> >> Tested with build-many-glibcs.py. >> >> * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous >> handlers after setjmp. >> * setjmp/longjmp.c (__libc_longjmp): Don't define alias if >> defined. >> * sysdeps/unix/sysv/linux/x86/setjmpP.h: Include >> <libc-pointer-arith.h>. >> (_JUMP_BUF_SIGSET_BITS_PER_WORD): New. >> (_JUMP_BUF_SIGSET_NSIG): Changed to 96. >> (_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and >> _JUMP_BUF_SIGSET_BITS_PER_WORD. >> * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel. >> * sysdeps/x86/__longjmp_cancel.S: New file. >> * sysdeps/x86/longjmp.c: Likewise. >> * sysdeps/x86/nptl/pt-longjmp.c: Likewise. >> --- >> nptl/pthread_create.c | 17 +++++++-- >> setjmp/longjmp.c | 2 + >> sysdeps/unix/sysv/linux/x86/setjmpP.h | 71 ++++++++++++++++++++++++++++++++--- >> sysdeps/x86/Makefile | 4 ++ >> sysdeps/x86/__longjmp_cancel.S | 20 ++++++++++ >> sysdeps/x86/longjmp.c | 45 ++++++++++++++++++++++ >> sysdeps/x86/nptl/pt-longjmp.c | 71 +++++++++++++++++++++++++++++++++++ >> 7 files changed, 222 insertions(+), 8 deletions(-) >> create mode 100644 sysdeps/x86/__longjmp_cancel.S >> create mode 100644 sysdeps/x86/longjmp.c >> create mode 100644 sysdeps/x86/nptl/pt-longjmp.c >> >> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c >> index caaf07c134..92c945b12b 100644 >> --- a/nptl/pthread_create.c >> +++ b/nptl/pthread_create.c >> @@ -427,12 +427,23 @@ START_THREAD_DEFN >> compilers without that support we do use setjmp. */ >> struct pthread_unwind_buf unwind_buf; >> >> - /* No previous handlers. */ >> + int not_first_call; >> + not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); >> + >> + /* No previous handlers. NB: This must be done after setjmp since the >> + private space in the unwind jump buffer may overlap space used by >> + setjmp to store extra architecture-specific information which is >> + never used by the cancellation-specific __libc_unwind_longjmp. >> + >> + The private space is allowed to overlap because the unwinder never >> + has to return through any of the jumped-to call frames, and thus >> + only a minimum amount of saved data need be stored, and for example, >> + need not include the process signal mask information. This is all >> + an optimization to reduce stack usage when pushing cancellation >> + handlers. */ > > OK. > >> unwind_buf.priv.data.prev = NULL; >> unwind_buf.priv.data.cleanup = NULL; >> >> - int not_first_call; >> - not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf); >> if (__glibc_likely (! not_first_call)) >> { >> /* Store the new cleanup handler info. */ >> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c >> index a2a7065a85..453889e103 100644 >> --- a/setjmp/longjmp.c >> +++ b/setjmp/longjmp.c >> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val) >> } >> >> #ifndef __libc_siglongjmp >> +# ifndef __libc_longjmp >> /* __libc_longjmp is a private interface for cancellation implementation >> in libpthread. */ >> strong_alias (__libc_siglongjmp, __libc_longjmp) >> +# endif > > OK. > >> weak_alias (__libc_siglongjmp, _longjmp) >> weak_alias (__libc_siglongjmp, longjmp) >> weak_alias (__libc_siglongjmp, siglongjmp) >> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h >> index c0ed767a0d..6b2608453d 100644 >> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h >> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h >> @@ -20,13 +20,72 @@ >> #define _SETJMPP_H 1 >> >> #include <bits/types/__sigset_t.h> >> +#include <libc-pointer-arith.h> >> >> -/* The biggest signal number + 1. As of kernel 4.14, x86 _NSIG is 64. >> - Define it to 513 to leave some rooms for future use. */ >> -#define _JUMP_BUF_SIGSET_NSIG 513 >> +/* <setjmp/setjmp.h> has >> + >> +struct __jmp_buf_tag >> + { >> + __jmp_buf __jmpbuf; >> + int __mask_was_saved; >> + __sigset_t __saved_mask; >> + }; >> + >> + struct __jmp_buf_tag is 32 bits aligned on i386 and is 64 bits >> + aligned on x32 and x86-64. __saved_mask is aligned to 32 bits >> + on i386/x32 without padding and is aligned to 64 bits on x86-64 >> + with 32 bit padding. >> + >> + and <nptl/descr.h> has >> + >> +struct pthread_unwind_buf >> +{ >> + struct >> + { >> + __jmp_buf jmp_buf; >> + int mask_was_saved; >> + } cancel_jmp_buf[1]; >> + >> + union >> + { >> + void *pad[4]; >> + struct >> + { >> + struct pthread_unwind_buf *prev; >> + struct _pthread_cleanup_buffer *cleanup; >> + int canceltype; >> + } data; >> + } priv; >> +}; >> + >> + struct pthread_unwind_buf is 32 bits aligned on i386 and 64 bits >> + aligned on x32/x86-64. cancel_jmp_buf is aligned to 32 bits on >> + i386 and is aligned to 64 bits on x32/x86-64. > > OK. > >> + >> + The pad array in struct pthread_unwind_buf is used by setjmp to save >> + shadow stack register. The usable space in __saved_mask for sigset >> + and shadow stack pointer: >> + 1. i386: The 4x4 byte pad array which can be used for 4 byte shadow >> + stack pointer and maximum 12 byte sigset. >> + 2. x32: 4 byte padding + the 4x4 byte pad array which can be used >> + for 8 byte shadow stack pointer and maximum 12 byte sigset. >> + 3. x86-64: The 4x8 byte pad array which can be used for 8 byte >> + shadow stack pointer and maximum 24 byte sigset. > > OK. > >> + >> + NB: We use setjmp in thread cancellation and this saves the shadow >> + stack register, but __libc_unwind_longjmp doesn't restore the shadow >> + stack register since cancellation never returns after longjmp. */ >> + >> +/* Number of bits per long. */ >> +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int)) >> +/* The biggest signal number. As of kernel 4.14, x86 _NSIG is 64. The >> + common maximum sigset for i386, x32 and x86-64 is 12 bytes (96 bits). >> + Define it to 96 to leave some rooms for future use. */ >> +#define _JUMP_BUF_SIGSET_NSIG 96 >> /* Number of longs to hold all signals. */ >> #define _JUMP_BUF_SIGSET_NWORDS \ >> - ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int))) >> + (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \ >> + / _JUMP_BUF_SIGSET_BITS_PER_WORD) > > OK. > >> >> typedef struct >> { >> @@ -39,7 +98,9 @@ typedef union >> struct >> { >> __jmp_buf_sigset_t __saved_mask; >> - /* Used for shadow stack pointer. */ >> + /* Used for shadow stack pointer. NB: Shadow stack pointer >> + must have the same alignment as __saved_mask. Otherwise >> + offset of __saved_mask will be changed. */ > > s/offset of __saved_mask/offset of __shadow_stack_pointer/g? __saved_mask here is in the public header: /* Calling environment, plus possibly a saved signal mask. */ struct __jmp_buf_tag { /* NOTE: The machine-dependent definitions of `__sigsetjmp' assume that a `jmp_buf' begins with a `__jmp_buf' and that `__mask_was_saved' follows it. Do not move these members or add others before it. */ __jmp_buf __jmpbuf; /* Calling environment. */ int __mask_was_saved; /* Saved the signal mask? */ __sigset_t __saved_mask; /* Saved signal mask. */ }; and sysdeps/unix/sysv/linux/x86/setjmpP.h has #undef __sigset_t #define __sigset_t __jmpbuf_arch_t #include <setjmp.h> #undef __saved_mask #define __saved_mask __saved_mask.__saved.__saved_mask The offset of __saved_mask must be unchanged, which is checked by assert in include/setjmp.h: /* Check if internal fields in jmp_buf have the expected offsets. */ TEST_OFFSET (struct __jmp_buf_tag, __mask_was_saved, MASK_WAS_SAVED_OFFSET); TEST_OFFSET (struct __jmp_buf_tag, __saved_mask, SAVED_MASK_OFFSET); > OK with that correction. > > It is the __shadow_stack_pointer offset which will change if it doesn't > have the same alignment as the __saved_mask. The compiler will insert > padding between the fields. > > Perhaps assert there is no padding? I don't think we need it since there is /* Assert that the priv field in struct pthread_unwind_buf has space to store shadow stack pointer. */ _Static_assert ((offsetof (struct pthread_unwind_buf, priv) <= SHADOW_STACK_POINTER_OFFSET) && ((offsetof (struct pthread_unwind_buf, priv) + sizeof (((struct pthread_unwind_buf *) 0)->priv)) >= (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE)), "Shadow stack pointer is not within private storage " "of pthread_unwind_buf."); to ensure that offset and size of the priv field are sufficient. >> unsigned long int __shadow_stack_pointer; >> } __saved; >> } __jmpbuf_arch_t; >> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile >> index 0d0326c21a..d25d6f0ae4 100644 >> --- a/sysdeps/x86/Makefile >> +++ b/sysdeps/x86/Makefile >> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features >> tests += tst-get-cpu-features >> tests-static += tst-get-cpu-features-static >> endif >> + >> +ifeq ($(subdir),setjmp) >> +sysdep_routines += __longjmp_cancel >> +endif > > OK. > >> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S >> new file mode 100644 >> index 0000000000..b57dbfa376 >> --- /dev/null >> +++ b/sysdeps/x86/__longjmp_cancel.S >> @@ -0,0 +1,20 @@ >> +/* __longjmp_cancel for x86. >> + Copyright (C) 2018 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + <http://www.gnu.org/licenses/>. */ >> + >> +#define __longjmp __longjmp_cancel >> +#include <__longjmp.S> > > OK. > >> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c >> new file mode 100644 >> index 0000000000..a53f31e1dd >> --- /dev/null >> +++ b/sysdeps/x86/longjmp.c >> @@ -0,0 +1,45 @@ >> +/* __libc_siglongjmp for x86. >> + Copyright (C) 2018 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + <http://www.gnu.org/licenses/>. */ >> + >> +#define __libc_longjmp __redirect___libc_longjmp >> +#include <setjmp/longjmp.c> >> +#undef __libc_longjmp >> + >> +extern void __longjmp_cancel (__jmp_buf __env, int __val) >> + __attribute__ ((__noreturn__)) attribute_hidden; >> + >> +/* Since __libc_longjmp is a private interface for cancellation >> + implementation in libpthread, there is no need to restore shadow >> + stack register. */ >> + >> +void >> +__libc_longjmp (sigjmp_buf env, int val) >> +{ >> + /* Perform any cleanups needed by the frames being unwound. */ >> + _longjmp_unwind (env, val); >> + >> + if (env[0].__mask_was_saved) >> + /* Restore the saved signal mask. */ >> + (void) __sigprocmask (SIG_SETMASK, >> + (sigset_t *) &env[0].__saved_mask, >> + (sigset_t *) NULL); >> + >> + /* Call the machine-dependent function to restore machine state >> + without shadow stack. */ >> + __longjmp_cancel (env[0].__jmpbuf, val ?: 1); >> +} > > OK. > >> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c >> new file mode 100644 >> index 0000000000..6165c7d4a7 >> --- /dev/null >> +++ b/sysdeps/x86/nptl/pt-longjmp.c >> @@ -0,0 +1,71 @@ >> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI. >> + X86 version. >> + Copyright (C) 18 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + <http://www.gnu.org/licenses/>. */ >> + >> +#include <pthreadP.h> >> +#include <jmp_buf-ssp.h> >> + >> +#ifdef __x86_64__ >> +# define SHADOW_STACK_POINTER_SIZE 8 >> +#else >> +# define SHADOW_STACK_POINTER_SIZE 4 >> +#endif >> + >> +/* Assert that the priv field in struct pthread_unwind_buf has space >> + to store shadow stack pointer. */ >> +_Static_assert ((offsetof (struct pthread_unwind_buf, priv) >> + <= SHADOW_STACK_POINTER_OFFSET) >> + && ((offsetof (struct pthread_unwind_buf, priv) >> + + sizeof (((struct pthread_unwind_buf *) 0)->priv)) >> + >= (SHADOW_STACK_POINTER_OFFSET >> + + SHADOW_STACK_POINTER_SIZE)), >> + "Shadow stack pointer is not within private storage " >> + "of pthread_unwind_buf."); > > OK. Thanks for testing this out. > >> + >> +#include <shlib-compat.h> >> + >> +/* libpthread once had its own longjmp (and siglongjmp alias), though there >> + was no apparent reason for it. There is no use in having a separate >> + symbol in libpthread, but the historical ABI requires it. For static >> + linking, there is no need to provide anything here--the libc version >> + will be linked in. For shared library ABI compatibility, there must be >> + longjmp and siglongjmp symbols in libpthread.so. >> + >> + With an IFUNC resolver, it would be possible to avoid the indirection, >> + but the IFUNC resolver might run before the __libc_longjmp symbol has >> + been relocated, in which case the IFUNC resolver would not be able to >> + provide the correct address. */ >> + >> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22) >> + >> +static void __attribute__ ((noreturn, used)) >> +longjmp_compat (jmp_buf env, int val) >> +{ >> + /* NB: We call __libc_siglongjmp, instead of __libc_longjmp, since >> + __libc_longjmp is a private interface for cancellation which >> + doesn't restore shadow stack register. */ >> + __libc_siglongjmp (env, val); >> +} >> + >> +strong_alias (longjmp_compat, longjmp_alias) >> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0); >> + >> +strong_alias (longjmp_alias, siglongjmp_alias) >> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0); >> + >> +#endif >> > > Version 3 looks good to me with the comment correction I mentioned. > > Thank you for working with me to improve the quality of the code > and comments, and for balancing the needs of the implementation > against the distribution needs e.g. no added symbol for core CET > enablement. Thank you for your valuable feedbacks. I am checking in my patch now. > Reviewed-by: Carlos O'Donell <carlos@redhat.com> > > -- > Cheers, > Carlos. -- H.J. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-04-06 12:59 ` H.J. Lu 2018-04-06 20:26 ` H.J. Lu 2018-04-12 21:36 ` Carlos O'Donell @ 2018-04-17 20:03 ` Joseph Myers 2018-04-17 23:20 ` H.J. Lu 2 siblings, 1 reply; 13+ messages in thread From: Joseph Myers @ 2018-04-17 20:03 UTC (permalink / raw) To: H.J. Lu Cc: Carlos O'Donell, Florian Weimer, Tsimbalist, Igor V, GNU C Library On Fri, 6 Apr 2018, H.J. Lu wrote: > https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1 > > It should be binary backward compatible. I will investigate if there is a way Increasing the size of a public type is always dangerous, because you can end up with one part of a program expecting the new, larger size but another part only allocating the old, smaller size. It might in some cases be compatible to the extent that existing linked programs and shared libraries work with new glibc, if new glibc will never try to write into the unallocated part of such objects allocated by an existing linked program or shared library. However, any such change would need a careful analysis of how the type gets written to, and to what extent external libraries have interfaces that depend on the size of the type, and would need a NEWS entry explaining the change and discussing the compatibility issues with it. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register 2018-04-17 20:03 ` [PATCH] " Joseph Myers @ 2018-04-17 23:20 ` H.J. Lu 0 siblings, 0 replies; 13+ messages in thread From: H.J. Lu @ 2018-04-17 23:20 UTC (permalink / raw) To: Joseph Myers Cc: Carlos O'Donell, Florian Weimer, Tsimbalist, Igor V, GNU C Library On Tue, Apr 17, 2018 at 1:02 PM, Joseph Myers <joseph@codesourcery.com> wrote: > On Fri, 6 Apr 2018, H.J. Lu wrote: > >> https://github.com/hjl-tools/glibc/commit/9bf6aefa8fb45f8df140d42ce9cf890bb24076e1 >> >> It should be binary backward compatible. I will investigate if there is a way > > Increasing the size of a public type is always dangerous, because you can > end up with one part of a program expecting the new, larger size but > another part only allocating the old, smaller size. That is true. The allocated ucontext size must be no less than the size expected by ucontext consumer. > It might in some cases be compatible to the extent that existing linked > programs and shared libraries work with new glibc, if new glibc will never This is done by checking CET properties. Both linker and dynamic linker clear CET property bits if any module doesn't have CET bits set. Glibc should access new extended fields only if CET bits are set, which means the new ucontext is used in all .o files. That is why I want to extend ucontext before we have found a solution so that if an object file has CET bits set, it must use the new ucontext. > try to write into the unallocated part of such objects allocated by an > existing linked program or shared library. However, any such change would > need a careful analysis of how the type gets written to, and to what > extent external libraries have interfaces that depend on the size of the > type, and would need a NEWS entry explaining the change and discussing the > compatibility issues with it. > Agreed, -- H.J. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-05-02 12:45 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-03-30 17:41 [PATCH] x86: Use pad in pthread_unwind_buf to preserve shadow stack register H.J. Lu 2018-04-06 4:46 ` Carlos O'Donell 2018-04-06 12:59 ` H.J. Lu 2018-04-06 20:26 ` H.J. Lu 2018-04-12 21:36 ` Carlos O'Donell 2018-04-12 21:36 ` Carlos O'Donell 2018-04-12 23:50 ` H.J. Lu 2018-04-21 3:28 ` [PATCHv2] " Carlos O'Donell 2018-04-21 18:37 ` [PATCH/v3] " H.J. Lu 2018-05-02 4:43 ` Carlos O'Donell 2018-05-02 12:45 ` H.J. Lu 2018-04-17 20:03 ` [PATCH] " Joseph Myers 2018-04-17 23:20 ` H.J. Lu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).