From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dormouse.elm.relay.mailchannels.net (dormouse.elm.relay.mailchannels.net [23.83.212.50]) by sourceware.org (Postfix) with ESMTPS id 16D793858D3C for ; Wed, 8 Dec 2021 18:04:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 16D793858D3C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=gotplt.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gotplt.org X-Sender-Id: dreamhost|x-authsender|siddhesh@gotplt.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id C2D588818FB; Wed, 8 Dec 2021 18:04:01 +0000 (UTC) Received: from pdx1-sub0-mail-a304.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 06A36881948; Wed, 8 Dec 2021 18:03:57 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|siddhesh@gotplt.org Received: from pdx1-sub0-mail-a304.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.120.81.174 (trex/6.4.3); Wed, 08 Dec 2021 18:04:01 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|siddhesh@gotplt.org X-MailChannels-Auth-Id: dreamhost X-Squirrel-Squirrel: 490f784c10aee358_1638986641602_3350440578 X-MC-Loop-Signature: 1638986641602:2617111396 X-MC-Ingress-Time: 1638986641602 Received: from [192.168.52.116] (unknown [223.185.2.127]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: siddhesh@gotplt.org) by pdx1-sub0-mail-a304.dreamhost.com (Postfix) with ESMTPSA id 4J8Q5f430hz19Z; Wed, 8 Dec 2021 10:03:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gotplt.org; s=gotplt.org; t=1638986633; bh=1TQcZvXm2sxj7f4lJWJrNpzPQLE=; h=Date:Subject:To:From:Content-Type:Content-Transfer-Encoding; b=JTAaX6nVixpZlWjd7VNHORbHSaavEx++aLAIfpf4Z5irO1VAUV4zt6Yv4p2RGJj4y dlGLPgWv6GCR8ir7v70PlI5j6i7rez6yFqaLFU6U9bYO6nqyRzCJihUew9mbs2lkDV tzxFXaQnOLlak3UJs73iJv/kn+3Y3jMiHpncDJh8= Message-ID: <921551c8-0cd3-5fcc-30a4-e0709485e0f1@gotplt.org> Date: Wed, 8 Dec 2021 23:33:44 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.0 Subject: Re: [PATCH 4/8] nptl: Add rseq registration Content-Language: en-US To: Florian Weimer , libc-alpha@sourceware.org References: From: Siddhesh Poyarekar In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3039.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Dec 2021 18:04:11 -0000 On 12/7/21 18:31, Florian Weimer via Libc-alpha wrote: > The rseq area is placed directly into struct pthread. rseq > registration failure is not treated as an error, so it is possible > that threads run with inconsistent registration status. > > is not yet installed as a public header. > > Co-Authored-By: Mathieu Desnoyers > --- > v2: Use volatite access to cpu_id. Drop csu/libc-tls.c spurious change. > > nptl/descr.h | 4 + > nptl/pthread_create.c | 13 + > sysdeps/nptl/dl-tls_init_tp.c | 8 +- > sysdeps/unix/sysv/linux/Makefile | 9 +- > sysdeps/unix/sysv/linux/aarch64/bits/rseq.h | 43 ++++ > sysdeps/unix/sysv/linux/arm/bits/rseq.h | 83 +++++++ > sysdeps/unix/sysv/linux/bits/rseq.h | 29 +++ > sysdeps/unix/sysv/linux/mips/bits/rseq.h | 62 +++++ > sysdeps/unix/sysv/linux/powerpc/bits/rseq.h | 37 +++ > sysdeps/unix/sysv/linux/rseq-internal.h | 45 ++++ > sysdeps/unix/sysv/linux/s390/bits/rseq.h | 37 +++ > sysdeps/unix/sysv/linux/sys/rseq.h | 174 +++++++++++++ > sysdeps/unix/sysv/linux/tst-rseq-nptl.c | 260 ++++++++++++++++++++ > sysdeps/unix/sysv/linux/tst-rseq.c | 64 +++++ > sysdeps/unix/sysv/linux/tst-rseq.h | 57 +++++ > sysdeps/unix/sysv/linux/x86/bits/rseq.h | 30 +++ > 16 files changed, 952 insertions(+), 3 deletions(-) > create mode 100644 sysdeps/unix/sysv/linux/aarch64/bits/rseq.h > create mode 100644 sysdeps/unix/sysv/linux/arm/bits/rseq.h > create mode 100644 sysdeps/unix/sysv/linux/bits/rseq.h > create mode 100644 sysdeps/unix/sysv/linux/mips/bits/rseq.h > create mode 100644 sysdeps/unix/sysv/linux/powerpc/bits/rseq.h > create mode 100644 sysdeps/unix/sysv/linux/rseq-internal.h > create mode 100644 sysdeps/unix/sysv/linux/s390/bits/rseq.h > create mode 100644 sysdeps/unix/sysv/linux/sys/rseq.h > create mode 100644 sysdeps/unix/sysv/linux/tst-rseq-nptl.c > create mode 100644 sysdeps/unix/sysv/linux/tst-rseq.c > create mode 100644 sysdeps/unix/sysv/linux/tst-rseq.h > create mode 100644 sysdeps/unix/sysv/linux/x86/bits/rseq.h > > diff --git a/nptl/descr.h b/nptl/descr.h > index af2a6ab87a..92db305913 100644 > --- a/nptl/descr.h > +++ b/nptl/descr.h > @@ -34,6 +34,7 @@ > #include > #include > #include > +#include > > #ifndef TCB_ALIGNMENT > # define TCB_ALIGNMENT 32 > @@ -406,6 +407,9 @@ struct pthread > /* Used on strsignal. */ > struct tls_internal_t tls_state; > > + /* rseq area registered with the kernel. */ > + struct rseq rseq_area; > + > /* This member must be last. */ > char end_padding[]; > > diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c > index bad9eeb52f..ea0d79341e 100644 > --- a/nptl/pthread_create.c > +++ b/nptl/pthread_create.c > @@ -32,6 +32,7 @@ > #include > #include > #include > +#include > #include "libioP.h" > #include > #include > @@ -366,6 +367,9 @@ start_thread (void *arg) > /* Initialize pointers to locale data. */ > __ctype_init (); > > + /* Register rseq TLS to the kernel. */ > + rseq_register_current_thread (pd); > + > #ifndef __ASSUME_SET_ROBUST_LIST > if (__nptl_set_robust_list_avail) > #endif > @@ -571,6 +575,15 @@ out: > process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID > flag. The 'tid' field in the TCB will be set to zero. > > + rseq TLS is still registered at this point. Rely on implicit > + unregistration performed by the kernel on thread teardown. This is not a > + problem because the rseq TLS lives on the stack, and the stack outlives > + the thread. If TCB allocation is ever changed, additional steps may be > + required, such as performing explicit rseq unregistration before > + reclaiming the rseq TLS area memory. It is NOT sufficient to block > + signals because the kernel may write to the rseq area even without > + signals. > + > The exit code is zero since in case all threads exit by calling > 'pthread_exit' the exit status must be 0 (zero). */ > while (1) > diff --git a/sysdeps/nptl/dl-tls_init_tp.c b/sysdeps/nptl/dl-tls_init_tp.c > index ca494dd3a5..fedb876fdb 100644 > --- a/sysdeps/nptl/dl-tls_init_tp.c > +++ b/sysdeps/nptl/dl-tls_init_tp.c > @@ -21,6 +21,7 @@ > #include > #include > #include > +#include > > #ifndef __ASSUME_SET_ROBUST_LIST > bool __nptl_set_robust_list_avail; > @@ -57,11 +58,12 @@ __tls_pre_init_tp (void) > void > __tls_init_tp (void) > { > + struct pthread *pd = THREAD_SELF; > + > /* Set up thread stack list management. */ > - list_add (&THREAD_SELF->list, &GL (dl_stack_user)); > + list_add (&pd->list, &GL (dl_stack_user)); > > /* Early initialization of the TCB. */ > - struct pthread *pd = THREAD_SELF; > pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->tid); > THREAD_SETMEM (pd, specific[0], &pd->specific_1stblock[0]); > THREAD_SETMEM (pd, user_stack, true); > @@ -90,6 +92,8 @@ __tls_init_tp (void) > } > } > > + rseq_register_current_thread (pd); > + > /* Set initial thread's stack block from 0 up to __libc_stack_end. > It will be bigger than it actually is, but for unwind.c/pt-longjmp.c > purposes this is good enough. */ > diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile > index 29c6c78f98..eb0f5fc021 100644 > --- a/sysdeps/unix/sysv/linux/Makefile > +++ b/sysdeps/unix/sysv/linux/Makefile > @@ -131,7 +131,10 @@ ifeq ($(have-GLIBC_2.27)$(build-shared),yesyes) > tests += tst-ofdlocks-compat > endif > > -tests-internal += tst-sigcontext-get_pc > +tests-internal += \ > + tst-rseq \ > + tst-sigcontext-get_pc \ > + # tests-internal > > tests-time64 += \ > tst-adjtimex-time64 \ > @@ -357,4 +360,8 @@ endif > > ifeq ($(subdir),nptl) > tests += tst-align-clone tst-getpid1 > + > +# tst-rseq-nptl is an internal test because it requires a definition of > +# __NR_rseq from the internal system call list. > +tests-internal += tst-rseq-nptl > endif > diff --git a/sysdeps/unix/sysv/linux/aarch64/bits/rseq.h b/sysdeps/unix/sysv/linux/aarch64/bits/rseq.h > new file mode 100644 > index 0000000000..9ba92725c7 > --- /dev/null > +++ b/sysdeps/unix/sysv/linux/aarch64/bits/rseq.h > @@ -0,0 +1,43 @@ > +/* Restartable Sequences Linux aarch64 architecture header. > + Copyright (C) 2021 Free Software Foundation, Inc. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#ifndef _SYS_RSEQ_H > +# error "Never use directly; include instead." > +#endif > + > +/* RSEQ_SIG is a signature required before each abort handler code. > + > + It is a 32-bit value that maps to actual architecture code compiled > + into applications and libraries. It needs to be defined for each > + architecture. When choosing this value, it needs to be taken into > + account that generating invalid instructions may have ill effects on > + tools like objdump, and may also have impact on the CPU speculative > + execution efficiency in some cases. > + > + aarch64 -mbig-endian generates mixed endianness code vs data: > + little-endian code and big-endian data. Ensure the RSEQ_SIG signature > + matches code endianness. */ > + > +#define RSEQ_SIG_CODE 0xd428bc00 /* BRK #0x45E0. */ > + > +#ifdef __AARCH64EB__ > +# define RSEQ_SIG_DATA 0x00bc28d4 /* BRK #0x45E0. */ > +#else > +# define RSEQ_SIG_DATA RSEQ_SIG_CODE > +#endif > + > +#define RSEQ_SIG RSEQ_SIG_DATA > diff --git a/sysdeps/unix/sysv/linux/arm/bits/rseq.h b/sysdeps/unix/sysv/linux/arm/bits/rseq.h > new file mode 100644 > index 0000000000..0542b26f6a > --- /dev/null > +++ b/sysdeps/unix/sysv/linux/arm/bits/rseq.h > @@ -0,0 +1,83 @@ > +/* Restartable Sequences Linux arm architecture header. > + Copyright (C) 2021 Free Software Foundation, Inc. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#ifndef _SYS_RSEQ_H > +# error "Never use directly; include instead." > +#endif > + > +/* > + RSEQ_SIG is a signature required before each abort handler code. > + > + It is a 32-bit value that maps to actual architecture code compiled > + into applications and libraries. It needs to be defined for each > + architecture. When choosing this value, it needs to be taken into > + account that generating invalid instructions may have ill effects on > + tools like objdump, and may also have impact on the CPU speculative > + execution efficiency in some cases. > + > + - ARM little endian > + > + RSEQ_SIG uses the udf A32 instruction with an uncommon immediate operand > + value 0x5de3. This traps if user-space reaches this instruction by mistake, > + and the uncommon operand ensures the kernel does not move the instruction > + pointer to attacker-controlled code on rseq abort. > + > + The instruction pattern in the A32 instruction set is: > + > + e7f5def3 udf #24035 ; 0x5de3 > + > + This translates to the following instruction pattern in the T16 instruction > + set: > + > + little endian: > + def3 udf #243 ; 0xf3 > + e7f5 b.n <7f5> > + > + - ARMv6+ big endian (BE8): > + > + ARMv6+ -mbig-endian generates mixed endianness code vs data: little-endian > + code and big-endian data. The data value of the signature needs to have its > + byte order reversed to generate the trap instruction: > + > + Data: 0xf3def5e7 > + > + Translates to this A32 instruction pattern: > + > + e7f5def3 udf #24035 ; 0x5de3 > + > + Translates to this T16 instruction pattern: > + > + def3 udf #243 ; 0xf3 > + e7f5 b.n <7f5> > + > + - Prior to ARMv6 big endian (BE32): > + > + Prior to ARMv6, -mbig-endian generates big-endian code and data > + (which match), so the endianness of the data representation of the > + signature should not be reversed. However, the choice between BE32 > + and BE8 is done by the linker, so we cannot know whether code and > + data endianness will be mixed before the linker is invoked. So rather > + than try to play tricks with the linker, the rseq signature is simply > + data (not a trap instruction) prior to ARMv6 on big endian. This is > + why the signature is expressed as data (.word) rather than as > + instruction (.inst) in assembler. */ > + > +#ifdef __ARMEB__ > +# define RSEQ_SIG 0xf3def5e7 /* udf #24035 ; 0x5de3 (ARMv6+) */ > +#else > +# define RSEQ_SIG 0xe7f5def3 /* udf #24035 ; 0x5de3 */ > +#endif > diff --git a/sysdeps/unix/sysv/linux/bits/rseq.h b/sysdeps/unix/sysv/linux/bits/rseq.h > new file mode 100644 > index 0000000000..46cf5d1c74 > --- /dev/null > +++ b/sysdeps/unix/sysv/linux/bits/rseq.h > @@ -0,0 +1,29 @@ > +/* Restartable Sequences architecture header. Stub version. > + Copyright (C) 2021 Free Software Foundation, Inc. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#ifndef _SYS_RSEQ_H > +# error "Never use directly; include instead." > +#endif > + > +/* RSEQ_SIG is a signature required before each abort handler code. > + > + It is a 32-bit value that maps to actual architecture code compiled > + into applications and libraries. It needs to be defined for each > + architecture. When choosing this value, it needs to be taken into > + account that generating invalid instructions may have ill effects on > + tools like objdump, and may also have impact on the CPU speculative > + execution efficiency in some cases. */ > diff --git a/sysdeps/unix/sysv/linux/mips/bits/rseq.h b/sysdeps/unix/sysv/linux/mips/bits/rseq.h > new file mode 100644 > index 0000000000..a9defee568 > --- /dev/null > +++ b/sysdeps/unix/sysv/linux/mips/bits/rseq.h > @@ -0,0 +1,62 @@ > +/* Restartable Sequences Linux mips architecture header. > + Copyright (C) 2021 Free Software Foundation, Inc. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#ifndef _SYS_RSEQ_H > +# error "Never use directly; include instead." > +#endif > + > +/* RSEQ_SIG is a signature required before each abort handler code. > + > + It is a 32-bit value that maps to actual architecture code compiled > + into applications and libraries. It needs to be defined for each > + architecture. When choosing this value, it needs to be taken into > + account that generating invalid instructions may have ill effects on > + tools like objdump, and may also have impact on the CPU speculative > + execution efficiency in some cases. > + > + RSEQ_SIG uses the break instruction. The instruction pattern is: > + > + On MIPS: > + 0350000d break 0x350 > + > + On nanoMIPS: > + 00100350 break 0x350 > + > + On microMIPS: > + 0000d407 break 0x350 > + > + For nanoMIPS32 and microMIPS, the instruction stream is encoded as > + 16-bit halfwords, so the signature halfwords need to be swapped > + accordingly for little-endian. */ > + > +#if defined (__nanomips__) > +# ifdef __MIPSEL__ > +# define RSEQ_SIG 0x03500010 > +# else > +# define RSEQ_SIG 0x00100350 > +# endif > +#elif defined (__mips_micromips) > +# ifdef __MIPSEL__ > +# define RSEQ_SIG 0xd4070000 > +# else > +# define RSEQ_SIG 0x0000d407 > +# endif > +#elif defined (__mips__) > +# define RSEQ_SIG 0x0350000d > +#else > +/* Unknown MIPS architecture. */ > +#endif > diff --git a/sysdeps/unix/sysv/linux/powerpc/bits/rseq.h b/sysdeps/unix/sysv/linux/powerpc/bits/rseq.h > new file mode 100644 > index 0000000000..05b3cf7b8f > --- /dev/null > +++ b/sysdeps/unix/sysv/linux/powerpc/bits/rseq.h > @@ -0,0 +1,37 @@ > +/* Restartable Sequences Linux powerpc architecture header. > + Copyright (C) 2021 Free Software Foundation, Inc. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#ifndef _SYS_RSEQ_H > +# error "Never use directly; include instead." > +#endif > + > +/* RSEQ_SIG is a signature required before each abort handler code. > + > + It is a 32-bit value that maps to actual architecture code compiled > + into applications and libraries. It needs to be defined for each > + architecture. When choosing this value, it needs to be taken into > + account that generating invalid instructions may have ill effects on > + tools like objdump, and may also have impact on the CPU speculative > + execution efficiency in some cases. > + > + RSEQ_SIG uses the following trap instruction: > + > + powerpc-be: 0f e5 00 0b twui r5,11 > + powerpc64-le: 0b 00 e5 0f twui r5,11 > + powerpc64-be: 0f e5 00 0b twui r5,11 */ > + > +#define RSEQ_SIG 0x0fe5000b > diff --git a/sysdeps/unix/sysv/linux/rseq-internal.h b/sysdeps/unix/sysv/linux/rseq-internal.h > new file mode 100644 > index 0000000000..909f547825 > --- /dev/null > +++ b/sysdeps/unix/sysv/linux/rseq-internal.h > @@ -0,0 +1,45 @@ > +/* Restartable Sequences internal API. Linux implementation. > + Copyright (C) 2021 Free Software Foundation, Inc. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#ifndef RSEQ_INTERNAL_H > +#define RSEQ_INTERNAL_H > + > +#include > +#include > +#include > +#include > +#include > + > +#ifdef RSEQ_SIG > +static inline void > +rseq_register_current_thread (struct pthread *self) > +{ > + int ret = INTERNAL_SYSCALL_CALL (rseq, > + &self->rseq_area, sizeof (self->rseq_area), > + 0, RSEQ_SIG); > + if (INTERNAL_SYSCALL_ERROR_P (ret)) > + THREAD_SETMEM (self, rseq_area.cpu_id, RSEQ_CPU_ID_REGISTRATION_FAILED); Why can't we just leave it as the kernel did when it failed the syscall? It looks like we'll only end up shadowing UNINITIALIZED all the time and it may cause issues if linux decides to use -2 for some other purpose in future. Siddhesh