From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x334.google.com (mail-ot1-x334.google.com [IPv6:2607:f8b0:4864:20::334]) by sourceware.org (Postfix) with ESMTPS id D82283858C53 for ; Thu, 24 Aug 2023 16:42:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D82283858C53 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-ot1-x334.google.com with SMTP id 46e09a7af769-6bc9c01e154so582198a34.0 for ; Thu, 24 Aug 2023 09:42:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1692895343; x=1693500143; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=OC4kh7ficpmCg2HxXca+yFs4X7ibKQMovvGUDtSy660=; b=a7aFT6VIo08+bfP7hg/F3kwjjYzSl2s1qKrJio2gm6TJB0iEJebPilGlSpchRSn2D4 6dxm1nnWCVEWoRDATboOoCiBLUQ/sEsASZ3tAxTlmz80CzBnJGBMX8QcXjGO9pNAFxy5 RfdOvvKhVGpsUxgIpVK2o9C+23AgUhU56j33zQ1hUD8RrZNVzrBuwpOm3oT9MIpj+epA qiC/qtOCS6iPh/wngX9/Hth818pxaaJHkxvdQFeXSSaFDEltkMIQh4X1bEtJOHqhWgZo mVC+7PklRj3sn1WsYZ3tkeoi5QYjNR12rYLRX5PoDdjVNY58chmk/rcpIwt3M0hxCNGH h9jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692895343; x=1693500143; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OC4kh7ficpmCg2HxXca+yFs4X7ibKQMovvGUDtSy660=; b=ZnHvaJuGSlMG2q3PW8PbDAx/Z1PSwA3j3lL7HVVUOnfbsH4aK0N7eAptDR6YpbgIeV Q3MG6K7NFp+5TcYl5HRGFIY8blRRkFetMSWJ2tyJcXPb3lmgMgwWPw6U19/N03yh6RBS clI6NRIGjuL6DTrADb4g/FWTar5vHIea4cz2NFqI+iRUl3LJqI5J2BPHXuJWDAdMeIXD HcgtCciC97Xl5TQc+uLQ8+khRPyvtr+vsnGcamhzBFMiwaFvW/lu09DJBWjiL1uU0F22 LqxpXfSdvZu0j5uHZzGKelUzgO7GkUfaaO/gAhIQ1plWjnAKbyEGNbyw+jEFR0UwahIm LkUQ== X-Gm-Message-State: AOJu0YyMExJHZcacIl9xNFRVgkMfkqw0akbIlOEyqCSEpQExLNeeQsJT CDCd3N+aPM+EcOgFT1NNPPqLZ/Swiv6Q9FmO4mwNjA== X-Google-Smtp-Source: AGHT+IFAwTx4l1RKQEQx+uJkvobTlhrSMAW3irXgRzweKDSa7ganHyKkptFJ/lxKISRfJCCJ/qKH9Q== X-Received: by 2002:a05:6808:1294:b0:3a6:f2ab:42c with SMTP id a20-20020a056808129400b003a6f2ab042cmr136504oiw.20.1692895343455; Thu, 24 Aug 2023 09:42:23 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c2:c275:5154:75ba:e0b6:f70d]) by smtp.gmail.com with ESMTPSA id r25-20020a544899000000b003a37372e6b2sm5115409oic.21.2023.08.24.09.42.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Aug 2023 09:42:22 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Florian Weimer Subject: [PATCH v9 0/6] Add pidfd and cgroupv2 support for process creation Date: Thu, 24 Aug 2023 13:42:13 -0300 Message-Id: <20230824164219.391263-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The glibc 2.36 added wrappers for Linux syscall pidfd_open, pidfd_getfd, and pidfd_send_signal, and exported the P_PIDFD to use along with waitid. The pidfd is a race-free interface, however, the pidfd_open is subject to TOCTOU if the file descriptor is not obtained directly from the clone or clone3 syscall (there is still a small window between the clone return and the pidfd_getfd where the process can be reaped and the process ID reused). A fully race-free interface with posix_spawn interface is being discussed by GNOME [1] [2], and Qt already uses it in its QtProcess implementation [3]. The Qt implementation has some pitfalls: - It calls clone through the syscall symbol, which does not run the pthread_atfork handlers even though it intends to use the clone semantic for fork (by only using CLONE_PIDFD | SIGCHLD). - It also does not reset any internal state, such as internal IO, malloc, loader, etc. locks. - It does not set the TCB tid field nor the robust list, used by the pthread code. - It does not optimize process creation by using CLONE_VM and CLONE_VFORK. Also, the recent Linux kernel (starting with 5.7) provides a way to create a new process in a different cgroups version 2 than the default one (through clone3 CLONE_INTO_CGROUP flag). Providing it through glibc interfaces makes it usable without the risk of potential breakage by issuing clone3 syscall directly (check BZ#26371 discussion). This patch set adds new interfaces that take care of these potential issues. The new posix_spawn / posix_spawnp extensions: #define POSIX_SPAWN_SETCGROUP 0x100 int posix_spawnattr_getcgroup_np (const posix_spawnattr_t restrict *attr, int *cgroup); int posix_spawnattr_setcgroup_np (posix_spawnattr_t *restrict attr, int cgroup); Allow spawning a new process on a different cgroupv2. The pidfd_spawn and pidfd_spawnp is similar to posix_spawn and posix_spawnp, but return a process file descriptor instead of a PID. int pidfd_spawn (int *restrict pidfd, const char *restrict file, const posix_spawn_file_actions_t *restrict facts, const posix_spawnattr_t *restrict attrp, char *const argv[restrict], char *const envp[restrict]); int pidfd_spawnp (int *restrict pidfd, const char *restrict path, const posix_spawn_file_actions_t *restrict facts, const posix_spawnattr_t *restrict attrp, char *const argv[restrict_arr], char *const envp[restrict_arr]); The implementation makes sure that kernel must support the complete pidfd interface, meaning that waitid (P_PIDFD) should be supported. It ensures that a non-racy workaround is required (such as reading procfs fdinfo pid to use along with old wait interfaces). If the kernel does not have the required support the interface returns ENOSYS. A new symbol is used instead of a posix_spawn extension to avoid possible issues with language bindings that might track the argument lifetime. Both symbols reuse the posix_spawn posix_spawn_file_actions_t and posix_spawnattr_t, to either avoid rehashing the posix_spawn API or add a new one. It also means that both interfaces support the same attribute and file actions, and a new flag or file action on posix_spawn is also added automatically for pidfd_spawn. It includes POSIX_SPAWN_SETCGROUP. To have a way to interop between process IDs and process file descriptors, the pidfd_getpid is also provided: pid_t pidfd_getpid (int fd) It reads the procfs fdinfo entry from the file descriptor to get the process ID. [1] https://gitlab.gnome.org/GNOME/glib/-/issues/1866 [2] https://sourceware.org/bugzilla/show_bug.cgi?id=30349 [3] https://codebrowser.dev/qt6/qtbase/src/3rdparty/forkfd/forkfd_linux.c.html --- Changes from v8: - Removed fork_np. Changes from v7: - Redefine __ASSUME_CLONE3 to 0 if the architecture does not support the syscall. - Fixed some failing errors to be reported by spawned processes. - Fixed pre-commit CI for AArch64 failures. - Rename pidfd_fork to fork_np and make the API extensible - Document more possible pidfd_getpid errors. Changes from v6: - Rebased against master, adjusted symbol version and NEWS entry. - Added arm/mips clone3 implementation. Changes from v5: - Added cgroupv2 support for posix_spawn, pidfd_spawn, and pidfd_fork. Changes from v4: - Changed pidfd_fork signature to return a pid_t instead of the PID file descriptor. - Changed pidfd_getpid to return EBADF for negative input, instead of EINVAL. - Added PIDFDFORK_NOSIGCHLD option. - Fixed nested __BEGIN_DECLS on spawn.h Changes from v3: - Remove strtoul usage. - Fixed patchwork tst-pidfd_getpid.c regression. - Fixed manual and NEWS typos. Changes from v2: - Added pidfd_fork and pidfd_getpid manual entries - Change pidfd_fork to act as fork as default, instead as _Fork. - Changed PIDFD_FORK_RUNATFORK flag to PIDFDFORK_ASYNCSAFE. - Added pidfd_getpid test for EREMOTE. Changes from v1: - Extended pidfd_getpid error codes to return EBADF if fdinfo does not have Pid entry or if the value is invalid, EREMOTE is pid is in a separate namespace, and ESRCH if is already terminated. - Extended tst-pidfd_getpid. - Rename PIDFD_FORK_RUNATFORK to PIDFDFORK_RUNATFORK to avoid clashes with possible kernel extensions. Adhemerval Zanella (6): arm: Add the clone3 wrapper mips: Add the clone3 wrapper linux: Define __ASSUME_CLONE3 to 0 for alpha, ia64, nios2, sh, and sparc linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731) posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349) linux: Add pidfd_getpid NEWS | 17 ++ bits/spawn_ext.h | 21 ++ include/clone_internal.h | 4 + manual/process.texi | 52 +++- posix/Makefile | 2 + posix/spawn.h | 6 +- posix/spawn_int.h | 3 +- posix/spawnattr_setflags.c | 3 +- posix/tst-posix_spawn-setsid.c | 169 +++++++++---- posix/tst-spawn-chdir.c | 15 +- posix/tst-spawn.c | 24 +- posix/tst-spawn.h | 36 +++ posix/tst-spawn2.c | 17 +- posix/tst-spawn3.c | 99 ++++---- posix/tst-spawn4.c | 7 +- posix/tst-spawn5.c | 14 +- posix/tst-spawn6.c | 13 +- posix/tst-spawn7.c | 13 +- sysdeps/unix/sysv/linux/Makefile | 26 ++ sysdeps/unix/sysv/linux/Versions | 7 + sysdeps/unix/sysv/linux/aarch64/libc.abilist | 5 + .../unix/sysv/linux/alpha/kernel-features.h | 4 + sysdeps/unix/sysv/linux/alpha/libc.abilist | 5 + sysdeps/unix/sysv/linux/arc/libc.abilist | 5 + sysdeps/unix/sysv/linux/arm/be/libc.abilist | 5 + sysdeps/unix/sysv/linux/arm/clone3.S | 80 +++++++ sysdeps/unix/sysv/linux/arm/le/libc.abilist | 5 + sysdeps/unix/sysv/linux/arm/sysdep.h | 1 + sysdeps/unix/sysv/linux/bits/spawn_ext.h | 67 ++++++ sysdeps/unix/sysv/linux/clone-pidfd-support.c | 60 +++++ sysdeps/unix/sysv/linux/csky/libc.abilist | 5 + sysdeps/unix/sysv/linux/hppa/libc.abilist | 5 + sysdeps/unix/sysv/linux/i386/libc.abilist | 5 + .../unix/sysv/linux/ia64/kernel-features.h | 4 + sysdeps/unix/sysv/linux/ia64/libc.abilist | 5 + .../sysv/linux/loongarch/lp64/libc.abilist | 5 + .../sysv/linux/m68k/coldfire/libc.abilist | 5 + .../unix/sysv/linux/m68k/m680x0/libc.abilist | 5 + .../sysv/linux/microblaze/be/libc.abilist | 5 + .../sysv/linux/microblaze/le/libc.abilist | 5 + sysdeps/unix/sysv/linux/mips/clone3.S | 139 +++++++++++ .../sysv/linux/mips/mips32/fpu/libc.abilist | 5 + .../sysv/linux/mips/mips32/nofpu/libc.abilist | 5 + .../sysv/linux/mips/mips64/n32/libc.abilist | 5 + .../sysv/linux/mips/mips64/n64/libc.abilist | 5 + sysdeps/unix/sysv/linux/mips/sysdep.h | 2 + .../unix/sysv/linux/nios2/kernel-features.h | 24 ++ sysdeps/unix/sysv/linux/nios2/libc.abilist | 5 + sysdeps/unix/sysv/linux/or1k/libc.abilist | 5 + sysdeps/unix/sysv/linux/pidfd_getpid.c | 126 ++++++++++ sysdeps/unix/sysv/linux/pidfd_spawn.c | 30 +++ sysdeps/unix/sysv/linux/pidfd_spawnp.c | 30 +++ .../linux/powerpc/powerpc32/fpu/libc.abilist | 5 + .../powerpc/powerpc32/nofpu/libc.abilist | 5 + .../linux/powerpc/powerpc64/be/libc.abilist | 5 + .../linux/powerpc/powerpc64/le/libc.abilist | 5 + sysdeps/unix/sysv/linux/procutils.c | 97 ++++++++ sysdeps/unix/sysv/linux/procutils.h | 43 ++++ .../unix/sysv/linux/riscv/rv32/libc.abilist | 5 + .../unix/sysv/linux/riscv/rv64/libc.abilist | 5 + .../unix/sysv/linux/s390/s390-32/libc.abilist | 5 + .../unix/sysv/linux/s390/s390-64/libc.abilist | 5 + sysdeps/unix/sysv/linux/sh/be/libc.abilist | 5 + sysdeps/unix/sysv/linux/sh/kernel-features.h | 4 + sysdeps/unix/sysv/linux/sh/le/libc.abilist | 5 + .../unix/sysv/linux/sparc/kernel-features.h | 4 + .../sysv/linux/sparc/sparc32/libc.abilist | 5 + .../sysv/linux/sparc/sparc64/libc.abilist | 5 + .../unix/sysv/linux/spawnattr_getcgroup_np.c | 28 +++ .../unix/sysv/linux/spawnattr_setcgroup_np.c | 27 +++ sysdeps/unix/sysv/linux/spawni.c | 44 +++- sysdeps/unix/sysv/linux/sys/pidfd.h | 4 + sysdeps/unix/sysv/linux/tst-pidfd.c | 47 ++++ sysdeps/unix/sysv/linux/tst-pidfd_getpid.c | 123 ++++++++++ .../sysv/linux/tst-posix_spawn-setsid-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn-cgroup.c | 223 ++++++++++++++++++ .../unix/sysv/linux/tst-spawn-chdir-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn-pidfd.h | 63 +++++ sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c | 20 ++ .../unix/sysv/linux/x86_64/64/libc.abilist | 5 + .../unix/sysv/linux/x86_64/x32/libc.abilist | 5 + 87 files changed, 2018 insertions(+), 154 deletions(-) create mode 100644 bits/spawn_ext.h create mode 100644 posix/tst-spawn.h create mode 100644 sysdeps/unix/sysv/linux/arm/clone3.S create mode 100644 sysdeps/unix/sysv/linux/bits/spawn_ext.h create mode 100644 sysdeps/unix/sysv/linux/clone-pidfd-support.c create mode 100644 sysdeps/unix/sysv/linux/mips/clone3.S create mode 100644 sysdeps/unix/sysv/linux/nios2/kernel-features.h create mode 100644 sysdeps/unix/sysv/linux/pidfd_getpid.c create mode 100644 sysdeps/unix/sysv/linux/pidfd_spawn.c create mode 100644 sysdeps/unix/sysv/linux/pidfd_spawnp.c create mode 100644 sysdeps/unix/sysv/linux/procutils.c create mode 100644 sysdeps/unix/sysv/linux/procutils.h create mode 100644 sysdeps/unix/sysv/linux/spawnattr_getcgroup_np.c create mode 100644 sysdeps/unix/sysv/linux/spawnattr_setcgroup_np.c create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_getpid.c create mode 100644 sysdeps/unix/sysv/linux/tst-posix_spawn-setsid-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-cgroup.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-chdir-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-pidfd.h create mode 100644 sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c -- 2.34.1