From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x330.google.com (mail-ot1-x330.google.com [IPv6:2607:f8b0:4864:20::330]) by sourceware.org (Postfix) with ESMTPS id 7AA1D3858C2D for ; Thu, 6 Jul 2023 13:45:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7AA1D3858C2D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-ot1-x330.google.com with SMTP id 46e09a7af769-6b7541d885cso628985a34.3 for ; Thu, 06 Jul 2023 06:45:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1688651112; x=1691243112; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=tsJ3gckf342zUkASXiPe4p8b3Ei2+D5IjSf+7iC4eic=; b=MunUxOse4YbJhBDlmLkeOradXDC153uqsLVeEXHar/ITY6HWs2csRX3Metj/uKkS1F iqdzwevOK0V8vYTLA3BYb5diK6ebg9JZ6ZSh2mAVUnpLDozvoC+438p8ibLUfzt8KTIx YTCLQUND/am0UjFJqSXmJgo7N+D1WEqojp9hYyOI3+Pifd6BY1Ooz6MytKFJd2J5u7OI WF3SLBTuo+mGYfFBV0fs4Bp/EwJgShVA6V+4J6wp/iy5YMdd3cGiinlERgH+ad3zwOFe IUHXhPOlNYe4+ieCCFNbi8/QT0dtQ2T2aMuQYy2cYXjPPZWvyeYkOJvwov+XNOCOQuMW uDqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688651112; x=1691243112; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tsJ3gckf342zUkASXiPe4p8b3Ei2+D5IjSf+7iC4eic=; b=imyjLZtRVXb48xIOahVuD5ApubwyRwbE4ETMyW3iBo/vRMyoR9+eaVsywYNIuseRqd LKXw5diIr/l4sYWlqNzixNpac46WZHgUHQmDA3JsH8cEoJ9X7DlyIOm3/1nXwzlCT7p4 reAeygN6hA8vOo9balw/sZlMN0KTMtlPTb1hIheHsdwCzJ45VupRFKpeBsYy1NEcI0XC FpVIZj8MjFOvTMoe1cFv34yJvTwemgo9rkOno/hfbWUEi2S2c210EzqP4kjI2oCO9FmZ roGMCNQS1xYMx7C30P0MFu1VYXYdTRoHMZI1zuntvSfit4o2mu9uFCOZwvFrtSfHVk/H xY1A== X-Gm-Message-State: ABy/qLbLpV9Ifrw29wxi+Nu7KrS1OphUSnmqVz3zfYa4m3C5QjvTfWJT lpJvku56RuNL7he1PRsBTl1BuZHlEo35+p3OlnnNvA== X-Google-Smtp-Source: APBJJlGiycGORf5eP6qyt3DK3pV/NJGTbhDJbZLuNJ1XlnkdWJoV2HIoEjEGXJ/4y3DL6pPOQSlMAA== X-Received: by 2002:a05:6830:4c2:b0:6b8:8be0:b0b with SMTP id s2-20020a05683004c200b006b88be00b0bmr1987720otd.10.1688651112523; Thu, 06 Jul 2023 06:45:12 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c3:e0c8:d7d8:d5af:ed90:b13c]) by smtp.gmail.com with ESMTPSA id o19-20020a9d7653000000b006b1570a7674sm696274otl.29.2023.07.06.06.45.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jul 2023 06:45:12 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Luca Boccassi , Philip Withnall Subject: [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation (resend) Date: Thu, 6 Jul 2023 10:45:03 -0300 Message-Id: <20230706134508.422526-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The glibc 2.36 added wrappers for Linux syscall pidfd_open, pidfd_getfd, and pidfd_send_signal, and exported the P_PIDFD to use along with waitid. The pidfd is a race free interface, however the pidfd_open is subject to TOCTOU if the file descriptor is not obtained directly from the clone or clone3 syscall (there is still a small window between the clone return and the pidfd_getfd where the process can be reaped and the process ID reused). A fully race free interface with posix_spawn interface is being discussed by GNOME [1] [2], and Qt already uses on its QtProcess implementation [3]. The Qt implementation has some pitfalls: - It calls clone through the syscall symbol, which does not run the pthread_atfork handlers even though it really intends to use the clone semantic for fork (by only using CLONE_PIDFD | SIGCHLD). - It also does not reset any internal state, such as internal IO, malloc, loader, etc. locks. - It does not set the TCB tid field nor the robust list, used by pthread code. - It does not optimize process creation by using CLONE_VM and CLONE_VFORK. Also, recent Linux kernel (starting with 5.7) provide a way to create a new process in a different cgroups version 2 than the default one (through clone3 CLONE_INTO_CGROUP flag). Providing it through glibc interfaces make is usable without the risk of potential breakage by issuing clone3 syscall directly (check BZ#26371 discussion). This patchset adds new interfaces that take care of this potential issues. The new posix_spawn / posix_spawnp extesions: #define POSIX_SPAWN_SETCGROUP 0x100 int posix_spawnattr_getcgroup_np (const posix_spawnattr_t restrict *attr, int *cgroup); int posix_spawnattr_setcgroup_np (posix_spawnattr_t *restrict attr, int cgroup); Allow spawn a new process on a different cgroupv2. The pidfd_spawn and pidfd_spawnp is similar to posix_spawn and posix_spawnp, but return a process file descriptor instead of a PID. int pidfd_spawn (int *restrict pidfd, const char *restrict file, const posix_spawn_file_actions_t *restrict facts, const posix_spawnattr_t *restrict attrp, char *const argv[restrict], char *const envp[restrict]) int pidfd_spawnp (int *restrict pidfd, const char *restrict path, const posix_spawn_file_actions_t *restrict facts, const posix_spawnattr_t *restrict attrp, char *const argv[restrict_arr], char *const envp[restrict_arr]); The implementation makes sure that kernel must support the complete pidfd interface, meaning that waitid (P_PIDFD) should be supported. It ensure that non racy workaround is required (such as reading procfs fdinfo pid to use along with old wait interfaces). If kernel does not have the required support the interface returns ENOSYS. A new symbol is used instead of a posix_spawn extension to avoid possible issue with language bindings that might track the argument lifetime. Both symbols reuse the posix_spawn posix_spawn_file_actions_t and posix_spawnattr_t, to either avoid rehash posix_spawn API or add a new one. It also mean that both interfaces support the same attribute and file actions, and a new flag or file actions on posix_spawn is also added automatically for pidfd_spawn. It includes POSIX_SPAWN_SETCGROUP. Along with the spawn interface, a fork like one is also provided: pid_t pidfd_fork (int *pidfd, int cgroup, unsigned int flags) If PIDFD is set to NULL, no file descriptor is returned and pidfd_fork acts as fork. Otherwise, a new file descriptor is returned and the kernel already sets O_CLOEXEC as default. The pidfd_fork follows fork/_Fork convention on returning a positive or negative value to the parent (with negative indicating an error) and zero to the child. If cgroup is 0 or positive value, it is interpreted as a different cgroup to be place the new process (check CLONE_INTO_CGROUP clone flag). The kernel already sets O_CLOEXEC as default and it follows fork/_Fork convention on returning a positive or negative value to the parent (with negative indicating an error) and zero to the child. Similar to fork, pidfd_fork also runs the pthread_atfork handlers It can be change by using PIDFDFORK_ASYNCSAFE flag, which make pidfd_fork acts a _Fork. It also send SIGCHLD to parent when process terminates. To have a way to interop between process IDs and process file descriptors, the pidfd_getpid is also provided: pid_t pidfd_getpid (int fd) It reads the procfs fdinfo entry from the file descriptor to get the process ID. --- Changes from v5: - Added cgroupv2 support for posix_spawn, pidfd_spawn, and pidfd_fork. Changes from v4: - Changed pidfd_fork signature to return a pid_t instead of PID file descriptor. - Changed pidfd_getpid to return EBADF for negative input, instead of EINVAL. - Added PIDFDFORK_NOSIGCHLD option. - Fixed nested __BEGIN_DECLS on spawn.h Changes from v3: - Remove strtoul usage. - Fixed patchwork tst-pidfd_getpid.c regression. - Fixed manual and NEWS typos. Changes from v2: - Added pidfd_fork and pidfd_getpid manual entries - Change pidfd_fork to act as fork as default, instead as _Fork. - Changed PIDFD_FORK_RUNATFORK flag to PIDFDFORK_ASYNCSAFE. - Added pidfd_getpid test for EREMOTE. Changes from v1: - Extended pidfd_getpid error codes to return EBADF if fdinfo does not have Pid entry or if the value is invalid, EREMOTE is pid is in a separate namespace, and ESRCH if is already terminated. - Extended tst-pidfd_getpid. - Rename PIDFD_FORK_RUNATFORK to PIDFDFORK_RUNATFORK to avoid clash with possible kernel extensions. Adhemerval Zanella (5): linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731) posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349) posix: Add pidfd_fork (BZ 26371) posix: Add PIDFDFORK_NOSIGCHLD for pidfd_fork linux: Add pidfd_getpid NEWS | 22 ++ bits/spawn_ext.h | 21 ++ include/clone_internal.h | 21 ++ manual/process.texi | 92 ++++++- posix/Makefile | 5 +- posix/fork-internal.c | 127 ++++++++++ posix/fork-internal.h | 36 +++ posix/fork.c | 107 +-------- posix/spawn.h | 6 +- posix/spawn_int.h | 3 +- posix/spawnattr_setflags.c | 3 +- posix/tst-posix_spawn-setsid.c | 168 +++++++++---- posix/tst-spawn-chdir.c | 15 +- posix/tst-spawn.c | 24 +- posix/tst-spawn.h | 36 +++ posix/tst-spawn2.c | 17 +- posix/tst-spawn3.c | 100 ++++---- posix/tst-spawn4.c | 7 +- posix/tst-spawn5.c | 14 +- posix/tst-spawn6.c | 15 +- posix/tst-spawn7.c | 13 +- sysdeps/nptl/_Fork.c | 2 +- sysdeps/unix/sysv/linux/Makefile | 29 +++ sysdeps/unix/sysv/linux/Versions | 8 + sysdeps/unix/sysv/linux/aarch64/libc.abilist | 6 + sysdeps/unix/sysv/linux/alpha/libc.abilist | 6 + sysdeps/unix/sysv/linux/arc/libc.abilist | 6 + sysdeps/unix/sysv/linux/arch-fork.h | 16 +- sysdeps/unix/sysv/linux/arm/be/libc.abilist | 6 + sysdeps/unix/sysv/linux/arm/le/libc.abilist | 6 + sysdeps/unix/sysv/linux/bits/spawn_ext.h | 60 +++++ sysdeps/unix/sysv/linux/clone-internal.c | 60 ++++- sysdeps/unix/sysv/linux/clone-pidfd-support.c | 58 +++++ sysdeps/unix/sysv/linux/csky/libc.abilist | 6 + sysdeps/unix/sysv/linux/hppa/libc.abilist | 6 + sysdeps/unix/sysv/linux/i386/libc.abilist | 6 + sysdeps/unix/sysv/linux/ia64/libc.abilist | 6 + .../sysv/linux/loongarch/lp64/libc.abilist | 6 + .../sysv/linux/m68k/coldfire/libc.abilist | 6 + .../unix/sysv/linux/m68k/m680x0/libc.abilist | 6 + .../sysv/linux/microblaze/be/libc.abilist | 6 + .../sysv/linux/microblaze/le/libc.abilist | 6 + .../sysv/linux/mips/mips32/fpu/libc.abilist | 6 + .../sysv/linux/mips/mips32/nofpu/libc.abilist | 6 + .../sysv/linux/mips/mips64/n32/libc.abilist | 6 + .../sysv/linux/mips/mips64/n64/libc.abilist | 6 + sysdeps/unix/sysv/linux/nios2/libc.abilist | 6 + sysdeps/unix/sysv/linux/or1k/libc.abilist | 6 + sysdeps/unix/sysv/linux/pidfd_fork.c | 82 +++++++ sysdeps/unix/sysv/linux/pidfd_getpid.c | 122 ++++++++++ sysdeps/unix/sysv/linux/pidfd_spawn.c | 30 +++ sysdeps/unix/sysv/linux/pidfd_spawnp.c | 30 +++ .../linux/powerpc/powerpc32/fpu/libc.abilist | 6 + .../powerpc/powerpc32/nofpu/libc.abilist | 6 + .../linux/powerpc/powerpc64/be/libc.abilist | 6 + .../linux/powerpc/powerpc64/le/libc.abilist | 6 + sysdeps/unix/sysv/linux/procutils.c | 104 ++++++++ sysdeps/unix/sysv/linux/procutils.h | 35 +++ .../unix/sysv/linux/riscv/rv32/libc.abilist | 6 + .../unix/sysv/linux/riscv/rv64/libc.abilist | 6 + .../unix/sysv/linux/s390/s390-32/libc.abilist | 6 + .../unix/sysv/linux/s390/s390-64/libc.abilist | 6 + sysdeps/unix/sysv/linux/sh/be/libc.abilist | 6 + sysdeps/unix/sysv/linux/sh/le/libc.abilist | 6 + .../sysv/linux/sparc/sparc32/libc.abilist | 6 + .../sysv/linux/sparc/sparc64/libc.abilist | 6 + .../unix/sysv/linux/spawnattr_getcgroup_np.c | 28 +++ .../unix/sysv/linux/spawnattr_setcgroup_np.c | 27 +++ sysdeps/unix/sysv/linux/spawni.c | 38 ++- sysdeps/unix/sysv/linux/sys/pidfd.h | 25 ++ sysdeps/unix/sysv/linux/tst-pidfd.c | 47 ++++ .../unix/sysv/linux/tst-pidfd_fork-cgroup.c | 162 +++++++++++++ sysdeps/unix/sysv/linux/tst-pidfd_fork.c | 227 ++++++++++++++++++ sysdeps/unix/sysv/linux/tst-pidfd_getpid.c | 187 +++++++++++++++ .../sysv/linux/tst-posix_spawn-setsid-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn-cgroup.c | 216 +++++++++++++++++ .../unix/sysv/linux/tst-spawn-chdir-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn-pidfd.h | 63 +++++ sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c | 20 ++ sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c | 20 ++ .../unix/sysv/linux/x86_64/64/libc.abilist | 6 + .../unix/sysv/linux/x86_64/x32/libc.abilist | 6 + 87 files changed, 2624 insertions(+), 268 deletions(-) create mode 100644 bits/spawn_ext.h create mode 100644 posix/fork-internal.c create mode 100644 posix/fork-internal.h create mode 100644 posix/tst-spawn.h create mode 100644 sysdeps/unix/sysv/linux/bits/spawn_ext.h create mode 100644 sysdeps/unix/sysv/linux/clone-pidfd-support.c create mode 100644 sysdeps/unix/sysv/linux/pidfd_fork.c create mode 100644 sysdeps/unix/sysv/linux/pidfd_getpid.c create mode 100644 sysdeps/unix/sysv/linux/pidfd_spawn.c create mode 100644 sysdeps/unix/sysv/linux/pidfd_spawnp.c create mode 100644 sysdeps/unix/sysv/linux/procutils.c create mode 100644 sysdeps/unix/sysv/linux/procutils.h create mode 100644 sysdeps/unix/sysv/linux/spawnattr_getcgroup_np.c create mode 100644 sysdeps/unix/sysv/linux/spawnattr_setcgroup_np.c create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_fork-cgroup.c create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_fork.c create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_getpid.c create mode 100644 sysdeps/unix/sysv/linux/tst-posix_spawn-setsid-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-cgroup.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-chdir-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-pidfd.h create mode 100644 sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c create mode 100644 sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c -- 2.34.1