From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: libc-alpha@sourceware.org, Florian Weimer <fweimer@redhat.com>
Subject: [PATCH v9 0/6] Add pidfd and cgroupv2 support for process creation
Date: Thu, 24 Aug 2023 13:42:13 -0300 [thread overview]
Message-ID: <20230824164219.391263-1-adhemerval.zanella@linaro.org> (raw)
The glibc 2.36 added wrappers for Linux syscall pidfd_open, pidfd_getfd,
and pidfd_send_signal, and exported the P_PIDFD to use along with
waitid. The pidfd is a race-free interface, however, the pidfd_open is
subject to TOCTOU if the file descriptor is not obtained directly from
the clone or clone3 syscall (there is still a small window between the
clone return and the pidfd_getfd where the process can be reaped and the
process ID reused).
A fully race-free interface with posix_spawn interface is being
discussed by GNOME [1] [2], and Qt already uses it in its QtProcess
implementation [3]. The Qt implementation has some pitfalls:
- It calls clone through the syscall symbol, which does not run the
pthread_atfork handlers even though it intends to use the clone
semantic for fork (by only using CLONE_PIDFD | SIGCHLD).
- It also does not reset any internal state, such as internal IO,
malloc, loader, etc. locks.
- It does not set the TCB tid field nor the robust list, used by the
pthread code.
- It does not optimize process creation by using CLONE_VM and
CLONE_VFORK.
Also, the recent Linux kernel (starting with 5.7) provides a way to
create a new process in a different cgroups version 2 than the default
one (through clone3 CLONE_INTO_CGROUP flag). Providing it through glibc
interfaces makes it usable without the risk of potential breakage by
issuing clone3 syscall directly (check BZ#26371 discussion).
This patch set adds new interfaces that take care of these potential
issues. The new posix_spawn / posix_spawnp extensions:
#define POSIX_SPAWN_SETCGROUP 0x100
int posix_spawnattr_getcgroup_np (const posix_spawnattr_t
restrict *attr, int *cgroup);
int posix_spawnattr_setcgroup_np (posix_spawnattr_t *restrict attr,
int cgroup);
Allow spawning a new process on a different cgroupv2.
The pidfd_spawn and pidfd_spawnp is similar to posix_spawn and
posix_spawnp, but return a process file descriptor instead of a PID.
int pidfd_spawn (int *restrict pidfd,
const char *restrict file,
const posix_spawn_file_actions_t *restrict facts,
const posix_spawnattr_t *restrict attrp,
char *const argv[restrict],
char *const envp[restrict]);
int pidfd_spawnp (int *restrict pidfd,
const char *restrict path,
const posix_spawn_file_actions_t *restrict facts,
const posix_spawnattr_t *restrict attrp,
char *const argv[restrict_arr],
char *const envp[restrict_arr]);
The implementation makes sure that kernel must support the complete
pidfd interface, meaning that waitid (P_PIDFD) should be supported. It
ensures that a non-racy workaround is required (such as reading procfs
fdinfo pid to use along with old wait interfaces). If the kernel does
not have the required support the interface returns ENOSYS.
A new symbol is used instead of a posix_spawn extension to avoid
possible issues with language bindings that might track the argument
lifetime.
Both symbols reuse the posix_spawn posix_spawn_file_actions_t and
posix_spawnattr_t, to either avoid rehashing the posix_spawn API or add
a new one. It also means that both interfaces support the same
attribute and file actions, and a new flag or file action on posix_spawn
is also added automatically for pidfd_spawn. It includes
POSIX_SPAWN_SETCGROUP.
To have a way to interop between process IDs and process file
descriptors, the pidfd_getpid is also provided:
pid_t pidfd_getpid (int fd)
It reads the procfs fdinfo entry from the file descriptor to get the
process ID.
[1] https://gitlab.gnome.org/GNOME/glib/-/issues/1866
[2] https://sourceware.org/bugzilla/show_bug.cgi?id=30349
[3] https://codebrowser.dev/qt6/qtbase/src/3rdparty/forkfd/forkfd_linux.c.html
---
Changes from v8:
- Removed fork_np.
Changes from v7:
- Redefine __ASSUME_CLONE3 to 0 if the architecture does not support the
syscall.
- Fixed some failing errors to be reported by spawned processes.
- Fixed pre-commit CI for AArch64 failures.
- Rename pidfd_fork to fork_np and make the API extensible
- Document more possible pidfd_getpid errors.
Changes from v6:
- Rebased against master, adjusted symbol version and NEWS entry.
- Added arm/mips clone3 implementation.
Changes from v5:
- Added cgroupv2 support for posix_spawn, pidfd_spawn, and pidfd_fork.
Changes from v4:
- Changed pidfd_fork signature to return a pid_t instead of the PID file
descriptor.
- Changed pidfd_getpid to return EBADF for negative input, instead of
EINVAL.
- Added PIDFDFORK_NOSIGCHLD option.
- Fixed nested __BEGIN_DECLS on spawn.h
Changes from v3:
- Remove strtoul usage.
- Fixed patchwork tst-pidfd_getpid.c regression.
- Fixed manual and NEWS typos.
Changes from v2:
- Added pidfd_fork and pidfd_getpid manual entries
- Change pidfd_fork to act as fork as default, instead as _Fork.
- Changed PIDFD_FORK_RUNATFORK flag to PIDFDFORK_ASYNCSAFE.
- Added pidfd_getpid test for EREMOTE.
Changes from v1:
- Extended pidfd_getpid error codes to return EBADF if fdinfo does not
have Pid entry or if the value is invalid, EREMOTE is pid is in a
separate namespace, and ESRCH if is already terminated.
- Extended tst-pidfd_getpid.
- Rename PIDFD_FORK_RUNATFORK to PIDFDFORK_RUNATFORK to avoid clashes
with possible kernel extensions.
Adhemerval Zanella (6):
arm: Add the clone3 wrapper
mips: Add the clone3 wrapper
linux: Define __ASSUME_CLONE3 to 0 for alpha, ia64, nios2, sh, and
sparc
linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731)
posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349)
linux: Add pidfd_getpid
NEWS | 17 ++
bits/spawn_ext.h | 21 ++
include/clone_internal.h | 4 +
manual/process.texi | 52 +++-
posix/Makefile | 2 +
posix/spawn.h | 6 +-
posix/spawn_int.h | 3 +-
posix/spawnattr_setflags.c | 3 +-
posix/tst-posix_spawn-setsid.c | 169 +++++++++----
posix/tst-spawn-chdir.c | 15 +-
posix/tst-spawn.c | 24 +-
posix/tst-spawn.h | 36 +++
posix/tst-spawn2.c | 17 +-
posix/tst-spawn3.c | 99 ++++----
posix/tst-spawn4.c | 7 +-
posix/tst-spawn5.c | 14 +-
posix/tst-spawn6.c | 13 +-
posix/tst-spawn7.c | 13 +-
sysdeps/unix/sysv/linux/Makefile | 26 ++
sysdeps/unix/sysv/linux/Versions | 7 +
sysdeps/unix/sysv/linux/aarch64/libc.abilist | 5 +
.../unix/sysv/linux/alpha/kernel-features.h | 4 +
sysdeps/unix/sysv/linux/alpha/libc.abilist | 5 +
sysdeps/unix/sysv/linux/arc/libc.abilist | 5 +
sysdeps/unix/sysv/linux/arm/be/libc.abilist | 5 +
sysdeps/unix/sysv/linux/arm/clone3.S | 80 +++++++
sysdeps/unix/sysv/linux/arm/le/libc.abilist | 5 +
sysdeps/unix/sysv/linux/arm/sysdep.h | 1 +
sysdeps/unix/sysv/linux/bits/spawn_ext.h | 67 ++++++
sysdeps/unix/sysv/linux/clone-pidfd-support.c | 60 +++++
sysdeps/unix/sysv/linux/csky/libc.abilist | 5 +
sysdeps/unix/sysv/linux/hppa/libc.abilist | 5 +
sysdeps/unix/sysv/linux/i386/libc.abilist | 5 +
.../unix/sysv/linux/ia64/kernel-features.h | 4 +
sysdeps/unix/sysv/linux/ia64/libc.abilist | 5 +
.../sysv/linux/loongarch/lp64/libc.abilist | 5 +
.../sysv/linux/m68k/coldfire/libc.abilist | 5 +
.../unix/sysv/linux/m68k/m680x0/libc.abilist | 5 +
.../sysv/linux/microblaze/be/libc.abilist | 5 +
.../sysv/linux/microblaze/le/libc.abilist | 5 +
sysdeps/unix/sysv/linux/mips/clone3.S | 139 +++++++++++
.../sysv/linux/mips/mips32/fpu/libc.abilist | 5 +
.../sysv/linux/mips/mips32/nofpu/libc.abilist | 5 +
.../sysv/linux/mips/mips64/n32/libc.abilist | 5 +
.../sysv/linux/mips/mips64/n64/libc.abilist | 5 +
sysdeps/unix/sysv/linux/mips/sysdep.h | 2 +
.../unix/sysv/linux/nios2/kernel-features.h | 24 ++
sysdeps/unix/sysv/linux/nios2/libc.abilist | 5 +
sysdeps/unix/sysv/linux/or1k/libc.abilist | 5 +
sysdeps/unix/sysv/linux/pidfd_getpid.c | 126 ++++++++++
sysdeps/unix/sysv/linux/pidfd_spawn.c | 30 +++
sysdeps/unix/sysv/linux/pidfd_spawnp.c | 30 +++
.../linux/powerpc/powerpc32/fpu/libc.abilist | 5 +
.../powerpc/powerpc32/nofpu/libc.abilist | 5 +
.../linux/powerpc/powerpc64/be/libc.abilist | 5 +
.../linux/powerpc/powerpc64/le/libc.abilist | 5 +
sysdeps/unix/sysv/linux/procutils.c | 97 ++++++++
sysdeps/unix/sysv/linux/procutils.h | 43 ++++
.../unix/sysv/linux/riscv/rv32/libc.abilist | 5 +
.../unix/sysv/linux/riscv/rv64/libc.abilist | 5 +
.../unix/sysv/linux/s390/s390-32/libc.abilist | 5 +
.../unix/sysv/linux/s390/s390-64/libc.abilist | 5 +
sysdeps/unix/sysv/linux/sh/be/libc.abilist | 5 +
sysdeps/unix/sysv/linux/sh/kernel-features.h | 4 +
sysdeps/unix/sysv/linux/sh/le/libc.abilist | 5 +
.../unix/sysv/linux/sparc/kernel-features.h | 4 +
.../sysv/linux/sparc/sparc32/libc.abilist | 5 +
.../sysv/linux/sparc/sparc64/libc.abilist | 5 +
.../unix/sysv/linux/spawnattr_getcgroup_np.c | 28 +++
.../unix/sysv/linux/spawnattr_setcgroup_np.c | 27 +++
sysdeps/unix/sysv/linux/spawni.c | 44 +++-
sysdeps/unix/sysv/linux/sys/pidfd.h | 4 +
sysdeps/unix/sysv/linux/tst-pidfd.c | 47 ++++
sysdeps/unix/sysv/linux/tst-pidfd_getpid.c | 123 ++++++++++
.../sysv/linux/tst-posix_spawn-setsid-pidfd.c | 20 ++
sysdeps/unix/sysv/linux/tst-spawn-cgroup.c | 223 ++++++++++++++++++
.../unix/sysv/linux/tst-spawn-chdir-pidfd.c | 20 ++
sysdeps/unix/sysv/linux/tst-spawn-pidfd.c | 20 ++
sysdeps/unix/sysv/linux/tst-spawn-pidfd.h | 63 +++++
sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c | 20 ++
sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c | 20 ++
sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c | 20 ++
sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c | 20 ++
sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c | 20 ++
sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c | 20 ++
.../unix/sysv/linux/x86_64/64/libc.abilist | 5 +
.../unix/sysv/linux/x86_64/x32/libc.abilist | 5 +
87 files changed, 2018 insertions(+), 154 deletions(-)
create mode 100644 bits/spawn_ext.h
create mode 100644 posix/tst-spawn.h
create mode 100644 sysdeps/unix/sysv/linux/arm/clone3.S
create mode 100644 sysdeps/unix/sysv/linux/bits/spawn_ext.h
create mode 100644 sysdeps/unix/sysv/linux/clone-pidfd-support.c
create mode 100644 sysdeps/unix/sysv/linux/mips/clone3.S
create mode 100644 sysdeps/unix/sysv/linux/nios2/kernel-features.h
create mode 100644 sysdeps/unix/sysv/linux/pidfd_getpid.c
create mode 100644 sysdeps/unix/sysv/linux/pidfd_spawn.c
create mode 100644 sysdeps/unix/sysv/linux/pidfd_spawnp.c
create mode 100644 sysdeps/unix/sysv/linux/procutils.c
create mode 100644 sysdeps/unix/sysv/linux/procutils.h
create mode 100644 sysdeps/unix/sysv/linux/spawnattr_getcgroup_np.c
create mode 100644 sysdeps/unix/sysv/linux/spawnattr_setcgroup_np.c
create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_getpid.c
create mode 100644 sysdeps/unix/sysv/linux/tst-posix_spawn-setsid-pidfd.c
create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-cgroup.c
create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-chdir-pidfd.c
create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-pidfd.c
create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-pidfd.h
create mode 100644 sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c
create mode 100644 sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c
create mode 100644 sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c
create mode 100644 sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c
create mode 100644 sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c
create mode 100644 sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c
--
2.34.1
next reply other threads:[~2023-08-24 16:42 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-24 16:42 Adhemerval Zanella [this message]
2023-08-24 16:42 ` [PATCH v9 1/6] arm: Add the clone3 wrapper Adhemerval Zanella
2023-09-04 12:04 ` Florian Weimer
2023-09-05 13:14 ` Adhemerval Zanella Netto
2023-08-24 16:42 ` [PATCH v9 2/6] mips: " Adhemerval Zanella
2023-09-04 12:04 ` Florian Weimer
2023-09-05 13:15 ` Adhemerval Zanella Netto
2023-08-24 16:42 ` [PATCH v9 3/6] linux: Define __ASSUME_CLONE3 to 0 for alpha, ia64, nios2, sh, and sparc Adhemerval Zanella
2023-08-24 16:42 ` [PATCH v9 4/6] linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731) Adhemerval Zanella
2023-08-24 16:42 ` [PATCH v9 5/6] posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349) Adhemerval Zanella
2023-09-04 12:53 ` Florian Weimer
2023-08-24 16:42 ` [PATCH v9 6/6] linux: Add pidfd_getpid Adhemerval Zanella
2023-08-28 12:42 ` [PATCH v9 0/6] Add pidfd and cgroupv2 support for process creation Luca Boccassi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230824164219.391263-1-adhemerval.zanella@linaro.org \
--to=adhemerval.zanella@linaro.org \
--cc=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).