public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation
@ 2023-07-05 20:43 Adhemerval Zanella
  2023-07-05 20:43 ` [PATCH v6 1/5] linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731) Adhemerval Zanella
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Adhemerval Zanella @ 2023-07-05 20:43 UTC (permalink / raw)
  To: libc-alpha, Luca Boccassi, Philip Withnall

Add pidfd and cgroupv2 support for process creation

The glibc 2.36 added wrappers for Linux syscall pidfd_open,
pidfd_getfd, and pidfd_send_signal, and exported the P_PIDFD to use
along with waitid. The pidfd is a race free interface, however
the pidfd_open is subject to TOCTOU if the file descriptor
is not obtained directly from the clone or clone3 syscall (there is
still a small window between the clone return and the pidfd_getfd
where the process can be reaped and the process ID reused).

A fully race free interface with posix_spawn interface is being
discussed by GNOME [1] [2], and Qt already uses on its QtProcess
implementation [3].  The Qt implementation has some pitfalls:

  - It calls clone through the syscall symbol, which does not run the
    pthread_atfork handlers even though it really intends to use the
    clone semantic for fork (by only using CLONE_PIDFD | SIGCHLD).

  - It also does not reset any internal state, such as internal IO,
    malloc, loader, etc. locks.

  - It does not set the TCB tid field nor the robust list, used by
    pthread code.

  - It does not optimize process creation by using CLONE_VM and
    CLONE_VFORK.

Also, recent Linux kernel (starting with 5.7) provide a way to
create a new process in a different cgroups version 2 than the
default one (through clone3 CLONE_INTO_CGROUP flag).  Providing it
through glibc interfaces make is usable without the risk of potential
breakage by issuing clone3 syscall directly (check BZ#26371 discussion).

This patchset adds new interfaces that take care of this potential
issues.  The new posix_spawn / posix_spawnp extesions:


  #define POSIX_SPAWN_SETCGROUP 0x100

  int posix_spawnattr_getcgroup_np (const posix_spawnattr_t
				    restrict *attr, int *cgroup);
  int posix_spawnattr_setcgroup_np (posix_spawnattr_t *restrict attr,
                                    int cgroup);
  
Allow spawn a new process on a different cgroupv2.  

The pidfd_spawn and pidfd_spawnp is similar to posix_spawn and
posix_spawnp,
but return a process file descriptor instead of a PID.

  int pidfd_spawn (int *restrict pidfd,
 		   const char *restrict file,
  		   const posix_spawn_file_actions_t *restrict facts,
  		   const posix_spawnattr_t *restrict attrp,
  		   char *const argv[restrict],
  		   char *const envp[restrict])

  int pidfd_spawnp (int *restrict pidfd,
 		    const char *restrict path,
  		    const posix_spawn_file_actions_t *restrict facts,
  		    const posix_spawnattr_t *restrict attrp,
  		    char *const argv[restrict_arr],
  		    char *const envp[restrict_arr]);

The implementation makes sure that kernel must support the complete
pidfd interface, meaning that waitid (P_PIDFD) should be supported.  It
ensure that non racy workaround is required (such as reading procfs
fdinfo pid to use along with old wait interfaces).  If kernel does not
have the required support the interface returns ENOSYS.

A new symbol is used instead of a posix_spawn extension to avoid
possible issue with language bindings that might track the argument
lifetime.

Both symbols reuse the posix_spawn posix_spawn_file_actions_t and
posix_spawnattr_t, to either avoid rehash posix_spawn API or add a new
one.  It also mean that both interfaces support the same attribute and
file actions, and a new flag or file actions on posix_spawn is also
added automatically for pidfd_spawn. It includes POSIX_SPAWN_SETCGROUP.

Along with the spawn interface, a fork like one is also provided:

  pid_t pidfd_fork (int *pidfd, int cgroup, unsigned int flags)

If PIDFD is set to NULL, no file descriptor is returned and pidfd_fork
acts as fork.  Otherwise, a new file descriptor is returned and the
kernel already sets O_CLOEXEC as default.  The pidfd_fork follows
fork/_Fork convention on returning a positive or negative value to the
parent (with negative indicating an error) and zero to the child.

If cgroup is 0 or positive value, it is interpreted as a different
cgroup to be place the new process (check CLONE_INTO_CGROUP clone
flag).

The kernel already sets O_CLOEXEC as default and it follows fork/_Fork
convention on returning a positive or negative value to the parent
(with negative indicating an error) and zero to the child.

Similar to fork, pidfd_fork also runs the pthread_atfork handlers
It can be change by using PIDFDFORK_ASYNCSAFE flag, which make
pidfd_fork acts a _Fork.  It also send SIGCHLD to parent when
process terminates.

To have a way to interop between process IDs and process file
descriptors, the pidfd_getpid is also provided:

   pid_t pidfd_getpid (int fd)

It reads the procfs fdinfo entry from the file descriptor to get
the process ID.

---

Changes from v5:
- Added cgroupv2 support for posix_spawn, pidfd_spawn, and pidfd_fork.

Changes from v4:
- Changed pidfd_fork signature to return a pid_t instead of PID file
  descriptor.
- Changed pidfd_getpid to return EBADF for negative input, instead of
  EINVAL.
- Added PIDFDFORK_NOSIGCHLD option.
- Fixed nested __BEGIN_DECLS on spawn.h

Changes from v3:
- Remove strtoul usage.
- Fixed patchwork tst-pidfd_getpid.c regression.
- Fixed manual and NEWS typos.

Changes from v2:
- Added pidfd_fork and pidfd_getpid manual entries
- Change pidfd_fork to act as fork as default, instead as _Fork.
- Changed PIDFD_FORK_RUNATFORK flag to PIDFDFORK_ASYNCSAFE.
- Added pidfd_getpid test for EREMOTE.

Changes from v1:
- Extended pidfd_getpid error codes to return EBADF if fdinfo does not
  have Pid entry or if the value is invalid, EREMOTE is pid is in a 
  separate namespace, and ESRCH if is already terminated.
- Extended tst-pidfd_getpid.
- Rename PIDFD_FORK_RUNATFORK to PIDFDFORK_RUNATFORK to avoid clash
  with possible kernel extensions.

Adhemerval Zanella (5):
  linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731)
  posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349)
  posix: Add pidfd_fork (BZ 26371)
  posix: Add PIDFDFORK_NOSIGCHLD for pidfd_fork
  linux: Add pidfd_getpid

 NEWS                                          |  22 ++
 bits/spawn_ext.h                              |  21 ++
 include/clone_internal.h                      |  21 ++
 manual/process.texi                           |  92 ++++++-
 posix/Makefile                                |   5 +-
 posix/fork-internal.c                         | 127 ++++++++++
 posix/fork-internal.h                         |  36 +++
 posix/fork.c                                  | 107 +--------
 posix/spawn.h                                 |   6 +-
 posix/spawn_int.h                             |   3 +-
 posix/spawnattr_setflags.c                    |   3 +-
 posix/tst-posix_spawn-setsid.c                | 168 +++++++++----
 posix/tst-spawn-chdir.c                       |  15 +-
 posix/tst-spawn.c                             |  24 +-
 posix/tst-spawn.h                             |  36 +++
 posix/tst-spawn2.c                            |  17 +-
 posix/tst-spawn3.c                            | 100 ++++----
 posix/tst-spawn4.c                            |   7 +-
 posix/tst-spawn5.c                            |  14 +-
 posix/tst-spawn6.c                            |  15 +-
 posix/tst-spawn7.c                            |  13 +-
 sysdeps/nptl/_Fork.c                          |   2 +-
 sysdeps/unix/sysv/linux/Makefile              |  29 +++
 sysdeps/unix/sysv/linux/Versions              |   8 +
 sysdeps/unix/sysv/linux/aarch64/libc.abilist  |   6 +
 sysdeps/unix/sysv/linux/alpha/libc.abilist    |   6 +
 sysdeps/unix/sysv/linux/arc/libc.abilist      |   6 +
 sysdeps/unix/sysv/linux/arch-fork.h           |  16 +-
 sysdeps/unix/sysv/linux/arm/be/libc.abilist   |   6 +
 sysdeps/unix/sysv/linux/arm/le/libc.abilist   |   6 +
 sysdeps/unix/sysv/linux/bits/spawn_ext.h      |  60 +++++
 sysdeps/unix/sysv/linux/clone-internal.c      |  60 ++++-
 sysdeps/unix/sysv/linux/clone-pidfd-support.c |  58 +++++
 sysdeps/unix/sysv/linux/csky/libc.abilist     |   6 +
 sysdeps/unix/sysv/linux/hppa/libc.abilist     |   6 +
 sysdeps/unix/sysv/linux/i386/libc.abilist     |   6 +
 sysdeps/unix/sysv/linux/ia64/libc.abilist     |   6 +
 .../sysv/linux/loongarch/lp64/libc.abilist    |   6 +
 .../sysv/linux/m68k/coldfire/libc.abilist     |   6 +
 .../unix/sysv/linux/m68k/m680x0/libc.abilist  |   6 +
 .../sysv/linux/microblaze/be/libc.abilist     |   6 +
 .../sysv/linux/microblaze/le/libc.abilist     |   6 +
 .../sysv/linux/mips/mips32/fpu/libc.abilist   |   6 +
 .../sysv/linux/mips/mips32/nofpu/libc.abilist |   6 +
 .../sysv/linux/mips/mips64/n32/libc.abilist   |   6 +
 .../sysv/linux/mips/mips64/n64/libc.abilist   |   6 +
 sysdeps/unix/sysv/linux/nios2/libc.abilist    |   6 +
 sysdeps/unix/sysv/linux/or1k/libc.abilist     |   6 +
 sysdeps/unix/sysv/linux/pidfd_fork.c          |  82 +++++++
 sysdeps/unix/sysv/linux/pidfd_getpid.c        | 122 ++++++++++
 sysdeps/unix/sysv/linux/pidfd_spawn.c         |  30 +++
 sysdeps/unix/sysv/linux/pidfd_spawnp.c        |  30 +++
 .../linux/powerpc/powerpc32/fpu/libc.abilist  |   6 +
 .../powerpc/powerpc32/nofpu/libc.abilist      |   6 +
 .../linux/powerpc/powerpc64/be/libc.abilist   |   6 +
 .../linux/powerpc/powerpc64/le/libc.abilist   |   6 +
 sysdeps/unix/sysv/linux/procutils.c           | 104 ++++++++
 sysdeps/unix/sysv/linux/procutils.h           |  35 +++
 .../unix/sysv/linux/riscv/rv32/libc.abilist   |   6 +
 .../unix/sysv/linux/riscv/rv64/libc.abilist   |   6 +
 .../unix/sysv/linux/s390/s390-32/libc.abilist |   6 +
 .../unix/sysv/linux/s390/s390-64/libc.abilist |   6 +
 sysdeps/unix/sysv/linux/sh/be/libc.abilist    |   6 +
 sysdeps/unix/sysv/linux/sh/le/libc.abilist    |   6 +
 .../sysv/linux/sparc/sparc32/libc.abilist     |   6 +
 .../sysv/linux/sparc/sparc64/libc.abilist     |   6 +
 .../unix/sysv/linux/spawnattr_getcgroup_np.c  |  28 +++
 .../unix/sysv/linux/spawnattr_setcgroup_np.c  |  27 +++
 sysdeps/unix/sysv/linux/spawni.c              |  38 ++-
 sysdeps/unix/sysv/linux/sys/pidfd.h           |  25 ++
 sysdeps/unix/sysv/linux/tst-pidfd.c           |  47 ++++
 .../unix/sysv/linux/tst-pidfd_fork-cgroup.c   | 162 +++++++++++++
 sysdeps/unix/sysv/linux/tst-pidfd_fork.c      | 227 ++++++++++++++++++
 sysdeps/unix/sysv/linux/tst-pidfd_getpid.c    | 187 +++++++++++++++
 .../sysv/linux/tst-posix_spawn-setsid-pidfd.c |  20 ++
 sysdeps/unix/sysv/linux/tst-spawn-cgroup.c    | 216 +++++++++++++++++
 .../unix/sysv/linux/tst-spawn-chdir-pidfd.c   |  20 ++
 sysdeps/unix/sysv/linux/tst-spawn-pidfd.c     |  20 ++
 sysdeps/unix/sysv/linux/tst-spawn-pidfd.h     |  63 +++++
 sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c    |  20 ++
 sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c    |  20 ++
 sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c    |  20 ++
 sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c    |  20 ++
 sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c    |  20 ++
 sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c    |  20 ++
 .../unix/sysv/linux/x86_64/64/libc.abilist    |   6 +
 .../unix/sysv/linux/x86_64/x32/libc.abilist   |   6 +
 87 files changed, 2624 insertions(+), 268 deletions(-)
 create mode 100644 bits/spawn_ext.h
 create mode 100644 posix/fork-internal.c
 create mode 100644 posix/fork-internal.h
 create mode 100644 posix/tst-spawn.h
 create mode 100644 sysdeps/unix/sysv/linux/bits/spawn_ext.h
 create mode 100644 sysdeps/unix/sysv/linux/clone-pidfd-support.c
 create mode 100644 sysdeps/unix/sysv/linux/pidfd_fork.c
 create mode 100644 sysdeps/unix/sysv/linux/pidfd_getpid.c
 create mode 100644 sysdeps/unix/sysv/linux/pidfd_spawn.c
 create mode 100644 sysdeps/unix/sysv/linux/pidfd_spawnp.c
 create mode 100644 sysdeps/unix/sysv/linux/procutils.c
 create mode 100644 sysdeps/unix/sysv/linux/procutils.h
 create mode 100644 sysdeps/unix/sysv/linux/spawnattr_getcgroup_np.c
 create mode 100644 sysdeps/unix/sysv/linux/spawnattr_setcgroup_np.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_fork-cgroup.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_fork.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_getpid.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-posix_spawn-setsid-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-cgroup.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-chdir-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-pidfd.h
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 1/5] linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731)
  2023-07-05 20:43 [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Adhemerval Zanella
@ 2023-07-05 20:43 ` Adhemerval Zanella
  2023-07-05 20:43 ` [PATCH v6 2/5] posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349) Adhemerval Zanella
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Adhemerval Zanella @ 2023-07-05 20:43 UTC (permalink / raw)
  To: libc-alpha, Luca Boccassi, Philip Withnall

These function allow to posix_spawn and posix_spawnp to use
CLONE_INTO_CGROUP with clone3, allowing the child process to
be created in a different version 2 cgroup.  They are GNU
extensions that are available only for Linux, and also only
for the architectures that implement clone3 wrapper
(HAVE_CLONE3_WRAPPER).

To create a process on a different cgroupv2, one can use the:

  posix_spawnattr_t attr;
  posix_spawnattr_init (&attr);
  posix_spawnattr_setflags (&attr, POSIX_SPAWN_SETCGROUP);
  posix_spawnattr_setcgroup_np (&attr, cgroup);
  posix_spawn (...)

Similar to other posix_spawn flags, POSIX_SPAWN_SETCGROUP control
whether the cgroup file descriptor will be used or not with
clone3.

There is no fallback is either clone3 does not support the flag
or if the architecture does not provide the clone3 wrapper, in
this case posix_spawn returns ENOTSUP.

Checked on x86_64-linux-gnu.
---
 NEWS                                          |   6 +
 bits/spawn_ext.h                              |  21 ++
 posix/Makefile                                |   1 +
 posix/spawn.h                                 |   6 +-
 posix/spawnattr_setflags.c                    |   3 +-
 sysdeps/unix/sysv/linux/Makefile              |   5 +
 sysdeps/unix/sysv/linux/Versions              |   4 +
 sysdeps/unix/sysv/linux/aarch64/libc.abilist  |   2 +
 sysdeps/unix/sysv/linux/alpha/libc.abilist    |   2 +
 sysdeps/unix/sysv/linux/arc/libc.abilist      |   2 +
 sysdeps/unix/sysv/linux/arm/be/libc.abilist   |   2 +
 sysdeps/unix/sysv/linux/arm/le/libc.abilist   |   2 +
 sysdeps/unix/sysv/linux/bits/spawn_ext.h      |  40 ++++
 sysdeps/unix/sysv/linux/csky/libc.abilist     |   2 +
 sysdeps/unix/sysv/linux/hppa/libc.abilist     |   2 +
 sysdeps/unix/sysv/linux/i386/libc.abilist     |   2 +
 sysdeps/unix/sysv/linux/ia64/libc.abilist     |   2 +
 .../sysv/linux/loongarch/lp64/libc.abilist    |   2 +
 .../sysv/linux/m68k/coldfire/libc.abilist     |   2 +
 .../unix/sysv/linux/m68k/m680x0/libc.abilist  |   2 +
 .../sysv/linux/microblaze/be/libc.abilist     |   2 +
 .../sysv/linux/microblaze/le/libc.abilist     |   2 +
 .../sysv/linux/mips/mips32/fpu/libc.abilist   |   2 +
 .../sysv/linux/mips/mips32/nofpu/libc.abilist |   2 +
 .../sysv/linux/mips/mips64/n32/libc.abilist   |   2 +
 .../sysv/linux/mips/mips64/n64/libc.abilist   |   2 +
 sysdeps/unix/sysv/linux/nios2/libc.abilist    |   2 +
 sysdeps/unix/sysv/linux/or1k/libc.abilist     |   2 +
 .../linux/powerpc/powerpc32/fpu/libc.abilist  |   2 +
 .../powerpc/powerpc32/nofpu/libc.abilist      |   2 +
 .../linux/powerpc/powerpc64/be/libc.abilist   |   2 +
 .../linux/powerpc/powerpc64/le/libc.abilist   |   2 +
 .../unix/sysv/linux/riscv/rv32/libc.abilist   |   2 +
 .../unix/sysv/linux/riscv/rv64/libc.abilist   |   2 +
 .../unix/sysv/linux/s390/s390-32/libc.abilist |   2 +
 .../unix/sysv/linux/s390/s390-64/libc.abilist |   2 +
 sysdeps/unix/sysv/linux/sh/be/libc.abilist    |   2 +
 sysdeps/unix/sysv/linux/sh/le/libc.abilist    |   2 +
 .../sysv/linux/sparc/sparc32/libc.abilist     |   2 +
 .../sysv/linux/sparc/sparc64/libc.abilist     |   2 +
 .../unix/sysv/linux/spawnattr_getcgroup_np.c  |  28 +++
 .../unix/sysv/linux/spawnattr_setcgroup_np.c  |  27 +++
 sysdeps/unix/sysv/linux/spawni.c              |  20 +-
 sysdeps/unix/sysv/linux/tst-spawn-cgroup.c    | 216 ++++++++++++++++++
 .../unix/sysv/linux/x86_64/64/libc.abilist    |   2 +
 .../unix/sysv/linux/x86_64/x32/libc.abilist   |   2 +
 46 files changed, 440 insertions(+), 5 deletions(-)
 create mode 100644 bits/spawn_ext.h
 create mode 100644 sysdeps/unix/sysv/linux/bits/spawn_ext.h
 create mode 100644 sysdeps/unix/sysv/linux/spawnattr_getcgroup_np.c
 create mode 100644 sysdeps/unix/sysv/linux/spawnattr_setcgroup_np.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-cgroup.c

diff --git a/NEWS b/NEWS
index 709ee40e50..39eccdf6ce 100644
--- a/NEWS
+++ b/NEWS
@@ -48,6 +48,12 @@ Major new features:
 * The strlcpy and strlcat functions have been added.  They are derived
   from OpenBSD, and are expected to be added to a future POSIX version.
 
+* On Linux, the functions posix_spawnattr_getcgroup_np and
+  posix_spawnattr_setcgroup_np have been added, along with the
+  POSIX_SPAWN_SETCGROUP flag.  They allow posix_spawn and posix_spawnp to
+  set the cgroupv2 in the new process in a race free manner.  These functions
+  are GNU extensions and require a kernel with clone3 support.
+
 Deprecated and removed features, and other changes affecting compatibility:
 
 * In the Linux kernel for the hppa/parisc architecture some of the
diff --git a/bits/spawn_ext.h b/bits/spawn_ext.h
new file mode 100644
index 0000000000..75b504a768
--- /dev/null
+++ b/bits/spawn_ext.h
@@ -0,0 +1,21 @@
+/* POSIX spawn extensions.   Generic version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _SPAWN_H
+# error "Never include <bits/spawn-ext.h> directly; use <spawn.h> instead."
+#endif
diff --git a/posix/Makefile b/posix/Makefile
index ad43cbdec6..e74a4e00c4 100644
--- a/posix/Makefile
+++ b/posix/Makefile
@@ -37,6 +37,7 @@ headers := \
   bits/pthreadtypes-arch.h \
   bits/pthreadtypes.h \
   bits/sched.h \
+  bits/spawn_ext.h \
   bits/thread-shared-types.h \
   bits/types.h \
   bits/types/idtype_t.h \
diff --git a/posix/spawn.h b/posix/spawn.h
index 04cc525fa5..731862cc5a 100644
--- a/posix/spawn.h
+++ b/posix/spawn.h
@@ -34,7 +34,8 @@ typedef struct
   sigset_t __ss;
   struct sched_param __sp;
   int __policy;
-  int __pad[16];
+  int __cgroup;
+  int __pad[15];
 } posix_spawnattr_t;
 
 
@@ -59,6 +60,7 @@ typedef struct
 #ifdef __USE_GNU
 # define POSIX_SPAWN_USEVFORK		0x40
 # define POSIX_SPAWN_SETSID		0x80
+# define POSIX_SPAWN_SETCGROUP         0x100
 #endif
 
 
@@ -231,4 +233,6 @@ posix_spawn_file_actions_addtcsetpgrp_np (posix_spawn_file_actions_t *,
 
 __END_DECLS
 
+#include <bits/spawn_ext.h>
+
 #endif /* spawn.h */
diff --git a/posix/spawnattr_setflags.c b/posix/spawnattr_setflags.c
index 97153948e4..e7bb217c6a 100644
--- a/posix/spawnattr_setflags.c
+++ b/posix/spawnattr_setflags.c
@@ -26,7 +26,8 @@
 		   | POSIX_SPAWN_SETSCHEDPARAM				      \
 		   | POSIX_SPAWN_SETSCHEDULER				      \
 		   | POSIX_SPAWN_SETSID					      \
-		   | POSIX_SPAWN_USEVFORK)
+		   | POSIX_SPAWN_USEVFORK				      \
+		   | POSIX_SPAWN_SETCGROUP)
 
 /* Store flags in the attribute structure.  */
 int
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 23a84cf225..c54cba873c 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -490,11 +490,14 @@ sysdep_routines += \
   getcpu \
   oldglob \
   sched_getcpu \
+  spawnattr_getcgroup_np \
+  spawnattr_setcgroup_np \
   # sysdep_routines
 
 tests += \
   tst-affinity \
   tst-affinity-pid \
+  tst-spawn-cgroup \
   # tests
 
 tests-static += \
@@ -508,6 +511,8 @@ tests += \
 CFLAGS-fork.c = $(libio-mtsafe)
 CFLAGS-getpid.o = -fomit-frame-pointer
 CFLAGS-getpid.os = -fomit-frame-pointer
+
+tst-spawn-cgroup-ARGS = -- $(host-test-program-cmd)
 endif
 
 ifeq ($(subdir),inet)
diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions
index bc59bce42f..c912370cde 100644
--- a/sysdeps/unix/sysv/linux/Versions
+++ b/sysdeps/unix/sysv/linux/Versions
@@ -321,6 +321,10 @@ libc {
     __ppoll64_chk;
 %endif
   }
+  GLIBC_2.38 {
+    posix_spawnattr_getcgroup_np;
+    posix_spawnattr_setcgroup_np;
+  }
   GLIBC_PRIVATE {
     # functions used in other libraries
     __syscall_rt_sigqueueinfo;
diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
index c49363e70e..cbc8387131 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
@@ -2669,6 +2669,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist
index d6b1dcaae6..6d31a565d2 100644
--- a/sysdeps/unix/sysv/linux/alpha/libc.abilist
+++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist
@@ -2778,6 +2778,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/arc/libc.abilist b/sysdeps/unix/sysv/linux/arc/libc.abilist
index dfe0c3f7b6..8c604659c4 100644
--- a/sysdeps/unix/sysv/linux/arc/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arc/libc.abilist
@@ -2430,6 +2430,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/arm/be/libc.abilist b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
index 6c75e5aa76..7936ed59f8 100644
--- a/sysdeps/unix/sysv/linux/arm/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
@@ -550,6 +550,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/arm/le/libc.abilist b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
index 03d6f7ae2d..2893783e3d 100644
--- a/sysdeps/unix/sysv/linux/arm/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
@@ -547,6 +547,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/bits/spawn_ext.h b/sysdeps/unix/sysv/linux/bits/spawn_ext.h
new file mode 100644
index 0000000000..3bc10ab477
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/bits/spawn_ext.h
@@ -0,0 +1,40 @@
+/* POSIX spawn extensions.   Linux version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _SPAWN_H
+# error "Never include <bits/spawn-ext.h> directly; use <spawn.h> instead."
+#endif
+
+__BEGIN_DECLS
+
+#ifdef __USE_MISC
+
+/* Get the cgroupsv2 the attribute structure.  */
+extern int posix_spawnattr_getcgroup_np (const posix_spawnattr_t *
+					 __restrict __attr,
+					 int *__cgroup)
+     __THROW __nonnull ((1, 2));
+
+/* Store scheduling parameters in the attribute structure.  */
+extern int posix_spawnattr_setcgroup_np (posix_spawnattr_t *__restrict __attr,
+					 int __cgroup)
+     __THROW __nonnull ((1));
+
+#endif /* __USE_MISC */
+
+__END_DECLS
diff --git a/sysdeps/unix/sysv/linux/csky/libc.abilist b/sysdeps/unix/sysv/linux/csky/libc.abilist
index d858c108c6..dc1b885b5a 100644
--- a/sysdeps/unix/sysv/linux/csky/libc.abilist
+++ b/sysdeps/unix/sysv/linux/csky/libc.abilist
@@ -2706,6 +2706,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist
index 82a14f8ace..c967612203 100644
--- a/sysdeps/unix/sysv/linux/hppa/libc.abilist
+++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist
@@ -2655,6 +2655,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist
index 1950b15d5d..c7b921f392 100644
--- a/sysdeps/unix/sysv/linux/i386/libc.abilist
+++ b/sysdeps/unix/sysv/linux/i386/libc.abilist
@@ -2839,6 +2839,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist
index d0b9cb279b..a0e4bd3ab9 100644
--- a/sysdeps/unix/sysv/linux/ia64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist
@@ -2604,6 +2604,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist b/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
index e760a631dd..bb8b895a37 100644
--- a/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
@@ -2190,6 +2190,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
index 35785a3d5f..0c28b1bd74 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
@@ -551,6 +551,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
index 4ab2426e0a..59badcaf6b 100644
--- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
@@ -2782,6 +2782,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
index 38faa16232..18da382fc2 100644
--- a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
@@ -2755,6 +2755,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
index 374d658988..3f54fb325f 100644
--- a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
@@ -2752,6 +2752,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
index fcc5e88e91..4ce0ca4955 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
@@ -2747,6 +2747,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
index 01eb96cd93..dc6e322b77 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
@@ -2745,6 +2745,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
index a2748b7b74..b4ce7b5c82 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
@@ -2753,6 +2753,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
index 0ae7ba499d..311800f6ca 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
@@ -2655,6 +2655,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist
index 947495a0e2..756f3cce1d 100644
--- a/sysdeps/unix/sysv/linux/nios2/libc.abilist
+++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist
@@ -2794,6 +2794,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/or1k/libc.abilist b/sysdeps/unix/sysv/linux/or1k/libc.abilist
index 115f1039e7..9b59c148a1 100644
--- a/sysdeps/unix/sysv/linux/or1k/libc.abilist
+++ b/sysdeps/unix/sysv/linux/or1k/libc.abilist
@@ -2176,6 +2176,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
index 19c4c325b0..022c9d5907 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
@@ -2821,6 +2821,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
index 3e043c4044..5eabe69671 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
@@ -2854,6 +2854,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
index e4f3a766bb..a66243cb1b 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
@@ -2575,6 +2575,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
index dafe1c4a59..8904138f0b 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
@@ -2889,6 +2889,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
index b9740a1afc..c90aeb6bbf 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
@@ -2432,6 +2432,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
index e3b4656aa2..dea9d8e7fc 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
@@ -2632,6 +2632,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
index 84cb7a50ed..475f5a991f 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
@@ -2819,6 +2819,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
index 33df3b1646..228525449e 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
@@ -2612,6 +2612,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/sh/be/libc.abilist b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
index 94cbccd715..f8ffa32087 100644
--- a/sysdeps/unix/sysv/linux/sh/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
@@ -2662,6 +2662,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/sh/le/libc.abilist b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
index 3bb316a787..ab4fecdc3e 100644
--- a/sysdeps/unix/sysv/linux/sh/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
@@ -2659,6 +2659,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
index 6341b491b4..79b5353355 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
@@ -2814,6 +2814,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
index 8ed1ea2926..479637e24d 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
@@ -2627,6 +2627,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/spawnattr_getcgroup_np.c b/sysdeps/unix/sysv/linux/spawnattr_getcgroup_np.c
new file mode 100644
index 0000000000..82fd8f4b71
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/spawnattr_getcgroup_np.c
@@ -0,0 +1,28 @@
+/* Copyright (C) 2000-2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <spawn.h>
+
+/* Get scheduling policy from the attribute structure.  */
+int
+posix_spawnattr_getcgroup_np (const posix_spawnattr_t *attr,
+			      int *cgroup)
+{
+  *cgroup = attr->__cgroup;
+
+  return 0;
+}
diff --git a/sysdeps/unix/sysv/linux/spawnattr_setcgroup_np.c b/sysdeps/unix/sysv/linux/spawnattr_setcgroup_np.c
new file mode 100644
index 0000000000..74d60bb5ea
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/spawnattr_setcgroup_np.c
@@ -0,0 +1,27 @@
+/* Copyright (C) 2000-2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <spawn.h>
+
+/* Store scheduling policy in the attribute structure.  */
+int
+posix_spawnattr_setcgroup_np (posix_spawnattr_t *attr, int cgroup)
+{
+  attr->__cgroup = cgroup;
+
+  return 0;
+}
diff --git a/sysdeps/unix/sysv/linux/spawni.c b/sysdeps/unix/sysv/linux/spawni.c
index ec687cb423..da748679c1 100644
--- a/sysdeps/unix/sysv/linux/spawni.c
+++ b/sysdeps/unix/sysv/linux/spawni.c
@@ -380,14 +380,19 @@ __spawnix (pid_t * pid, const char *file,
      need for CLONE_SETTLS.  Although parent and child share the same TLS
      namespace, there will be no concurrent access for TLS variables (errno
      for instance).  */
+  bool set_cgroup = attrp ? (attrp->__flags & POSIX_SPAWN_SETCGROUP) : false;
   struct clone_args clone_args =
     {
       /* Unsupported flags like CLONE_CLEAR_SIGHAND will be cleared up by
 	 __clone_internal_fallback.  */
-      .flags = CLONE_CLEAR_SIGHAND | CLONE_VM | CLONE_VFORK,
+      .flags = (set_cgroup ? CLONE_INTO_CGROUP : 0)
+	       | CLONE_CLEAR_SIGHAND
+	       | CLONE_VM
+	       | CLONE_VFORK,
       .exit_signal = SIGCHLD,
       .stack = (uintptr_t) stack,
       .stack_size = stack_size,
+      .cgroup = (set_cgroup ? attrp->__cgroup : 0)
     };
 #ifdef HAVE_CLONE3_WRAPPER
   args.use_clone3 = true;
@@ -398,8 +403,17 @@ __spawnix (pid_t * pid, const char *file,
 #endif
     {
       args.use_clone3 = false;
-      new_pid = __clone_internal_fallback (&clone_args, __spawni_child,
-					   &args);
+      if (!set_cgroup)
+	new_pid = __clone_internal_fallback (&clone_args, __spawni_child,
+					     &args);
+      else
+	{
+	  /* No fallback for POSIX_SPAWN_SETCGROUP if clone3 is not
+	     supported.  */
+	  new_pid = -1;
+	  if (errno == ENOSYS)
+	    errno = ENOTSUP;
+	}
     }
 
   /* It needs to collect the case where the auxiliary process was created
diff --git a/sysdeps/unix/sysv/linux/tst-spawn-cgroup.c b/sysdeps/unix/sysv/linux/tst-spawn-cgroup.c
new file mode 100644
index 0000000000..6dba30ab29
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-spawn-cgroup.c
@@ -0,0 +1,216 @@
+/* Tests for posix_spawn cgroup extension.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <getopt.h>
+#include <spawn.h>
+#include <stdlib.h>
+#include <string.h>
+#include <support/check.h>
+#include <support/support.h>
+#include <support/xstdio.h>
+#include <support/xunistd.h>
+#include <support/temp_file.h>
+#include <sys/vfs.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#define CGROUPFS "/sys/fs/cgroup/"
+#ifndef CGROUP2_SUPER_MAGIC
+# define CGROUP2_SUPER_MAGIC 0x63677270
+#endif
+
+#define F_TYPE_EQUAL(a, b) (a == (typeof(a)) b)
+
+#define CGROUP_TEST "test-spawn-cgroup"
+
+/* Nonzero if the program gets called via `exec'.  */
+#define CMDLINE_OPTIONS \
+  { "restart", no_argument, &restart, 1 },
+static int restart;
+
+/* Hold the four initial argument used to respawn the process, plus the extra
+   '--direct', '--restart', the check type ('SIG_IGN' or 'SIG_DFL'), and a
+   final NULL.  */
+static char *spargs[8];
+
+static inline char *
+startswith (const char *s, const char *prefix)
+{
+  size_t l = strlen (prefix);
+  if (strncmp (s, prefix, l) == 0)
+    return (char *) s + l;
+  return NULL;
+}
+
+static char *
+get_cgroup (void)
+{
+  FILE *f = fopen ("/proc/self/cgroup", "re");
+  if (f == NULL)
+    FAIL_UNSUPPORTED ("no cgroup defined for the process");
+
+  char *cgroup = NULL;
+
+  char *line = NULL;
+  size_t linesiz = 0;
+  while (xgetline (&line, &linesiz, f) > 0)
+    {
+      char *entry = startswith (line, "0:");
+      if (entry == NULL)
+	continue;
+
+      entry = strchr (entry, ':');
+      if (entry == NULL)
+	continue;
+
+      cgroup = entry + 1;
+      size_t l = strlen (cgroup);
+      if (cgroup[l - 1] == '\n')
+	cgroup[l - 1] = '\0';
+
+      cgroup = xstrdup (entry + 1);
+      break;
+    }
+
+  xfclose (f);
+  free (line);
+
+  return cgroup;
+}
+
+
+/* Called on process re-execution.  */
+_Noreturn static void
+handle_restart (int argc, char *argv[])
+{
+  assert (argc == 1);
+  char *newcgroup = argv[0];
+
+  char *current_cgroup = get_cgroup ();
+  TEST_VERIFY_EXIT (current_cgroup != NULL);
+  TEST_COMPARE_STRING (newcgroup, current_cgroup);
+  exit (EXIT_SUCCESS);
+}
+
+static int
+do_test_cgroup_failure (pid_t *pid, int cgroup)
+{
+  posix_spawnattr_t attr;
+  TEST_COMPARE (posix_spawnattr_init (&attr), 0);
+  TEST_COMPARE (posix_spawnattr_setflags (&attr, POSIX_SPAWN_SETCGROUP), 0);
+  TEST_COMPARE (posix_spawnattr_setcgroup_np (&attr, cgroup), 0);
+
+  int cgetgroup;
+  TEST_COMPARE (posix_spawnattr_getcgroup_np (&attr, &cgetgroup), 0);
+  TEST_COMPARE (cgroup, cgetgroup);
+
+  return posix_spawn (pid, spargs[0], NULL, &attr, spargs, environ);
+}
+
+static int
+create_new_cgroup (char **newcgroup)
+{
+  struct statfs fs;
+  if (statfs (CGROUPFS, &fs) < 0)
+    {
+      if (errno == ENOENT)
+	FAIL_UNSUPPORTED ("not cgroupv2 mount found");
+      FAIL_EXIT1 ("statfs (%s): %m\n", CGROUPFS);
+    }
+
+  if (!F_TYPE_EQUAL (fs.f_type, CGROUP2_SUPER_MAGIC))
+    FAIL_UNSUPPORTED ("%s is not a cgroupv2", CGROUPFS);
+
+  char *cgroup = get_cgroup ();
+  TEST_VERIFY_EXIT (cgroup != NULL);
+  *newcgroup = xasprintf ("%s/%s", cgroup, CGROUP_TEST);
+  char *cgpath = xasprintf ("%s%s/%s", CGROUPFS, cgroup, CGROUP_TEST);
+  free (cgroup);
+
+  if (mkdir (cgpath, 0755) == -1 && errno != EEXIST)
+    {
+      if (errno == EACCES || errno == EPERM)
+	FAIL_UNSUPPORTED ("can not create a new cgroupv2 group");
+      FAIL_EXIT1 ("mkdir (%s): %m", cgpath);
+    }
+  add_temp_file (cgpath);
+
+  return xopen (cgpath, O_DIRECTORY | O_RDONLY | O_CLOEXEC, 0666);
+}
+
+static int
+do_test (int argc, char *argv[])
+{
+  /* We must have either:
+
+     - one or four parameters if called initially:
+       + argv[1]: path for ld.so        optional
+       + argv[2]: "--library-path"      optional
+       + argv[3]: the library path      optional
+       + argv[4]: the application name
+
+     - six parameters left if called through re-execution:
+       + argv[4/1]: the application name
+       + argv[5/2]: the created cgroup
+
+     * When built with --enable-hardcoded-path-in-tests or issued without
+       using the loader directly.  */
+
+  if (restart)
+    handle_restart (argc - 1, &argv[1]);
+
+  TEST_VERIFY_EXIT (argc == 2 || argc == 5);
+
+  char *newcgroup;
+  int cgroup = create_new_cgroup (&newcgroup);
+
+  int i;
+  for (i = 0; i < argc - 1; i++)
+    spargs[i] = argv[i + 1];
+  spargs[i++] = (char *) "--direct";
+  spargs[i++] = (char *) "--restart";
+  spargs[i++] = (char *) newcgroup;
+  spargs[i] = NULL;
+
+  /* Check if invalid cgroups returns an error.  */
+  {
+    TEST_COMPARE (do_test_cgroup_failure (NULL, -1), EINVAL);
+  }
+
+  {
+    pid_t pid;
+    TEST_COMPARE (do_test_cgroup_failure (&pid, cgroup), 0);
+
+    siginfo_t sinfo;
+    TEST_COMPARE (waitid (P_PID, pid, &sinfo, WEXITED), 0);
+    TEST_COMPARE (sinfo.si_signo, SIGCHLD);
+    TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+    TEST_COMPARE (sinfo.si_status, 0);
+  }
+
+  xclose (cgroup);
+  free (newcgroup);
+
+  return 0;
+}
+
+#define TEST_FUNCTION_ARGV do_test
+#include <support/test-driver.c>
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
index 57cfcc2086..ea8539447c 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
@@ -2578,6 +2578,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
index 3f0a9f6d82..f15ac7c33f 100644
--- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
@@ -2684,6 +2684,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 posix_spawnattr_getcgroup_np F
+GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
 GLIBC_2.38 strlcpy F
 GLIBC_2.38 wcslcat F
-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 2/5] posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349)
  2023-07-05 20:43 [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Adhemerval Zanella
  2023-07-05 20:43 ` [PATCH v6 1/5] linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731) Adhemerval Zanella
@ 2023-07-05 20:43 ` Adhemerval Zanella
  2023-07-05 20:43 ` [PATCH v6 3/5] posix: Add pidfd_fork (BZ 26371) Adhemerval Zanella
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Adhemerval Zanella @ 2023-07-05 20:43 UTC (permalink / raw)
  To: libc-alpha, Luca Boccassi, Philip Withnall

Returning a pidfd allows a process to keep a race-free handle to a child
process, otherwise the caller will need to either use pidfd_open (which
still might be subject to TOCTOU) or keep the old racy interface base
on pid_t.

The implementation makes sure that kernel must support the complete
pidfd interface, meaning that waitid (P_PIDFD) should be supported
(added on Linux 5.4).  It ensures that non racy workaround is required
(such as reading procfs fdinfo pid to use along with old wait interfaces).

These interfaces are similar to the posix_spawn and posix_spawnp, with
the only different diferent being it returns a process file descriptor
(int) instead of process ID (pid_t).  Their prototypes are:

  int pidfd_spawn (int *restrict pidfd,
 		   const char *restrict file,
  		   const posix_spawn_file_actions_t *restrict facts,
  		   const posix_spawnattr_t *restrict attrp,
  		   char *const argv[restrict],
  		   char *const envp[restrict])

  int pidfd_spawnp (int *restrict pidfd,
 		    const char *restrict path,
  		    const posix_spawn_file_actions_t *restrict facts,
  		    const posix_spawnattr_t *restrict attrp,
  		    char *const argv[restrict_arr],
  		    char *const envp[restrict_arr]);

A new symbol is used instead of a posix_spawn extension to avoid possible
issue with language bindings that might track the return argument
lifetime.  Although, on Linux pid_t and int are interchangeable, POSIX
only state that pid_t should be a signed interger.

Both symbols reuse the posix_spawn posix_spawn_file_actions_t and
posix_spawnattr_t, to void rehash posix_spawn API or add a new one.
It also mean that both interfaces support the same attribute and
file actions, and a new flag or file actions on posix_spawn is also
added automatically for pidfd_spawn.

Also, using posix_spawn plumbering allows to reuse most of the current
testing with some changes:

  - waitid is used instead of waitpid, since it is a more generic
    interface.

  - tst-posix_spawn-setsid.c is adapted to take in consideration that
    caller can check for session id directly.  The test now spawn itself
    and write the session id a file instead.

  - tst-spawn3.c need to know where pidfd_spawn is used so it keep
    an extra file description ununsed.

Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PID or waitid
support), Linux 5.15 (only clone support), and Linux 5.19 (full
support including clone3).
---
 NEWS                                          |   7 +
 include/clone_internal.h                      |   4 +
 manual/process.texi                           |  14 +-
 posix/Makefile                                |   1 +
 posix/spawn_int.h                             |   3 +-
 posix/tst-posix_spawn-setsid.c                | 168 +++++++++++++-----
 posix/tst-spawn-chdir.c                       |  15 +-
 posix/tst-spawn.c                             |  24 +--
 posix/tst-spawn.h                             |  36 ++++
 posix/tst-spawn2.c                            |  17 +-
 posix/tst-spawn3.c                            | 100 ++++++-----
 posix/tst-spawn4.c                            |   7 +-
 posix/tst-spawn5.c                            |  14 +-
 posix/tst-spawn6.c                            |  15 +-
 posix/tst-spawn7.c                            |  13 +-
 sysdeps/unix/sysv/linux/Makefile              |  18 ++
 sysdeps/unix/sysv/linux/Versions              |   2 +
 sysdeps/unix/sysv/linux/aarch64/libc.abilist  |   2 +
 sysdeps/unix/sysv/linux/alpha/libc.abilist    |   2 +
 sysdeps/unix/sysv/linux/arc/libc.abilist      |   2 +
 sysdeps/unix/sysv/linux/arm/be/libc.abilist   |   2 +
 sysdeps/unix/sysv/linux/arm/le/libc.abilist   |   2 +
 sysdeps/unix/sysv/linux/bits/spawn_ext.h      |  20 +++
 sysdeps/unix/sysv/linux/clone-pidfd-support.c |  58 ++++++
 sysdeps/unix/sysv/linux/csky/libc.abilist     |   2 +
 sysdeps/unix/sysv/linux/hppa/libc.abilist     |   2 +
 sysdeps/unix/sysv/linux/i386/libc.abilist     |   2 +
 sysdeps/unix/sysv/linux/ia64/libc.abilist     |   2 +
 .../sysv/linux/loongarch/lp64/libc.abilist    |   2 +
 .../sysv/linux/m68k/coldfire/libc.abilist     |   2 +
 .../unix/sysv/linux/m68k/m680x0/libc.abilist  |   2 +
 .../sysv/linux/microblaze/be/libc.abilist     |   2 +
 .../sysv/linux/microblaze/le/libc.abilist     |   2 +
 .../sysv/linux/mips/mips32/fpu/libc.abilist   |   2 +
 .../sysv/linux/mips/mips32/nofpu/libc.abilist |   2 +
 .../sysv/linux/mips/mips64/n32/libc.abilist   |   2 +
 .../sysv/linux/mips/mips64/n64/libc.abilist   |   2 +
 sysdeps/unix/sysv/linux/nios2/libc.abilist    |   2 +
 sysdeps/unix/sysv/linux/or1k/libc.abilist     |   2 +
 sysdeps/unix/sysv/linux/pidfd_spawn.c         |  30 ++++
 sysdeps/unix/sysv/linux/pidfd_spawnp.c        |  30 ++++
 .../linux/powerpc/powerpc32/fpu/libc.abilist  |   2 +
 .../powerpc/powerpc32/nofpu/libc.abilist      |   2 +
 .../linux/powerpc/powerpc64/be/libc.abilist   |   2 +
 .../linux/powerpc/powerpc64/le/libc.abilist   |   2 +
 .../unix/sysv/linux/riscv/rv32/libc.abilist   |   2 +
 .../unix/sysv/linux/riscv/rv64/libc.abilist   |   2 +
 .../unix/sysv/linux/s390/s390-32/libc.abilist |   2 +
 .../unix/sysv/linux/s390/s390-64/libc.abilist |   2 +
 sysdeps/unix/sysv/linux/sh/be/libc.abilist    |   2 +
 sysdeps/unix/sysv/linux/sh/le/libc.abilist    |   2 +
 .../sysv/linux/sparc/sparc32/libc.abilist     |   2 +
 .../sysv/linux/sparc/sparc64/libc.abilist     |   2 +
 sysdeps/unix/sysv/linux/spawni.c              |  20 ++-
 .../sysv/linux/tst-posix_spawn-setsid-pidfd.c |  20 +++
 .../unix/sysv/linux/tst-spawn-chdir-pidfd.c   |  20 +++
 sysdeps/unix/sysv/linux/tst-spawn-pidfd.c     |  20 +++
 sysdeps/unix/sysv/linux/tst-spawn-pidfd.h     |  63 +++++++
 sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c    |  20 +++
 sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c    |  20 +++
 sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c    |  20 +++
 sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c    |  20 +++
 sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c    |  20 +++
 sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c    |  20 +++
 .../unix/sysv/linux/x86_64/64/libc.abilist    |   2 +
 .../unix/sysv/linux/x86_64/x32/libc.abilist   |   2 +
 66 files changed, 776 insertions(+), 151 deletions(-)
 create mode 100644 posix/tst-spawn.h
 create mode 100644 sysdeps/unix/sysv/linux/clone-pidfd-support.c
 create mode 100644 sysdeps/unix/sysv/linux/pidfd_spawn.c
 create mode 100644 sysdeps/unix/sysv/linux/pidfd_spawnp.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-posix_spawn-setsid-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-chdir-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn-pidfd.h
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c

diff --git a/NEWS b/NEWS
index 39eccdf6ce..65562e75e7 100644
--- a/NEWS
+++ b/NEWS
@@ -54,6 +54,13 @@ Major new features:
   set the cgroupv2 in the new process in a race free manner.  These functions
   are GNU extensions and require a kernel with clone3 support.
 
+* On Linux, the pidfd_spawn and pidfd_spawp functions have been added.
+  They have similar prototype and semantic as posix_spawn, but instead of
+  returning a process ID, they return a file descriptor that can be used
+  along other pidfd function (like pidfd_send_signal, poll, or waitid).
+  The pidfd functionality avoid the issue of PID reuse with traditional
+  posix_spawn interface.
+
 Deprecated and removed features, and other changes affecting compatibility:
 
 * In the Linux kernel for the hppa/parisc architecture some of the
diff --git a/include/clone_internal.h b/include/clone_internal.h
index ad7b170f58..567160ebb5 100644
--- a/include/clone_internal.h
+++ b/include/clone_internal.h
@@ -35,6 +35,10 @@ extern int __clone_internal_fallback (struct clone_args *__cl_args,
 				      void *__arg)
      attribute_hidden;
 
+/* Return whether the kernel supports pid file descriptor, including clone
+   with CLONE_PIDFD and waitid with P_PIDFD.  */
+extern bool __clone_pidfd_supported (void) attribute_hidden;
+
 #ifndef _ISOMAC
 libc_hidden_proto (__clone3)
 libc_hidden_proto (__clone_internal)
diff --git a/manual/process.texi b/manual/process.texi
index c8413a5a58..68361c3f61 100644
--- a/manual/process.texi
+++ b/manual/process.texi
@@ -136,13 +136,13 @@ creating a process and making it run another program.
 @cindex parent process
 @cindex subprocess
 A new processes is created when one of the functions
-@code{posix_spawn}, @code{fork}, @code{_Fork} or @code{vfork} is called.
-(The @code{system} and @code{popen} also create new processes internally.)
-Due to the name of the @code{fork} function, the act of creating a new
-process is sometimes called @dfn{forking} a process.  Each new process
-(the @dfn{child process} or @dfn{subprocess}) is allocated a process
-ID, distinct from the process ID of the parent process.  @xref{Process
-Identification}.
+@code{posix_spawn}, @code{fork}, @code{_Fork}, @code{vfork}, or
+@code{pidfd_spawn} is called.  (The @code{system} and @code{popen} also
+create new processes internally.)  Due to the name of the @code{fork}
+function, the act of creating a new process is sometimes called
+@dfn{forking} a process.  Each new process (the @dfn{child process} or
+@dfn{subprocess}) is allocated a process ID, distinct from the process
+ID of the parent process.  @xref{Process Identification}.
 
 After forking a child process, both the parent and child processes
 continue to execute normally.  If you want your program to wait for a
diff --git a/posix/Makefile b/posix/Makefile
index e74a4e00c4..e3c78d3d65 100644
--- a/posix/Makefile
+++ b/posix/Makefile
@@ -593,6 +593,7 @@ tst-spawn-static-ARGS = $(tst-spawn-ARGS)
 tst-spawn5-ARGS = -- $(host-test-program-cmd)
 tst-spawn6-ARGS = -- $(host-test-program-cmd)
 tst-spawn7-ARGS = -- $(host-test-program-cmd)
+tst-posix_spawn-setsid-ARGS = -- $(host-test-program-cmd)
 tst-dir-ARGS = `pwd` `cd $(common-objdir)/$(subdir); pwd` `cd $(common-objdir); pwd` $(objpfx)tst-dir
 tst-chmod-ARGS = $(objdir)
 tst-vfork3-ARGS = --test-dir=$(objpfx)
diff --git a/posix/spawn_int.h b/posix/spawn_int.h
index aeb066c44f..64ee03e62d 100644
--- a/posix/spawn_int.h
+++ b/posix/spawn_int.h
@@ -76,12 +76,13 @@ struct __spawn_action
 
 #define SPAWN_XFLAGS_USE_PATH	0x1
 #define SPAWN_XFLAGS_TRY_SHELL	0x2
+#define SPAWN_XFLAGS_RET_PIDFD  0x4
 
 extern int __posix_spawn_file_actions_realloc (posix_spawn_file_actions_t *
 					       file_actions)
      attribute_hidden;
 
-extern int __spawni (pid_t *pid, const char *path,
+extern int __spawni (int *pid, const char *path,
 		     const posix_spawn_file_actions_t *file_actions,
 		     const posix_spawnattr_t *attrp, char *const argv[],
 		     char *const envp[], int xflags) attribute_hidden;
diff --git a/posix/tst-posix_spawn-setsid.c b/posix/tst-posix_spawn-setsid.c
index 124d878ce2..751674165c 100644
--- a/posix/tst-posix_spawn-setsid.c
+++ b/posix/tst-posix_spawn-setsid.c
@@ -18,78 +18,158 @@
 
 #include <errno.h>
 #include <fcntl.h>
+#include <getopt.h>
+#include <intprops.h>
+#include <paths.h>
 #include <spawn.h>
 #include <stdbool.h>
 #include <stdio.h>
+#include <stdlib.h>
 #include <sys/resource.h>
+#include <sys/wait.h>
 #include <unistd.h>
 
 #include <support/check.h>
+#include <support/xunistd.h>
+#include <support/temp_file.h>
+#include <tst-spawn.h>
+
+/* Nonzero if the program gets called via `exec'.  */
+static int restart;
+
+/* Hold the four initial argument used to respawn the process, plus
+   the extra '--direct' and '--restart', and a final NULL.  */
+static char *initial_argv[7];
+static int initial_argv_count;
+
+#define CMDLINE_OPTIONS \
+  { "restart", no_argument, &restart, 1 },
+
+static char *pidfile;
+
+static pid_t
+read_child_sid (void)
+{
+  int pidfd = xopen (pidfile, O_RDONLY, 0);
+
+  char buf[INT_STRLEN_BOUND (pid_t)];
+  ssize_t n = read (pidfd, buf, sizeof (buf));
+  TEST_VERIFY (n < sizeof buf && n >= 0);
+  buf[n] = '\0';
+
+  /* We only expect to read the PID.  */
+  char *endp;
+  long int rpid = strtol (buf, &endp, 10);
+  TEST_VERIFY (endp != buf);
+
+  xclose (pidfd);
+
+  return rpid;
+}
+
+/* Called on process re-execution, write down the session id on PIDFILE.  */
+_Noreturn static void
+handle_restart (const char *pidfile)
+{
+  int pidfd = xopen (pidfile, O_WRONLY, 0);
+
+  char buf[INT_STRLEN_BOUND (pid_t)];
+  int s = snprintf (buf, sizeof buf, "%d", getsid (0));
+  size_t n = write (pidfd, buf, s);
+  TEST_VERIFY (n == s);
+
+  xclose (pidfd);
+
+  exit (EXIT_SUCCESS);
+}
 
 static void
 do_test_setsid (bool test_setsid)
 {
-  pid_t sid, child_sid;
-  int res;
-
   /* Current session ID.  */
-  sid = getsid(0);
-  if (sid == (pid_t) -1)
-    FAIL_EXIT1 ("getsid (0): %m");
+  pid_t sid = getsid (0);
+  TEST_VERIFY (sid != (pid_t) -1);
 
   posix_spawnattr_t attrp;
-  /* posix_spawnattr_init should not fail (it basically memset the
-     attribute).  */
-  posix_spawnattr_init (&attrp);
+  TEST_COMPARE (posix_spawnattr_init (&attrp), 0);
   if (test_setsid)
-    {
-      res = posix_spawnattr_setflags (&attrp, POSIX_SPAWN_SETSID);
-      if (res != 0)
-	{
-	  errno = res;
-	  FAIL_EXIT1 ("posix_spawnattr_setflags: %m");
-	}
-    }
-
-  /* Program to run.  */
-  char *args[2] = { (char *) "true", NULL };
-  pid_t child;
-
-  res = posix_spawnp (&child, "true", NULL, &attrp, args, environ);
-  /* posix_spawnattr_destroy is noop.  */
-  posix_spawnattr_destroy (&attrp);
-
-  if (res != 0)
-    {
-      errno = res;
-      FAIL_EXIT1 ("posix_spawnp: %m");
-    }
+    TEST_COMPARE (posix_spawnattr_setflags (&attrp, POSIX_SPAWN_SETSID), 0);
+
+  /* 1 or 4 elements from initial_argv:
+       + path to ld.so          optional
+       + --library-path         optional
+       + the library path       optional
+       + application name
+       + --direct
+       + --restart
+       + pidfile  */
+  int argv_size = initial_argv_count + 2;
+  char *args[argv_size];
+  int argc = 0;
+
+  for (char **arg = initial_argv; *arg != NULL; arg++)
+    args[argc++] = *arg;
+  args[argc++] = pidfile;
+  args[argc] = NULL;
+  TEST_VERIFY (argc < argv_size);
+
+  PID_T_TYPE pid;
+  TEST_COMPARE (POSIX_SPAWN (&pid, args[0], NULL, &attrp, args, environ), 0);
+  TEST_COMPARE (posix_spawnattr_destroy (&attrp), 0);
+
+  siginfo_t sinfo;
+  TEST_COMPARE (WAITID (P_PID, pid, &sinfo, WEXITED), 0);
+  TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+  TEST_COMPARE (sinfo.si_status, 0);
+
+  pid_t child_sid = read_child_sid ();
 
   /* Child should have a different session ID than parent.  */
-  child_sid = getsid (child);
-
-  if (child_sid == (pid_t) -1)
-    FAIL_EXIT1 ("getsid (%i): %m", child);
+  TEST_VERIFY (child_sid != (pid_t) -1);
 
   if (test_setsid)
-    {
-      if (child_sid == sid)
-	FAIL_EXIT1 ("child session ID matched parent one");
-    }
+    TEST_VERIFY (child_sid != sid);
   else
-    {
-      if (child_sid != sid)
-	FAIL_EXIT1 ("child session ID did not match parent one");
-    }
+    TEST_VERIFY (child_sid == sid);
 }
 
 static int
-do_test (void)
+do_test (int argc, char *argv[])
 {
+  /* We must have either:
+
+     - one or four parameters if called initially:
+       + argv[1]: path for ld.so        optional
+       + argv[2]: "--library-path"      optional
+       + argv[3]: the library path      optional
+       + argv[4]: the application name
+
+     - six parameters left if called through re-execution:
+       + argv[5/1]: the application name
+       + argv[6/2]: the pidfile
+
+     * When built with --enable-hardcoded-path-in-tests or issued without
+       using the loader directly.  */
+
+  if (restart)
+    handle_restart (argv[1]);
+
+  TEST_VERIFY_EXIT (argc == 2 || argc == 5);
+
+  int i;
+  for (i = 0; i < argc - 1; i++)
+    initial_argv[i] = argv[i + 1];
+  initial_argv[i++] = (char *) "--direct";
+  initial_argv[i++] = (char *) "--restart";
+  initial_argv_count = i;
+
+  create_temp_file ("tst-posix_spawn-setsid-", &pidfile);
+
   do_test_setsid (false);
   do_test_setsid (true);
 
   return 0;
 }
 
+#define TEST_FUNCTION_ARGV do_test
 #include <support/test-driver.c>
diff --git a/posix/tst-spawn-chdir.c b/posix/tst-spawn-chdir.c
index b335092d7f..c01ca6692d 100644
--- a/posix/tst-spawn-chdir.c
+++ b/posix/tst-spawn-chdir.c
@@ -29,7 +29,9 @@
 #include <support/test-driver.h>
 #include <support/xstdio.h>
 #include <support/xunistd.h>
+#include <sys/wait.h>
 #include <unistd.h>
+#include <tst-spawn.h>
 
 /* Reads the file at PATH, which must consist of exactly one line.
    Removes the line terminator at the end of the file.  */
@@ -169,17 +171,18 @@ do_test (void)
 
           char *const argv[] = { (char *) "pwd", NULL };
           char *const envp[] = { NULL } ;
-          pid_t pid;
+          PID_T_TYPE pid;
           if (do_spawnp)
-            TEST_COMPARE (posix_spawnp (&pid, "pwd", &actions,
+            TEST_COMPARE (POSIX_SPAWNP (&pid, "pwd", &actions,
                                         NULL, argv, envp), 0);
           else
-            TEST_COMPARE (posix_spawn (&pid, "subdir/pwd-symlink", &actions,
+            TEST_COMPARE (POSIX_SPAWN (&pid, "subdir/pwd-symlink", &actions,
                                        NULL, argv, envp), 0);
           TEST_VERIFY (pid > 0);
-          int status;
-          xwaitpid (pid, &status, 0);
-          TEST_COMPARE (status, 0);
+          siginfo_t sinfo;
+          TEST_COMPARE (WAITID (P_ALL, 0, &sinfo, WEXITED), 0);
+          TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+          TEST_COMPARE (sinfo.si_status, 0);
 
           /* Check that the current directory did not change.  */
           {
diff --git a/posix/tst-spawn.c b/posix/tst-spawn.c
index 6782a322fc..c44d90756a 100644
--- a/posix/tst-spawn.c
+++ b/posix/tst-spawn.c
@@ -25,11 +25,13 @@
 #include <stdlib.h>
 #include <string.h>
 #include <sys/param.h>
+#include <sys/wait.h>
 
 #include <support/check.h>
 #include <support/xunistd.h>
 #include <support/temp_file.h>
 #include <support/support.h>
+#include <tst-spawn.h>
 
 
 /* Nonzero if the program gets called via `exec'.  */
@@ -143,9 +145,9 @@ handle_restart (const char *fd1s, const char *fd2s, const char *fd3s,
 static int
 do_test (int argc, char *argv[])
 {
-  pid_t pid;
+  PID_T_TYPE pid;
   int fd4;
-  int status;
+  siginfo_t sinfo;
   posix_spawn_file_actions_t actions;
   char fd1name[18];
   char fd2name[18];
@@ -233,17 +235,16 @@ do_test (int argc, char *argv[])
   spargv[i++] = fd5name;
   spargv[i] = NULL;
 
-  TEST_COMPARE (posix_spawn (&pid, argv[1], &actions, NULL, spargv, environ),
+  TEST_COMPARE (POSIX_SPAWN (&pid, argv[1], &actions, NULL, spargv, environ),
 		0);
 
   /* Wait for the children.  */
-  TEST_COMPARE (xwaitpid (pid, &status, 0), pid);
-  TEST_VERIFY (WIFEXITED (status));
-  TEST_VERIFY (!WIFSIGNALED (status));
-  TEST_COMPARE (WEXITSTATUS (status), 0);
+  TEST_COMPARE (WAITID (P_PID, pid, &sinfo, WEXITED), 0);
+  TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+  TEST_COMPARE (sinfo.si_status, 0);
 
   /* Same test but with a NULL pid argument.  */
-  TEST_COMPARE (posix_spawn (NULL, argv[1], &actions, NULL, spargv, environ),
+  TEST_COMPARE (POSIX_SPAWN (NULL, argv[1], &actions, NULL, spargv, environ),
 		0);
 
   /* Cleanup.  */
@@ -251,10 +252,9 @@ do_test (int argc, char *argv[])
   free (name3_copy);
 
   /* Wait for the children.  */
-  xwaitpid (-1, &status, 0);
-  TEST_VERIFY (WIFEXITED (status));
-  TEST_VERIFY (!WIFSIGNALED (status));
-  TEST_COMPARE (WEXITSTATUS (status), 0);
+  TEST_COMPARE (WAITID (P_ALL, 0, &sinfo, WEXITED), 0);
+  TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+  TEST_COMPARE (sinfo.si_status, 0);
 
   return 0;
 }
diff --git a/posix/tst-spawn.h b/posix/tst-spawn.h
new file mode 100644
index 0000000000..a6f2dc8680
--- /dev/null
+++ b/posix/tst-spawn.h
@@ -0,0 +1,36 @@
+/* Generic definitions for posix_spawn tests.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef PID_T_TYPE
+# define PID_T_TYPE pid_t
+#endif
+
+#ifndef POSIX_SPAWN
+# define POSIX_SPAWN(__child, __path, __actions, __attr, __argv, __envp) \
+  posix_spawn (__child, __path, __actions, __attr, __argv, __envp)
+#endif
+
+#ifndef POSIX_SPAWNP
+# define POSIX_SPAWNP(__child, __path, __actions, __attr, __argv, __envp) \
+  posix_spawnp (__child, __path, __actions, __attr, __argv, __envp)
+#endif
+
+#ifndef WAITID
+# define WAITID(__idtype, __id, __info, __opts) \
+  waitid (__idtype, __id, __info, __opts)
+#endif
diff --git a/posix/tst-spawn2.c b/posix/tst-spawn2.c
index 40dc692488..f5c1f13039 100644
--- a/posix/tst-spawn2.c
+++ b/posix/tst-spawn2.c
@@ -26,6 +26,7 @@
 #include <stdio.h>
 
 #include <support/check.h>
+#include <tst-spawn.h>
 
 int
 do_test (void)
@@ -35,9 +36,9 @@ do_test (void)
 
   const char *program = "/path/to/invalid/binary";
   char * const args[] = { 0 };
-  pid_t pid = -1;
+  PID_T_TYPE pid = -1;
 
-  int ret = posix_spawn (&pid, program, 0, 0, args, environ);
+  int ret = POSIX_SPAWN (&pid, program, 0, 0, args, environ);
   if (ret != ENOENT)
     {
       errno = ret;
@@ -51,14 +52,13 @@ do_test (void)
     FAIL_EXIT1 ("posix_spawn returned pid != -1 (%i)", (int) pid);
 
   /* Check if no child is actually created.  */
-  ret = waitpid (-1, NULL, 0);
-  if (ret != -1 || errno != ECHILD)
-    FAIL_EXIT1 ("waitpid: %m)");
+  TEST_COMPARE (WAITID (P_ALL, 0, NULL, WEXITED), -1);
+  TEST_COMPARE (errno, ECHILD);
 
   /* Same as before, but with posix_spawnp.  */
   char *args2[] = { (char*) program, 0 };
 
-  ret = posix_spawnp (&pid, args2[0], 0, 0, args2, environ);
+  ret = POSIX_SPAWNP (&pid, args2[0], 0, 0, args2, environ);
   if (ret != ENOENT)
     {
       errno = ret;
@@ -68,9 +68,8 @@ do_test (void)
   if (pid != -1)
     FAIL_EXIT1 ("posix_spawnp returned pid != -1 (%i)", (int) pid);
 
-  ret = waitpid (-1, NULL, 0);
-  if (ret != -1 || errno != ECHILD)
-    FAIL_EXIT1 ("waitpid: %m)");
+  TEST_COMPARE (WAITID (P_ALL, 0, NULL, WEXITED), -1);
+  TEST_COMPARE (errno, ECHILD);
 
   return 0;
 }
diff --git a/posix/tst-spawn3.c b/posix/tst-spawn3.c
index e7ce0fb386..bd21ac6c4b 100644
--- a/posix/tst-spawn3.c
+++ b/posix/tst-spawn3.c
@@ -16,6 +16,7 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
+#include <assert.h>
 #include <stdio.h>
 #include <spawn.h>
 #include <error.h>
@@ -27,9 +28,12 @@
 #include <sys/resource.h>
 #include <fcntl.h>
 #include <paths.h>
+#include <intprops.h>
 
 #include <support/check.h>
 #include <support/temp_file.h>
+#include <support/xunistd.h>
+#include <tst-spawn.h>
 
 static int
 do_test (void)
@@ -48,7 +52,6 @@ do_test (void)
 
   struct rlimit rl;
   int max_fd = 24;
-  int ret;
 
   /* Set maximum number of file descriptor to a low value to avoid open
      too many files in environments where RLIMIT_NOFILE is large and to
@@ -66,7 +69,7 @@ do_test (void)
   /* Exhauste the file descriptor limit with temporary files.  */
   int files[max_fd];
   int nfiles = 0;
-  for (;;)
+  for (; nfiles < max_fd; nfiles++)
     {
       int fd = create_temp_file ("tst-spawn3.", NULL);
       if (fd == -1)
@@ -75,75 +78,82 @@ do_test (void)
 	    FAIL_EXIT1 ("create_temp_file: %m");
 	  break;
 	}
-      files[nfiles++] = fd;
+      files[nfiles] = fd;
     }
+  assert (nfiles != 0);
 
   posix_spawn_file_actions_t a;
-  if (posix_spawn_file_actions_init (&a) != 0)
-    FAIL_EXIT1 ("posix_spawn_file_actions_init");
+  TEST_COMPARE (posix_spawn_file_actions_init (&a), 0);
 
   /* Executes a /bin/sh echo $$ 2>&1 > ${objpfx}tst-spawn3.pid .  */
   const char pidfile[] = OBJPFX "tst-spawn3.pid";
-  if (posix_spawn_file_actions_addopen (&a, STDOUT_FILENO, pidfile, O_WRONLY
-					| O_CREAT | O_TRUNC, 0644) != 0)
-    FAIL_EXIT1 ("posix_spawn_file_actions_addopen");
+  TEST_COMPARE (posix_spawn_file_actions_addopen (&a, STDOUT_FILENO, pidfile,
+						  O_WRONLY| O_CREAT | O_TRUNC,
+						  0644),
+		0);
 
-  if (posix_spawn_file_actions_adddup2 (&a, STDOUT_FILENO, STDERR_FILENO) != 0)
-    FAIL_EXIT1 ("posix_spawn_file_actions_adddup2");
+  TEST_COMPARE (posix_spawn_file_actions_adddup2 (&a, STDOUT_FILENO,
+						  STDERR_FILENO),
+		0);
 
   /* Since execve (called by posix_spawn) might require to open files to
      actually execute the shell script, setup to close the temporary file
      descriptors.  */
-  for (int i=0; i<nfiles; i++)
-    {
-      if (posix_spawn_file_actions_addclose (&a, files[i]))
-	FAIL_EXIT1 ("posix_spawn_file_actions_addclose");
-    }
+  int maxnfiles =
+#ifdef TST_SPAWN_PIDFD
+    /* The sparing file descriptor will be returned as the pid descriptor,
+       otherwise clone fail with EMFILE.  */
+    nfiles - 1;
+#else
+    nfiles;
+#endif
+
+  for (int i=0; i<maxnfiles; i++)
+    TEST_COMPARE (posix_spawn_file_actions_addclose (&a, files[i]), 0);
 
   char *spawn_argv[] = { (char *) _PATH_BSHELL, (char *) "-c",
 			 (char *) "echo $$", NULL };
-  pid_t pid;
-  if ((ret = posix_spawn (&pid, _PATH_BSHELL, &a, NULL, spawn_argv, NULL))
-       != 0)
-    {
-      errno = ret;
-      FAIL_EXIT1 ("posix_spawn: %m");
-    }
-
-  int status;
-  int err = waitpid (pid, &status, 0);
-  if (err != pid)
-    FAIL_EXIT1 ("waitpid: %m");
+  PID_T_TYPE pid;
+
+  {
+    int r = POSIX_SPAWN (&pid, _PATH_BSHELL, &a, NULL, spawn_argv, NULL);
+    if (r == ENOSYS)
+      FAIL_UNSUPPORTED ("kernel does not support CLONE_PIDFD clone flag");
+#ifdef TST_SPAWN_PIDFD
+    TEST_COMPARE (r, EMFILE);
+
+    /* Free up one file descriptor, so posix_spawn_pidfd_ex can return it.  */
+    xclose (files[nfiles-1]);
+    nfiles--;
+    r = POSIX_SPAWN (&pid, _PATH_BSHELL, &a, NULL, spawn_argv, NULL);
+#endif
+    TEST_COMPARE (r, 0);
+  }
+
+  siginfo_t sinfo;
+  TEST_COMPARE (WAITID (P_PID, pid, &sinfo, WEXITED), 0);
+  TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+  TEST_COMPARE (sinfo.si_status, 0);
 
   /* Close the temporary files descriptor so it can check posix_spawn
      output.  */
   for (int i=0; i<nfiles; i++)
-    {
-      if (close (files[i]))
-	FAIL_EXIT1 ("close: %m");
-    }
+    xclose (files[i]);
 
-  int pidfd = open (pidfile, O_RDONLY);
-  if (pidfd == -1)
-    FAIL_EXIT1 ("open: %m");
+  int pidfd = xopen (pidfile, O_RDONLY, 0);
 
-  char buf[64];
-  ssize_t n;
-  if ((n = read (pidfd, buf, sizeof (buf))) < 0)
-    FAIL_EXIT1 ("read: %m");
+  char buf[INT_STRLEN_BOUND (pid_t)];
+  ssize_t n = read (pidfd, buf, sizeof (buf));
+  TEST_VERIFY (n < sizeof buf && n >= 0);
 
-  unlink (pidfile);
+  xunlink (pidfile);
 
   /* We only expect to read the PID.  */
   char *endp;
   long int rpid = strtol (buf, &endp, 10);
-  if (*endp != '\n')
-    FAIL_EXIT1 ("*endp != \'n\'");
-  if (endp == buf)
-    FAIL_EXIT1 ("read empty line");
+  TEST_VERIFY (*endp == '\n' && endp != buf);
 
-  if (rpid != pid)
-    FAIL_EXIT1 ("found \"%s\", expected pid %ld\n", buf, (long int) pid);
+  TEST_COMPARE (rpid, sinfo.si_pid);
 
   return 0;
 }
diff --git a/posix/tst-spawn4.c b/posix/tst-spawn4.c
index 327f04ea6c..8bf8bd52df 100644
--- a/posix/tst-spawn4.c
+++ b/posix/tst-spawn4.c
@@ -24,6 +24,7 @@
 #include <support/xunistd.h>
 #include <support/check.h>
 #include <support/temp_file.h>
+#include <tst-spawn.h>
 
 static int
 do_test (void)
@@ -38,15 +39,15 @@ do_test (void)
 
   TEST_VERIFY_EXIT (chmod (scriptname, 0x775) == 0);
 
-  pid_t pid;
+  PID_T_TYPE pid;
   int status;
 
   /* Check if scripts without shebang are correctly not executed.  */
-  status = posix_spawn (&pid, scriptname, NULL, NULL, (char *[]) { 0 },
+  status = POSIX_SPAWN (&pid, scriptname, NULL, NULL, (char *[]) { 0 },
                         (char *[]) { 0 });
   TEST_VERIFY_EXIT (status == ENOEXEC);
 
-  status = posix_spawnp (&pid, scriptname, NULL, NULL, (char *[]) { 0 },
+  status = POSIX_SPAWNP (&pid, scriptname, NULL, NULL, (char *[]) { 0 },
                          (char *[]) { 0 });
   TEST_VERIFY_EXIT (status == ENOEXEC);
 
diff --git a/posix/tst-spawn5.c b/posix/tst-spawn5.c
index 6b3d11cf82..7850f3d7dd 100644
--- a/posix/tst-spawn5.c
+++ b/posix/tst-spawn5.c
@@ -33,6 +33,7 @@
 
 #include <arch-fd_to_filename.h>
 #include <array_length.h>
+#include <tst-spawn.h>
 
 /* Nonzero if the program gets called via `exec'.  */
 static int restart;
@@ -161,14 +162,13 @@ spawn_closefrom_test (posix_spawn_file_actions_t *fa, int lowfd, int highfd,
   args[argc] = NULL;
   TEST_VERIFY (argc < argv_size);
 
-  pid_t pid;
-  int status;
+  PID_T_TYPE pid;
+  siginfo_t sinfo;
 
-  TEST_COMPARE (posix_spawn (&pid, args[0], fa, NULL, args, environ), 0);
-  TEST_COMPARE (xwaitpid (pid, &status, 0), pid);
-  TEST_VERIFY (WIFEXITED (status));
-  TEST_VERIFY (!WIFSIGNALED (status));
-  TEST_COMPARE (WEXITSTATUS (status), 0);
+  TEST_COMPARE (POSIX_SPAWN (&pid, args[0], fa, NULL, args, environ), 0);
+  TEST_COMPARE (WAITID (P_PID, pid, &sinfo, WEXITED), 0);
+  TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+  TEST_COMPARE (sinfo.si_status, 0);
 }
 
 static void
diff --git a/posix/tst-spawn6.c b/posix/tst-spawn6.c
index 4e29d78168..ff36351cd6 100644
--- a/posix/tst-spawn6.c
+++ b/posix/tst-spawn6.c
@@ -32,6 +32,7 @@
 #include <sys/ioctl.h>
 #include <stdlib.h>
 #include <termios.h>
+#include <tst-spawn.h>
 
 #ifndef PATH_MAX
 # define PATH_MAX 1024
@@ -108,17 +109,15 @@ run_subprogram (int argc, char *argv[], const posix_spawnattr_t *attr,
   spargv[i] = NULL;
 
   pid_t pid;
-  TEST_COMPARE (posix_spawn (&pid, argv[1], actions, attr, spargv, environ),
+  TEST_COMPARE (POSIX_SPAWN (&pid, argv[1], actions, attr, spargv, environ),
 		exp_err);
   if (exp_err != 0)
     return;
 
-  int status;
-  TEST_COMPARE (xwaitpid (pid, &status, WUNTRACED), pid);
-  TEST_VERIFY (WIFEXITED (status));
-  TEST_VERIFY (!WIFSTOPPED (status));
-  TEST_VERIFY (!WIFSIGNALED (status));
-  TEST_COMPARE (WEXITSTATUS (status), 0);
+  siginfo_t sinfo;
+  TEST_COMPARE (WAITID (P_ALL, 0, &sinfo, WEXITED), 0);
+  TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+  TEST_COMPARE (sinfo.si_status, 0);
 }
 
 static int
@@ -202,7 +201,7 @@ do_test (int argc, char *argv[])
   if (restart)
     return handle_restart (argv[1], argv[2]);
 
-  pid_t pid = xfork ();
+  PID_T_TYPE pid = xfork ();
   if (pid == 0)
     {
       /* Create a pseudo-terminal to avoid interfering with the one using by
diff --git a/posix/tst-spawn7.c b/posix/tst-spawn7.c
index fb06915cb7..cc4498830b 100644
--- a/posix/tst-spawn7.c
+++ b/posix/tst-spawn7.c
@@ -24,7 +24,9 @@
 #include <support/check.h>
 #include <support/xsignal.h>
 #include <support/xunistd.h>
+#include <sys/wait.h>
 #include <unistd.h>
+#include <tst-spawn.h>
 
 /* Nonzero if the program gets called via `exec'.  */
 #define CMDLINE_OPTIONS \
@@ -81,14 +83,13 @@ spawn_signal_test (const char *type, const posix_spawnattr_t *attr)
 {
   spargs[check_type_argc] = (char*) type;
 
-  pid_t pid;
-  int status;
+  PID_T_TYPE pid;
+  siginfo_t sinfo;
 
   TEST_COMPARE (posix_spawn (&pid, spargs[0], NULL, attr, spargs, environ), 0);
-  TEST_COMPARE (xwaitpid (pid, &status, 0), pid);
-  TEST_VERIFY (WIFEXITED (status));
-  TEST_VERIFY (!WIFSIGNALED (status));
-  TEST_COMPARE (WEXITSTATUS (status), 0);
+  TEST_COMPARE (WAITID (P_ALL, 0, &sinfo, WEXITED), 0);
+  TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+  TEST_COMPARE (sinfo.si_status, 0);
 }
 
 static void
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index c54cba873c..1bfd114d5d 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -62,6 +62,7 @@ sysdep_routines += \
   clock_adjtime \
   clone \
   clone-internal \
+  clone-pidfd-support \
   clone3 \
   closefrom_fallback \
   convert_scm_timestamps \
@@ -489,6 +490,8 @@ sysdep_headers += \
 sysdep_routines += \
   getcpu \
   oldglob \
+  pidfd_spawn \
+  pidfd_spawnp \
   sched_getcpu \
   spawnattr_getcgroup_np \
   spawnattr_setcgroup_np \
@@ -497,7 +500,16 @@ sysdep_routines += \
 tests += \
   tst-affinity \
   tst-affinity-pid \
+  tst-posix_spawn-setsid-pidfd \
   tst-spawn-cgroup \
+  tst-spawn-chdir-pidfd \
+  tst-spawn-pidfd \
+  tst-spawn2-pidfd \
+  tst-spawn3-pidfd \
+  tst-spawn4-pidfd \
+  tst-spawn5-pidfd \
+  tst-spawn6-pidfd \
+  tst-spawn7-pidfd \
   # tests
 
 tests-static += \
@@ -511,8 +523,14 @@ tests += \
 CFLAGS-fork.c = $(libio-mtsafe)
 CFLAGS-getpid.o = -fomit-frame-pointer
 CFLAGS-getpid.os = -fomit-frame-pointer
+CFLAGS-tst-spawn3-pidfd.c += -DOBJPFX=\"$(objpfx)\"
 
 tst-spawn-cgroup-ARGS = -- $(host-test-program-cmd)
+tst-spawn-pidfd-ARGS = -- $(host-test-program-cmd)
+tst-spawn5-pidfd-ARGS = -- $(host-test-program-cmd)
+tst-spawn6-pidfd-ARGS = -- $(host-test-program-cmd)
+tst-spawn7-pidfd-ARGS = -- $(host-test-program-cmd)
+tst-posix_spawn-setsid-pidfd-ARGS = -- $(host-test-program-cmd)
 endif
 
 ifeq ($(subdir),inet)
diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions
index c912370cde..95ad896850 100644
--- a/sysdeps/unix/sysv/linux/Versions
+++ b/sysdeps/unix/sysv/linux/Versions
@@ -324,6 +324,8 @@ libc {
   GLIBC_2.38 {
     posix_spawnattr_getcgroup_np;
     posix_spawnattr_setcgroup_np;
+    pidfd_spawn;
+    pidfd_spawnp;
   }
   GLIBC_PRIVATE {
     # functions used in other libraries
diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
index cbc8387131..26483dbc4e 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
@@ -2669,6 +2669,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist
index 6d31a565d2..b2cafe896e 100644
--- a/sysdeps/unix/sysv/linux/alpha/libc.abilist
+++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist
@@ -2778,6 +2778,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/arc/libc.abilist b/sysdeps/unix/sysv/linux/arc/libc.abilist
index 8c604659c4..7138d480c1 100644
--- a/sysdeps/unix/sysv/linux/arc/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arc/libc.abilist
@@ -2430,6 +2430,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/arm/be/libc.abilist b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
index 7936ed59f8..4d92c041dd 100644
--- a/sysdeps/unix/sysv/linux/arm/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
@@ -550,6 +550,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/arm/le/libc.abilist b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
index 2893783e3d..8595044924 100644
--- a/sysdeps/unix/sysv/linux/arm/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
@@ -547,6 +547,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/bits/spawn_ext.h b/sysdeps/unix/sysv/linux/bits/spawn_ext.h
index 3bc10ab477..6f9d31bde3 100644
--- a/sysdeps/unix/sysv/linux/bits/spawn_ext.h
+++ b/sysdeps/unix/sysv/linux/bits/spawn_ext.h
@@ -37,4 +37,24 @@ extern int posix_spawnattr_setcgroup_np (posix_spawnattr_t *__restrict __attr,
 
 #endif /* __USE_MISC */
 
+#ifdef __USE_GNU
+
+extern int pidfd_spawn (int *__restrict __pidfd,
+			const char *__restrict __file,
+			const posix_spawn_file_actions_t *__restrict __facts,
+			const posix_spawnattr_t *__restrict __attrp,
+			char *const __argv[__restrict_arr],
+			char *const __envp[__restrict_arr])
+    __nonnull ((2, 5));
+
+extern int pidfd_spawnp (int *__restrict __pidfd,
+			 const char *__restrict __path,
+			 const posix_spawn_file_actions_t *__restrict __facts,
+			 const posix_spawnattr_t *__restrict __attrp,
+			 char *const __argv[__restrict_arr],
+			 char *const __envp[__restrict_arr])
+    __nonnull ((2, 5));
+
+#endif /* __USE_GNU */
+
 __END_DECLS
diff --git a/sysdeps/unix/sysv/linux/clone-pidfd-support.c b/sysdeps/unix/sysv/linux/clone-pidfd-support.c
new file mode 100644
index 0000000000..e56a064849
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/clone-pidfd-support.c
@@ -0,0 +1,58 @@
+/* Check if kernel supports PID file descriptors.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <atomic.h>
+#include <sys/wait.h>
+#include <sysdep.h>
+
+/* The PID file descriptors was added during multiple releases:
+   - Linux 5.2 added CLONE_PIDFD support for clone and __clone_pidfd_supported
+     syscall.
+   - Linux 5.3 added support for poll and CLONE_PIDFD for clone3.
+   - Linux 5.4 added P_PIDFD support on waitid.
+
+   For internal usage on spawn and fork, it only make sense to return a file
+   descriptor if caller can actually waitid on it.  */
+bool
+__clone_pidfd_supported (void)
+{
+  static int supported = 0;
+  int state = atomic_load_relaxed (&supported);
+  if (state == 0)
+    {
+      /* Linux define the maximum allocated file descriptor value as
+	 0x7fffffc0 (from fs/file.c):
+
+         #define __const_min(x, y) ((x) < (y) ? (x) : (y))
+         unsigned int sysctl_nr_open_max =
+	   __const_min(INT_MAX, ~(size_t)0/sizeof(void *)) & -BITS_PER_LONG;
+
+	 So we can detect whether kernel supports all pidfd interfaces by
+	 using a valid but never allocated file descriptor: if is not
+	 supported waitid will return EINVAL, otherwise EBADF.
+
+         Also the waitid is a cancellation entrypoint, so issue the syscall
+	 directly.  */
+      int r = INTERNAL_SYSCALL_CALL (waitid, P_PIDFD, INT_MAX, NULL,
+				     WEXITED | WNOHANG, NULL);
+      state = r == -EBADF ? 1 : -1;
+      atomic_store_relaxed (&supported, state);
+    }
+
+  return state == 1;
+}
diff --git a/sysdeps/unix/sysv/linux/csky/libc.abilist b/sysdeps/unix/sysv/linux/csky/libc.abilist
index dc1b885b5a..388db91231 100644
--- a/sysdeps/unix/sysv/linux/csky/libc.abilist
+++ b/sysdeps/unix/sysv/linux/csky/libc.abilist
@@ -2706,6 +2706,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist
index c967612203..aa21ca135b 100644
--- a/sysdeps/unix/sysv/linux/hppa/libc.abilist
+++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist
@@ -2655,6 +2655,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist
index c7b921f392..31d34bd2cc 100644
--- a/sysdeps/unix/sysv/linux/i386/libc.abilist
+++ b/sysdeps/unix/sysv/linux/i386/libc.abilist
@@ -2839,6 +2839,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist
index a0e4bd3ab9..6b09d6bddb 100644
--- a/sysdeps/unix/sysv/linux/ia64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist
@@ -2604,6 +2604,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist b/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
index bb8b895a37..65c5050c24 100644
--- a/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
@@ -2190,6 +2190,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
index 0c28b1bd74..104f1d9e7d 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
@@ -551,6 +551,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
index 59badcaf6b..b1d44b697c 100644
--- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
@@ -2782,6 +2782,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
index 18da382fc2..be7b0c59b9 100644
--- a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
@@ -2755,6 +2755,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
index 3f54fb325f..44171c5bcc 100644
--- a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
@@ -2752,6 +2752,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
index 4ce0ca4955..672d142d2e 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
@@ -2747,6 +2747,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
index dc6e322b77..6a494ab102 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
@@ -2745,6 +2745,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
index b4ce7b5c82..38d3ed399d 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
@@ -2753,6 +2753,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
index 311800f6ca..4f6f2040b9 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
@@ -2655,6 +2655,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist
index 756f3cce1d..dbdeab7a7a 100644
--- a/sysdeps/unix/sysv/linux/nios2/libc.abilist
+++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist
@@ -2794,6 +2794,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/or1k/libc.abilist b/sysdeps/unix/sysv/linux/or1k/libc.abilist
index 9b59c148a1..8801c867b1 100644
--- a/sysdeps/unix/sysv/linux/or1k/libc.abilist
+++ b/sysdeps/unix/sysv/linux/or1k/libc.abilist
@@ -2176,6 +2176,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/pidfd_spawn.c b/sysdeps/unix/sysv/linux/pidfd_spawn.c
new file mode 100644
index 0000000000..9f4a5780e6
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/pidfd_spawn.c
@@ -0,0 +1,30 @@
+/* pidfd_spawn - Spawn a process and return a pid file descriptor.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <spawn.h>
+#include "spawn_int.h"
+
+int
+pidfd_spawn (int *pidfd, const char *path,
+	     const posix_spawn_file_actions_t *file_actions,
+	     const posix_spawnattr_t *attrp, char *const argv[],
+	     char *const envp[])
+{
+  return __spawni (pidfd, path, file_actions, attrp, argv, envp,
+		   SPAWN_XFLAGS_RET_PIDFD);
+}
diff --git a/sysdeps/unix/sysv/linux/pidfd_spawnp.c b/sysdeps/unix/sysv/linux/pidfd_spawnp.c
new file mode 100644
index 0000000000..c8260fcd01
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/pidfd_spawnp.c
@@ -0,0 +1,30 @@
+/* pidfd_spawnp - Spawn a process and return a pid file descriptor.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <spawn.h>
+#include "spawn_int.h"
+
+int
+pidfd_spawnp (int *pidfd, const char *path,
+	      const posix_spawn_file_actions_t *file_actions,
+	      const posix_spawnattr_t *attrp, char *const argv[],
+	      char *const envp[])
+{
+  return __spawni (pidfd, path, file_actions, attrp, argv, envp,
+		   SPAWN_XFLAGS_USE_PATH | SPAWN_XFLAGS_RET_PIDFD);
+}
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
index 022c9d5907..d4927da36e 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
@@ -2821,6 +2821,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
index 5eabe69671..934ebcc495 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
@@ -2854,6 +2854,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
index a66243cb1b..7dee513a82 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
@@ -2575,6 +2575,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
index 8904138f0b..1f733560b9 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
@@ -2889,6 +2889,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
index c90aeb6bbf..6f0799c25a 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
@@ -2432,6 +2432,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
index dea9d8e7fc..c359dc2b29 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
@@ -2632,6 +2632,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
index 475f5a991f..c49704b77e 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
@@ -2819,6 +2819,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
index 228525449e..389a451762 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
@@ -2612,6 +2612,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/sh/be/libc.abilist b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
index f8ffa32087..01adcabee8 100644
--- a/sysdeps/unix/sysv/linux/sh/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
@@ -2662,6 +2662,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/sh/le/libc.abilist b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
index ab4fecdc3e..83a4359dab 100644
--- a/sysdeps/unix/sysv/linux/sh/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
@@ -2659,6 +2659,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
index 79b5353355..ba2f588fc1 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
@@ -2814,6 +2814,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
index 479637e24d..2e63d242c6 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
@@ -2627,6 +2627,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/spawni.c b/sysdeps/unix/sysv/linux/spawni.c
index da748679c1..9f16956702 100644
--- a/sysdeps/unix/sysv/linux/spawni.c
+++ b/sysdeps/unix/sysv/linux/spawni.c
@@ -68,6 +68,7 @@ struct posix_spawn_args
   int xflags;
   bool use_clone3;
   int err;
+  int pidfd;
 };
 
 /* Older version requires that shell script without shebang definition
@@ -309,7 +310,7 @@ fail:
 /* Spawn a new process executing PATH with the attributes describes in *ATTRP.
    Before running the process perform the actions described in FILE-ACTIONS. */
 static int
-__spawnix (pid_t * pid, const char *file,
+__spawnix (int *pid, const char *file,
 	   const posix_spawn_file_actions_t * file_actions,
 	   const posix_spawnattr_t * attrp, char *const argv[],
 	   char *const envp[], int xflags,
@@ -319,6 +320,15 @@ __spawnix (pid_t * pid, const char *file,
   struct posix_spawn_args args;
   int ec;
 
+  bool use_pidfd = xflags & SPAWN_XFLAGS_RET_PIDFD;
+
+  /* For CLONE_PIDFD, older kernels might not fail with unsupported flags or
+     some versions might not support waitid (P_PIDFD).  So to avoid the need
+     to handle the error on the helper process, check for full pidfd
+     support.  */
+  if (use_pidfd && !__clone_pidfd_supported ())
+    return ENOSYS;
+
   /* To avoid imposing hard limits on posix_spawn{p} the total number of
      arguments is first calculated to allocate a mmap to hold all possible
      values.  */
@@ -368,6 +378,7 @@ __spawnix (pid_t * pid, const char *file,
   args.argv = argv;
   args.argc = argc;
   args.envp = envp;
+  args.pidfd = 0;
   args.xflags = xflags;
 
   internal_signal_block_all (&args.oldmask);
@@ -386,13 +397,16 @@ __spawnix (pid_t * pid, const char *file,
       /* Unsupported flags like CLONE_CLEAR_SIGHAND will be cleared up by
 	 __clone_internal_fallback.  */
       .flags = (set_cgroup ? CLONE_INTO_CGROUP : 0)
+	       | (use_pidfd ? CLONE_PIDFD : 0)
 	       | CLONE_CLEAR_SIGHAND
 	       | CLONE_VM
 	       | CLONE_VFORK,
       .exit_signal = SIGCHLD,
       .stack = (uintptr_t) stack,
       .stack_size = stack_size,
-      .cgroup = (set_cgroup ? attrp->__cgroup : 0)
+      .cgroup = (set_cgroup ? attrp->__cgroup : 0),
+      .pidfd = use_pidfd ? (uintptr_t) &args.pidfd : 0,
+      .parent_tid = use_pidfd ? (uintptr_t) &args.pidfd : 0,
     };
 #ifdef HAVE_CLONE3_WRAPPER
   args.use_clone3 = true;
@@ -443,7 +457,7 @@ __spawnix (pid_t * pid, const char *file,
   __munmap (stack, stack_size);
 
   if ((ec == 0) && (pid != NULL))
-    *pid = new_pid;
+    *pid = use_pidfd ? args.pidfd : new_pid;
 
   internal_signal_restore_set (&args.oldmask);
 
diff --git a/sysdeps/unix/sysv/linux/tst-posix_spawn-setsid-pidfd.c b/sysdeps/unix/sysv/linux/tst-posix_spawn-setsid-pidfd.c
new file mode 100644
index 0000000000..4372833f07
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-posix_spawn-setsid-pidfd.c
@@ -0,0 +1,20 @@
+/* Tests for spawn pidfd extension.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <tst-spawn-pidfd.h>
+#include <posix/tst-posix_spawn-setsid.c>
diff --git a/sysdeps/unix/sysv/linux/tst-spawn-chdir-pidfd.c b/sysdeps/unix/sysv/linux/tst-spawn-chdir-pidfd.c
new file mode 100644
index 0000000000..019527b31b
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-spawn-chdir-pidfd.c
@@ -0,0 +1,20 @@
+/* Tests for spawn pidfd extension.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <tst-spawn-pidfd.h>
+#include <posix/tst-spawn-chdir.c>
diff --git a/sysdeps/unix/sysv/linux/tst-spawn-pidfd.c b/sysdeps/unix/sysv/linux/tst-spawn-pidfd.c
new file mode 100644
index 0000000000..c430995af8
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-spawn-pidfd.c
@@ -0,0 +1,20 @@
+/* Tests for spawn pidfd extension.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <tst-spawn-pidfd.h>
+#include <posix/tst-spawn.c>
diff --git a/sysdeps/unix/sysv/linux/tst-spawn-pidfd.h b/sysdeps/unix/sysv/linux/tst-spawn-pidfd.h
new file mode 100644
index 0000000000..ea51c22447
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-spawn-pidfd.h
@@ -0,0 +1,63 @@
+/* Tests for spawn pidfd extension.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <errno.h>
+#include <spawn.h>
+#include <support/check.h>
+
+#define PID_T_TYPE int
+
+/* Call posix_spawn with POSIX_SPAWN_PIDFD set.  */
+static inline int
+pidfd_spawn_check (int *pidfd, const char *path,
+		   const posix_spawn_file_actions_t *fa,
+		   const posix_spawnattr_t *attr, char *const argv[],
+		   char *const envp[])
+{
+  int r = pidfd_spawn (pidfd, path, fa, attr, argv, envp);
+  if (r == ENOSYS)
+    FAIL_UNSUPPORTED ("kernel does not support CLONE_PIDFD clone flag");
+  return r;
+}
+
+#define POSIX_SPAWN(__pidfd, __path, __actions, __attr, __argv, __envp)	     \
+  pidfd_spawn_check (__pidfd, __path, __actions, __attr, __argv, __envp)
+
+static inline int
+pidfd_spawnp_check (int *pidfd, const char *file,
+		    const posix_spawn_file_actions_t *fa,
+		    const posix_spawnattr_t *attr,
+		    char *const argv[], char *const envp[])
+{
+  int r = pidfd_spawnp (pidfd, file, fa, attr, argv, envp);
+  if (r == ENOSYS)
+    FAIL_UNSUPPORTED ("kernel does not support CLONE_PIDFD clone flag");
+  return r;
+}
+
+#define POSIX_SPAWNP(__child, __path, __actions, __attr, __argv, __envp) \
+  pidfd_spawnp_check (__child, __path, __actions, __attr, __argv, __envp)
+
+#define WAITID(__idtype, __id, __info, __opts)				     \
+  ({									     \
+     __typeof (__idtype) __new_idtype = __idtype == P_PID		     \
+					? P_PIDFD : __idtype;		     \
+     waitid (__new_idtype, __id, __info, __opts);			     \
+  })
+
+#define TST_SPAWN_PIDFD 1
diff --git a/sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c b/sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c
new file mode 100644
index 0000000000..03ba7a3d15
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-spawn2-pidfd.c
@@ -0,0 +1,20 @@
+/* Tests for spawn pidfd extension.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <tst-spawn-pidfd.h>
+#include <posix/tst-spawn2.c>
diff --git a/sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c b/sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c
new file mode 100644
index 0000000000..8ad9a16854
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-spawn3-pidfd.c
@@ -0,0 +1,20 @@
+/* Check posix_spawn add file actions.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <tst-spawn-pidfd.h>
+#include <posix/tst-spawn3.c>
diff --git a/sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c b/sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c
new file mode 100644
index 0000000000..83922da7d1
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-spawn4-pidfd.c
@@ -0,0 +1,20 @@
+/* Tests for spawn pidfd extension.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <tst-spawn-pidfd.h>
+#include <posix/tst-spawn4.c>
diff --git a/sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c b/sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c
new file mode 100644
index 0000000000..149c352bf8
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-spawn5-pidfd.c
@@ -0,0 +1,20 @@
+/* Tests for spawn pidfd extension.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <tst-spawn-pidfd.h>
+#include <posix/tst-spawn5.c>
diff --git a/sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c b/sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c
new file mode 100644
index 0000000000..d3f5859457
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-spawn6-pidfd.c
@@ -0,0 +1,20 @@
+/* Tests for spawn pidfd extension.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <tst-spawn-pidfd.h>
+#include <posix/tst-spawn6.c>
diff --git a/sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c b/sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c
new file mode 100644
index 0000000000..3aec86bec2
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-spawn7-pidfd.c
@@ -0,0 +1,20 @@
+/* Tests for spawn pidfd extension.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <tst-spawn-pidfd.h>
+#include <posix/tst-spawn7.c>
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
index ea8539447c..f7d5b23888 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
@@ -2578,6 +2578,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
index f15ac7c33f..ddc1f04eb1 100644
--- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
@@ -2684,6 +2684,8 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_spawn F
+GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
 GLIBC_2.38 posix_spawnattr_setcgroup_np F
 GLIBC_2.38 strlcat F
-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 3/5] posix: Add pidfd_fork (BZ 26371)
  2023-07-05 20:43 [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Adhemerval Zanella
  2023-07-05 20:43 ` [PATCH v6 1/5] linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731) Adhemerval Zanella
  2023-07-05 20:43 ` [PATCH v6 2/5] posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349) Adhemerval Zanella
@ 2023-07-05 20:43 ` Adhemerval Zanella
  2023-07-05 20:43 ` [PATCH v6 4/5] posix: Add PIDFDFORK_NOSIGCHLD for pidfd_fork Adhemerval Zanella
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Adhemerval Zanella @ 2023-07-05 20:43 UTC (permalink / raw)
  To: libc-alpha, Luca Boccassi, Philip Withnall

Returning a pidfd allows a process to keep a race-free handle to a
child process, otherwise the caller will need to either use pidfd_open
(which still might be subject to TOCTOU) or keep using the old racy
interface.

The implementation makes sure that kernel must support the complete
pidfd interface, meaning that waitid (P_PIDFD) should be supported.
It ensure that non racy workaround is required (such as reading procfs
fdinfo pid to use along with old wait interfaces).  If kernel does
not have the required support the interface returns -1 and set errno
to ENOSYS.

The interface is:

  pid_t pidfd_fork (int *pidfd, int cgroup, unsigned int flags)

If PIDFD is set to NULL, no file descriptor is returned and pidfd_fork
acts as fork.  Otherwise, a new file descriptor is returned and the
kernel already sets O_CLOEXEC as default.  The pidfd_fork follows
fork/_Fork convention on returning a positive or negative value to the
parent (with negative indicating an error) and zero to the child.

If cgroup is 0 or positive value, it is interpreted as a different
cgroup to be place the new process (check CLONE_INTO_CGROUP clone
flag).

Similar to fork, pidfd_fork also runs the pthread_atfork handlers
It can be change by using PIDFDFORK_ASYNCSAFE flag, which make
pidfd_fork acts a _Fork.  It also send SIGCHLD to parent when
process terminates.

Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PID or waitid
support), Linux 5.15 (only clone support), and Linux 5.19 (full
support including clone3).
---
 NEWS                                          |   5 +
 include/clone_internal.h                      |  16 ++
 manual/process.texi                           |  53 ++++-
 posix/Makefile                                |   3 +-
 posix/fork-internal.c                         | 127 ++++++++++++
 posix/fork-internal.h                         |  36 ++++
 posix/fork.c                                  | 107 +---------
 sysdeps/nptl/_Fork.c                          |   2 +-
 sysdeps/unix/sysv/linux/Makefile              |   3 +
 sysdeps/unix/sysv/linux/Versions              |   1 +
 sysdeps/unix/sysv/linux/aarch64/libc.abilist  |   1 +
 sysdeps/unix/sysv/linux/alpha/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/arc/libc.abilist      |   1 +
 sysdeps/unix/sysv/linux/arch-fork.h           |  16 +-
 sysdeps/unix/sysv/linux/arm/be/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/arm/le/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/clone-internal.c      |  56 +++++-
 sysdeps/unix/sysv/linux/csky/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/hppa/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/i386/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/ia64/libc.abilist     |   1 +
 .../sysv/linux/loongarch/lp64/libc.abilist    |   1 +
 .../sysv/linux/m68k/coldfire/libc.abilist     |   1 +
 .../unix/sysv/linux/m68k/m680x0/libc.abilist  |   1 +
 .../sysv/linux/microblaze/be/libc.abilist     |   1 +
 .../sysv/linux/microblaze/le/libc.abilist     |   1 +
 .../sysv/linux/mips/mips32/fpu/libc.abilist   |   1 +
 .../sysv/linux/mips/mips32/nofpu/libc.abilist |   1 +
 .../sysv/linux/mips/mips64/n32/libc.abilist   |   1 +
 .../sysv/linux/mips/mips64/n64/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/nios2/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/or1k/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/pidfd_fork.c          |  81 ++++++++
 .../linux/powerpc/powerpc32/fpu/libc.abilist  |   1 +
 .../powerpc/powerpc32/nofpu/libc.abilist      |   1 +
 .../linux/powerpc/powerpc64/be/libc.abilist   |   1 +
 .../linux/powerpc/powerpc64/le/libc.abilist   |   1 +
 .../unix/sysv/linux/riscv/rv32/libc.abilist   |   1 +
 .../unix/sysv/linux/riscv/rv64/libc.abilist   |   1 +
 .../unix/sysv/linux/s390/s390-32/libc.abilist |   1 +
 .../unix/sysv/linux/s390/s390-64/libc.abilist |   1 +
 sysdeps/unix/sysv/linux/sh/be/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/sh/le/libc.abilist    |   1 +
 .../sysv/linux/sparc/sparc32/libc.abilist     |   1 +
 .../sysv/linux/sparc/sparc64/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/sys/pidfd.h           |  19 ++
 .../unix/sysv/linux/tst-pidfd_fork-cgroup.c   | 162 +++++++++++++++
 sysdeps/unix/sysv/linux/tst-pidfd_fork.c      | 186 ++++++++++++++++++
 .../unix/sysv/linux/x86_64/64/libc.abilist    |   1 +
 .../unix/sysv/linux/x86_64/x32/libc.abilist   |   1 +
 50 files changed, 788 insertions(+), 119 deletions(-)
 create mode 100644 posix/fork-internal.c
 create mode 100644 posix/fork-internal.h
 create mode 100644 sysdeps/unix/sysv/linux/pidfd_fork.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_fork-cgroup.c
 create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_fork.c

diff --git a/NEWS b/NEWS
index 65562e75e7..462d8f511d 100644
--- a/NEWS
+++ b/NEWS
@@ -61,6 +61,11 @@ Major new features:
   The pidfd functionality avoid the issue of PID reuse with traditional
   posix_spawn interface.
 
+* On Linux, the pidfd_fork has been added.  It has a similar semantic
+  as fork or _Fork, where it clones the calling process.  However instead
+  of return a process ID, it returns a file descriptor that can be used
+  along other pidfd functions.
+
 Deprecated and removed features, and other changes affecting compatibility:
 
 * In the Linux kernel for the hppa/parisc architecture some of the
diff --git a/include/clone_internal.h b/include/clone_internal.h
index 567160ebb5..d9b5509f78 100644
--- a/include/clone_internal.h
+++ b/include/clone_internal.h
@@ -2,6 +2,8 @@
 #define _CLONE_INTERNAL_H
 
 #include <clone3.h>
+#include <stdbool.h>
+#include <stdint.h>
 
 /* The clone3 syscall provides a superset of the functionality of the clone
    interface.  The kernel might extend __CL_ARGS struct in the future, with
@@ -35,6 +37,20 @@ extern int __clone_internal_fallback (struct clone_args *__cl_args,
 				      void *__arg)
      attribute_hidden;
 
+/* Call the clone3/clone syscall with fork semantic (i.e. no stack setting
+   required).  The EXTRA_FLAGS define any additional flag to be used besides
+   CLONE_CHILD_SETTID and CLONE_CHILD_CLEARTID, the PIDFD indicates where
+   the process file descriptor (set with CLONE_PIDFD) should be returned,
+   and the CGROUP specifies the cgroupsv2 (set with CLONE_INTO_CGROUP).
+
+   Similar to __clone3_internal, it uses the stick check to avoid re-issue
+   the clone3 syscall if kernel does not support it.
+
+   It does not provide CLONE_INTO_CGROUP/CGROUP fallback if clone3 is not
+   supported, in this case the function returns -1/ENOTSUP.  */
+extern int __clone_fork (uint64_t __extra_flags, void *__pidfd, int __cgroup)
+     attribute_hidden;
+
 /* Return whether the kernel supports pid file descriptor, including clone
    with CLONE_PIDFD and waitid with P_PIDFD.  */
 extern bool __clone_pidfd_supported (void) attribute_hidden;
diff --git a/manual/process.texi b/manual/process.texi
index 68361c3f61..a656df425b 100644
--- a/manual/process.texi
+++ b/manual/process.texi
@@ -137,12 +137,12 @@ creating a process and making it run another program.
 @cindex subprocess
 A new processes is created when one of the functions
 @code{posix_spawn}, @code{fork}, @code{_Fork}, @code{vfork}, or
-@code{pidfd_spawn} is called.  (The @code{system} and @code{popen} also
-create new processes internally.)  Due to the name of the @code{fork}
-function, the act of creating a new process is sometimes called
-@dfn{forking} a process.  Each new process (the @dfn{child process} or
-@dfn{subprocess}) is allocated a process ID, distinct from the process
-ID of the parent process.  @xref{Process Identification}.
+@code{pidfd_spawn}, or @code{pidfd_fork} is called.  (The @code{system}
+and @code{popen} also create new processes internally.)  Due to the name
+of the @code{fork} function, the act of creating a new process is
+sometimes called @dfn{forking} a process.  Each new process (the
+@dfn{child process} or @dfn{subprocess}) is allocated a process ID,
+distinct from the process ID of the parent process.  @xref{Process Identification}.
 
 After forking a child process, both the parent and child processes
 continue to execute normally.  If you want your program to wait for a
@@ -153,10 +153,10 @@ limited information about why the child terminated---for example, its
 exit status code.
 
 A newly forked child process continues to execute the same program as
-its parent process, at the point where the @code{fork} or @code{_Fork}
-call returns.  You can use the return value from @code{fork} or
-@code{_Fork} to tell whether the program is running in the parent process
-or the child.
+its parent process, at the point where the @code{fork}, @code{_Fork},
+or @code{pidfd_fork} call returns.  You can use the return value from
+@code{fork}, @code{_Fork}, or @code{pidfd_fork} to tell whether the
+program is running in the parent process or the child.
 
 @cindex process image
 Having several processes run the same program is only occasionally
@@ -362,6 +362,39 @@ the proper precautions for using @code{vfork}, your program will still
 work even if the system uses @code{fork} instead.
 @end deftypefun
 
+@deftypefun pid_t pidfd_fork (int *@var{pidfd}, int @var{cgroup}, unsigned int @var{flags})
+@standards{GNU, sys/pidfd.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+The @code{fork} function is similar to @code{fork} but return a file
+descriptor instead of process ID.
+
+If the operation is sucessful, there are both parent and child processes
+and both see @code{pidfd_fork} return, but with different values: it return
+a value of @code{0} in the child process and returns the child's process ID
+in the parent process.
+
+Also, if the process is correctly created and @code{pidfd} is non @code{NULL}
+the input argument will contain a file descriptor that can be used along other
+pidfd functions (like @code{pidfd_send_signal} or with @code{waitid} along with
+@code{P_PIDFD}.
+
+The @var{cgroup} argument should either -1 or a file descriptor to a cgroup v2
+directory used on process creation.  There is no fallback implementation, meaning
+If the kernel does not provide the required support an error is returned.
+
+The @var{flags} argument should be either zero, or the bitwise OR of some of the
+following flags:
+
+@table @code
+@item PIDFDFORK_ASYNCSAFE
+Acts as @code{_Fork}, where it does not invoke any callbacks registered with
+@code{pthread_atfork}, nor does it reset internal state or locks (such as the
+@code{malloc} locks).
+@end table
+@end deftypefun
+
+This function is specific to Linux.
+
 @node Executing a File
 @section Executing a File
 @cindex executing a file
diff --git a/posix/Makefile b/posix/Makefile
index e3c78d3d65..974a369339 100644
--- a/posix/Makefile
+++ b/posix/Makefile
@@ -84,6 +84,7 @@ routines := \
   fexecve \
   fnmatch \
   fork \
+  fork-internal \
   fpathconf \
   gai_strerror \
   get_child_max \
@@ -580,7 +581,7 @@ CFLAGS-execl.os = -fomit-frame-pointer
 CFLAGS-execvp.os = -fomit-frame-pointer
 CFLAGS-execlp.os = -fomit-frame-pointer
 CFLAGS-nanosleep.c += -fexceptions -fasynchronous-unwind-tables
-CFLAGS-fork.c = $(libio-mtsafe) $(config-cflags-wno-ignored-attributes)
+CFLAGS-fork-internal.c = $(libio-mtsafe) $(config-cflags-wno-ignored-attributes)
 
 tstgetopt-ARGS = -a -b -cfoobar --required foobar --optional=bazbug \
 		--none random --col --color --colour
diff --git a/posix/fork-internal.c b/posix/fork-internal.c
new file mode 100644
index 0000000000..a5e47cbe53
--- /dev/null
+++ b/posix/fork-internal.c
@@ -0,0 +1,127 @@
+/* Internal fork definitions.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <fork.h>
+#include <fork-internal.h>
+#include <ldsodefs.h>
+#include <libio/libioP.h>
+#include <malloc/malloc-internal.h>
+#include <register-atfork.h>
+#include <stdio-lock.h>
+#include <unwind-link.h>
+
+static void
+fresetlockfiles (void)
+{
+  _IO_ITER i;
+
+  for (i = _IO_iter_begin(); i != _IO_iter_end(); i = _IO_iter_next(i))
+    if ((_IO_iter_file (i)->_flags & _IO_USER_LOCK) == 0)
+      _IO_lock_init (*((_IO_lock_t *) _IO_iter_file(i)->_lock));
+}
+
+uint64_t
+__fork_pre (bool multiple_threads, struct nss_database_data *nss_database_data)
+{
+  uint64_t lastrun = __run_prefork_handlers (multiple_threads);
+
+  /* If we are not running multiple threads, we do not have to
+     preserve lock state.  If fork runs from a signal handler, only
+     async-signal-safe functions can be used in the child.  These data
+     structures are only used by unsafe functions, so their state does
+     not matter if fork was called from a signal handler.  */
+  if (multiple_threads)
+    {
+      call_function_static_weak (__nss_database_fork_prepare_parent,
+				 nss_database_data);
+
+      _IO_list_lock ();
+
+      /* Acquire malloc locks.  This needs to come last because fork
+	 handlers may use malloc, and the libio list lock has an
+	 indirect malloc dependency as well (via the getdelim
+	 function).  */
+      call_function_static_weak (__malloc_fork_lock_parent);
+    }
+
+  return lastrun;
+}
+
+void
+__fork_post (struct fork_post_state_t *state,
+	     struct nss_database_data *nss_database_data)
+{
+  if (state->pid == 0)
+    {
+      fork_system_setup ();
+
+      /* Reset the lock state in the multi-threaded case.  */
+      if (state->multiple_threads)
+	{
+	  __libc_unwind_link_after_fork ();
+
+	  fork_system_setup_after_fork ();
+
+	  /* Release malloc locks.  */
+	  call_function_static_weak (__malloc_fork_unlock_child);
+
+	  /* Reset the file list.  These are recursive mutexes.  */
+	  fresetlockfiles ();
+
+	  /* Reset locks in the I/O code.  */
+	  _IO_list_resetlock ();
+
+	  call_function_static_weak (__nss_database_fork_subprocess,
+				     nss_database_data);
+	}
+
+      /* Reset the lock the dynamic loader uses to protect its data.  */
+      __rtld_lock_initialize (GL(dl_load_lock));
+
+      /* Reset the lock protecting dynamic TLS related data.  */
+      __rtld_lock_initialize (GL(dl_load_tls_lock));
+
+      reclaim_stacks ();
+
+      /* Run the handlers registered for the child.  */
+      __run_postfork_handlers (atfork_run_child, state->multiple_threads,
+			       state->lastrun);
+    }
+  else
+    {
+      /* If _Fork failed, preserve its errno value.  */
+      int save_errno = errno;
+
+      /* Release acquired locks in the multi-threaded case.  */
+      if (state->multiple_threads)
+	{
+	  /* Release malloc locks, parent process variant.  */
+	  call_function_static_weak (__malloc_fork_unlock_parent);
+
+	  /* We execute this even if the 'fork' call failed.  */
+	  _IO_list_unlock ();
+	}
+
+      /* Run the handlers registered for the parent.  */
+      __run_postfork_handlers (atfork_run_parent, state->multiple_threads,
+			       state->lastrun);
+
+      if (state->pid < 0)
+	__set_errno (save_errno);
+    }
+}
diff --git a/posix/fork-internal.h b/posix/fork-internal.h
new file mode 100644
index 0000000000..5017061e1e
--- /dev/null
+++ b/posix/fork-internal.h
@@ -0,0 +1,36 @@
+/* Internal fork definitions.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _FORK_INTERNAL_H
+#define _FORK_INTERNAL_H
+
+#include <stdint.h>
+#include <nss/nss_database.h>
+
+struct fork_post_state_t
+{
+  bool multiple_threads;
+  pid_t pid;
+  uint64_t lastrun;
+};
+
+uint64_t __fork_pre (bool, struct nss_database_data *) attribute_hidden;
+void __fork_post (struct fork_post_state_t *, struct nss_database_data *)
+  attribute_hidden;
+
+#endif
diff --git a/posix/fork.c b/posix/fork.c
index b4aaa9fa6d..1708473e72 100644
--- a/posix/fork.c
+++ b/posix/fork.c
@@ -16,25 +16,10 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <fork.h>
-#include <libio/libioP.h>
-#include <ldsodefs.h>
-#include <malloc/malloc-internal.h>
-#include <nss/nss_database.h>
-#include <register-atfork.h>
-#include <stdio-lock.h>
+#include <fork-internal.h>
 #include <sys/single_threaded.h>
 #include <unwind-link.h>
-
-static void
-fresetlockfiles (void)
-{
-  _IO_ITER i;
-
-  for (i = _IO_iter_begin(); i != _IO_iter_end(); i = _IO_iter_next(i))
-    if ((_IO_iter_file (i)->_flags & _IO_USER_LOCK) == 0)
-      _IO_lock_init (*((_IO_lock_t *) _IO_iter_file(i)->_lock));
-}
+#include <unistd.h>
 
 pid_t
 __libc_fork (void)
@@ -45,92 +30,18 @@ __libc_fork (void)
      requirement for fork (Austin Group tracker issue #62) this is
      best effort to make is async-signal-safe at least for single-thread
      case.  */
-  bool multiple_threads = !SINGLE_THREAD_P;
-  uint64_t lastrun;
-
-  lastrun = __run_prefork_handlers (multiple_threads);
-
+  struct fork_post_state_t state = {
+      .multiple_threads = !SINGLE_THREAD_P
+  };
   struct nss_database_data nss_database_data;
 
-  /* If we are not running multiple threads, we do not have to
-     preserve lock state.  If fork runs from a signal handler, only
-     async-signal-safe functions can be used in the child.  These data
-     structures are only used by unsafe functions, so their state does
-     not matter if fork was called from a signal handler.  */
-  if (multiple_threads)
-    {
-      call_function_static_weak (__nss_database_fork_prepare_parent,
-				 &nss_database_data);
-
-      _IO_list_lock ();
-
-      /* Acquire malloc locks.  This needs to come last because fork
-	 handlers may use malloc, and the libio list lock has an
-	 indirect malloc dependency as well (via the getdelim
-	 function).  */
-      call_function_static_weak (__malloc_fork_lock_parent);
-    }
-
-  pid_t pid = _Fork ();
-
-  if (pid == 0)
-    {
-      fork_system_setup ();
-
-      /* Reset the lock state in the multi-threaded case.  */
-      if (multiple_threads)
-	{
-	  __libc_unwind_link_after_fork ();
-
-	  fork_system_setup_after_fork ();
-
-	  /* Release malloc locks.  */
-	  call_function_static_weak (__malloc_fork_unlock_child);
-
-	  /* Reset the file list.  These are recursive mutexes.  */
-	  fresetlockfiles ();
-
-	  /* Reset locks in the I/O code.  */
-	  _IO_list_resetlock ();
-
-	  call_function_static_weak (__nss_database_fork_subprocess,
-				     &nss_database_data);
-	}
-
-      /* Reset the lock the dynamic loader uses to protect its data.  */
-      __rtld_lock_initialize (GL(dl_load_lock));
-
-      /* Reset the lock protecting dynamic TLS related data.  */
-      __rtld_lock_initialize (GL(dl_load_tls_lock));
-
-      reclaim_stacks ();
-
-      /* Run the handlers registered for the child.  */
-      __run_postfork_handlers (atfork_run_child, multiple_threads, lastrun);
-    }
-  else
-    {
-      /* If _Fork failed, preserve its errno value.  */
-      int save_errno = errno;
-
-      /* Release acquired locks in the multi-threaded case.  */
-      if (multiple_threads)
-	{
-	  /* Release malloc locks, parent process variant.  */
-	  call_function_static_weak (__malloc_fork_unlock_parent);
-
-	  /* We execute this even if the 'fork' call failed.  */
-	  _IO_list_unlock ();
-	}
+  state.lastrun = __fork_pre (state.multiple_threads, &nss_database_data);
 
-      /* Run the handlers registered for the parent.  */
-      __run_postfork_handlers (atfork_run_parent, multiple_threads, lastrun);
+  state.pid = _Fork ();
 
-      if (pid < 0)
-	__set_errno (save_errno);
-    }
+  __fork_post (&state, &nss_database_data);
 
-  return pid;
+  return state.pid;
 }
 weak_alias (__libc_fork, __fork)
 libc_hidden_def (__fork)
diff --git a/sysdeps/nptl/_Fork.c b/sysdeps/nptl/_Fork.c
index f8322ae557..aa99e05b5b 100644
--- a/sysdeps/nptl/_Fork.c
+++ b/sysdeps/nptl/_Fork.c
@@ -22,7 +22,7 @@
 pid_t
 _Fork (void)
 {
-  pid_t pid = arch_fork (&THREAD_SELF->tid);
+  pid_t pid = arch_fork (0, NULL, &THREAD_SELF->tid);
   if (pid == 0)
     {
       struct pthread *self = THREAD_SELF;
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 1bfd114d5d..1fc6785f28 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -490,6 +490,7 @@ sysdep_headers += \
 sysdep_routines += \
   getcpu \
   oldglob \
+  pidfd_fork \
   pidfd_spawn \
   pidfd_spawnp \
   sched_getcpu \
@@ -500,6 +501,8 @@ sysdep_routines += \
 tests += \
   tst-affinity \
   tst-affinity-pid \
+  tst-pidfd_fork \
+  tst-pidfd_fork-cgroup \
   tst-posix_spawn-setsid-pidfd \
   tst-spawn-cgroup \
   tst-spawn-chdir-pidfd \
diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions
index 95ad896850..e9eecfecc0 100644
--- a/sysdeps/unix/sysv/linux/Versions
+++ b/sysdeps/unix/sysv/linux/Versions
@@ -324,6 +324,7 @@ libc {
   GLIBC_2.38 {
     posix_spawnattr_getcgroup_np;
     posix_spawnattr_setcgroup_np;
+    pidfd_fork;
     pidfd_spawn;
     pidfd_spawnp;
   }
diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
index 26483dbc4e..e81e56f88c 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
@@ -2669,6 +2669,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist
index b2cafe896e..0640fd71b9 100644
--- a/sysdeps/unix/sysv/linux/alpha/libc.abilist
+++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist
@@ -2778,6 +2778,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/arc/libc.abilist b/sysdeps/unix/sysv/linux/arc/libc.abilist
index 7138d480c1..4c9dd624ca 100644
--- a/sysdeps/unix/sysv/linux/arc/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arc/libc.abilist
@@ -2430,6 +2430,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/arch-fork.h b/sysdeps/unix/sysv/linux/arch-fork.h
index 0e0eccbf38..9e8a449e2c 100644
--- a/sysdeps/unix/sysv/linux/arch-fork.h
+++ b/sysdeps/unix/sysv/linux/arch-fork.h
@@ -32,24 +32,24 @@
    override it with one of the supported calling convention (check generic
    kernel-features.h for the clone abi variants).  */
 static inline pid_t
-arch_fork (void *ctid)
+arch_fork (int flags, void *ptid, void *ctid)
 {
-  const int flags = CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD;
   long int ret;
+  flags |= CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD;
 #ifdef __ASSUME_CLONE_BACKWARDS
 # ifdef INLINE_CLONE_SYSCALL
-  ret = INLINE_CLONE_SYSCALL (flags, 0, NULL, 0, ctid);
+  ret = INLINE_CLONE_SYSCALL (flags, 0, ptid, 0, ctid);
 # else
-  ret = INLINE_SYSCALL_CALL (clone, flags, 0, NULL, 0, ctid);
+  ret = INLINE_SYSCALL_CALL (clone, flags, 0, ptid, 0, ctid);
 # endif
 #elif defined(__ASSUME_CLONE_BACKWARDS2)
-  ret = INLINE_SYSCALL_CALL (clone, 0, flags, NULL, ctid, 0);
+  ret = INLINE_SYSCALL_CALL (clone, 0, flags, ptid, ctid, 0);
 #elif defined(__ASSUME_CLONE_BACKWARDS3)
-  ret = INLINE_SYSCALL_CALL (clone, flags, 0, 0, NULL, ctid, 0);
+  ret = INLINE_SYSCALL_CALL (clone, flags, 0, 0, ptid, ctid, 0);
 #elif defined(__ASSUME_CLONE2)
-  ret = INLINE_SYSCALL_CALL (clone2, flags, 0, 0, NULL, ctid, 0);
+  ret = INLINE_SYSCALL_CALL (clone2, flags, 0, 0, ptid, ctid, 0);
 #elif defined(__ASSUME_CLONE_DEFAULT)
-  ret = INLINE_SYSCALL_CALL (clone, flags, 0, NULL, ctid, 0);
+  ret = INLINE_SYSCALL_CALL (clone, flags, 0, ptid, ctid, 0);
 #else
 # error "Undefined clone variant"
 #endif
diff --git a/sysdeps/unix/sysv/linux/arm/be/libc.abilist b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
index 4d92c041dd..e45af835ff 100644
--- a/sysdeps/unix/sysv/linux/arm/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
@@ -550,6 +550,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/arm/le/libc.abilist b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
index 8595044924..17abecc580 100644
--- a/sysdeps/unix/sysv/linux/arm/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
@@ -547,6 +547,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/clone-internal.c b/sysdeps/unix/sysv/linux/clone-internal.c
index 790739cfce..6f5d65d98c 100644
--- a/sysdeps/unix/sysv/linux/clone-internal.c
+++ b/sysdeps/unix/sysv/linux/clone-internal.c
@@ -16,6 +16,7 @@
    License along with the GNU C Library.  If not, see
    <https://www.gnu.org/licenses/>.  */
 
+#include <arch-fork.h>
 #include <sysdep.h>
 #include <stddef.h>
 #include <errno.h>
@@ -43,6 +44,11 @@ _Static_assert (offsetofend (struct clone_args, cgroup) == CLONE_ARGS_SIZE_VER2,
 _Static_assert (sizeof (struct clone_args) == CLONE_ARGS_SIZE_VER2,
 		"sizeof (struct clone_args) != CLONE_ARGS_SIZE_VER2");
 
+#if !__ASSUME_CLONE3 && defined __NR_clone3
+/* Set to 0 if kernel does not support clone3 syscall.  */
+static int clone3_supported = 1;
+#endif
+
 int
 __clone_internal_fallback (struct clone_args *cl_args,
 			   int (*func) (void *arg), void *arg)
@@ -84,7 +90,6 @@ __clone3_internal (struct clone_args *cl_args, int (*func) (void *args),
 # if __ASSUME_CLONE3
   return __clone3 (cl_args, sizeof (*cl_args), func, arg);
 # else
-  static int clone3_supported = 1;
   if (atomic_load_relaxed (&clone3_supported) == 1)
     {
       int ret = __clone3 (cl_args, sizeof (*cl_args), func, arg);
@@ -118,3 +123,52 @@ __clone_internal (struct clone_args *cl_args,
 }
 
 libc_hidden_def (__clone_internal)
+
+int
+__clone_fork (uint64_t extra_flags, void *pidfd, int cgroup)
+{
+#ifdef __NR_clone3
+  struct clone_args clone_args =
+    {
+      .flags = extra_flags
+	       | CLONE_CHILD_SETTID
+	       | CLONE_CHILD_CLEARTID,
+      .exit_signal = SIGCHLD,
+      .cgroup = cgroup,
+      .child_tid = (uintptr_t) &THREAD_SELF->tid,
+      .pidfd = (uintptr_t) pidfd,
+      .parent_tid = (uintptr_t) pidfd
+    };
+#endif
+
+#if __ASSUME_CLONE3
+  return INLINE_SYSCALL_CALL (clone3, &clone_args, sizeof (clone_args));
+#else
+  /* Some architecture still does not export clone3.  */
+  pid_t pid;
+# ifdef __NR_clone3
+  if (atomic_load_relaxed (&clone3_supported) == 1)
+    {
+      pid = INLINE_SYSCALL_CALL (clone3, &clone_args, sizeof (clone_args));
+      if (pid != -1 || errno != ENOSYS)
+	return pid;
+
+      atomic_store_relaxed (&clone3_supported, 0);
+    }
+# endif
+
+  bool set_cgroup = cgroup != -1;
+  bool use_pidfd = pidfd != NULL;
+
+  if (!set_cgroup)
+    pid = arch_fork (use_pidfd ? CLONE_PIDFD : 0, pidfd, &THREAD_SELF->tid);
+  else
+    {
+      /* No fallback for POSIX_SPAWN_SETCGROUP if clone3 is not supported.  */
+      pid = -1;
+      if (errno == ENOSYS)
+	errno = ENOTSUP;
+    }
+  return pid;
+#endif
+}
diff --git a/sysdeps/unix/sysv/linux/csky/libc.abilist b/sysdeps/unix/sysv/linux/csky/libc.abilist
index 388db91231..360a60980d 100644
--- a/sysdeps/unix/sysv/linux/csky/libc.abilist
+++ b/sysdeps/unix/sysv/linux/csky/libc.abilist
@@ -2706,6 +2706,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist
index aa21ca135b..2aa63e2860 100644
--- a/sysdeps/unix/sysv/linux/hppa/libc.abilist
+++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist
@@ -2655,6 +2655,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist
index 31d34bd2cc..2ee3d027ac 100644
--- a/sysdeps/unix/sysv/linux/i386/libc.abilist
+++ b/sysdeps/unix/sysv/linux/i386/libc.abilist
@@ -2839,6 +2839,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist
index 6b09d6bddb..262e0b3f59 100644
--- a/sysdeps/unix/sysv/linux/ia64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist
@@ -2604,6 +2604,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist b/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
index 65c5050c24..0e5b4da990 100644
--- a/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
@@ -2190,6 +2190,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
index 104f1d9e7d..b33dc8f04a 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
@@ -551,6 +551,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
index b1d44b697c..0b8bfb07d3 100644
--- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
@@ -2782,6 +2782,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
index be7b0c59b9..d70ae3c2d3 100644
--- a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
@@ -2755,6 +2755,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
index 44171c5bcc..c9dea106b8 100644
--- a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
@@ -2752,6 +2752,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
index 672d142d2e..542d9b464e 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
@@ -2747,6 +2747,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
index 6a494ab102..5839437940 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
@@ -2745,6 +2745,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
index 38d3ed399d..3d5a63c979 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
@@ -2753,6 +2753,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
index 4f6f2040b9..3aa747d1b8 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
@@ -2655,6 +2655,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist
index dbdeab7a7a..ed7da52383 100644
--- a/sysdeps/unix/sysv/linux/nios2/libc.abilist
+++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist
@@ -2794,6 +2794,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/or1k/libc.abilist b/sysdeps/unix/sysv/linux/or1k/libc.abilist
index 8801c867b1..e75c55efa0 100644
--- a/sysdeps/unix/sysv/linux/or1k/libc.abilist
+++ b/sysdeps/unix/sysv/linux/or1k/libc.abilist
@@ -2176,6 +2176,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/pidfd_fork.c b/sysdeps/unix/sysv/linux/pidfd_fork.c
new file mode 100644
index 0000000000..983f8ade98
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/pidfd_fork.c
@@ -0,0 +1,81 @@
+/* pidfd_fork - Duplicated calling process and return a process file
+   descriptor.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <clone_internal.h>
+#include <fork-internal.h>
+#include <sys/pidfd.h>
+
+static pid_t
+forkfd (int *pidfd, int cgroup)
+{
+  bool use_pidfd = pidfd != NULL;
+  bool set_cgroup = cgroup != -1;
+
+  uint64_t extra_flags = (use_pidfd ? CLONE_PIDFD : 0)
+			 | (set_cgroup ? CLONE_INTO_CGROUP : 0);
+  pid_t pid = __clone_fork (extra_flags, use_pidfd ? pidfd : NULL,
+			    set_cgroup ? cgroup: 0);
+
+  if (pid == 0)
+    {
+      struct pthread *self = THREAD_SELF;
+
+      /* Initialize the robust mutex, check _Fork implementation for a full
+	 description why this is required.  */
+#if __PTHREAD_MUTEX_HAVE_PREV
+      self->robust_prev = &self->robust_head;
+#endif
+      self->robust_head.list = &self->robust_head;
+      INTERNAL_SYSCALL_CALL (set_robust_list, &self->robust_head,
+			     sizeof (struct robust_list_head));
+    }
+  return pid;
+}
+
+pid_t
+pidfd_fork (int *pidfd, int cgroup, unsigned int flags)
+{
+  if (!__clone_pidfd_supported ())
+    return INLINE_SYSCALL_ERROR_RETURN_VALUE (ENOSYS);
+
+  if (flags & ~(PIDFDFORK_ASYNCSAFE))
+    return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL);
+
+  pid_t pid;
+  if (!(flags & PIDFDFORK_ASYNCSAFE))
+    {
+      bool multiple_threads = !SINGLE_THREAD_P;
+      struct fork_post_state_t state = {
+	  .multiple_threads = !SINGLE_THREAD_P
+      };
+      struct nss_database_data nss_database_data;
+
+      state.lastrun = __fork_pre (multiple_threads, &nss_database_data);
+      state.pid = forkfd (pidfd, cgroup);
+      /* It follow the usual fork semantic, where a positive or negative
+	 value is returned to parent, and 0 for the child.  */
+      __fork_post (&state, &nss_database_data);
+
+      pid = state.pid;
+    }
+  else
+    pid = forkfd (pidfd, cgroup);
+
+  return pid;
+}
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
index d4927da36e..82eb6e1be0 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
@@ -2821,6 +2821,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
index 934ebcc495..c188bb00f2 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
@@ -2854,6 +2854,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
index 7dee513a82..f68077e425 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
@@ -2575,6 +2575,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
index 1f733560b9..8aa9ce6859 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
@@ -2889,6 +2889,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
index 6f0799c25a..3a1e55073a 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
@@ -2432,6 +2432,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
index c359dc2b29..312b0860b3 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
@@ -2632,6 +2632,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
index c49704b77e..b702ceb160 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
@@ -2819,6 +2819,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
index 389a451762..0d9f2c0ea7 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
@@ -2612,6 +2612,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/sh/be/libc.abilist b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
index 01adcabee8..a99bd972e5 100644
--- a/sysdeps/unix/sysv/linux/sh/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
@@ -2662,6 +2662,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/sh/le/libc.abilist b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
index 83a4359dab..76fdafd7df 100644
--- a/sysdeps/unix/sysv/linux/sh/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
@@ -2659,6 +2659,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
index ba2f588fc1..9201f21b4e 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
@@ -2814,6 +2814,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
index 2e63d242c6..5337df989d 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
@@ -2627,6 +2627,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/sys/pidfd.h b/sysdeps/unix/sysv/linux/sys/pidfd.h
index 342e593288..3e6d009ce7 100644
--- a/sysdeps/unix/sysv/linux/sys/pidfd.h
+++ b/sysdeps/unix/sysv/linux/sys/pidfd.h
@@ -46,4 +46,23 @@ extern int pidfd_getfd (int __pidfd, int __targetfd,
 extern int pidfd_send_signal (int __pidfd, int __sig, siginfo_t *__info,
 			      unsigned int __flags) __THROW;
 
+
+/* Do not issue the pthread_atfork on pidfd_fork.  */
+#define PIDFDFORK_ASYNCSAFE (1U << 1)
+
+/* Clone the calling process, creating an exact copy and return a file
+   descriptor that can be used along other pidfd functions.
+
+   THE __CGROUP can be used to specify a different cgroup2 than the default
+   one.  This is done with the CLONE_INTO_CGROUP clone3 flag, and passing an
+   value -1 disables it.  If clone3 is not supported the call will fail.
+
+   The __FLAGS can be used to specify whether to run pthread_atfork handlers
+   and reset internal states.  The default is to run it, similar to fork.
+
+   Return -1 for errors, 0 to the new process, and the process ID of the new
+   process to the parent process.  */
+extern pid_t pidfd_fork (int *__pidfd, int __cgroup, unsigned int __flags)
+  __THROW;
+
 #endif /* _PIDFD_H  */
diff --git a/sysdeps/unix/sysv/linux/tst-pidfd_fork-cgroup.c b/sysdeps/unix/sysv/linux/tst-pidfd_fork-cgroup.c
new file mode 100644
index 0000000000..124a70d7f5
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-pidfd_fork-cgroup.c
@@ -0,0 +1,162 @@
+/* pidfd_fork test using cgroupsv2.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <errno.h>
+#include <sched.h>
+#include <stdlib.h>
+#include <string.h>
+#include <support/check.h>
+#include <support/support.h>
+#include <support/xstdio.h>
+#include <support/xunistd.h>
+#include <support/temp_file.h>
+#include <sys/pidfd.h>
+#include <sys/vfs.h>
+#include <sys/wait.h>
+
+#include <dirent.h>
+
+#define CGROUPFS "/sys/fs/cgroup/"
+#ifndef CGROUP2_SUPER_MAGIC
+# define CGROUP2_SUPER_MAGIC 0x63677270
+#endif
+
+#define F_TYPE_EQUAL(a, b) (a == (typeof(a)) b)
+
+static inline char *
+startswith(const char *s, const char *prefix)
+{
+  size_t l = strlen (prefix);
+  if (strncmp (s, prefix, l) == 0)
+    return (char*) s + l;
+  return NULL;
+}
+
+static char *
+get_cgroup (void)
+{
+  FILE *f = xfopen ("/proc/self/cgroup", "re");
+
+  char *cgroup = NULL;
+
+  char *line = NULL;
+  size_t linesiz = 0;
+  while (xgetline (&line, &linesiz, f) > 0)
+    {
+      char *entry = startswith (line, "0:");
+      if (entry == NULL)
+	continue;
+
+      entry = strchr (entry, ':');
+      if (entry == NULL)
+	continue;
+
+      cgroup = entry + 1;
+      size_t l = strlen (cgroup);
+      if (cgroup[l - 1] == '\n')
+	cgroup[l - 1] = '\0';
+
+      cgroup = xstrdup (entry + 1);
+      break;
+    }
+
+  xfclose (f);
+  free (line);
+  
+  return cgroup;
+}
+
+static int
+do_test (void)
+{
+  struct statfs fs;
+  if (statfs (CGROUPFS, &fs) < 0)
+    {
+      if (errno == ENOENT)
+	FAIL_UNSUPPORTED ("not cgroupv2 mount found");
+      FAIL_EXIT1 ("statfs (%s): %m\n", CGROUPFS);
+    }
+
+  if (!F_TYPE_EQUAL (fs.f_type, CGROUP2_SUPER_MAGIC))
+    FAIL_UNSUPPORTED ("%s is not a cgroupv2", CGROUPFS);
+
+  char *cgroup = get_cgroup ();
+  TEST_VERIFY_EXIT (cgroup != NULL);
+  char *newcgroup = xasprintf ("%s/%s", cgroup, "test-pidfd_fork-cgroup");
+  char *cgpath = xasprintf ("%s%s/test-pidfd_fork-cgroup", CGROUPFS, cgroup);
+  free (cgroup);
+
+  if (mkdir (cgpath, 0755) == -1 && errno != EEXIST)
+    {
+      if (errno == EACCES || errno == EPERM)
+	FAIL_UNSUPPORTED ("can not create a new cgroupv2 group");
+      FAIL_EXIT1 ("mkdir (%s): %m", cgpath);
+    }
+  add_temp_file (cgpath);
+
+  int dfd = xopen (cgpath, O_DIRECTORY | O_RDONLY | O_CLOEXEC, 0666);
+
+  /* Check if the cgroup used at creation is the same returned by the kernel
+     and not as the parent.  */
+  {
+    pid_t pid = pidfd_fork (NULL, dfd, 0);
+    if (pid == -1 && errno == ENOSYS)
+      FAIL_UNSUPPORTED ("kernel does not support CLONE_PIDFD clone flag");
+    TEST_VERIFY_EXIT (pid != -1);
+    if (pid == 0)
+      {
+	char *child_cgroup = get_cgroup ();
+	TEST_VERIFY_EXIT (child_cgroup != NULL);
+	TEST_COMPARE_STRING (newcgroup, child_cgroup);
+	_exit (EXIT_SUCCESS);
+      }
+
+    siginfo_t sinfo;
+    TEST_COMPARE (waitid (P_PID, pid, &sinfo, WEXITED), 0);
+    TEST_COMPARE (sinfo.si_signo, SIGCHLD);
+    TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+    TEST_COMPARE (sinfo.si_status, 0);
+  }
+
+  /* Same as before, but also check along with process file descriptor.  */
+  {
+    int pidfd;
+    pid_t pid = pidfd_fork (&pidfd, dfd, 0);
+    TEST_VERIFY_EXIT (pid != -1);
+    if (pid == 0)
+      {
+	char *child_cgroup = get_cgroup ();
+	TEST_VERIFY_EXIT (child_cgroup != NULL);
+	TEST_COMPARE_STRING (newcgroup, child_cgroup);
+	_exit (EXIT_SUCCESS);
+      }
+
+    siginfo_t sinfo;
+    TEST_COMPARE (waitid (P_PIDFD, pidfd, &sinfo, WEXITED), 0);
+    TEST_COMPARE (sinfo.si_signo, SIGCHLD);
+    TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+    TEST_COMPARE (sinfo.si_status, 0);
+  }
+
+  free (cgpath);
+  free (newcgroup);
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/unix/sysv/linux/tst-pidfd_fork.c b/sysdeps/unix/sysv/linux/tst-pidfd_fork.c
new file mode 100644
index 0000000000..3e09c55d54
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-pidfd_fork.c
@@ -0,0 +1,186 @@
+/* Basic tests for pidfd_fork.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <array_length.h>
+#include <errno.h>
+#include <pthread.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <support/check.h>
+#include <support/temp_file.h>
+#include <support/xunistd.h>
+#include <sys/pidfd.h>
+#include <sys/wait.h>
+
+#define SIG_PID_EXIT_CODE 20
+
+static bool atfork_prepare_var;
+static bool atfork_parent_var;
+static bool atfork_child_var;
+
+static void
+atfork_prepare (void)
+{
+  atfork_prepare_var = true;
+}
+
+static void
+atfork_parent (void)
+{
+  atfork_parent_var = true;
+}
+
+static void
+atfork_child (void)
+{
+  atfork_child_var = true;
+}
+
+static int
+singlethread_test (unsigned int flags, bool wait_with_pid)
+{
+  const char testdata1[] = "abcdefghijklmnopqrtuvwxz";
+  enum { testdatalen1 = array_length (testdata1) };
+  const char testdata2[] = "01234567890";
+  enum { testdatalen2 = array_length (testdata2) };
+
+  pid_t ppid = getpid ();
+
+  int tempfd = create_temp_file ("tst-pidfd_fork", NULL);
+
+  /* Check if the opened file is shared between process by read and write
+     some data on parent and child processes.  */
+  xwrite (tempfd, testdata1, testdatalen1);
+  off_t off = xlseek (tempfd, 0, SEEK_CUR);
+  TEST_COMPARE (off, testdatalen1);
+
+  int pidfd;
+  pid_t pid = pidfd_fork (&pidfd, -1, flags);
+  TEST_VERIFY_EXIT (pid != -1);
+
+  if (pid == 0)
+    {
+      if (flags & PIDFDFORK_ASYNCSAFE)
+	TEST_VERIFY (!atfork_child_var);
+      else
+	TEST_VERIFY (atfork_child_var);
+
+      TEST_VERIFY_EXIT (getpid () != ppid);
+      TEST_COMPARE (getppid(), ppid);
+
+      TEST_COMPARE (xlseek (tempfd, 0, SEEK_CUR), testdatalen1);
+
+      xlseek (tempfd, 0, SEEK_SET);
+      char buf[testdatalen1];
+      TEST_COMPARE (read (tempfd, buf, sizeof (buf)), testdatalen1);
+      TEST_COMPARE_BLOB (buf, testdatalen1, testdata1, testdatalen1);
+
+      xlseek (tempfd, 0, SEEK_SET);
+      xwrite (tempfd, testdata2, testdatalen2);
+
+      xclose (tempfd);
+
+      _exit (EXIT_SUCCESS);
+    }
+
+  {
+    siginfo_t sinfo;
+    if (wait_with_pid)
+      TEST_COMPARE (waitid (P_PID, pid, &sinfo, WEXITED), 0);
+    else
+      TEST_COMPARE (waitid (P_PIDFD, pidfd, &sinfo, WEXITED), 0);
+    TEST_COMPARE (sinfo.si_signo, SIGCHLD);
+    TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+    TEST_COMPARE (sinfo.si_status, 0);
+  }
+
+  TEST_COMPARE (xlseek (tempfd, 0, SEEK_CUR), testdatalen2);
+
+  xlseek (tempfd, 0, SEEK_SET);
+  char buf[testdatalen2];
+  TEST_COMPARE (read (tempfd, buf, sizeof (buf)), testdatalen2);
+
+  TEST_COMPARE_BLOB (buf, testdatalen2, testdata2, testdatalen2);
+
+  return 0;
+}
+
+static int
+do_test (void)
+{
+  /* Sanity check for pidfd support and check if passing NULL as the argument
+     make pidfd_fork acts as fork.  */
+  {
+    pid_t pid = pidfd_fork (NULL, -1, 0);
+    if (pid == -1 && errno == ENOSYS)
+      FAIL_UNSUPPORTED ("kernel does not support CLONE_PIDFD clone flag");
+    TEST_VERIFY_EXIT (pid != -1);
+    if (pid == 0)
+      _exit (EXIT_SUCCESS);
+
+    siginfo_t sinfo;
+    TEST_COMPARE (waitid (P_PID, pid, &sinfo, WEXITED), 0);
+    TEST_COMPARE (sinfo.si_signo, SIGCHLD);
+    TEST_COMPARE (sinfo.si_code, CLD_EXITED);
+    TEST_COMPARE (sinfo.si_status, 0);
+  }
+
+  pthread_atfork (atfork_prepare, atfork_parent, atfork_child);
+
+  /* With default flags, pidfd_fork acts as fork and run the pthread_atfork
+     handlers.  */
+  {
+    atfork_prepare_var = atfork_parent_var = atfork_child_var = false;
+    singlethread_test (0, false);
+    TEST_VERIFY (atfork_prepare_var);
+    TEST_VERIFY (atfork_parent_var);
+    TEST_VERIFY (!atfork_child_var);
+  }
+
+  /* Same as before, but also wait using the PID instead of pidfd.  */
+  {
+    atfork_prepare_var = atfork_parent_var = atfork_child_var = false;
+    singlethread_test (0, true);
+    TEST_VERIFY (atfork_prepare_var);
+    TEST_VERIFY (atfork_parent_var);
+    TEST_VERIFY (!atfork_child_var);
+  }
+
+  /* With PIDFDFORK_ASYNCSAFE, pidfd_fork acts as _Fork.  */
+  {
+    atfork_prepare_var = atfork_parent_var = atfork_child_var = false;
+    pthread_atfork (atfork_prepare, atfork_parent, atfork_child);
+    singlethread_test (PIDFDFORK_ASYNCSAFE, false);
+    TEST_VERIFY (!atfork_prepare_var);
+    TEST_VERIFY (!atfork_parent_var);
+    TEST_VERIFY (!atfork_child_var);
+  }
+
+  {
+    atfork_prepare_var = atfork_parent_var = atfork_child_var = false;
+    pthread_atfork (atfork_prepare, atfork_parent, atfork_child);
+    singlethread_test (PIDFDFORK_ASYNCSAFE, true);
+    TEST_VERIFY (!atfork_prepare_var);
+    TEST_VERIFY (!atfork_parent_var);
+    TEST_VERIFY (!atfork_child_var);
+  }
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
index f7d5b23888..fa0ffd975f 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
@@ -2578,6 +2578,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
index ddc1f04eb1..cf4d2b2573 100644
--- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
@@ -2684,6 +2684,7 @@ GLIBC_2.38 __strlcat_chk F
 GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
+GLIBC_2.38 pidfd_fork F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 4/5] posix: Add PIDFDFORK_NOSIGCHLD for pidfd_fork
  2023-07-05 20:43 [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Adhemerval Zanella
                   ` (2 preceding siblings ...)
  2023-07-05 20:43 ` [PATCH v6 3/5] posix: Add pidfd_fork (BZ 26371) Adhemerval Zanella
@ 2023-07-05 20:43 ` Adhemerval Zanella
  2023-07-05 20:43 ` [PATCH v6 5/5] linux: Add pidfd_getpid Adhemerval Zanella
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Adhemerval Zanella @ 2023-07-05 20:43 UTC (permalink / raw)
  To: libc-alpha, Luca Boccassi, Philip Withnall

It clones the process without setting SIGCHLD as the termination
signal.  When using this flag, the parent process must specify the
__WALL or __WCLONE Linux specific options when waiting for the child
with wait or waitid.

Checked on x86_64-linux-gnu and i686-linux-gnu.
---
 include/clone_internal.h                 |  3 +-
 manual/process.texi                      |  6 ++++
 sysdeps/nptl/_Fork.c                     |  2 +-
 sysdeps/unix/sysv/linux/arch-fork.h      |  2 +-
 sysdeps/unix/sysv/linux/clone-internal.c | 10 ++++--
 sysdeps/unix/sysv/linux/pidfd_fork.c     | 13 +++----
 sysdeps/unix/sysv/linux/sys/pidfd.h      |  2 ++
 sysdeps/unix/sysv/linux/tst-pidfd_fork.c | 45 ++++++++++++++++++++++--
 8 files changed, 69 insertions(+), 14 deletions(-)

diff --git a/include/clone_internal.h b/include/clone_internal.h
index d9b5509f78..4ec0c9198f 100644
--- a/include/clone_internal.h
+++ b/include/clone_internal.h
@@ -48,7 +48,8 @@ extern int __clone_internal_fallback (struct clone_args *__cl_args,
 
    It does not provide CLONE_INTO_CGROUP/CGROUP fallback if clone3 is not
    supported, in this case the function returns -1/ENOTSUP.  */
-extern int __clone_fork (uint64_t __extra_flags, void *__pidfd, int __cgroup)
+extern int __clone_fork (uint64_t __extra_flags, void *__pidfd, int __cgroup,
+			 bool nosigchld)
      attribute_hidden;
 
 /* Return whether the kernel supports pid file descriptor, including clone
diff --git a/manual/process.texi b/manual/process.texi
index a656df425b..c60701aeb8 100644
--- a/manual/process.texi
+++ b/manual/process.texi
@@ -390,6 +390,12 @@ following flags:
 Acts as @code{_Fork}, where it does not invoke any callbacks registered with
 @code{pthread_atfork}, nor does it reset internal state or locks (such as the
 @code{malloc} locks).
+
+@item PIDFDFORK_NOSIGCHLD
+Do not send a @code{SIGCHLD} termination signal when child terminates.
+@strong{NB:} When using this flag, the parent process must specify the
+@code{__WALL} or @code{__WCLONE} options when waiting for the child with
+@code{wait} or @code{waitid}.
 @end table
 @end deftypefun
 
diff --git a/sysdeps/nptl/_Fork.c b/sysdeps/nptl/_Fork.c
index aa99e05b5b..397f059fb0 100644
--- a/sysdeps/nptl/_Fork.c
+++ b/sysdeps/nptl/_Fork.c
@@ -22,7 +22,7 @@
 pid_t
 _Fork (void)
 {
-  pid_t pid = arch_fork (0, NULL, &THREAD_SELF->tid);
+  pid_t pid = arch_fork (SIGCHLD, NULL, &THREAD_SELF->tid);
   if (pid == 0)
     {
       struct pthread *self = THREAD_SELF;
diff --git a/sysdeps/unix/sysv/linux/arch-fork.h b/sysdeps/unix/sysv/linux/arch-fork.h
index 9e8a449e2c..f978d4c4f4 100644
--- a/sysdeps/unix/sysv/linux/arch-fork.h
+++ b/sysdeps/unix/sysv/linux/arch-fork.h
@@ -35,7 +35,7 @@ static inline pid_t
 arch_fork (int flags, void *ptid, void *ctid)
 {
   long int ret;
-  flags |= CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD;
+  flags |= CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID;
 #ifdef __ASSUME_CLONE_BACKWARDS
 # ifdef INLINE_CLONE_SYSCALL
   ret = INLINE_CLONE_SYSCALL (flags, 0, ptid, 0, ctid);
diff --git a/sysdeps/unix/sysv/linux/clone-internal.c b/sysdeps/unix/sysv/linux/clone-internal.c
index 6f5d65d98c..49916bb15f 100644
--- a/sysdeps/unix/sysv/linux/clone-internal.c
+++ b/sysdeps/unix/sysv/linux/clone-internal.c
@@ -125,7 +125,7 @@ __clone_internal (struct clone_args *cl_args,
 libc_hidden_def (__clone_internal)
 
 int
-__clone_fork (uint64_t extra_flags, void *pidfd, int cgroup)
+__clone_fork (uint64_t extra_flags, void *pidfd, int cgroup, bool nosigchld)
 {
 #ifdef __NR_clone3
   struct clone_args clone_args =
@@ -133,7 +133,7 @@ __clone_fork (uint64_t extra_flags, void *pidfd, int cgroup)
       .flags = extra_flags
 	       | CLONE_CHILD_SETTID
 	       | CLONE_CHILD_CLEARTID,
-      .exit_signal = SIGCHLD,
+      .exit_signal = nosigchld ? 0 : SIGCHLD,
       .cgroup = cgroup,
       .child_tid = (uintptr_t) &THREAD_SELF->tid,
       .pidfd = (uintptr_t) pidfd,
@@ -161,7 +161,11 @@ __clone_fork (uint64_t extra_flags, void *pidfd, int cgroup)
   bool use_pidfd = pidfd != NULL;
 
   if (!set_cgroup)
-    pid = arch_fork (use_pidfd ? CLONE_PIDFD : 0, pidfd, &THREAD_SELF->tid);
+    {
+      int extra_flags = use_pidfd ? CLONE_PIDFD : 0
+			| (nosigchld ? 0 : SIGCHLD);
+      pid = arch_fork (extra_flags, pidfd, &THREAD_SELF->tid);
+    }
   else
     {
       /* No fallback for POSIX_SPAWN_SETCGROUP if clone3 is not supported.  */
diff --git a/sysdeps/unix/sysv/linux/pidfd_fork.c b/sysdeps/unix/sysv/linux/pidfd_fork.c
index 983f8ade98..f3b6b74375 100644
--- a/sysdeps/unix/sysv/linux/pidfd_fork.c
+++ b/sysdeps/unix/sysv/linux/pidfd_fork.c
@@ -22,7 +22,7 @@
 #include <sys/pidfd.h>
 
 static pid_t
-forkfd (int *pidfd, int cgroup)
+forkfd (int *pidfd, int cgroup, bool nosigchld)
 {
   bool use_pidfd = pidfd != NULL;
   bool set_cgroup = cgroup != -1;
@@ -30,8 +30,7 @@ forkfd (int *pidfd, int cgroup)
   uint64_t extra_flags = (use_pidfd ? CLONE_PIDFD : 0)
 			 | (set_cgroup ? CLONE_INTO_CGROUP : 0);
   pid_t pid = __clone_fork (extra_flags, use_pidfd ? pidfd : NULL,
-			    set_cgroup ? cgroup: 0);
-
+			    set_cgroup ? cgroup: 0, nosigchld);
   if (pid == 0)
     {
       struct pthread *self = THREAD_SELF;
@@ -54,9 +53,11 @@ pidfd_fork (int *pidfd, int cgroup, unsigned int flags)
   if (!__clone_pidfd_supported ())
     return INLINE_SYSCALL_ERROR_RETURN_VALUE (ENOSYS);
 
-  if (flags & ~(PIDFDFORK_ASYNCSAFE))
+  if (flags & ~(PIDFDFORK_ASYNCSAFE | PIDFDFORK_NOSIGCHLD))
     return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL);
 
+  bool nosigchld = flags & PIDFDFORK_NOSIGCHLD;
+
   pid_t pid;
   if (!(flags & PIDFDFORK_ASYNCSAFE))
     {
@@ -67,7 +68,7 @@ pidfd_fork (int *pidfd, int cgroup, unsigned int flags)
       struct nss_database_data nss_database_data;
 
       state.lastrun = __fork_pre (multiple_threads, &nss_database_data);
-      state.pid = forkfd (pidfd, cgroup);
+      state.pid = forkfd (pidfd, cgroup, nosigchld);
       /* It follow the usual fork semantic, where a positive or negative
 	 value is returned to parent, and 0 for the child.  */
       __fork_post (&state, &nss_database_data);
@@ -75,7 +76,7 @@ pidfd_fork (int *pidfd, int cgroup, unsigned int flags)
       pid = state.pid;
     }
   else
-    pid = forkfd (pidfd, cgroup);
+    pid = forkfd (pidfd, cgroup, nosigchld);
 
   return pid;
 }
diff --git a/sysdeps/unix/sysv/linux/sys/pidfd.h b/sysdeps/unix/sysv/linux/sys/pidfd.h
index 3e6d009ce7..87095212a7 100644
--- a/sysdeps/unix/sysv/linux/sys/pidfd.h
+++ b/sysdeps/unix/sysv/linux/sys/pidfd.h
@@ -49,6 +49,8 @@ extern int pidfd_send_signal (int __pidfd, int __sig, siginfo_t *__info,
 
 /* Do not issue the pthread_atfork on pidfd_fork.  */
 #define PIDFDFORK_ASYNCSAFE (1U << 1)
+/* Do not send a SIGCHLD termination signal.  */
+#define PIDFDFORK_NOSIGCHLD (1U << 2)
 
 /* Clone the calling process, creating an exact copy and return a file
    descriptor that can be used along other pidfd functions.
diff --git a/sysdeps/unix/sysv/linux/tst-pidfd_fork.c b/sysdeps/unix/sysv/linux/tst-pidfd_fork.c
index 3e09c55d54..ee3a72ba5d 100644
--- a/sysdeps/unix/sysv/linux/tst-pidfd_fork.c
+++ b/sysdeps/unix/sysv/linux/tst-pidfd_fork.c
@@ -24,6 +24,7 @@
 #include <support/check.h>
 #include <support/temp_file.h>
 #include <support/xunistd.h>
+#include <support/xsignal.h>
 #include <sys/pidfd.h>
 #include <sys/wait.h>
 
@@ -33,6 +34,14 @@ static bool atfork_prepare_var;
 static bool atfork_parent_var;
 static bool atfork_child_var;
 
+static sig_atomic_t sigchld_called;
+
+static void
+sigchld_handler (int sig)
+{
+  sigchld_called = 1;
+}
+
 static void
 atfork_prepare (void)
 {
@@ -69,6 +78,9 @@ singlethread_test (unsigned int flags, bool wait_with_pid)
   off_t off = xlseek (tempfd, 0, SEEK_CUR);
   TEST_COMPARE (off, testdatalen1);
 
+  bool check_nosigchld = flags & PIDFDFORK_NOSIGCHLD;
+  sigchld_called = 0;
+
   int pidfd;
   pid_t pid = pidfd_fork (&pidfd, -1, flags);
   TEST_VERIFY_EXIT (pid != -1);
@@ -100,13 +112,18 @@ singlethread_test (unsigned int flags, bool wait_with_pid)
 
   {
     siginfo_t sinfo;
+    int options = WEXITED | (check_nosigchld ? __WCLONE : 0);
     if (wait_with_pid)
-      TEST_COMPARE (waitid (P_PID, pid, &sinfo, WEXITED), 0);
+      TEST_COMPARE (waitid (P_PID, pid, &sinfo, options), 0);
     else
-      TEST_COMPARE (waitid (P_PIDFD, pidfd, &sinfo, WEXITED), 0);
+      TEST_COMPARE (waitid (P_PIDFD, pidfd, &sinfo, options), 0);
     TEST_COMPARE (sinfo.si_signo, SIGCHLD);
     TEST_COMPARE (sinfo.si_code, CLD_EXITED);
     TEST_COMPARE (sinfo.si_status, 0);
+
+    /* If PIDFDFORK_NOSIGCHLD is specified no SIGCHLD should be sent by the
+       kernel.  */
+    TEST_COMPARE (sigchld_called, check_nosigchld ? 0 : 1);
   }
 
   TEST_COMPARE (xlseek (tempfd, 0, SEEK_CUR), testdatalen2);
@@ -140,6 +157,14 @@ do_test (void)
     TEST_COMPARE (sinfo.si_status, 0);
   }
 
+  {
+    struct sigaction sa;
+    sa.sa_handler = sigchld_handler;
+    sa.sa_flags = 0;
+    sigemptyset (&sa.sa_mask);
+    xsigaction (SIGCHLD, &sa, NULL);
+  }
+
   pthread_atfork (atfork_prepare, atfork_parent, atfork_child);
 
   /* With default flags, pidfd_fork acts as fork and run the pthread_atfork
@@ -161,6 +186,14 @@ do_test (void)
     TEST_VERIFY (!atfork_child_var);
   }
 
+  {
+    atfork_prepare_var = atfork_parent_var = atfork_child_var = false;
+    singlethread_test (PIDFDFORK_NOSIGCHLD, false);
+    TEST_VERIFY (atfork_prepare_var);
+    TEST_VERIFY (atfork_parent_var);
+    TEST_VERIFY (!atfork_child_var);
+  }
+
   /* With PIDFDFORK_ASYNCSAFE, pidfd_fork acts as _Fork.  */
   {
     atfork_prepare_var = atfork_parent_var = atfork_child_var = false;
@@ -180,6 +213,14 @@ do_test (void)
     TEST_VERIFY (!atfork_child_var);
   }
 
+  {
+    atfork_prepare_var = atfork_parent_var = atfork_child_var = false;
+    singlethread_test (PIDFDFORK_NOSIGCHLD | PIDFDFORK_ASYNCSAFE, true);
+    TEST_VERIFY (!atfork_prepare_var);
+    TEST_VERIFY (!atfork_parent_var);
+    TEST_VERIFY (!atfork_child_var);
+  }
+
   return 0;
 }
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 5/5] linux: Add pidfd_getpid
  2023-07-05 20:43 [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Adhemerval Zanella
                   ` (3 preceding siblings ...)
  2023-07-05 20:43 ` [PATCH v6 4/5] posix: Add PIDFDFORK_NOSIGCHLD for pidfd_fork Adhemerval Zanella
@ 2023-07-05 20:43 ` Adhemerval Zanella
  2023-07-05 20:56 ` [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Luca Boccassi
  2023-07-06 12:35 ` Adhemerval Zanella Netto
  6 siblings, 0 replies; 8+ messages in thread
From: Adhemerval Zanella @ 2023-07-05 20:43 UTC (permalink / raw)
  To: libc-alpha, Luca Boccassi, Philip Withnall

This interface allows to obtain the associated process ID from the
process file descriptor.  It is done by parsing the procps fdinfo
information.  Its prototype is:

   pid_t pidfd_getpid (int fd)

It returns the associated pid or -1 in case of an error and sets the
errno accordingly.  The possible errno values are those from open,
read, and close (used on procps parsing), along with:

   - EBADF if the FD is negative, does not have a PID associatedi, or
     if the fdinfo fields contains a value larger than pid_t.

   - EREMOTE if the PID is in a separate namespace.

   - ESRCH if the process is already terminated.

Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PID or waitid
support), Linux 5.15 (only clone support), and Linux 5.19 (full
support including clone3).
---
 NEWS                                          |   4 +
 manual/process.texi                           |  31 +++
 sysdeps/unix/sysv/linux/Makefile              |   3 +
 sysdeps/unix/sysv/linux/Versions              |   1 +
 sysdeps/unix/sysv/linux/aarch64/libc.abilist  |   1 +
 sysdeps/unix/sysv/linux/alpha/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/arc/libc.abilist      |   1 +
 sysdeps/unix/sysv/linux/arm/be/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/arm/le/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/csky/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/hppa/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/i386/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/ia64/libc.abilist     |   1 +
 .../sysv/linux/loongarch/lp64/libc.abilist    |   1 +
 .../sysv/linux/m68k/coldfire/libc.abilist     |   1 +
 .../unix/sysv/linux/m68k/m680x0/libc.abilist  |   1 +
 .../sysv/linux/microblaze/be/libc.abilist     |   1 +
 .../sysv/linux/microblaze/le/libc.abilist     |   1 +
 .../sysv/linux/mips/mips32/fpu/libc.abilist   |   1 +
 .../sysv/linux/mips/mips32/nofpu/libc.abilist |   1 +
 .../sysv/linux/mips/mips64/n32/libc.abilist   |   1 +
 .../sysv/linux/mips/mips64/n64/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/nios2/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/or1k/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/pidfd_getpid.c        | 122 ++++++++++++
 .../linux/powerpc/powerpc32/fpu/libc.abilist  |   1 +
 .../powerpc/powerpc32/nofpu/libc.abilist      |   1 +
 .../linux/powerpc/powerpc64/be/libc.abilist   |   1 +
 .../linux/powerpc/powerpc64/le/libc.abilist   |   1 +
 sysdeps/unix/sysv/linux/procutils.c           | 104 ++++++++++
 sysdeps/unix/sysv/linux/procutils.h           |  35 ++++
 .../unix/sysv/linux/riscv/rv32/libc.abilist   |   1 +
 .../unix/sysv/linux/riscv/rv64/libc.abilist   |   1 +
 .../unix/sysv/linux/s390/s390-32/libc.abilist |   1 +
 .../unix/sysv/linux/s390/s390-64/libc.abilist |   1 +
 sysdeps/unix/sysv/linux/sh/be/libc.abilist    |   1 +
 sysdeps/unix/sysv/linux/sh/le/libc.abilist    |   1 +
 .../sysv/linux/sparc/sparc32/libc.abilist     |   1 +
 .../sysv/linux/sparc/sparc64/libc.abilist     |   1 +
 sysdeps/unix/sysv/linux/sys/pidfd.h           |   4 +
 sysdeps/unix/sysv/linux/tst-pidfd.c           |  47 +++++
 sysdeps/unix/sysv/linux/tst-pidfd_getpid.c    | 187 ++++++++++++++++++
 .../unix/sysv/linux/x86_64/64/libc.abilist    |   1 +
 .../unix/sysv/linux/x86_64/x32/libc.abilist   |   1 +
 44 files changed, 572 insertions(+)
 create mode 100644 sysdeps/unix/sysv/linux/pidfd_getpid.c
 create mode 100644 sysdeps/unix/sysv/linux/procutils.c
 create mode 100644 sysdeps/unix/sysv/linux/procutils.h
 create mode 100644 sysdeps/unix/sysv/linux/tst-pidfd_getpid.c

diff --git a/NEWS b/NEWS
index 462d8f511d..d8ba4ba62e 100644
--- a/NEWS
+++ b/NEWS
@@ -66,6 +66,10 @@ Major new features:
   of return a process ID, it returns a file descriptor that can be used
   along other pidfd functions.
 
+* On Linux, the pidfd_getpid function has been added.  It allows to retrieve
+  the process ID associated with process file descriptor created with
+  pid_spawn, pidfd_fork, or pidfd_open.
+
 Deprecated and removed features, and other changes affecting compatibility:
 
 * In the Linux kernel for the hppa/parisc architecture some of the
diff --git a/manual/process.texi b/manual/process.texi
index c60701aeb8..a74f316ddc 100644
--- a/manual/process.texi
+++ b/manual/process.texi
@@ -33,6 +33,7 @@ primitive functions to do each step individually instead.
 * Process Creation Concepts::   An overview of the hard way to do it.
 * Process Identification::      How to get the process ID of a process.
 * Creating a Process::          How to fork a child process.
+* Querying a Process::          How to query a child process.
 * Executing a File::            How to make a process execute another program.
 * Process Completion::          How to tell when a child process has completed.
 * Process Completion Status::   How to interpret the status value
@@ -401,6 +402,36 @@ Do not send a @code{SIGCHLD} termination signal when child terminates.
 
 This function is specific to Linux.
 
+@node Querying a Process
+@section Querying a Process
+
+The file descriptor returned by the @code{pidfd_fork} function can be used to
+query process extra information.
+
+@deftypefun pid_t pidfd_getpid (int @var{fd})
+@standards{GNU, sys/pidfd.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+
+The @code{pidfd_getpid} function retrieves the process ID associated with process
+file descriptor created with @code{pid_spawn}, @code{pidfd_fork}, or
+@code{pidfd_open}.
+
+If the operation fails, @code{pidfd_getpid} return @code{-1} and the following
+@code{errno} error conditionas are defined:
+
+@table @code
+@item EBADF
+The input file descriptor is invalid, does not have a pidfd associated, or an
+error has occurred parsing the kernel data.
+@item EREMOTE
+There is no process ID to denote the process in the current namespace.
+@item ESRCH
+The process for which the file descriptor refers to is terminated.
+@end table
+
+This function is specific to Linux.
+@end deftypefun
+
 @node Executing a File
 @section Executing a File
 @cindex executing a file
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 1fc6785f28..d01a6c0869 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -210,6 +210,7 @@ tests += \
   tst-ofdlocks \
   tst-personality \
   tst-pidfd \
+  tst-pidfd_getpid \
   tst-pkey \
   tst-ppoll \
   tst-prctl \
@@ -491,8 +492,10 @@ sysdep_routines += \
   getcpu \
   oldglob \
   pidfd_fork \
+  pidfd_getpid \
   pidfd_spawn \
   pidfd_spawnp \
+  procutils \
   sched_getcpu \
   spawnattr_getcgroup_np \
   spawnattr_setcgroup_np \
diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions
index e9eecfecc0..47a5fddaa4 100644
--- a/sysdeps/unix/sysv/linux/Versions
+++ b/sysdeps/unix/sysv/linux/Versions
@@ -325,6 +325,7 @@ libc {
     posix_spawnattr_getcgroup_np;
     posix_spawnattr_setcgroup_np;
     pidfd_fork;
+    pidfd_getpid;
     pidfd_spawn;
     pidfd_spawnp;
   }
diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
index e81e56f88c..8bfbb0b483 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
@@ -2670,6 +2670,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist
index 0640fd71b9..c75b2fb07a 100644
--- a/sysdeps/unix/sysv/linux/alpha/libc.abilist
+++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist
@@ -2779,6 +2779,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/arc/libc.abilist b/sysdeps/unix/sysv/linux/arc/libc.abilist
index 4c9dd624ca..f998cb5bde 100644
--- a/sysdeps/unix/sysv/linux/arc/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arc/libc.abilist
@@ -2431,6 +2431,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/arm/be/libc.abilist b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
index e45af835ff..6ae97d8d73 100644
--- a/sysdeps/unix/sysv/linux/arm/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
@@ -551,6 +551,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/arm/le/libc.abilist b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
index 17abecc580..6c4b0092a6 100644
--- a/sysdeps/unix/sysv/linux/arm/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
@@ -548,6 +548,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/csky/libc.abilist b/sysdeps/unix/sysv/linux/csky/libc.abilist
index 360a60980d..9ad97fbc95 100644
--- a/sysdeps/unix/sysv/linux/csky/libc.abilist
+++ b/sysdeps/unix/sysv/linux/csky/libc.abilist
@@ -2707,6 +2707,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist
index 2aa63e2860..6e1e9b445b 100644
--- a/sysdeps/unix/sysv/linux/hppa/libc.abilist
+++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist
@@ -2656,6 +2656,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist
index 2ee3d027ac..0e17c4d7fc 100644
--- a/sysdeps/unix/sysv/linux/i386/libc.abilist
+++ b/sysdeps/unix/sysv/linux/i386/libc.abilist
@@ -2840,6 +2840,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist
index 262e0b3f59..1cd7bb2955 100644
--- a/sysdeps/unix/sysv/linux/ia64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist
@@ -2605,6 +2605,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist b/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
index 0e5b4da990..952d488956 100644
--- a/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
@@ -2191,6 +2191,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
index b33dc8f04a..a59dd88476 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
@@ -552,6 +552,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
index 0b8bfb07d3..f0bbc9bf6b 100644
--- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
@@ -2783,6 +2783,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
index d70ae3c2d3..25a8de52f8 100644
--- a/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/be/libc.abilist
@@ -2756,6 +2756,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
index c9dea106b8..03179cb364 100644
--- a/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/microblaze/le/libc.abilist
@@ -2753,6 +2753,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
index 542d9b464e..1730e758a2 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
@@ -2748,6 +2748,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
index 5839437940..8ec65b98be 100644
--- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
@@ -2746,6 +2746,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
index 3d5a63c979..25bb055593 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
@@ -2754,6 +2754,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
index 3aa747d1b8..5ffda82120 100644
--- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
@@ -2656,6 +2656,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist
index ed7da52383..40e37d4ebc 100644
--- a/sysdeps/unix/sysv/linux/nios2/libc.abilist
+++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist
@@ -2795,6 +2795,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/or1k/libc.abilist b/sysdeps/unix/sysv/linux/or1k/libc.abilist
index e75c55efa0..b2587a7ee2 100644
--- a/sysdeps/unix/sysv/linux/or1k/libc.abilist
+++ b/sysdeps/unix/sysv/linux/or1k/libc.abilist
@@ -2177,6 +2177,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/pidfd_getpid.c b/sysdeps/unix/sysv/linux/pidfd_getpid.c
new file mode 100644
index 0000000000..46848a5983
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/pidfd_getpid.c
@@ -0,0 +1,122 @@
+/* pidfd_getpid - Get the associated pid from the pid file descriptor.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <_itoa.h>
+#include <errno.h>
+#include <intprops.h>
+#include <procutils.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sysdep.h>
+#include <unistd.h>
+
+#define FDINFO_TO_FILENAME_PREFIX "/proc/self/fdinfo/"
+
+#define FDINFO_FILENAME_LEN \
+  (sizeof (FDINFO_TO_FILENAME_PREFIX) + INT_STRLEN_BOUND (int))
+
+struct parse_fdinfo_t
+{
+  bool found;
+  pid_t pid;
+};
+
+/* Parse the PID field in the fdinfo entry, if existent.  Avoid strtol or
+   similar to not be locale dependent.  */
+static int
+parse_fdinfo (const char *l, void *arg)
+{
+  enum { fieldlen = sizeof ("Pid:") - 1 };
+  if (strncmp (l, "Pid:", fieldlen) != 0)
+    return 0;
+
+  l += fieldlen;
+
+  /* Skip leading spaces.  */
+  while (*l == ' ' || (unsigned int)(*l) -'\t' < 5)
+    l++;
+
+  bool neg = false;
+  switch (*l)
+    {
+    case '-': neg = true;
+    case '+': l++;
+    }
+
+  if (*l == '\0')
+    return 0;
+
+  int n = 0;
+  while (*l != '\0')
+    {
+      /* Check if '*l' is a digit.  */
+      if ((unsigned int)(*l) - '0' >= 10)
+        return 0;
+
+      /* Ignore invalid large values.  */
+      if (INT_MULTIPLY_WRAPV (10, n, &n)
+          || INT_ADD_WRAPV (n, *l++ - '0', &n))
+        return 0;
+    }
+
+  /* -1 indicates that the process is terminated.  */
+  if (neg && n != 1)
+    return 0;
+
+  struct parse_fdinfo_t *fdinfo = arg;
+  fdinfo->pid = neg ? -n : n;
+  fdinfo->found = true;
+
+  return 1;
+}
+
+pid_t
+pidfd_getpid (int fd)
+{
+  if (__glibc_unlikely (fd < 0))
+    {
+      __set_errno (EBADF);
+      return -1;
+    }
+
+  char fdinfoname[FDINFO_FILENAME_LEN];
+
+  char *p = mempcpy (fdinfoname, FDINFO_TO_FILENAME_PREFIX,
+		     strlen (FDINFO_TO_FILENAME_PREFIX));
+  *_fitoa_word (fd, p, 10, 0) = '\0';
+
+  struct parse_fdinfo_t fdinfo = { .found = false, .pid = -1 };
+  if (procutils_read_file (fdinfoname, parse_fdinfo, &fdinfo) == -1)
+    /* The fdinfo contains an invalid 'Pid:' value.  */
+    return INLINE_SYSCALL_ERROR_RETURN_VALUE (EBADF);
+
+  /* The FD does not have a 'Pid:' entry associated.  */
+  if (!fdinfo.found)
+    return INLINE_SYSCALL_ERROR_RETURN_VALUE (EBADF);
+
+  /* The pidfd cannot be resolved because it is in a separate pid
+     namespace.  */
+  if (fdinfo.pid == 0)
+    return INLINE_SYSCALL_ERROR_RETURN_VALUE (EREMOTE);
+
+  /* A negative value means the process is terminated.  */
+  if (fdinfo.pid < 0)
+    return INLINE_SYSCALL_ERROR_RETURN_VALUE (ESRCH);
+
+  return fdinfo.pid;
+}
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
index 82eb6e1be0..569af00040 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
@@ -2822,6 +2822,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
index c188bb00f2..956e610cbc 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
@@ -2855,6 +2855,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
index f68077e425..61ffafe996 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist
@@ -2576,6 +2576,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
index 8aa9ce6859..062e4732b4 100644
--- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist
@@ -2890,6 +2890,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/procutils.c b/sysdeps/unix/sysv/linux/procutils.c
new file mode 100644
index 0000000000..83b327cb9a
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/procutils.c
@@ -0,0 +1,104 @@
+/* Utilities functions to read/parse Linux procfs and sysfs.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <not-cancel.h>
+#include <procutils.h>
+#include <string.h>
+
+static int
+next_line (char **r, int fd, char *const buffer, char **cp, char **re,
+           char *const buffer_end)
+{
+  char *res = *cp;
+  char *nl = memchr (*cp, '\n', *re - *cp);
+  if (nl == NULL)
+    {
+      if (*cp != buffer)
+        {
+          if (*re == buffer_end)
+            {
+              memmove (buffer, *cp, *re - *cp);
+              *re = buffer + (*re - *cp);
+              *cp = buffer;
+
+              ssize_t n = __read_nocancel (fd, *re, buffer_end - *re);
+              if (n < 0)
+                return -1;
+
+              *re += n;
+
+              nl = memchr (*cp, '\n', *re - *cp);
+              while (nl == NULL && *re == buffer_end)
+                {
+                  /* Truncate too long lines.  */
+                  *re = buffer + 3 * (buffer_end - buffer) / 4;
+                  n = __read_nocancel (fd, *re, buffer_end - *re);
+                  if (n < 0)
+                    return -1;
+
+                  nl = memchr (*re, '\n', n);
+                  **re = '\0';
+                  *re += n;
+                }
+            }
+          else
+            nl = memchr (*cp, '\n', *re - *cp);
+
+          res = *cp;
+        }
+
+      if (nl == NULL)
+        nl = *re - 1;
+    }
+
+  *nl = '\0';
+  *cp = nl + 1;
+  assert (*cp <= *re);
+
+  if (res == *re)
+    return 0;
+
+  *r = res;
+  return 1;
+}
+
+int
+procutils_read_file (const char *filename, procutils_closure_t closure,
+		     void *arg)
+{
+  enum { buffer_size = 1024 };
+  char buffer[buffer_size];
+  char *buffer_end = buffer + buffer_size;
+  char *cp = buffer_end;
+  char *re = buffer_end;
+
+  int fd = __open64_nocancel (filename, O_RDONLY | O_CLOEXEC);
+  if (fd == -1)
+    return -1;
+
+  char *l;
+  int r;
+  while ((r = next_line (&l, fd, buffer, &cp, &re, buffer_end)) > 0)
+    if (closure (l, arg) != 0)
+      break;
+
+  __close_nocancel_nostatus (fd);
+
+  return r;
+}
diff --git a/sysdeps/unix/sysv/linux/procutils.h b/sysdeps/unix/sysv/linux/procutils.h
new file mode 100644
index 0000000000..64e1080920
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/procutils.h
@@ -0,0 +1,35 @@
+/* Utilities functions to read/parse Linux procfs and sysfs.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _PROCUTILS_H
+#define _PROCUTILS_H
+
+typedef int (*procutils_closure_t)(const char *line, void *arg);
+
+/* Open and read the path FILENAME, line per line, and call CLOSURE with
+   argument ARG on each line.  The read is done with a static buffer,
+   with non-cancellable calls, and the line is null terminated.
+
+   The CLOSURE should return true if the read should continue, or false
+   if the function should stop.
+
+   It returns 0 in case of success, or -1 otherwise.  */
+int procutils_read_file (const char *filename, procutils_closure_t closure,
+			 void *arg) attribute_hidden;
+
+#endif
diff --git a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
index 3a1e55073a..95002b44c3 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist
@@ -2433,6 +2433,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
index 312b0860b3..118319ebc0 100644
--- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist
@@ -2633,6 +2633,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
index b702ceb160..2189782b92 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
@@ -2820,6 +2820,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
index 0d9f2c0ea7..0b59b52fba 100644
--- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
@@ -2613,6 +2613,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/sh/be/libc.abilist b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
index a99bd972e5..91507508ab 100644
--- a/sysdeps/unix/sysv/linux/sh/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/be/libc.abilist
@@ -2663,6 +2663,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/sh/le/libc.abilist b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
index 76fdafd7df..8e42578fe2 100644
--- a/sysdeps/unix/sysv/linux/sh/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sh/le/libc.abilist
@@ -2660,6 +2660,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
index 9201f21b4e..e93ea5511d 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
@@ -2815,6 +2815,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
index 5337df989d..ad6cc27fbf 100644
--- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
@@ -2628,6 +2628,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/sys/pidfd.h b/sysdeps/unix/sysv/linux/sys/pidfd.h
index 87095212a7..8cf4df6b81 100644
--- a/sysdeps/unix/sysv/linux/sys/pidfd.h
+++ b/sysdeps/unix/sysv/linux/sys/pidfd.h
@@ -67,4 +67,8 @@ extern int pidfd_send_signal (int __pidfd, int __sig, siginfo_t *__info,
 extern pid_t pidfd_fork (int *__pidfd, int __cgroup, unsigned int __flags)
   __THROW;
 
+/* Query the process ID (PID) from process descriptor __FD.  Return the PID
+   or -1 in case of an error.  */
+extern pid_t pidfd_getpid (int __fd) __THROW;
+
 #endif /* _PIDFD_H  */
diff --git a/sysdeps/unix/sysv/linux/tst-pidfd.c b/sysdeps/unix/sysv/linux/tst-pidfd.c
index 64d8a2ef40..53d223f702 100644
--- a/sysdeps/unix/sysv/linux/tst-pidfd.c
+++ b/sysdeps/unix/sysv/linux/tst-pidfd.c
@@ -18,6 +18,7 @@
 
 #include <errno.h>
 #include <fcntl.h>
+#include <limits.h>
 #include <support/capture_subprocess.h>
 #include <support/check.h>
 #include <support/process_state.h>
@@ -27,6 +28,9 @@
 #include <support/xsocket.h>
 #include <sys/pidfd.h>
 #include <sys/wait.h>
+#include <stdlib.h>
+
+#include <string.h>
 
 #define REMOTE_PATH "/dev/null"
 
@@ -102,6 +106,43 @@ do_test (void)
   ppid = getpid ();
   puid = getuid ();
 
+  /* Sanity check for invalid inputs.  */
+  TEST_COMPARE (pidfd_getpid (-1), -1);
+  TEST_COMPARE (errno, EBADF);
+
+  {
+    pid_t pid = pidfd_getpid (STDOUT_FILENO);
+    TEST_COMPARE (pid, -1);
+    TEST_COMPARE (errno, EBADF);
+  }
+
+  /* Check if pidfd_getpid returns ESRCH for exited subprocess.  */
+  {
+    int pidfd;
+    pid_t pidfork = pidfd_fork (&pidfd, -1, 0);
+    if (pidfork == 0)
+      _exit (EXIT_SUCCESS);
+
+    /* The process might be still running or already in zombie state, in any
+       case the PID is still allocated to the process.  */
+    pid_t pid = pidfd_getpid (pidfd);
+    if (pid > 0)
+      support_process_state_wait (pid, support_process_state_zombie);
+
+    siginfo_t info;
+    TEST_COMPARE (waitid (P_PIDFD, pidfd, &info, WEXITED), 0);
+    TEST_COMPARE (info.si_pid, pidfork);
+    TEST_COMPARE (info.si_status, 0);
+    TEST_COMPARE (info.si_code, CLD_EXITED);
+
+    /* Once the process is reaped the associated PID is not available.  */
+    pid = pidfd_getpid (pidfd);
+    TEST_COMPARE (pid, -1);
+    TEST_COMPARE (errno, ESRCH);
+
+    xclose (pidfd);
+  }
+
   TEST_COMPARE (socketpair (AF_UNIX, SOCK_STREAM, 0, sockets), 0);
 
   pid_t pid = xfork ();
@@ -118,6 +159,12 @@ do_test (void)
   int pidfd = pidfd_open (pid, 0);
   TEST_VERIFY (pidfd != -1);
 
+  TEST_COMPARE (pidfd_getpid (INT_MAX), -1);
+  {
+    pid_t querypid = pidfd_getpid (pidfd);
+    TEST_COMPARE (querypid, pid);
+  }
+
   /* Wait for first sigtimedwait.  */
   support_process_state_wait (pid, support_process_state_sleeping);
   TEST_COMPARE (pidfd_send_signal (pidfd, SIGUSR1, NULL, 0), 0);
diff --git a/sysdeps/unix/sysv/linux/tst-pidfd_getpid.c b/sysdeps/unix/sysv/linux/tst-pidfd_getpid.c
new file mode 100644
index 0000000000..41d03a04ad
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/tst-pidfd_getpid.c
@@ -0,0 +1,187 @@
+/* Specific tests for Linux pidfd_getpid.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <errno.h>
+#include <sched.h>
+#include <stdlib.h>
+#include <support/check.h>
+#include <support/xsocket.h>
+#include <support/xunistd.h>
+#include <support/test-driver.h>
+#include <sys/pidfd.h>
+#include <sys/wait.h>
+#include <sys/mount.h>
+#include <string.h>
+
+#include <stdio.h>
+
+static int sockfd[2];
+
+static void
+send_fd (const int sock, const int fd)
+{
+  union
+    {
+      struct cmsghdr hdr;
+      char buf[CMSG_SPACE (sizeof (int))];
+    } cmsgbuf = {0};
+  struct cmsghdr *cmsg;
+  char ch = 'A';
+  struct iovec vec =
+    {
+      .iov_base = &ch,
+      .iov_len = sizeof ch
+    };
+  struct msghdr msg =
+    {
+      .msg_control = &cmsgbuf.buf,
+      .msg_controllen = sizeof (cmsgbuf.buf),
+      .msg_iov = &vec,
+      .msg_iovlen = 1,
+    };
+
+  cmsg = CMSG_FIRSTHDR (&msg);
+  cmsg->cmsg_len = CMSG_LEN (sizeof (int));
+  cmsg->cmsg_level = SOL_SOCKET;
+  cmsg->cmsg_type = SCM_RIGHTS;
+  memcpy (CMSG_DATA (cmsg), &fd, sizeof (fd));
+
+  ssize_t n;
+  while ((n = sendmsg (sock, &msg, 0)) == -1 && errno == EINTR);
+
+  TEST_VERIFY_EXIT (n == 1);
+}
+
+static int
+recv_fd (const int sock)
+{
+  union
+    {
+      struct cmsghdr hdr;
+      char buf[CMSG_SPACE(sizeof(int))];
+    } cmsgbuf = {0};
+  struct cmsghdr *cmsg;
+  char ch = '\0';
+  struct iovec vec =
+    {
+      .iov_base = &ch,
+      .iov_len = sizeof ch
+    };
+  struct msghdr msg =
+    {
+      .msg_control = &cmsgbuf.buf,
+      .msg_controllen = sizeof (cmsgbuf.buf),
+      .msg_iov = &vec,
+      .msg_iovlen = 1,
+    };
+
+  ssize_t n;
+  while ((n = recvmsg (sock, &msg, 0)) == -1 && errno == EINTR);
+  if (n != 1 || ch != 'A')
+    return -1;
+
+  cmsg = CMSG_FIRSTHDR (&msg);
+  if (cmsg == NULL)
+    return -1;
+  if (cmsg->cmsg_type != SCM_RIGHTS)
+    return -1;
+
+  int fd = -1;
+  memcpy (&fd, CMSG_DATA (cmsg), sizeof (fd));
+  if (fd < 0)
+    return -1;
+  return fd;
+}
+
+static int
+do_test (void)
+{
+  {
+    /* The pidfd_getfd syscall was the last in the set of pidfd related
+       syscalls added to the kernel.  Use pidfd_getfd to decide if this
+       kernel has pidfd support that we can test.  */
+    int r = pidfd_getfd (0, 0, 1);
+    TEST_VERIFY_EXIT (r == -1);
+    if (errno == ENOSYS)
+      FAIL_UNSUPPORTED ("kernel does not support pidfd_getfd, skipping test");
+  }
+
+  TEST_VERIFY_EXIT (socketpair (AF_UNIX, SOCK_STREAM, 0, sockfd) == 0);
+
+  /* Check if pidfd_getpid returns EREMOTE for process not in current
+     namespace.  */
+  {
+    int pidfd;
+    pid_t pid = pidfd_fork (&pidfd, -1, 0);
+    TEST_VERIFY_EXIT (pid >= 0);
+    if (pid == 0)
+      {
+        if (unshare (CLONE_NEWNS | CLONE_NEWUSER | CLONE_NEWPID) < 0)
+	  {
+	    /* Older kernels may not support all the options, or security
+	       policy may block this call.  */
+	    if (errno == EINVAL || errno == EPERM || errno == ENOSPC)
+	      exit (EXIT_UNSUPPORTED);
+	    FAIL_EXIT1 ("unshare user/fs/pid failed: %m");
+	  }
+
+	TEST_VERIFY_EXIT (mount (NULL, "/", NULL, MS_REC | MS_PRIVATE, 0)
+			  == 0);
+
+	pid_t child = xfork ();
+	if (child > 0)
+	  {
+	    int status;
+	    xwaitpid (child, &status, 0);
+	    TEST_VERIFY (WIFEXITED (status));
+	    exit (WEXITSTATUS (status));
+	  }
+
+	/* Now that we're pid 1 (effectively "root") we can mount /proc  */
+	if (mount ("proc", "/proc", "proc", 0, NULL) != 0)
+	  /* This happens if we're trying to create a nested container,
+	     like if the build is running under podman, and we lack
+	     priviledges.  */
+	  {
+	    if (errno == EPERM)
+	      _exit (EXIT_UNSUPPORTED);
+	    else
+	      _exit (EXIT_FAILURE);
+	  }
+
+	int ppidfd = recv_fd (sockfd[0]);
+	TEST_COMPARE (pidfd_getpid (ppidfd), -1);
+	TEST_COMPARE (errno, EREMOTE);
+
+	_exit (EXIT_SUCCESS);
+      }
+
+    send_fd (sockfd[1], pidfd);
+
+    siginfo_t info;
+    TEST_COMPARE (waitid (P_PIDFD, pidfd, &info, WEXITED), 0);
+    if (info.si_status == EXIT_UNSUPPORTED)
+      FAIL_UNSUPPORTED ("unable to unshare user/fs/pid");
+    TEST_COMPARE (info.si_status, 0);
+    TEST_COMPARE (info.si_code, CLD_EXITED);
+  }
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
index fa0ffd975f..a3b77fe8fb 100644
--- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
@@ -2579,6 +2579,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
index cf4d2b2573..944d0bad53 100644
--- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
+++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist
@@ -2685,6 +2685,7 @@ GLIBC_2.38 __strlcpy_chk F
 GLIBC_2.38 __wcslcat_chk F
 GLIBC_2.38 __wcslcpy_chk F
 GLIBC_2.38 pidfd_fork F
+GLIBC_2.38 pidfd_getpid F
 GLIBC_2.38 pidfd_spawn F
 GLIBC_2.38 pidfd_spawnp F
 GLIBC_2.38 posix_spawnattr_getcgroup_np F
-- 
2.34.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation
  2023-07-05 20:43 [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Adhemerval Zanella
                   ` (4 preceding siblings ...)
  2023-07-05 20:43 ` [PATCH v6 5/5] linux: Add pidfd_getpid Adhemerval Zanella
@ 2023-07-05 20:56 ` Luca Boccassi
  2023-07-06 12:35 ` Adhemerval Zanella Netto
  6 siblings, 0 replies; 8+ messages in thread
From: Luca Boccassi @ 2023-07-05 20:56 UTC (permalink / raw)
  To: Adhemerval Zanella, libc-alpha

On Wed, 5 Jul 2023 at 21:43, Adhemerval Zanella
<adhemerval.zanella@linaro.org> wrote:
>
> Add pidfd and cgroupv2 support for process creation
>
> The glibc 2.36 added wrappers for Linux syscall pidfd_open,
> pidfd_getfd, and pidfd_send_signal, and exported the P_PIDFD to use
> along with waitid. The pidfd is a race free interface, however
> the pidfd_open is subject to TOCTOU if the file descriptor
> is not obtained directly from the clone or clone3 syscall (there is
> still a small window between the clone return and the pidfd_getfd
> where the process can be reaped and the process ID reused).
>
> A fully race free interface with posix_spawn interface is being
> discussed by GNOME [1] [2], and Qt already uses on its QtProcess
> implementation [3].  The Qt implementation has some pitfalls:
>
>   - It calls clone through the syscall symbol, which does not run the
>     pthread_atfork handlers even though it really intends to use the
>     clone semantic for fork (by only using CLONE_PIDFD | SIGCHLD).
>
>   - It also does not reset any internal state, such as internal IO,
>     malloc, loader, etc. locks.
>
>   - It does not set the TCB tid field nor the robust list, used by
>     pthread code.
>
>   - It does not optimize process creation by using CLONE_VM and
>     CLONE_VFORK.
>
> Also, recent Linux kernel (starting with 5.7) provide a way to
> create a new process in a different cgroups version 2 than the
> default one (through clone3 CLONE_INTO_CGROUP flag).  Providing it
> through glibc interfaces make is usable without the risk of potential
> breakage by issuing clone3 syscall directly (check BZ#26371 discussion).
>
> This patchset adds new interfaces that take care of this potential
> issues.  The new posix_spawn / posix_spawnp extesions:
>
>
>   #define POSIX_SPAWN_SETCGROUP 0x100
>
>   int posix_spawnattr_getcgroup_np (const posix_spawnattr_t
>                                     restrict *attr, int *cgroup);
>   int posix_spawnattr_setcgroup_np (posix_spawnattr_t *restrict attr,
>                                     int cgroup);
>
> Allow spawn a new process on a different cgroupv2.
>
> The pidfd_spawn and pidfd_spawnp is similar to posix_spawn and
> posix_spawnp,
> but return a process file descriptor instead of a PID.
>
>   int pidfd_spawn (int *restrict pidfd,
>                    const char *restrict file,
>                    const posix_spawn_file_actions_t *restrict facts,
>                    const posix_spawnattr_t *restrict attrp,
>                    char *const argv[restrict],
>                    char *const envp[restrict])
>
>   int pidfd_spawnp (int *restrict pidfd,
>                     const char *restrict path,
>                     const posix_spawn_file_actions_t *restrict facts,
>                     const posix_spawnattr_t *restrict attrp,
>                     char *const argv[restrict_arr],
>                     char *const envp[restrict_arr]);
>
> The implementation makes sure that kernel must support the complete
> pidfd interface, meaning that waitid (P_PIDFD) should be supported.  It
> ensure that non racy workaround is required (such as reading procfs
> fdinfo pid to use along with old wait interfaces).  If kernel does not
> have the required support the interface returns ENOSYS.
>
> A new symbol is used instead of a posix_spawn extension to avoid
> possible issue with language bindings that might track the argument
> lifetime.
>
> Both symbols reuse the posix_spawn posix_spawn_file_actions_t and
> posix_spawnattr_t, to either avoid rehash posix_spawn API or add a new
> one.  It also mean that both interfaces support the same attribute and
> file actions, and a new flag or file actions on posix_spawn is also
> added automatically for pidfd_spawn. It includes POSIX_SPAWN_SETCGROUP.
>
> Along with the spawn interface, a fork like one is also provided:
>
>   pid_t pidfd_fork (int *pidfd, int cgroup, unsigned int flags)
>
> If PIDFD is set to NULL, no file descriptor is returned and pidfd_fork
> acts as fork.  Otherwise, a new file descriptor is returned and the
> kernel already sets O_CLOEXEC as default.  The pidfd_fork follows
> fork/_Fork convention on returning a positive or negative value to the
> parent (with negative indicating an error) and zero to the child.
>
> If cgroup is 0 or positive value, it is interpreted as a different
> cgroup to be place the new process (check CLONE_INTO_CGROUP clone
> flag).
>
> The kernel already sets O_CLOEXEC as default and it follows fork/_Fork
> convention on returning a positive or negative value to the parent
> (with negative indicating an error) and zero to the child.
>
> Similar to fork, pidfd_fork also runs the pthread_atfork handlers
> It can be change by using PIDFDFORK_ASYNCSAFE flag, which make
> pidfd_fork acts a _Fork.  It also send SIGCHLD to parent when
> process terminates.
>
> To have a way to interop between process IDs and process file
> descriptors, the pidfd_getpid is also provided:
>
>    pid_t pidfd_getpid (int fd)
>
> It reads the procfs fdinfo entry from the file descriptor to get
> the process ID.
>
> ---
>
> Changes from v5:
> - Added cgroupv2 support for posix_spawn, pidfd_spawn, and pidfd_fork.
>
> Changes from v4:
> - Changed pidfd_fork signature to return a pid_t instead of PID file
>   descriptor.
> - Changed pidfd_getpid to return EBADF for negative input, instead of
>   EINVAL.
> - Added PIDFDFORK_NOSIGCHLD option.
> - Fixed nested __BEGIN_DECLS on spawn.h
>
> Changes from v3:
> - Remove strtoul usage.
> - Fixed patchwork tst-pidfd_getpid.c regression.
> - Fixed manual and NEWS typos.
>
> Changes from v2:
> - Added pidfd_fork and pidfd_getpid manual entries
> - Change pidfd_fork to act as fork as default, instead as _Fork.
> - Changed PIDFD_FORK_RUNATFORK flag to PIDFDFORK_ASYNCSAFE.
> - Added pidfd_getpid test for EREMOTE.
>
> Changes from v1:
> - Extended pidfd_getpid error codes to return EBADF if fdinfo does not
>   have Pid entry or if the value is invalid, EREMOTE is pid is in a
>   separate namespace, and ESRCH if is already terminated.
> - Extended tst-pidfd_getpid.
> - Rename PIDFD_FORK_RUNATFORK to PIDFDFORK_RUNATFORK to avoid clash
>   with possible kernel extensions.
>
> Adhemerval Zanella (5):
>   linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731)
>   posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349)
>   posix: Add pidfd_fork (BZ 26371)
>   posix: Add PIDFDFORK_NOSIGCHLD for pidfd_fork
>   linux: Add pidfd_getpid

I have reviewed the interfaces, not the implementation, but fully
intend to use all of these from systemd once available, so:

Acked-by: Luca Boccassi <bluca@debian.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation
  2023-07-05 20:43 [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Adhemerval Zanella
                   ` (5 preceding siblings ...)
  2023-07-05 20:56 ` [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Luca Boccassi
@ 2023-07-06 12:35 ` Adhemerval Zanella Netto
  6 siblings, 0 replies; 8+ messages in thread
From: Adhemerval Zanella Netto @ 2023-07-06 12:35 UTC (permalink / raw)
  To: libc-alpha, Luca Boccassi, Philip Withnall



On 05/07/23 17:43, Adhemerval Zanella wrote:
> Add pidfd and cgroupv2 support for process creation

The patchset failed to apply for some reason, I will rebase it
and resend.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-07-06 12:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-05 20:43 [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Adhemerval Zanella
2023-07-05 20:43 ` [PATCH v6 1/5] linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731) Adhemerval Zanella
2023-07-05 20:43 ` [PATCH v6 2/5] posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349) Adhemerval Zanella
2023-07-05 20:43 ` [PATCH v6 3/5] posix: Add pidfd_fork (BZ 26371) Adhemerval Zanella
2023-07-05 20:43 ` [PATCH v6 4/5] posix: Add PIDFDFORK_NOSIGCHLD for pidfd_fork Adhemerval Zanella
2023-07-05 20:43 ` [PATCH v6 5/5] linux: Add pidfd_getpid Adhemerval Zanella
2023-07-05 20:56 ` [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation Luca Boccassi
2023-07-06 12:35 ` Adhemerval Zanella Netto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).