public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
To: Luca Boccassi <luca.boccassi@gmail.com>,
	Florian Weimer <fweimer@redhat.com>
Cc: libc-alpha@sourceware.org, Philip Withnall <bugzilla@tecnocode.co.uk>
Subject: Re: [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation (resend)
Date: Thu, 10 Aug 2023 15:19:12 -0300	[thread overview]
Message-ID: <1dffc148-4392-4c86-fd29-0e33c05b4156@linaro.org> (raw)
In-Reply-To: <CAMw=ZnTTvQGUcA_KP4LRqLrXKuFbMM1V9o9sPLybU5oNL3WAZA@mail.gmail.com>



On 08/08/23 16:35, Luca Boccassi wrote:
> On Thu, 6 Jul 2023 at 14:45, Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
>>
>> The glibc 2.36 added wrappers for Linux syscall pidfd_open,
>> pidfd_getfd, and pidfd_send_signal, and exported the P_PIDFD to use
>> along with waitid. The pidfd is a race free interface, however
>> the pidfd_open is subject to TOCTOU if the file descriptor
>> is not obtained directly from the clone or clone3 syscall (there is
>> still a small window between the clone return and the pidfd_getfd
>> where the process can be reaped and the process ID reused).
>>
>> A fully race free interface with posix_spawn interface is being
>> discussed by GNOME [1] [2], and Qt already uses on its QtProcess
>> implementation [3].  The Qt implementation has some pitfalls:
>>
>>   - It calls clone through the syscall symbol, which does not run the
>>     pthread_atfork handlers even though it really intends to use the
>>     clone semantic for fork (by only using CLONE_PIDFD | SIGCHLD).
>>
>>   - It also does not reset any internal state, such as internal IO,
>>     malloc, loader, etc. locks.
>>
>>   - It does not set the TCB tid field nor the robust list, used by
>>     pthread code.
>>
>>   - It does not optimize process creation by using CLONE_VM and
>>     CLONE_VFORK.
>>
>> Also, recent Linux kernel (starting with 5.7) provide a way to
>> create a new process in a different cgroups version 2 than the
>> default one (through clone3 CLONE_INTO_CGROUP flag).  Providing it
>> through glibc interfaces make is usable without the risk of potential
>> breakage by issuing clone3 syscall directly (check BZ#26371 discussion).
>>
>> This patchset adds new interfaces that take care of this potential
>> issues.  The new posix_spawn / posix_spawnp extesions:
>>
>>
>>   #define POSIX_SPAWN_SETCGROUP 0x100
>>
>>   int posix_spawnattr_getcgroup_np (const posix_spawnattr_t
>>                                     restrict *attr, int *cgroup);
>>   int posix_spawnattr_setcgroup_np (posix_spawnattr_t *restrict attr,
>>                                     int cgroup);
>>
>> Allow spawn a new process on a different cgroupv2.
>>
>> The pidfd_spawn and pidfd_spawnp is similar to posix_spawn and
>> posix_spawnp,
>> but return a process file descriptor instead of a PID.
>>
>>   int pidfd_spawn (int *restrict pidfd,
>>                    const char *restrict file,
>>                    const posix_spawn_file_actions_t *restrict facts,
>>                    const posix_spawnattr_t *restrict attrp,
>>                    char *const argv[restrict],
>>                    char *const envp[restrict])
>>
>>   int pidfd_spawnp (int *restrict pidfd,
>>                     const char *restrict path,
>>                     const posix_spawn_file_actions_t *restrict facts,
>>                     const posix_spawnattr_t *restrict attrp,
>>                     char *const argv[restrict_arr],
>>                     char *const envp[restrict_arr]);
>>
>> The implementation makes sure that kernel must support the complete
>> pidfd interface, meaning that waitid (P_PIDFD) should be supported.  It
>> ensure that non racy workaround is required (such as reading procfs
>> fdinfo pid to use along with old wait interfaces).  If kernel does not
>> have the required support the interface returns ENOSYS.
>>
>> A new symbol is used instead of a posix_spawn extension to avoid
>> possible issue with language bindings that might track the argument
>> lifetime.
>>
>> Both symbols reuse the posix_spawn posix_spawn_file_actions_t and
>> posix_spawnattr_t, to either avoid rehash posix_spawn API or add a new
>> one.  It also mean that both interfaces support the same attribute and
>> file actions, and a new flag or file actions on posix_spawn is also
>> added automatically for pidfd_spawn. It includes POSIX_SPAWN_SETCGROUP.
>>
>> Along with the spawn interface, a fork like one is also provided:
>>
>>   pid_t pidfd_fork (int *pidfd, int cgroup, unsigned int flags)
>>
>> If PIDFD is set to NULL, no file descriptor is returned and pidfd_fork
>> acts as fork.  Otherwise, a new file descriptor is returned and the
>> kernel already sets O_CLOEXEC as default.  The pidfd_fork follows
>> fork/_Fork convention on returning a positive or negative value to the
>> parent (with negative indicating an error) and zero to the child.
>>
>> If cgroup is 0 or positive value, it is interpreted as a different
>> cgroup to be place the new process (check CLONE_INTO_CGROUP clone
>> flag).
>>
>> The kernel already sets O_CLOEXEC as default and it follows fork/_Fork
>> convention on returning a positive or negative value to the parent
>> (with negative indicating an error) and zero to the child.
>>
>> Similar to fork, pidfd_fork also runs the pthread_atfork handlers
>> It can be change by using PIDFDFORK_ASYNCSAFE flag, which make
>> pidfd_fork acts a _Fork.  It also send SIGCHLD to parent when
>> process terminates.
>>
>> To have a way to interop between process IDs and process file
>> descriptors, the pidfd_getpid is also provided:
>>
>>    pid_t pidfd_getpid (int fd)
>>
>> It reads the procfs fdinfo entry from the file descriptor to get
>> the process ID.
> 
> Hi, any update on this series?

Florian said on last weekly call that he intends to check this in following
weeks.

      reply	other threads:[~2023-08-10 18:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-06 13:45 Adhemerval Zanella
2023-07-06 13:45 ` [PATCH v6 1/5] linux: Add posix_spawnattr_{get,set}cgroup_np (BZ 26731) Adhemerval Zanella
2023-07-06 13:45 ` [PATCH v6 2/5] posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349) Adhemerval Zanella
2023-07-06 13:45 ` [PATCH v6 3/5] posix: Add pidfd_fork (BZ 26371) Adhemerval Zanella
2023-07-06 13:45 ` [PATCH v6 4/5] posix: Add PIDFDFORK_NOSIGCHLD for pidfd_fork Adhemerval Zanella
2023-07-06 13:45 ` [PATCH v6 5/5] linux: Add pidfd_getpid Adhemerval Zanella
2023-08-08 19:35 ` [PATCH v6 0/5] Add pidfd and cgroupv2 support for process creation (resend) Luca Boccassi
2023-08-10 18:19   ` Adhemerval Zanella Netto [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1dffc148-4392-4c86-fd29-0e33c05b4156@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=bugzilla@tecnocode.co.uk \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=luca.boccassi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).