public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
* posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
@ 2022-08-14  3:30 Rain
  2022-08-14  3:38 ` Rain
  2022-08-22 16:51 ` Adhemerval Zanella Netto
  0 siblings, 2 replies; 20+ messages in thread
From: Rain @ 2022-08-14  3:30 UTC (permalink / raw)
  To: libc-help

Hi there --

I've been working on a CLI tool (in Rust) that spawns lots of processes with posix_spawn. Specifically, I've been observing its behavior when Ctrl-Z is pressed in a terminal, and the process group receives a SIGTSTP signal. I'm seeing an issue where if the signal is received early enough during the posix_spawn process, the parent can be stuck in the middle of the clone3() syscall, an uninterruptible sleep status.

Here are some backtraces, observed with glibc 2.35 and Linux kernel 5.18.10-76051810-generic on Ubuntu 22.04 (x86_64). I checked glibc master and I'm not seeing any code changes in this area, so I presume this issue still exists.

In this case, during setup, posix_spawnattr_setsigmask is called with an empty signal set. However, based on reading the source code. I don't think that's relevant.

--- parent process ---

(gdb) bt
#0  clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
#1  0x00007f12a0a37a51 in __GI___clone_internal (cl_args=cl_args@entry=0x7f129a5ed9e0, func=func@entry=0x7f12a0a24300 <__spawni_child>, arg=arg@entry=0x7f129a5eda40)
    at ../sysdeps/unix/sysv/linux/clone-internal.c:54
#2  0x00007f12a0a241f3 in __spawnix (pid=0x7f129a5edd20, file=0x7f123405d030 "/home/rain/dev/tokio/target/debug/deps/sync_mutex-22a40a7c6051156b", file_actions=0x7f129a5edd60, 
    attrp=<optimized out>, argv=<optimized out>, envp=0x7f123403f2e0, xflags=1, exec=0x7f12a09fcdd0 <__execvpex>) at ../sysdeps/unix/sysv/linux/spawni.c:388
#3  0x00007f12a0a2490b in __spawni (pid=<optimized out>, file=<optimized out>, acts=<optimized out>, attrp=<optimized out>, argv=<optimized out>, envp=<optimized out>, xflags=1)
    at ../sysdeps/unix/sysv/linux/spawni.c:436
#4  0x00007f12a0a2403f in __posix_spawnp (pid=<optimized out>, file=<optimized out>, file_actions=<optimized out>, attrp=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at ./posix/spawnp.c:30
#5  0x000056199dee0811 in std::sys::unix::process::process_common::Command::posix_spawn () at library/std/src/sys/unix/process/process_unix.rs:544
#6  std::sys::unix::process::process_common::Command::spawn () at library/std/src/sys/unix/process/process_unix.rs:57
#7  0x000056199ded68dc in std::process::Command::spawn () at library/std/src/process.rs:881

--- child process ---

(gdb) bt
#0  __GI___pthread_sigmask (how=how@entry=2, newmask=<optimized out>, oldmask=oldmask@entry=0x0) at ./nptl/pthread_sigmask.c:43
#1  0x00007faaf8edd71d in __GI___sigprocmask (how=how@entry=2, set=<optimized out>, oset=oset@entry=0x0) at ../sysdeps/unix/sysv/linux/sigprocmask.c:25
#2  0x00007faaf8fae4d8 in __spawni_child (arguments=<optimized out>) at ../sysdeps/unix/sysv/linux/spawni.c:287
#3  0x00007faaf8fc1a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

---

Based on these backtraces and reading the source code, here's what I believe is happening:

1. The parent calls __posix_spawnp, which in turn calls __spawni and __spawnix.
2. The parent calls clone3 and enters uninterruptible sleep.
3. The child enters __spawni_child and blocks all incoming signals.
---> 4. At this point the child receives a SIGTSTP signal. <---
5. The child unblocks signals by calling sigprocmask/pthread_sigmask.
6. At this point the SIGTSTP is delivered to the child.
7. However, the clone hasn't exited in the parent and so it remains stuck in the clone3 syscall until the child receives a SIGCONT.

I'm not sure what a reasonable way to handle this would be on the part of my CLI tool. The tool currently just gets stuck in uninterruptible sleep, resulting in a bad user experience.

Here are solutions I've thought about that don't seem to work (please correct me if I'm wrong!)
1. Setting the signal mask to include SIGTSTP. I do want to be able to send the child SIGTSTP after the clone(), and in my case the child is a third-party process so I can't depend on it to reset the signal mask.
2. Spawning a stub process that execves the real child. It seems like the same issue exists when the main process calls the stub process, if I'm understanding the code correctly, so this won't help.

... though now as I'm writing this email out, maybe one solution is:

* my tool spawns a stub process with SIGTSTP masked.
* the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, but at least it won't block the parent process), then execves the third-party process.

Is that the solution you would recommend?

Thanks.

--

Rain
(they/she)

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-10-21 14:19 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-14  3:30 posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough Rain
2022-08-14  3:38 ` Rain
2022-08-22 16:51 ` Adhemerval Zanella Netto
2022-08-22 17:00   ` Rain
2022-08-22 17:48     ` Adhemerval Zanella Netto
2022-08-22 18:21       ` Florian Weimer
2022-08-22 18:32         ` Adhemerval Zanella Netto
2022-08-22 22:28           ` Adhemerval Zanella Netto
2022-09-13 10:04           ` Florian Weimer
2022-09-21 15:24             ` Adhemerval Zanella Netto
2022-09-22 12:18               ` Florian Weimer
2022-09-22 16:56                 ` Adhemerval Zanella Netto
2022-09-22 17:38                   ` Florian Weimer
2022-09-22 19:14                     ` Adhemerval Zanella Netto
2022-10-10 13:45                       ` Florian Weimer
2022-10-18 20:04                         ` Adhemerval Zanella Netto
2022-10-20 11:55                           ` Florian Weimer
2022-10-21  1:40                             ` Rain
2022-10-21 14:18                               ` Szabolcs Nagy
2022-08-22 22:30       ` Adhemerval Zanella Netto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).