public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
From: Rain <glibc@sunshowers.io>
To: libc-help@sourceware.org
Subject: Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
Date: Sat, 13 Aug 2022 20:38:41 -0700	[thread overview]
Message-ID: <88c14933-2bc9-4e4e-8fc7-13056787a9cf@www.fastmail.com> (raw)
In-Reply-To: <2921668c-773e-465d-9480-0abb6f979bf9@www.fastmail.com>

On Sat, Aug 13, 2022, at 20:30, Rain wrote:
> Hi there --
>
> I've been working on a CLI tool (in Rust) that spawns lots of processes 
> with posix_spawn. Specifically, I've been observing its behavior when 
> Ctrl-Z is pressed in a terminal, and the process group receives a 
> SIGTSTP signal. I'm seeing an issue where if the signal is received 
> early enough during the posix_spawn process, the parent can be stuck in 
> the middle of the clone3() syscall, an uninterruptible sleep status.
>
> Here are some backtraces, observed with glibc 2.35 and Linux kernel 
> 5.18.10-76051810-generic on Ubuntu 22.04 (x86_64). I checked glibc 
> master and I'm not seeing any code changes in this area, so I presume 
> this issue still exists.
>
> In this case, during setup, posix_spawnattr_setsigmask is called with 
> an empty signal set. However, based on reading the source code. I don't 
> think that's relevant.
>
> --- parent process ---
>
> (gdb) bt
> #0  clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
> #1  0x00007f12a0a37a51 in __GI___clone_internal 
> (cl_args=cl_args@entry=0x7f129a5ed9e0, func=func@entry=0x7f12a0a24300 
> <__spawni_child>, arg=arg@entry=0x7f129a5eda40)
>     at ../sysdeps/unix/sysv/linux/clone-internal.c:54
> #2  0x00007f12a0a241f3 in __spawnix (pid=0x7f129a5edd20, 
> file=0x7f123405d030 
> "/home/rain/dev/tokio/target/debug/deps/sync_mutex-22a40a7c6051156b", 
> file_actions=0x7f129a5edd60, 
>     attrp=<optimized out>, argv=<optimized out>, envp=0x7f123403f2e0, 
> xflags=1, exec=0x7f12a09fcdd0 <__execvpex>) at 
> ../sysdeps/unix/sysv/linux/spawni.c:388
> #3  0x00007f12a0a2490b in __spawni (pid=<optimized out>, 
> file=<optimized out>, acts=<optimized out>, attrp=<optimized out>, 
> argv=<optimized out>, envp=<optimized out>, xflags=1)
>     at ../sysdeps/unix/sysv/linux/spawni.c:436
> #4  0x00007f12a0a2403f in __posix_spawnp (pid=<optimized out>, 
> file=<optimized out>, file_actions=<optimized out>, attrp=<optimized 
> out>, argv=<optimized out>, envp=<optimized out>)
>     at ./posix/spawnp.c:30
> #5  0x000056199dee0811 in 
> std::sys::unix::process::process_common::Command::posix_spawn () at 
> library/std/src/sys/unix/process/process_unix.rs:544
> #6  std::sys::unix::process::process_common::Command::spawn () at 
> library/std/src/sys/unix/process/process_unix.rs:57
> #7  0x000056199ded68dc in std::process::Command::spawn () at 
> library/std/src/process.rs:881
>
> --- child process ---
>
> (gdb) bt
> #0  __GI___pthread_sigmask (how=how@entry=2, newmask=<optimized out>, 
> oldmask=oldmask@entry=0x0) at ./nptl/pthread_sigmask.c:43
> #1  0x00007faaf8edd71d in __GI___sigprocmask (how=how@entry=2, 
> set=<optimized out>, oset=oset@entry=0x0) at 
> ../sysdeps/unix/sysv/linux/sigprocmask.c:25
> #2  0x00007faaf8fae4d8 in __spawni_child (arguments=<optimized out>) at 
> ../sysdeps/unix/sysv/linux/spawni.c:287
> #3  0x00007faaf8fc1a00 in clone3 () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
>
> ---
>
> Based on these backtraces and reading the source code, here's what I 
> believe is happening:
>
> 1. The parent calls __posix_spawnp, which in turn calls __spawni and 
> __spawnix.
> 2. The parent calls clone3 and enters uninterruptible sleep.
> 3. The child enters __spawni_child and blocks all incoming signals.
> ---> 4. At this point the child receives a SIGTSTP signal. <---
> 5. The child unblocks signals by calling sigprocmask/pthread_sigmask.
> 6. At this point the SIGTSTP is delivered to the child.
> 7. However, the clone hasn't exited in the parent and so it remains 
> stuck in the clone3 syscall until the child receives a SIGCONT.
>
> I'm not sure what a reasonable way to handle this would be on the part 
> of my CLI tool. The tool currently just gets stuck in uninterruptible 
> sleep, resulting in a bad user experience.
>
> Here are solutions I've thought about that don't seem to work (please 
> correct me if I'm wrong!)
> 1. Setting the signal mask to include SIGTSTP. I do want to be able to 
> send the child SIGTSTP after the clone(), and in my case the child is a 
> third-party process so I can't depend on it to reset the signal mask.
> 2. Spawning a stub process that execves the real child. It seems like 
> the same issue exists when the main process calls the stub process, if 
> I'm understanding the code correctly, so this won't help.
>
> ... though now as I'm writing this email out, maybe one solution is:
>
> * my tool spawns a stub process with SIGTSTP masked.
> * the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, 
> but at least it won't block the parent process), then execves the 
> third-party process.
>
> Is that the solution you would recommend?
>
> Thanks.
>
> --
>
> Rain
> (they/she)

I apologize for the lack of wrapped text -- I didn't realize that my
MUA (Fastmail) doesn't wrap plaintext emails. I believe the quoted text
in this response should be correctly wrapped.

-- 
Rain
(they/she)

  reply	other threads:[~2022-08-14  3:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-14  3:30 Rain
2022-08-14  3:38 ` Rain [this message]
2022-08-22 16:51 ` Adhemerval Zanella Netto
2022-08-22 17:00   ` Rain
2022-08-22 17:48     ` Adhemerval Zanella Netto
2022-08-22 18:21       ` Florian Weimer
2022-08-22 18:32         ` Adhemerval Zanella Netto
2022-08-22 22:28           ` Adhemerval Zanella Netto
2022-09-13 10:04           ` Florian Weimer
2022-09-21 15:24             ` Adhemerval Zanella Netto
2022-09-22 12:18               ` Florian Weimer
2022-09-22 16:56                 ` Adhemerval Zanella Netto
2022-09-22 17:38                   ` Florian Weimer
2022-09-22 19:14                     ` Adhemerval Zanella Netto
2022-10-10 13:45                       ` Florian Weimer
2022-10-18 20:04                         ` Adhemerval Zanella Netto
2022-10-20 11:55                           ` Florian Weimer
2022-10-21  1:40                             ` Rain
2022-10-21 14:18                               ` Szabolcs Nagy
2022-08-22 22:30       ` Adhemerval Zanella Netto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=88c14933-2bc9-4e4e-8fc7-13056787a9cf@www.fastmail.com \
    --to=glibc@sunshowers.io \
    --cc=libc-help@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).