public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
To: Rain <glibc@sunshowers.io>, libc-help@sourceware.org
Subject: Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
Date: Mon, 22 Aug 2022 13:51:03 -0300	[thread overview]
Message-ID: <7727e4de-a8da-1e6b-4d7c-68a132750996@linaro.org> (raw)
In-Reply-To: <2921668c-773e-465d-9480-0abb6f979bf9@www.fastmail.com>



On 14/08/22 00:30, Rain wrote:
> Hi there --
> 
> I've been working on a CLI tool (in Rust) that spawns lots of processes with posix_spawn. Specifically, I've been observing its behavior when Ctrl-Z is pressed in a terminal, and the process group receives a SIGTSTP signal. I'm seeing an issue where if the signal is received early enough during the posix_spawn process, the parent can be stuck in the middle of the clone3() syscall, an uninterruptible sleep status.
> 
> Here are some backtraces, observed with glibc 2.35 and Linux kernel 5.18.10-76051810-generic on Ubuntu 22.04 (x86_64). I checked glibc master and I'm not seeing any code changes in this area, so I presume this issue still exists.
> 
> In this case, during setup, posix_spawnattr_setsigmask is called with an empty signal set. However, based on reading the source code. I don't think that's relevant.
> 
> --- parent process ---
> 
> (gdb) bt
> #0  clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
> #1  0x00007f12a0a37a51 in __GI___clone_internal (cl_args=cl_args@entry=0x7f129a5ed9e0, func=func@entry=0x7f12a0a24300 <__spawni_child>, arg=arg@entry=0x7f129a5eda40)
>     at ../sysdeps/unix/sysv/linux/clone-internal.c:54
> #2  0x00007f12a0a241f3 in __spawnix (pid=0x7f129a5edd20, file=0x7f123405d030 "/home/rain/dev/tokio/target/debug/deps/sync_mutex-22a40a7c6051156b", file_actions=0x7f129a5edd60, 
>     attrp=<optimized out>, argv=<optimized out>, envp=0x7f123403f2e0, xflags=1, exec=0x7f12a09fcdd0 <__execvpex>) at ../sysdeps/unix/sysv/linux/spawni.c:388
> #3  0x00007f12a0a2490b in __spawni (pid=<optimized out>, file=<optimized out>, acts=<optimized out>, attrp=<optimized out>, argv=<optimized out>, envp=<optimized out>, xflags=1)
>     at ../sysdeps/unix/sysv/linux/spawni.c:436
> #4  0x00007f12a0a2403f in __posix_spawnp (pid=<optimized out>, file=<optimized out>, file_actions=<optimized out>, attrp=<optimized out>, argv=<optimized out>, envp=<optimized out>)
>     at ./posix/spawnp.c:30
> #5  0x000056199dee0811 in std::sys::unix::process::process_common::Command::posix_spawn () at library/std/src/sys/unix/process/process_unix.rs:544
> #6  std::sys::unix::process::process_common::Command::spawn () at library/std/src/sys/unix/process/process_unix.rs:57
> #7  0x000056199ded68dc in std::process::Command::spawn () at library/std/src/process.rs:881
> 
> --- child process ---
> 
> (gdb) bt
> #0  __GI___pthread_sigmask (how=how@entry=2, newmask=<optimized out>, oldmask=oldmask@entry=0x0) at ./nptl/pthread_sigmask.c:43
> #1  0x00007faaf8edd71d in __GI___sigprocmask (how=how@entry=2, set=<optimized out>, oset=oset@entry=0x0) at ../sysdeps/unix/sysv/linux/sigprocmask.c:25
> #2  0x00007faaf8fae4d8 in __spawni_child (arguments=<optimized out>) at ../sysdeps/unix/sysv/linux/spawni.c:287
> #3  0x00007faaf8fc1a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
> 
> ---
> 
> Based on these backtraces and reading the source code, here's what I believe is happening:
> 
> 1. The parent calls __posix_spawnp, which in turn calls __spawni and __spawnix.
> 2. The parent calls clone3 and enters uninterruptible sleep.
> 3. The child enters __spawni_child and blocks all incoming signals.

In fact glibc do not block, but rather set all handlers to either SIG_DFL
if is not SIG_IGN, or SIG_DFL if POSIX_SPAWN_SETSIGDEF is set.  However
it does not matter for SIGSTOP since we can not set it to SIG_IGN.

> ---> 4. At this point the child receives a SIGTSTP signal. <---
> 5. The child unblocks signals by calling sigprocmask/pthread_sigmask.
> 6. At this point the SIGTSTP is delivered to the child.

Afaik SIGSTOP is not synchronous and can be delivered any time during process
execution. 

> 7. However, the clone hasn't exited in the parent and so it remains stuck in the clone3 syscall until the child receives a SIGCONT.
> 
> I'm not sure what a reasonable way to handle this would be on the part of my CLI tool. The tool currently just gets stuck in uninterruptible sleep, resulting in a bad user experience.

Reading through both your twitter discussion and the bug report against your
tool [1] I think it is outside posix_spawn specification on how to handle
SIGSTOP for the helper process itself in the tiny window between process
creation and the setpgid.

> 
> Here are solutions I've thought about that don't seem to work (please correct me if I'm wrong!)
> 1. Setting the signal mask to include SIGTSTP. I do want to be able to send the child SIGTSTP after the clone(), and in my case the child is a third-party process so I can't depend on it to reset the signal mask.
> 2. Spawning a stub process that execves the real child. It seems like the same issue exists when the main process calls the stub process, if I'm understanding the code correctly, so this won't help.
> 
> ... though now as I'm writing this email out, maybe one solution is:
> 
> * my tool spawns a stub process with SIGTSTP masked.
> * the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, but at least it won't block the parent process), then execves the third-party process.
> 
> Is that the solution you would recommend?

I am not sure this would work, since SIGSTOP cannot be caught, blocked, or 
ignored.  What I think if might work is to spawn a stub process and make
it a new session leader with setsid so it will not have a controlling
terminal.  The stub process will be responsible to spawn new processes,
so any interaction with the controlling terminal (the CTRL+Z) won't affected
the posix_spawn helper thread. 

You will probably need to open the controlling terminal in raw mode so you
can catch ctrl-z and pass along the expected process groups.

> 
> Thanks.
 

[1] https://github.com/nextest-rs/nextest/pull/470#issue-1338100182

  parent reply	other threads:[~2022-08-22 16:51 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-14  3:30 Rain
2022-08-14  3:38 ` Rain
2022-08-22 16:51 ` Adhemerval Zanella Netto [this message]
2022-08-22 17:00   ` Rain
2022-08-22 17:48     ` Adhemerval Zanella Netto
2022-08-22 18:21       ` Florian Weimer
2022-08-22 18:32         ` Adhemerval Zanella Netto
2022-08-22 22:28           ` Adhemerval Zanella Netto
2022-09-13 10:04           ` Florian Weimer
2022-09-21 15:24             ` Adhemerval Zanella Netto
2022-09-22 12:18               ` Florian Weimer
2022-09-22 16:56                 ` Adhemerval Zanella Netto
2022-09-22 17:38                   ` Florian Weimer
2022-09-22 19:14                     ` Adhemerval Zanella Netto
2022-10-10 13:45                       ` Florian Weimer
2022-10-18 20:04                         ` Adhemerval Zanella Netto
2022-10-20 11:55                           ` Florian Weimer
2022-10-21  1:40                             ` Rain
2022-10-21 14:18                               ` Szabolcs Nagy
2022-08-22 22:30       ` Adhemerval Zanella Netto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7727e4de-a8da-1e6b-4d7c-68a132750996@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=glibc@sunshowers.io \
    --cc=libc-help@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).