public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
* posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
@ 2022-08-14  3:30 Rain
  2022-08-14  3:38 ` Rain
  2022-08-22 16:51 ` Adhemerval Zanella Netto
  0 siblings, 2 replies; 20+ messages in thread
From: Rain @ 2022-08-14  3:30 UTC (permalink / raw)
  To: libc-help

Hi there --

I've been working on a CLI tool (in Rust) that spawns lots of processes with posix_spawn. Specifically, I've been observing its behavior when Ctrl-Z is pressed in a terminal, and the process group receives a SIGTSTP signal. I'm seeing an issue where if the signal is received early enough during the posix_spawn process, the parent can be stuck in the middle of the clone3() syscall, an uninterruptible sleep status.

Here are some backtraces, observed with glibc 2.35 and Linux kernel 5.18.10-76051810-generic on Ubuntu 22.04 (x86_64). I checked glibc master and I'm not seeing any code changes in this area, so I presume this issue still exists.

In this case, during setup, posix_spawnattr_setsigmask is called with an empty signal set. However, based on reading the source code. I don't think that's relevant.

--- parent process ---

(gdb) bt
#0  clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
#1  0x00007f12a0a37a51 in __GI___clone_internal (cl_args=cl_args@entry=0x7f129a5ed9e0, func=func@entry=0x7f12a0a24300 <__spawni_child>, arg=arg@entry=0x7f129a5eda40)
    at ../sysdeps/unix/sysv/linux/clone-internal.c:54
#2  0x00007f12a0a241f3 in __spawnix (pid=0x7f129a5edd20, file=0x7f123405d030 "/home/rain/dev/tokio/target/debug/deps/sync_mutex-22a40a7c6051156b", file_actions=0x7f129a5edd60, 
    attrp=<optimized out>, argv=<optimized out>, envp=0x7f123403f2e0, xflags=1, exec=0x7f12a09fcdd0 <__execvpex>) at ../sysdeps/unix/sysv/linux/spawni.c:388
#3  0x00007f12a0a2490b in __spawni (pid=<optimized out>, file=<optimized out>, acts=<optimized out>, attrp=<optimized out>, argv=<optimized out>, envp=<optimized out>, xflags=1)
    at ../sysdeps/unix/sysv/linux/spawni.c:436
#4  0x00007f12a0a2403f in __posix_spawnp (pid=<optimized out>, file=<optimized out>, file_actions=<optimized out>, attrp=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at ./posix/spawnp.c:30
#5  0x000056199dee0811 in std::sys::unix::process::process_common::Command::posix_spawn () at library/std/src/sys/unix/process/process_unix.rs:544
#6  std::sys::unix::process::process_common::Command::spawn () at library/std/src/sys/unix/process/process_unix.rs:57
#7  0x000056199ded68dc in std::process::Command::spawn () at library/std/src/process.rs:881

--- child process ---

(gdb) bt
#0  __GI___pthread_sigmask (how=how@entry=2, newmask=<optimized out>, oldmask=oldmask@entry=0x0) at ./nptl/pthread_sigmask.c:43
#1  0x00007faaf8edd71d in __GI___sigprocmask (how=how@entry=2, set=<optimized out>, oset=oset@entry=0x0) at ../sysdeps/unix/sysv/linux/sigprocmask.c:25
#2  0x00007faaf8fae4d8 in __spawni_child (arguments=<optimized out>) at ../sysdeps/unix/sysv/linux/spawni.c:287
#3  0x00007faaf8fc1a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

---

Based on these backtraces and reading the source code, here's what I believe is happening:

1. The parent calls __posix_spawnp, which in turn calls __spawni and __spawnix.
2. The parent calls clone3 and enters uninterruptible sleep.
3. The child enters __spawni_child and blocks all incoming signals.
---> 4. At this point the child receives a SIGTSTP signal. <---
5. The child unblocks signals by calling sigprocmask/pthread_sigmask.
6. At this point the SIGTSTP is delivered to the child.
7. However, the clone hasn't exited in the parent and so it remains stuck in the clone3 syscall until the child receives a SIGCONT.

I'm not sure what a reasonable way to handle this would be on the part of my CLI tool. The tool currently just gets stuck in uninterruptible sleep, resulting in a bad user experience.

Here are solutions I've thought about that don't seem to work (please correct me if I'm wrong!)
1. Setting the signal mask to include SIGTSTP. I do want to be able to send the child SIGTSTP after the clone(), and in my case the child is a third-party process so I can't depend on it to reset the signal mask.
2. Spawning a stub process that execves the real child. It seems like the same issue exists when the main process calls the stub process, if I'm understanding the code correctly, so this won't help.

... though now as I'm writing this email out, maybe one solution is:

* my tool spawns a stub process with SIGTSTP masked.
* the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, but at least it won't block the parent process), then execves the third-party process.

Is that the solution you would recommend?

Thanks.

--

Rain
(they/she)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-08-14  3:30 posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough Rain
@ 2022-08-14  3:38 ` Rain
  2022-08-22 16:51 ` Adhemerval Zanella Netto
  1 sibling, 0 replies; 20+ messages in thread
From: Rain @ 2022-08-14  3:38 UTC (permalink / raw)
  To: libc-help

On Sat, Aug 13, 2022, at 20:30, Rain wrote:
> Hi there --
>
> I've been working on a CLI tool (in Rust) that spawns lots of processes 
> with posix_spawn. Specifically, I've been observing its behavior when 
> Ctrl-Z is pressed in a terminal, and the process group receives a 
> SIGTSTP signal. I'm seeing an issue where if the signal is received 
> early enough during the posix_spawn process, the parent can be stuck in 
> the middle of the clone3() syscall, an uninterruptible sleep status.
>
> Here are some backtraces, observed with glibc 2.35 and Linux kernel 
> 5.18.10-76051810-generic on Ubuntu 22.04 (x86_64). I checked glibc 
> master and I'm not seeing any code changes in this area, so I presume 
> this issue still exists.
>
> In this case, during setup, posix_spawnattr_setsigmask is called with 
> an empty signal set. However, based on reading the source code. I don't 
> think that's relevant.
>
> --- parent process ---
>
> (gdb) bt
> #0  clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
> #1  0x00007f12a0a37a51 in __GI___clone_internal 
> (cl_args=cl_args@entry=0x7f129a5ed9e0, func=func@entry=0x7f12a0a24300 
> <__spawni_child>, arg=arg@entry=0x7f129a5eda40)
>     at ../sysdeps/unix/sysv/linux/clone-internal.c:54
> #2  0x00007f12a0a241f3 in __spawnix (pid=0x7f129a5edd20, 
> file=0x7f123405d030 
> "/home/rain/dev/tokio/target/debug/deps/sync_mutex-22a40a7c6051156b", 
> file_actions=0x7f129a5edd60, 
>     attrp=<optimized out>, argv=<optimized out>, envp=0x7f123403f2e0, 
> xflags=1, exec=0x7f12a09fcdd0 <__execvpex>) at 
> ../sysdeps/unix/sysv/linux/spawni.c:388
> #3  0x00007f12a0a2490b in __spawni (pid=<optimized out>, 
> file=<optimized out>, acts=<optimized out>, attrp=<optimized out>, 
> argv=<optimized out>, envp=<optimized out>, xflags=1)
>     at ../sysdeps/unix/sysv/linux/spawni.c:436
> #4  0x00007f12a0a2403f in __posix_spawnp (pid=<optimized out>, 
> file=<optimized out>, file_actions=<optimized out>, attrp=<optimized 
> out>, argv=<optimized out>, envp=<optimized out>)
>     at ./posix/spawnp.c:30
> #5  0x000056199dee0811 in 
> std::sys::unix::process::process_common::Command::posix_spawn () at 
> library/std/src/sys/unix/process/process_unix.rs:544
> #6  std::sys::unix::process::process_common::Command::spawn () at 
> library/std/src/sys/unix/process/process_unix.rs:57
> #7  0x000056199ded68dc in std::process::Command::spawn () at 
> library/std/src/process.rs:881
>
> --- child process ---
>
> (gdb) bt
> #0  __GI___pthread_sigmask (how=how@entry=2, newmask=<optimized out>, 
> oldmask=oldmask@entry=0x0) at ./nptl/pthread_sigmask.c:43
> #1  0x00007faaf8edd71d in __GI___sigprocmask (how=how@entry=2, 
> set=<optimized out>, oset=oset@entry=0x0) at 
> ../sysdeps/unix/sysv/linux/sigprocmask.c:25
> #2  0x00007faaf8fae4d8 in __spawni_child (arguments=<optimized out>) at 
> ../sysdeps/unix/sysv/linux/spawni.c:287
> #3  0x00007faaf8fc1a00 in clone3 () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
>
> ---
>
> Based on these backtraces and reading the source code, here's what I 
> believe is happening:
>
> 1. The parent calls __posix_spawnp, which in turn calls __spawni and 
> __spawnix.
> 2. The parent calls clone3 and enters uninterruptible sleep.
> 3. The child enters __spawni_child and blocks all incoming signals.
> ---> 4. At this point the child receives a SIGTSTP signal. <---
> 5. The child unblocks signals by calling sigprocmask/pthread_sigmask.
> 6. At this point the SIGTSTP is delivered to the child.
> 7. However, the clone hasn't exited in the parent and so it remains 
> stuck in the clone3 syscall until the child receives a SIGCONT.
>
> I'm not sure what a reasonable way to handle this would be on the part 
> of my CLI tool. The tool currently just gets stuck in uninterruptible 
> sleep, resulting in a bad user experience.
>
> Here are solutions I've thought about that don't seem to work (please 
> correct me if I'm wrong!)
> 1. Setting the signal mask to include SIGTSTP. I do want to be able to 
> send the child SIGTSTP after the clone(), and in my case the child is a 
> third-party process so I can't depend on it to reset the signal mask.
> 2. Spawning a stub process that execves the real child. It seems like 
> the same issue exists when the main process calls the stub process, if 
> I'm understanding the code correctly, so this won't help.
>
> ... though now as I'm writing this email out, maybe one solution is:
>
> * my tool spawns a stub process with SIGTSTP masked.
> * the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, 
> but at least it won't block the parent process), then execves the 
> third-party process.
>
> Is that the solution you would recommend?
>
> Thanks.
>
> --
>
> Rain
> (they/she)

I apologize for the lack of wrapped text -- I didn't realize that my
MUA (Fastmail) doesn't wrap plaintext emails. I believe the quoted text
in this response should be correctly wrapped.

-- 
Rain
(they/she)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-08-14  3:30 posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough Rain
  2022-08-14  3:38 ` Rain
@ 2022-08-22 16:51 ` Adhemerval Zanella Netto
  2022-08-22 17:00   ` Rain
  1 sibling, 1 reply; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2022-08-22 16:51 UTC (permalink / raw)
  To: Rain, libc-help



On 14/08/22 00:30, Rain wrote:
> Hi there --
> 
> I've been working on a CLI tool (in Rust) that spawns lots of processes with posix_spawn. Specifically, I've been observing its behavior when Ctrl-Z is pressed in a terminal, and the process group receives a SIGTSTP signal. I'm seeing an issue where if the signal is received early enough during the posix_spawn process, the parent can be stuck in the middle of the clone3() syscall, an uninterruptible sleep status.
> 
> Here are some backtraces, observed with glibc 2.35 and Linux kernel 5.18.10-76051810-generic on Ubuntu 22.04 (x86_64). I checked glibc master and I'm not seeing any code changes in this area, so I presume this issue still exists.
> 
> In this case, during setup, posix_spawnattr_setsigmask is called with an empty signal set. However, based on reading the source code. I don't think that's relevant.
> 
> --- parent process ---
> 
> (gdb) bt
> #0  clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
> #1  0x00007f12a0a37a51 in __GI___clone_internal (cl_args=cl_args@entry=0x7f129a5ed9e0, func=func@entry=0x7f12a0a24300 <__spawni_child>, arg=arg@entry=0x7f129a5eda40)
>     at ../sysdeps/unix/sysv/linux/clone-internal.c:54
> #2  0x00007f12a0a241f3 in __spawnix (pid=0x7f129a5edd20, file=0x7f123405d030 "/home/rain/dev/tokio/target/debug/deps/sync_mutex-22a40a7c6051156b", file_actions=0x7f129a5edd60, 
>     attrp=<optimized out>, argv=<optimized out>, envp=0x7f123403f2e0, xflags=1, exec=0x7f12a09fcdd0 <__execvpex>) at ../sysdeps/unix/sysv/linux/spawni.c:388
> #3  0x00007f12a0a2490b in __spawni (pid=<optimized out>, file=<optimized out>, acts=<optimized out>, attrp=<optimized out>, argv=<optimized out>, envp=<optimized out>, xflags=1)
>     at ../sysdeps/unix/sysv/linux/spawni.c:436
> #4  0x00007f12a0a2403f in __posix_spawnp (pid=<optimized out>, file=<optimized out>, file_actions=<optimized out>, attrp=<optimized out>, argv=<optimized out>, envp=<optimized out>)
>     at ./posix/spawnp.c:30
> #5  0x000056199dee0811 in std::sys::unix::process::process_common::Command::posix_spawn () at library/std/src/sys/unix/process/process_unix.rs:544
> #6  std::sys::unix::process::process_common::Command::spawn () at library/std/src/sys/unix/process/process_unix.rs:57
> #7  0x000056199ded68dc in std::process::Command::spawn () at library/std/src/process.rs:881
> 
> --- child process ---
> 
> (gdb) bt
> #0  __GI___pthread_sigmask (how=how@entry=2, newmask=<optimized out>, oldmask=oldmask@entry=0x0) at ./nptl/pthread_sigmask.c:43
> #1  0x00007faaf8edd71d in __GI___sigprocmask (how=how@entry=2, set=<optimized out>, oset=oset@entry=0x0) at ../sysdeps/unix/sysv/linux/sigprocmask.c:25
> #2  0x00007faaf8fae4d8 in __spawni_child (arguments=<optimized out>) at ../sysdeps/unix/sysv/linux/spawni.c:287
> #3  0x00007faaf8fc1a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
> 
> ---
> 
> Based on these backtraces and reading the source code, here's what I believe is happening:
> 
> 1. The parent calls __posix_spawnp, which in turn calls __spawni and __spawnix.
> 2. The parent calls clone3 and enters uninterruptible sleep.
> 3. The child enters __spawni_child and blocks all incoming signals.

In fact glibc do not block, but rather set all handlers to either SIG_DFL
if is not SIG_IGN, or SIG_DFL if POSIX_SPAWN_SETSIGDEF is set.  However
it does not matter for SIGSTOP since we can not set it to SIG_IGN.

> ---> 4. At this point the child receives a SIGTSTP signal. <---
> 5. The child unblocks signals by calling sigprocmask/pthread_sigmask.
> 6. At this point the SIGTSTP is delivered to the child.

Afaik SIGSTOP is not synchronous and can be delivered any time during process
execution. 

> 7. However, the clone hasn't exited in the parent and so it remains stuck in the clone3 syscall until the child receives a SIGCONT.
> 
> I'm not sure what a reasonable way to handle this would be on the part of my CLI tool. The tool currently just gets stuck in uninterruptible sleep, resulting in a bad user experience.

Reading through both your twitter discussion and the bug report against your
tool [1] I think it is outside posix_spawn specification on how to handle
SIGSTOP for the helper process itself in the tiny window between process
creation and the setpgid.

> 
> Here are solutions I've thought about that don't seem to work (please correct me if I'm wrong!)
> 1. Setting the signal mask to include SIGTSTP. I do want to be able to send the child SIGTSTP after the clone(), and in my case the child is a third-party process so I can't depend on it to reset the signal mask.
> 2. Spawning a stub process that execves the real child. It seems like the same issue exists when the main process calls the stub process, if I'm understanding the code correctly, so this won't help.
> 
> ... though now as I'm writing this email out, maybe one solution is:
> 
> * my tool spawns a stub process with SIGTSTP masked.
> * the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, but at least it won't block the parent process), then execves the third-party process.
> 
> Is that the solution you would recommend?

I am not sure this would work, since SIGSTOP cannot be caught, blocked, or 
ignored.  What I think if might work is to spawn a stub process and make
it a new session leader with setsid so it will not have a controlling
terminal.  The stub process will be responsible to spawn new processes,
so any interaction with the controlling terminal (the CTRL+Z) won't affected
the posix_spawn helper thread. 

You will probably need to open the controlling terminal in raw mode so you
can catch ctrl-z and pass along the expected process groups.

> 
> Thanks.
 

[1] https://github.com/nextest-rs/nextest/pull/470#issue-1338100182

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-08-22 16:51 ` Adhemerval Zanella Netto
@ 2022-08-22 17:00   ` Rain
  2022-08-22 17:48     ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 20+ messages in thread
From: Rain @ 2022-08-22 17:00 UTC (permalink / raw)
  To: Adhemerval Zanella Netto, libc-help

On Mon, Aug 22, 2022, at 09:51, Adhemerval Zanella Netto wrote:

<snip>

>> ---
>> 
>> Based on these backtraces and reading the source code, here's what I believe is happening:
>> 
>> 1. The parent calls __posix_spawnp, which in turn calls __spawni and __spawnix.
>> 2. The parent calls clone3 and enters uninterruptible sleep.
>> 3. The child enters __spawni_child and blocks all incoming signals.
>
> In fact glibc do not block, but rather set all handlers to either SIG_DFL
> if is not SIG_IGN, or SIG_DFL if POSIX_SPAWN_SETSIGDEF is set.  However
> it does not matter for SIGSTOP since we can not set it to SIG_IGN.
>
>> ---> 4. At this point the child receives a SIGTSTP signal. <---
>> 5. The child unblocks signals by calling sigprocmask/pthread_sigmask.
>> 6. At this point the SIGTSTP is delivered to the child.
>
> Afaik SIGSTOP is not synchronous and can be delivered any time during process
> execution. 

Thank you for the response! To be clear, I'm referring to SIGTSTP (Ctrl+Z) [1], not
SIGSTOP. I understand that SIGSTOP cannot be blocked. However, SIGTSTP (which is
a different signal which can be blocked) is what I'm concerned about.

>
>> 7. However, the clone hasn't exited in the parent and so it remains stuck in the clone3 syscall until the child receives a SIGCONT.
>> 
>> I'm not sure what a reasonable way to handle this would be on the part of my CLI tool. The tool currently just gets stuck in uninterruptible sleep, resulting in a bad user experience.
>
> Reading through both your twitter discussion and the bug report against your
> tool [1] I think it is outside posix_spawn specification on how to handle
> SIGSTOP for the helper process itself in the tiny window between process
> creation and the setpgid.
>
>> 
>> Here are solutions I've thought about that don't seem to work (please correct me if I'm wrong!)
>> 1. Setting the signal mask to include SIGTSTP. I do want to be able to send the child SIGTSTP after the clone(), and in my case the child is a third-party process so I can't depend on it to reset the signal mask.
>> 2. Spawning a stub process that execves the real child. It seems like the same issue exists when the main process calls the stub process, if I'm understanding the code correctly, so this won't help.
>> 
>> ... though now as I'm writing this email out, maybe one solution is:
>> 
>> * my tool spawns a stub process with SIGTSTP masked.
>> * the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, but at least it won't block the parent process), then execves the third-party process.
>> 
>> Is that the solution you would recommend?
>
> I am not sure this would work, since SIGSTOP cannot be caught, blocked, or 
> ignored.  What I think if might work is to spawn a stub process and make
> it a new session leader with setsid so it will not have a controlling
> terminal.  The stub process will be responsible to spawn new processes,
> so any interaction with the controlling terminal (the CTRL+Z) won't affected
> the posix_spawn helper thread. 

That is definitely an interesting solution. However, is it necessary given that
Ctrl+Z is actually SIGTSTP, which can be blocked?

Thanks again.

[1] https://www.gnu.org/software/libc/manual/html_node/Job-Control-Signals.html

>
> You will probably need to open the controlling terminal in raw mode so you
> can catch ctrl-z and pass along the expected process groups.
>
>> 
>> Thanks.
> 
>
> [1] https://github.com/nextest-rs/nextest/pull/470#issue-1338100182

-- 
Rain
(they/she)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-08-22 17:00   ` Rain
@ 2022-08-22 17:48     ` Adhemerval Zanella Netto
  2022-08-22 18:21       ` Florian Weimer
  2022-08-22 22:30       ` Adhemerval Zanella Netto
  0 siblings, 2 replies; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2022-08-22 17:48 UTC (permalink / raw)
  To: Rain, libc-help



On 22/08/22 14:00, Rain wrote:
> On Mon, Aug 22, 2022, at 09:51, Adhemerval Zanella Netto wrote:
> 
> <snip>
> 
>>> ---
>>>
>>> Based on these backtraces and reading the source code, here's what I believe is happening:
>>>
>>> 1. The parent calls __posix_spawnp, which in turn calls __spawni and __spawnix.
>>> 2. The parent calls clone3 and enters uninterruptible sleep.
>>> 3. The child enters __spawni_child and blocks all incoming signals.
>>
>> In fact glibc do not block, but rather set all handlers to either SIG_DFL
>> if is not SIG_IGN, or SIG_DFL if POSIX_SPAWN_SETSIGDEF is set.  However
>> it does not matter for SIGSTOP since we can not set it to SIG_IGN.
>>
>>> ---> 4. At this point the child receives a SIGTSTP signal. <---
>>> 5. The child unblocks signals by calling sigprocmask/pthread_sigmask.
>>> 6. At this point the SIGTSTP is delivered to the child.
>>
>> Afaik SIGSTOP is not synchronous and can be delivered any time during process
>> execution. 
> 
> Thank you for the response! To be clear, I'm referring to SIGTSTP (Ctrl+Z) [1], not
> SIGSTOP. I understand that SIGSTOP cannot be blocked. However, SIGTSTP (which is
> a different signal which can be blocked) is what I'm concerned about.

Right, my mistake.  I understood the issue better now, although I am still puzzled
why SIGTSTP is only being triggered on sigprocmask (sing default action is still
to stop PROCESS).

> 
>>
>>> 7. However, the clone hasn't exited in the parent and so it remains stuck in the clone3 syscall until the child receives a SIGCONT.
>>>
>>> I'm not sure what a reasonable way to handle this would be on the part of my CLI tool. The tool currently just gets stuck in uninterruptible sleep, resulting in a bad user experience.
>>
>> Reading through both your twitter discussion and the bug report against your
>> tool [1] I think it is outside posix_spawn specification on how to handle
>> SIGSTOP for the helper process itself in the tiny window between process
>> creation and the setpgid.
>>
>>>
>>> Here are solutions I've thought about that don't seem to work (please correct me if I'm wrong!)
>>> 1. Setting the signal mask to include SIGTSTP. I do want to be able to send the child SIGTSTP after the clone(), and in my case the child is a third-party process so I can't depend on it to reset the signal mask.
>>> 2. Spawning a stub process that execves the real child. It seems like the same issue exists when the main process calls the stub process, if I'm understanding the code correctly, so this won't help.
>>>
>>> ... though now as I'm writing this email out, maybe one solution is:
>>>
>>> * my tool spawns a stub process with SIGTSTP masked.
>>> * the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, but at least it won't block the parent process), then execves the third-party process.
>>>
>>> Is that the solution you would recommend?
>>
>> I am not sure this would work, since SIGSTOP cannot be caught, blocked, or 
>> ignored.  What I think if might work is to spawn a stub process and make
>> it a new session leader with setsid so it will not have a controlling
>> terminal.  The stub process will be responsible to spawn new processes,
>> so any interaction with the controlling terminal (the CTRL+Z) won't affected
>> the posix_spawn helper thread. 
> 
> That is definitely an interesting solution. However, is it necessary given that
> Ctrl+Z is actually SIGTSTP, which can be blocked?
> 
> Thanks again.

I think one possibility would to set the default signal actions to SIG_IGN, similar
to POSIX_SPAWN_SETSIGDEF does for SIG_DFL (Solaris have POSIX_SPAWN_SETSIGIGN_NP
as an extension).  It won't help much if the signal is received in the tiny window 
between the helper process start and sigaction call, so I am afraid this will only 
decrease the possibility of the deadlock, not eliminate it.

> 
> [1] https://www.gnu.org/software/libc/manual/html_node/Job-Control-Signals.html
> 
>>
>> You will probably need to open the controlling terminal in raw mode so you
>> can catch ctrl-z and pass along the expected process groups.
>>
>>>
>>> Thanks.
>>
>>
>> [1] https://github.com/nextest-rs/nextest/pull/470#issue-1338100182
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-08-22 17:48     ` Adhemerval Zanella Netto
@ 2022-08-22 18:21       ` Florian Weimer
  2022-08-22 18:32         ` Adhemerval Zanella Netto
  2022-08-22 22:30       ` Adhemerval Zanella Netto
  1 sibling, 1 reply; 20+ messages in thread
From: Florian Weimer @ 2022-08-22 18:21 UTC (permalink / raw)
  To: Adhemerval Zanella Netto via Libc-help; +Cc: Rain, Adhemerval Zanella Netto

* Adhemerval Zanella Netto via Libc-help:

> Right, my mistake.  I understood the issue better now, although I am
> still puzzled why SIGTSTP is only being triggered on sigprocmask (sing
> default action is still to stop PROCESS).

I think it's a maskable stop, not an unmaskable one, like SIGSTOP.

This looks a vfork-specific bug that can't happen with fork.  I don't
see how to fix it in a generic fashion because we can't unblock SIGTSTP
and launch the new process in an atomic fashion.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-08-22 18:21       ` Florian Weimer
@ 2022-08-22 18:32         ` Adhemerval Zanella Netto
  2022-08-22 22:28           ` Adhemerval Zanella Netto
  2022-09-13 10:04           ` Florian Weimer
  0 siblings, 2 replies; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2022-08-22 18:32 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella Netto via Libc-help; +Cc: Rain



On 22/08/22 15:21, Florian Weimer wrote:
> * Adhemerval Zanella Netto via Libc-help:
> 
>> Right, my mistake.  I understood the issue better now, although I am
>> still puzzled why SIGTSTP is only being triggered on sigprocmask (sing
>> default action is still to stop PROCESS).
> 
> I think it's a maskable stop, not an unmaskable one, like SIGSTOP.

Yeah, we do block the signal on parent (internal_signal_block_all). 

> 
> This looks a vfork-specific bug that can't happen with fork.  I don't
> see how to fix it in a generic fashion because we can't unblock SIGTSTP
> and launch the new process in an atomic fashion.

We might ask for a new clone3 field to define the default signal mask on 
process start (and thus omit the final sigprocmask before execve).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-08-22 18:32         ` Adhemerval Zanella Netto
@ 2022-08-22 22:28           ` Adhemerval Zanella Netto
  2022-09-13 10:04           ` Florian Weimer
  1 sibling, 0 replies; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2022-08-22 22:28 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella Netto via Libc-help; +Cc: Rain



On 22/08/22 15:32, Adhemerval Zanella Netto wrote:
> 
> We might ask for a new clone3 field to define the default signal mask on 
> process start (and thus omit the final sigprocmask before execve).

And it does not make sense...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-08-22 17:48     ` Adhemerval Zanella Netto
  2022-08-22 18:21       ` Florian Weimer
@ 2022-08-22 22:30       ` Adhemerval Zanella Netto
  1 sibling, 0 replies; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2022-08-22 22:30 UTC (permalink / raw)
  To: Rain, libc-help



On 22/08/22 14:48, Adhemerval Zanella Netto wrote:

>>>
>>>> 7. However, the clone hasn't exited in the parent and so it remains stuck in the clone3 syscall until the child receives a SIGCONT.
>>>>
>>>> I'm not sure what a reasonable way to handle this would be on the part of my CLI tool. The tool currently just gets stuck in uninterruptible sleep, resulting in a bad user experience.
>>>
>>> Reading through both your twitter discussion and the bug report against your
>>> tool [1] I think it is outside posix_spawn specification on how to handle
>>> SIGSTOP for the helper process itself in the tiny window between process
>>> creation and the setpgid.
>>>
>>>>
>>>> Here are solutions I've thought about that don't seem to work (please correct me if I'm wrong!)
>>>> 1. Setting the signal mask to include SIGTSTP. I do want to be able to send the child SIGTSTP after the clone(), and in my case the child is a third-party process so I can't depend on it to reset the signal mask.
>>>> 2. Spawning a stub process that execves the real child. It seems like the same issue exists when the main process calls the stub process, if I'm understanding the code correctly, so this won't help.
>>>>
>>>> ... though now as I'm writing this email out, maybe one solution is:
>>>>
>>>> * my tool spawns a stub process with SIGTSTP masked.
>>>> * the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, but at least it won't block the parent process), then execves the third-party process.
>>>>
>>>> Is that the solution you would recommend?
>>>
>>> I am not sure this would work, since SIGSTOP cannot be caught, blocked, or 
>>> ignored.  What I think if might work is to spawn a stub process and make
>>> it a new session leader with setsid so it will not have a controlling
>>> terminal.  The stub process will be responsible to spawn new processes,
>>> so any interaction with the controlling terminal (the CTRL+Z) won't affected
>>> the posix_spawn helper thread. 
>>
>> That is definitely an interesting solution. However, is it necessary given that
>> Ctrl+Z is actually SIGTSTP, which can be blocked?
>>
>> Thanks again.
> 
> I think one possibility would to set the default signal actions to SIG_IGN, similar
> to POSIX_SPAWN_SETSIGDEF does for SIG_DFL (Solaris have POSIX_SPAWN_SETSIGIGN_NP
> as an extension).  It won't help much if the signal is received in the tiny window 
> between the helper process start and sigaction call, so I am afraid this will only 
> decrease the possibility of the deadlock, not eliminate it.

Sorry I got it backwards, since I forgot that glibc implementation does block
all signal prior spawning the helper process.  In fact using SIG_IGN as default
signal handling would require the spawn process to restore it to SIG_DFL, which
would prevent any process to be stopped.

Which goes back to your original suggestion:

> * my tool spawns a stub process with SIGTSTP masked.
> * the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, but at least it won't block the parent process), then execves the third-party process.

Which I agree is a way to handle it, since we can not atomically unblock SIGTSTP
and execve.

I am not sure if this indeed characterize as a posix_spawn bug for glibc, since
to really fixing we would either need to go back to use fork or use something
like a helper process.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-08-22 18:32         ` Adhemerval Zanella Netto
  2022-08-22 22:28           ` Adhemerval Zanella Netto
@ 2022-09-13 10:04           ` Florian Weimer
  2022-09-21 15:24             ` Adhemerval Zanella Netto
  1 sibling, 1 reply; 20+ messages in thread
From: Florian Weimer @ 2022-09-13 10:04 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: Adhemerval Zanella Netto via Libc-help, Rain

* Adhemerval Zanella Netto:

> On 22/08/22 15:21, Florian Weimer wrote:
>> * Adhemerval Zanella Netto via Libc-help:
>> 
>>> Right, my mistake.  I understood the issue better now, although I am
>>> still puzzled why SIGTSTP is only being triggered on sigprocmask (sing
>>> default action is still to stop PROCESS).
>> 
>> I think it's a maskable stop, not an unmaskable one, like SIGSTOP.
>
> Yeah, we do block the signal on parent (internal_signal_block_all). 
>
>> 
>> This looks a vfork-specific bug that can't happen with fork.  I don't
>> see how to fix it in a generic fashion because we can't unblock SIGTSTP
>> and launch the new process in an atomic fashion.
>
> We might ask for a new clone3 field to define the default signal mask on 
> process start (and thus omit the final sigprocmask before execve).

It might already possible to fix this using io_uring.  Unfortunately, I
didn't attend the LPC presentation.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-09-13 10:04           ` Florian Weimer
@ 2022-09-21 15:24             ` Adhemerval Zanella Netto
  2022-09-22 12:18               ` Florian Weimer
  0 siblings, 1 reply; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2022-09-21 15:24 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Adhemerval Zanella Netto via Libc-help, Rain



On 13/09/22 07:04, Florian Weimer wrote:
> * Adhemerval Zanella Netto:
> 
>> On 22/08/22 15:21, Florian Weimer wrote:
>>> * Adhemerval Zanella Netto via Libc-help:
>>>
>>>> Right, my mistake.  I understood the issue better now, although I am
>>>> still puzzled why SIGTSTP is only being triggered on sigprocmask (sing
>>>> default action is still to stop PROCESS).
>>>
>>> I think it's a maskable stop, not an unmaskable one, like SIGSTOP.
>>
>> Yeah, we do block the signal on parent (internal_signal_block_all). 
>>
>>>
>>> This looks a vfork-specific bug that can't happen with fork.  I don't
>>> see how to fix it in a generic fashion because we can't unblock SIGTSTP
>>> and launch the new process in an atomic fashion.
>>
>> We might ask for a new clone3 field to define the default signal mask on 
>> process start (and thus omit the final sigprocmask before execve).
> 
> It might already possible to fix this using io_uring.  Unfortunately, I
> didn't attend the LPC presentation.

Is there anything that prevents to avoid using CLONE_VFORK? The code already
uses a allocated stack and do synchronizes with waitpid.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-09-21 15:24             ` Adhemerval Zanella Netto
@ 2022-09-22 12:18               ` Florian Weimer
  2022-09-22 16:56                 ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 20+ messages in thread
From: Florian Weimer @ 2022-09-22 12:18 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: Adhemerval Zanella Netto via Libc-help, Rain

* Adhemerval Zanella Netto:

> On 13/09/22 07:04, Florian Weimer wrote:
>> * Adhemerval Zanella Netto:
>> 
>>> On 22/08/22 15:21, Florian Weimer wrote:
>>>> * Adhemerval Zanella Netto via Libc-help:
>>>>
>>>>> Right, my mistake.  I understood the issue better now, although I am
>>>>> still puzzled why SIGTSTP is only being triggered on sigprocmask (sing
>>>>> default action is still to stop PROCESS).
>>>>
>>>> I think it's a maskable stop, not an unmaskable one, like SIGSTOP.
>>>
>>> Yeah, we do block the signal on parent (internal_signal_block_all). 
>>>
>>>>
>>>> This looks a vfork-specific bug that can't happen with fork.  I don't
>>>> see how to fix it in a generic fashion because we can't unblock SIGTSTP
>>>> and launch the new process in an atomic fashion.
>>>
>>> We might ask for a new clone3 field to define the default signal mask on 
>>> process start (and thus omit the final sigprocmask before execve).
>> 
>> It might already possible to fix this using io_uring.  Unfortunately, I
>> didn't attend the LPC presentation.
>
> Is there anything that prevents to avoid using CLONE_VFORK? The code already
> uses a allocated stack and do synchronizes with waitpid.

Assuming there is a way to create a thread which gets replaced by execve
only (instead the whole process), this won't work because we have to
block all signals for the new thread (it must not be visible to
application code, and signal handlers must not run on it), and we can't
unblock those signals prior to execve.  With vfork, we can unblock them
after changing the signal handler disposition to SIG_DFL (preventing the
handler execution), but per-thread signal handlers have been removed
from Linux.  So even if we somehow could prevent the termination signal
from beign sent to the whole process (and not just the fake thread), we
still have a gap.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-09-22 12:18               ` Florian Weimer
@ 2022-09-22 16:56                 ` Adhemerval Zanella Netto
  2022-09-22 17:38                   ` Florian Weimer
  0 siblings, 1 reply; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2022-09-22 16:56 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Adhemerval Zanella Netto via Libc-help, Rain



On 22/09/22 09:18, Florian Weimer wrote:
> * Adhemerval Zanella Netto:
> 
>> On 13/09/22 07:04, Florian Weimer wrote:
>>> * Adhemerval Zanella Netto:
>>>
>>>> On 22/08/22 15:21, Florian Weimer wrote:
>>>>> * Adhemerval Zanella Netto via Libc-help:
>>>>>
>>>>>> Right, my mistake.  I understood the issue better now, although I am
>>>>>> still puzzled why SIGTSTP is only being triggered on sigprocmask (sing
>>>>>> default action is still to stop PROCESS).
>>>>>
>>>>> I think it's a maskable stop, not an unmaskable one, like SIGSTOP.
>>>>
>>>> Yeah, we do block the signal on parent (internal_signal_block_all). 
>>>>
>>>>>
>>>>> This looks a vfork-specific bug that can't happen with fork.  I don't
>>>>> see how to fix it in a generic fashion because we can't unblock SIGTSTP
>>>>> and launch the new process in an atomic fashion.
>>>>
>>>> We might ask for a new clone3 field to define the default signal mask on 
>>>> process start (and thus omit the final sigprocmask before execve).
>>>
>>> It might already possible to fix this using io_uring.  Unfortunately, I
>>> didn't attend the LPC presentation.
>>
>> Is there anything that prevents to avoid using CLONE_VFORK? The code already
>> uses a allocated stack and do synchronizes with waitpid.
> 
> Assuming there is a way to create a thread which gets replaced by execve
> only (instead the whole process), this won't work because we have to
> block all signals for the new thread (it must not be visible to
> application code, and signal handlers must not run on it), and we can't
> unblock those signals prior to execve.  With vfork, we can unblock them
> after changing the signal handler disposition to SIG_DFL (preventing the
> handler execution), but per-thread signal handlers have been removed
> from Linux.  So even if we somehow could prevent the termination signal
> from beign sent to the whole process (and not just the fake thread), we
> still have a gap.

But we already block all internal signals with internal_signal_block_all
prior clone call and it does not use CLONE_SIGHAND on the clone call. 
Also, independently of CLONE_SIGHAND, the calling process and child still 
have distinct signal masks.  Recall for posix_spawn we do not use
CLONE_THREAD, so per-thread signal handlers does not apply here.

Doing some tests, the main problem is in fact how to synchronize 
the deallocation of the stack, since without CLONE_VFORK there is no way
to advertise on a success call when execve has been called.

But I agree that even without CLONE_VFORK we still have a small window,
between the sigprocmask and execve, that the signal might act upon the
child.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-09-22 16:56                 ` Adhemerval Zanella Netto
@ 2022-09-22 17:38                   ` Florian Weimer
  2022-09-22 19:14                     ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 20+ messages in thread
From: Florian Weimer @ 2022-09-22 17:38 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: Adhemerval Zanella Netto via Libc-help, Rain

* Adhemerval Zanella Netto:

> On 22/09/22 09:18, Florian Weimer wrote:
>>> Is there anything that prevents to avoid using CLONE_VFORK? The code already
>>> uses a allocated stack and do synchronizes with waitpid.
>> 
>> Assuming there is a way to create a thread which gets replaced by execve
>> only (instead the whole process), this won't work because we have to
>> block all signals for the new thread (it must not be visible to
>> application code, and signal handlers must not run on it), and we can't
>> unblock those signals prior to execve.  With vfork, we can unblock them
>> after changing the signal handler disposition to SIG_DFL (preventing the
>> handler execution), but per-thread signal handlers have been removed
>> from Linux.  So even if we somehow could prevent the termination signal
>> from beign sent to the whole process (and not just the fake thread), we
>> still have a gap.
>
> But we already block all internal signals with internal_signal_block_all
> prior clone call and it does not use CLONE_SIGHAND on the clone call. 
> Also, independently of CLONE_SIGHAND, the calling process and child still 
> have distinct signal masks.  Recall for posix_spawn we do not use
> CLONE_THREAD, so per-thread signal handlers does not apply here.

This only works because we restore SIG_DFL before unblocking signals in
the new process.  And that depends on a separate set of signal handlers.

> Doing some tests, the main problem is in fact how to synchronize 
> the deallocation of the stack, since without CLONE_VFORK there is no way
> to advertise on a success call when execve has been called.
>
> But I agree that even without CLONE_VFORK we still have a small window,
> between the sigprocmask and execve, that the signal might act upon the
> child.

And that window shouldn't exist in the current implementation.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-09-22 17:38                   ` Florian Weimer
@ 2022-09-22 19:14                     ` Adhemerval Zanella Netto
  2022-10-10 13:45                       ` Florian Weimer
  0 siblings, 1 reply; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2022-09-22 19:14 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Adhemerval Zanella Netto via Libc-help, Rain



On 22/09/22 14:38, Florian Weimer wrote:
> * Adhemerval Zanella Netto:
> 
>> On 22/09/22 09:18, Florian Weimer wrote:
>>>> Is there anything that prevents to avoid using CLONE_VFORK? The code already
>>>> uses a allocated stack and do synchronizes with waitpid.
>>>
>>> Assuming there is a way to create a thread which gets replaced by execve
>>> only (instead the whole process), this won't work because we have to
>>> block all signals for the new thread (it must not be visible to
>>> application code, and signal handlers must not run on it), and we can't
>>> unblock those signals prior to execve.  With vfork, we can unblock them
>>> after changing the signal handler disposition to SIG_DFL (preventing the
>>> handler execution), but per-thread signal handlers have been removed
>>> from Linux.  So even if we somehow could prevent the termination signal
>>> from beign sent to the whole process (and not just the fake thread), we
>>> still have a gap.
>>
>> But we already block all internal signals with internal_signal_block_all
>> prior clone call and it does not use CLONE_SIGHAND on the clone call. 
>> Also, independently of CLONE_SIGHAND, the calling process and child still 
>> have distinct signal masks.  Recall for posix_spawn we do not use
>> CLONE_THREAD, so per-thread signal handlers does not apply here.
> 
> This only works because we restore SIG_DFL before unblocking signals in
> the new process.  And that depends on a separate set of signal handlers.
> 
>> Doing some tests, the main problem is in fact how to synchronize 
>> the deallocation of the stack, since without CLONE_VFORK there is no way
>> to advertise on a success call when execve has been called.
>>
>> But I agree that even without CLONE_VFORK we still have a small window,
>> between the sigprocmask and execve, that the signal might act upon the
>> child.
> 
> And that window shouldn't exist in the current implementation.

But that's the main issue described in this first message, isn't? The child 
unblocks signals by calling sigprocmask, SIGTSTP is delivered to the child,
but since clone hasn't exited due CLONE_VFORK, it remains stuck in clone
until child receives SIGCONT.

I think to actually fix it we need a execve/execveat where the signal mask
is set atomically, so SIGTSTP is sent to the spawned process instead of
the libc helper one.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-09-22 19:14                     ` Adhemerval Zanella Netto
@ 2022-10-10 13:45                       ` Florian Weimer
  2022-10-18 20:04                         ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 20+ messages in thread
From: Florian Weimer @ 2022-10-10 13:45 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: Adhemerval Zanella Netto via Libc-help, Rain

* Adhemerval Zanella Netto:

> On 22/09/22 14:38, Florian Weimer wrote:
>> * Adhemerval Zanella Netto:
>> 
>>> On 22/09/22 09:18, Florian Weimer wrote:
>>>>> Is there anything that prevents to avoid using CLONE_VFORK? The code already
>>>>> uses a allocated stack and do synchronizes with waitpid.
>>>>
>>>> Assuming there is a way to create a thread which gets replaced by execve
>>>> only (instead the whole process), this won't work because we have to
>>>> block all signals for the new thread (it must not be visible to
>>>> application code, and signal handlers must not run on it), and we can't
>>>> unblock those signals prior to execve.  With vfork, we can unblock them
>>>> after changing the signal handler disposition to SIG_DFL (preventing the
>>>> handler execution), but per-thread signal handlers have been removed
>>>> from Linux.  So even if we somehow could prevent the termination signal
>>>> from beign sent to the whole process (and not just the fake thread), we
>>>> still have a gap.
>>>
>>> But we already block all internal signals with internal_signal_block_all
>>> prior clone call and it does not use CLONE_SIGHAND on the clone call. 
>>> Also, independently of CLONE_SIGHAND, the calling process and child still 
>>> have distinct signal masks.  Recall for posix_spawn we do not use
>>> CLONE_THREAD, so per-thread signal handlers does not apply here.
>> 
>> This only works because we restore SIG_DFL before unblocking signals in
>> the new process.  And that depends on a separate set of signal handlers.
>> 
>>> Doing some tests, the main problem is in fact how to synchronize 
>>> the deallocation of the stack, since without CLONE_VFORK there is no way
>>> to advertise on a success call when execve has been called.
>>>
>>> But I agree that even without CLONE_VFORK we still have a small window,
>>> between the sigprocmask and execve, that the signal might act upon the
>>> child.
>> 
>> And that window shouldn't exist in the current implementation.
>
> But that's the main issue described in this first message, isn't? The child 
> unblocks signals by calling sigprocmask, SIGTSTP is delivered to the child,
> but since clone hasn't exited due CLONE_VFORK, it remains stuck in clone
> until child receives SIGCONT.

Yes, we do it this way to avoid a different bug, and trade it for
another.

> I think to actually fix it we need a execve/execveat where the signal mask
> is set atomically, so SIGTSTP is sent to the spawned process instead of
> the libc helper one.

Right, I don't see a way around that.

I don't think switching back to fork by default is really an option.
The impact on latency is much worse than with vfork.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-10-10 13:45                       ` Florian Weimer
@ 2022-10-18 20:04                         ` Adhemerval Zanella Netto
  2022-10-20 11:55                           ` Florian Weimer
  0 siblings, 1 reply; 20+ messages in thread
From: Adhemerval Zanella Netto @ 2022-10-18 20:04 UTC (permalink / raw)
  To: Florian Weimer, Christian Brauner
  Cc: Adhemerval Zanella Netto via Libc-help, Rain



On 10/10/22 10:45, Florian Weimer wrote:
> * Adhemerval Zanella Netto:
> 
>> On 22/09/22 14:38, Florian Weimer wrote:
>>> * Adhemerval Zanella Netto:
>>>
>>>> On 22/09/22 09:18, Florian Weimer wrote:
>>>>>> Is there anything that prevents to avoid using CLONE_VFORK? The code already
>>>>>> uses a allocated stack and do synchronizes with waitpid.
>>>>>
>>>>> Assuming there is a way to create a thread which gets replaced by execve
>>>>> only (instead the whole process), this won't work because we have to
>>>>> block all signals for the new thread (it must not be visible to
>>>>> application code, and signal handlers must not run on it), and we can't
>>>>> unblock those signals prior to execve.  With vfork, we can unblock them
>>>>> after changing the signal handler disposition to SIG_DFL (preventing the
>>>>> handler execution), but per-thread signal handlers have been removed
>>>>> from Linux.  So even if we somehow could prevent the termination signal
>>>>> from beign sent to the whole process (and not just the fake thread), we
>>>>> still have a gap.
>>>>
>>>> But we already block all internal signals with internal_signal_block_all
>>>> prior clone call and it does not use CLONE_SIGHAND on the clone call. 
>>>> Also, independently of CLONE_SIGHAND, the calling process and child still 
>>>> have distinct signal masks.  Recall for posix_spawn we do not use
>>>> CLONE_THREAD, so per-thread signal handlers does not apply here.
>>>
>>> This only works because we restore SIG_DFL before unblocking signals in
>>> the new process.  And that depends on a separate set of signal handlers.
>>>
>>>> Doing some tests, the main problem is in fact how to synchronize 
>>>> the deallocation of the stack, since without CLONE_VFORK there is no way
>>>> to advertise on a success call when execve has been called.
>>>>
>>>> But I agree that even without CLONE_VFORK we still have a small window,
>>>> between the sigprocmask and execve, that the signal might act upon the
>>>> child.
>>>
>>> And that window shouldn't exist in the current implementation.
>>
>> But that's the main issue described in this first message, isn't? The child 
>> unblocks signals by calling sigprocmask, SIGTSTP is delivered to the child,
>> but since clone hasn't exited due CLONE_VFORK, it remains stuck in clone
>> until child receives SIGCONT.
> 
> Yes, we do it this way to avoid a different bug, and trade it for
> another.
> 
>> I think to actually fix it we need a execve/execveat where the signal mask
>> is set atomically, so SIGTSTP is sent to the spawned process instead of
>> the libc helper one.
> 
> Right, I don't see a way around that.
> 
> I don't think switching back to fork by default is really an option.
> The impact on latency is much worse than with vfork.

I agree and I have been chatting with Christian if we can improve this with some
kernel support.  My idea would to add a new clone3 argument to define a signal
mask and another options (either through clone3 itself or with a new execve
variant) to setup the desired signal mask after execve call.

The first features is more an optimization to avoid the sigprocmask (although
I think we will need it anyway to proper reap the child if the spawni fails),
while the second feature should fix the issue raised in this thread.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-10-18 20:04                         ` Adhemerval Zanella Netto
@ 2022-10-20 11:55                           ` Florian Weimer
  2022-10-21  1:40                             ` Rain
  0 siblings, 1 reply; 20+ messages in thread
From: Florian Weimer @ 2022-10-20 11:55 UTC (permalink / raw)
  To: Adhemerval Zanella Netto
  Cc: Christian Brauner, Adhemerval Zanella Netto via Libc-help, Rain

* Adhemerval Zanella Netto:

>> I don't think switching back to fork by default is really an option.
>> The impact on latency is much worse than with vfork.
>
> I agree and I have been chatting with Christian if we can improve this
> with some kernel support.  My idea would to add a new clone3 argument
> to define a signal mask and another options (either through clone3
> itself or with a new execve variant) to setup the desired signal mask
> after execve call.
>
> The first features is more an optimization to avoid the sigprocmask
> (although I think we will need it anyway to proper reap the child if
> the spawni fails), while the second feature should fix the issue
> raised in this thread.

But I think it would only work for SIGTSTP, not for SIGSTOP, right?
But maybe SIGSTOP is sufficiently unusual that fixing SIGTSP on its own
is already a welcome improvement.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-10-20 11:55                           ` Florian Weimer
@ 2022-10-21  1:40                             ` Rain
  2022-10-21 14:18                               ` Szabolcs Nagy
  0 siblings, 1 reply; 20+ messages in thread
From: Rain @ 2022-10-21  1:40 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella Netto
  Cc: Christian Brauner, Adhemerval Zanella Netto via Libc-help

On Thu, Oct 20, 2022, at 04:55, Florian Weimer wrote:
> * Adhemerval Zanella Netto:
>
>>> I don't think switching back to fork by default is really an option.
>>> The impact on latency is much worse than with vfork.
>>
>> I agree and I have been chatting with Christian if we can improve this
>> with some kernel support.  My idea would to add a new clone3 argument
>> to define a signal mask and another options (either through clone3
>> itself or with a new execve variant) to setup the desired signal mask
>> after execve call.
>>
>> The first features is more an optimization to avoid the sigprocmask
>> (although I think we will need it anyway to proper reap the child if
>> the spawni fails), while the second feature should fix the issue
>> raised in this thread.
>
> But I think it would only work for SIGTSTP, not for SIGSTOP, right?
> But maybe SIGSTOP is sufficiently unusual that fixing SIGTSP on its own
> is already a welcome improvement.

From my perspective, fixing SIGTSTP is enough.

However, I do care about older versions of the Linux kernel and glibc,
as well as other operating systems, so I'll probably have to maintain
the double-process-spawn workaround indefinitely, sadly.

Thanks,
Rain

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
  2022-10-21  1:40                             ` Rain
@ 2022-10-21 14:18                               ` Szabolcs Nagy
  0 siblings, 0 replies; 20+ messages in thread
From: Szabolcs Nagy @ 2022-10-21 14:18 UTC (permalink / raw)
  To: Rain
  Cc: Florian Weimer, Adhemerval Zanella Netto,
	Adhemerval Zanella Netto via Libc-help, Christian Brauner

The 10/20/2022 18:40, Rain wrote:
> On Thu, Oct 20, 2022, at 04:55, Florian Weimer wrote:
> > * Adhemerval Zanella Netto:
> >
> >>> I don't think switching back to fork by default is really an option.
> >>> The impact on latency is much worse than with vfork.
> >>
> >> I agree and I have been chatting with Christian if we can improve this
> >> with some kernel support.  My idea would to add a new clone3 argument
> >> to define a signal mask and another options (either through clone3
> >> itself or with a new execve variant) to setup the desired signal mask
> >> after execve call.
> >>
> >> The first features is more an optimization to avoid the sigprocmask
> >> (although I think we will need it anyway to proper reap the child if
> >> the spawni fails), while the second feature should fix the issue
> >> raised in this thread.
> >
> > But I think it would only work for SIGTSTP, not for SIGSTOP, right?
> > But maybe SIGSTOP is sufficiently unusual that fixing SIGTSP on its own
> > is already a welcome improvement.
> 
> From my perspective, fixing SIGTSTP is enough.
> 
> However, I do care about older versions of the Linux kernel and glibc,
> as well as other operating systems, so I'll probably have to maintain
> the double-process-spawn workaround indefinitely, sadly.

either way i think you should open a ticket on bugzilla about this
(so you and others can follow related discussions)

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-10-21 14:19 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-14  3:30 posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough Rain
2022-08-14  3:38 ` Rain
2022-08-22 16:51 ` Adhemerval Zanella Netto
2022-08-22 17:00   ` Rain
2022-08-22 17:48     ` Adhemerval Zanella Netto
2022-08-22 18:21       ` Florian Weimer
2022-08-22 18:32         ` Adhemerval Zanella Netto
2022-08-22 22:28           ` Adhemerval Zanella Netto
2022-09-13 10:04           ` Florian Weimer
2022-09-21 15:24             ` Adhemerval Zanella Netto
2022-09-22 12:18               ` Florian Weimer
2022-09-22 16:56                 ` Adhemerval Zanella Netto
2022-09-22 17:38                   ` Florian Weimer
2022-09-22 19:14                     ` Adhemerval Zanella Netto
2022-10-10 13:45                       ` Florian Weimer
2022-10-18 20:04                         ` Adhemerval Zanella Netto
2022-10-20 11:55                           ` Florian Weimer
2022-10-21  1:40                             ` Rain
2022-10-21 14:18                               ` Szabolcs Nagy
2022-08-22 22:30       ` Adhemerval Zanella Netto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).