public inbox for libc-help@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
To: Florian Weimer <fweimer@redhat.com>,
	Christian Brauner <brauner@kernel.org>
Cc: Adhemerval Zanella Netto via Libc-help <libc-help@sourceware.org>,
	Rain <glibc@sunshowers.io>
Subject: Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough
Date: Tue, 18 Oct 2022 17:04:29 -0300	[thread overview]
Message-ID: <47b750c6-f05c-2538-114d-3799628ebf56@linaro.org> (raw)
In-Reply-To: <87pmezztah.fsf@oldenburg.str.redhat.com>



On 10/10/22 10:45, Florian Weimer wrote:
> * Adhemerval Zanella Netto:
> 
>> On 22/09/22 14:38, Florian Weimer wrote:
>>> * Adhemerval Zanella Netto:
>>>
>>>> On 22/09/22 09:18, Florian Weimer wrote:
>>>>>> Is there anything that prevents to avoid using CLONE_VFORK? The code already
>>>>>> uses a allocated stack and do synchronizes with waitpid.
>>>>>
>>>>> Assuming there is a way to create a thread which gets replaced by execve
>>>>> only (instead the whole process), this won't work because we have to
>>>>> block all signals for the new thread (it must not be visible to
>>>>> application code, and signal handlers must not run on it), and we can't
>>>>> unblock those signals prior to execve.  With vfork, we can unblock them
>>>>> after changing the signal handler disposition to SIG_DFL (preventing the
>>>>> handler execution), but per-thread signal handlers have been removed
>>>>> from Linux.  So even if we somehow could prevent the termination signal
>>>>> from beign sent to the whole process (and not just the fake thread), we
>>>>> still have a gap.
>>>>
>>>> But we already block all internal signals with internal_signal_block_all
>>>> prior clone call and it does not use CLONE_SIGHAND on the clone call. 
>>>> Also, independently of CLONE_SIGHAND, the calling process and child still 
>>>> have distinct signal masks.  Recall for posix_spawn we do not use
>>>> CLONE_THREAD, so per-thread signal handlers does not apply here.
>>>
>>> This only works because we restore SIG_DFL before unblocking signals in
>>> the new process.  And that depends on a separate set of signal handlers.
>>>
>>>> Doing some tests, the main problem is in fact how to synchronize 
>>>> the deallocation of the stack, since without CLONE_VFORK there is no way
>>>> to advertise on a success call when execve has been called.
>>>>
>>>> But I agree that even without CLONE_VFORK we still have a small window,
>>>> between the sigprocmask and execve, that the signal might act upon the
>>>> child.
>>>
>>> And that window shouldn't exist in the current implementation.
>>
>> But that's the main issue described in this first message, isn't? The child 
>> unblocks signals by calling sigprocmask, SIGTSTP is delivered to the child,
>> but since clone hasn't exited due CLONE_VFORK, it remains stuck in clone
>> until child receives SIGCONT.
> 
> Yes, we do it this way to avoid a different bug, and trade it for
> another.
> 
>> I think to actually fix it we need a execve/execveat where the signal mask
>> is set atomically, so SIGTSTP is sent to the spawned process instead of
>> the libc helper one.
> 
> Right, I don't see a way around that.
> 
> I don't think switching back to fork by default is really an option.
> The impact on latency is much worse than with vfork.

I agree and I have been chatting with Christian if we can improve this with some
kernel support.  My idea would to add a new clone3 argument to define a signal
mask and another options (either through clone3 itself or with a new execve
variant) to setup the desired signal mask after execve call.

The first features is more an optimization to avoid the sigprocmask (although
I think we will need it anyway to proper reap the child if the spawni fails),
while the second feature should fix the issue raised in this thread.

  reply	other threads:[~2022-10-18 20:04 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-14  3:30 Rain
2022-08-14  3:38 ` Rain
2022-08-22 16:51 ` Adhemerval Zanella Netto
2022-08-22 17:00   ` Rain
2022-08-22 17:48     ` Adhemerval Zanella Netto
2022-08-22 18:21       ` Florian Weimer
2022-08-22 18:32         ` Adhemerval Zanella Netto
2022-08-22 22:28           ` Adhemerval Zanella Netto
2022-09-13 10:04           ` Florian Weimer
2022-09-21 15:24             ` Adhemerval Zanella Netto
2022-09-22 12:18               ` Florian Weimer
2022-09-22 16:56                 ` Adhemerval Zanella Netto
2022-09-22 17:38                   ` Florian Weimer
2022-09-22 19:14                     ` Adhemerval Zanella Netto
2022-10-10 13:45                       ` Florian Weimer
2022-10-18 20:04                         ` Adhemerval Zanella Netto [this message]
2022-10-20 11:55                           ` Florian Weimer
2022-10-21  1:40                             ` Rain
2022-10-21 14:18                               ` Szabolcs Nagy
2022-08-22 22:30       ` Adhemerval Zanella Netto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47b750c6-f05c-2538-114d-3799628ebf56@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=brauner@kernel.org \
    --cc=fweimer@redhat.com \
    --cc=glibc@sunshowers.io \
    --cc=libc-help@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).