From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x22e.google.com (mail-oi1-x22e.google.com [IPv6:2607:f8b0:4864:20::22e]) by sourceware.org (Postfix) with ESMTPS id 7FE793858D39 for ; Tue, 18 Oct 2022 20:04:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7FE793858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x22e.google.com with SMTP id p127so16192380oih.9 for ; Tue, 18 Oct 2022 13:04:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=jZpIG8pBCg7j/B/lcEZQgV3v28b7U+eKqaygSJ2tj9I=; b=WCXnQeqdlxfXqlshYb8rzRYevXImGslMonYTzY6ZYZj/LHip3RLFowra4Ztlz0tPwX 5ox0/3k9yBgyn7oVZAiHOAcgj7vscHPxgWRuLfllstmf/pnce09BwWSeNuTI+sNIXkBp GwUFypLWKv9oc9hnC9OI5ajxJtYOqCRN5Io68sF5KHaKWeSPJOF7AeydAFkGw5PlDjqF sJjGNtH54uCaygyGtyN9ixWiNXRFK4dgCAiQyCVGZ+7QOPpm0hjtPrpk3a+oX0kZW8Ex AWPUxCdPhQMRnWbxqmObPL/L0alxZq4KkahMMlwdLktvGqWYNPrZKZALzwlmkU/g3ZaL cOnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jZpIG8pBCg7j/B/lcEZQgV3v28b7U+eKqaygSJ2tj9I=; b=TN8G8Vd9dLSzlCXfP6stRbRaNT6MZm43Hn2oYsfScOOQASb5YeUiTpOvPfTHPjkEXQ GEnh1mkkDuSwnDRyd8Al4YvIENNiSybScWi3iCtbZ6vUrWP8Jj368r5EGzmaQjsAK8Bd wWnlSGMeOPhXxFGOHuqnsyoDIxpllap1vHYMQxGjMz5U6tM4tbOhyUBeQOvPgzbrMV5I bTjEagd7GDp3xeBRy5lZYoO9bVYXgzblwEPJuzNZozzccSCrCItHggGFvikriDNznMJD QbyftMci73SdfAP2j52YaOqMTJMqLmhFHtN+UhhHPMuGWZcBeCfn0ITew51sG4Fjw7J3 6L/A== X-Gm-Message-State: ACrzQf2EtDCzhZvHi/GmIqdKPMFMKm5h1B/XW0UhxvO7GMphtboWpfhB U863yc6JlTXEYHqrYOfFBShRvw== X-Google-Smtp-Source: AMsMyM7CaQ2nZYUomIe2CunqNINx+KD96UA6ZhTJSkg0SRUDcJlhxqnxUEzUH+7PfRlqBMPgAzp8pw== X-Received: by 2002:a05:6808:309b:b0:355:196b:f8f0 with SMTP id bl27-20020a056808309b00b00355196bf8f0mr2226380oib.38.1666123471770; Tue, 18 Oct 2022 13:04:31 -0700 (PDT) Received: from ?IPV6:2804:1b3:a7c3:7d19:25db:85d5:2bce:fb6e? ([2804:1b3:a7c3:7d19:25db:85d5:2bce:fb6e]) by smtp.gmail.com with ESMTPSA id j21-20020acaeb15000000b00354976fde2dsm5940643oih.44.2022.10.18.13.04.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 18 Oct 2022 13:04:31 -0700 (PDT) Message-ID: <47b750c6-f05c-2538-114d-3799628ebf56@linaro.org> Date: Tue, 18 Oct 2022 17:04:29 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.3.3 Subject: Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough Content-Language: en-US To: Florian Weimer , Christian Brauner Cc: Adhemerval Zanella Netto via Libc-help , Rain References: <2921668c-773e-465d-9480-0abb6f979bf9@www.fastmail.com> <7727e4de-a8da-1e6b-4d7c-68a132750996@linaro.org> <64917a2f-788b-4695-b799-63bbb8a4873f@www.fastmail.com> <87tu64w33v.fsf@oldenburg.str.redhat.com> <7c356365-34db-cc00-bb92-0e55e7a89118@linaro.org> <877d27vbdx.fsf@oldenburg.str.redhat.com> <5bcba9d3-7bdd-1855-afb7-1f9d63014842@linaro.org> <87leqbmwkl.fsf@oldenburg.str.redhat.com> <87leqb1f9j.fsf@oldenburg.str.redhat.com> <88e5f61f-253d-5e2a-a0bd-39beff55c82c@linaro.org> <87pmezztah.fsf@oldenburg.str.redhat.com> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: <87pmezztah.fsf@oldenburg.str.redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/10/22 10:45, Florian Weimer wrote: > * Adhemerval Zanella Netto: > >> On 22/09/22 14:38, Florian Weimer wrote: >>> * Adhemerval Zanella Netto: >>> >>>> On 22/09/22 09:18, Florian Weimer wrote: >>>>>> Is there anything that prevents to avoid using CLONE_VFORK? The code already >>>>>> uses a allocated stack and do synchronizes with waitpid. >>>>> >>>>> Assuming there is a way to create a thread which gets replaced by execve >>>>> only (instead the whole process), this won't work because we have to >>>>> block all signals for the new thread (it must not be visible to >>>>> application code, and signal handlers must not run on it), and we can't >>>>> unblock those signals prior to execve. With vfork, we can unblock them >>>>> after changing the signal handler disposition to SIG_DFL (preventing the >>>>> handler execution), but per-thread signal handlers have been removed >>>>> from Linux. So even if we somehow could prevent the termination signal >>>>> from beign sent to the whole process (and not just the fake thread), we >>>>> still have a gap. >>>> >>>> But we already block all internal signals with internal_signal_block_all >>>> prior clone call and it does not use CLONE_SIGHAND on the clone call. >>>> Also, independently of CLONE_SIGHAND, the calling process and child still >>>> have distinct signal masks. Recall for posix_spawn we do not use >>>> CLONE_THREAD, so per-thread signal handlers does not apply here. >>> >>> This only works because we restore SIG_DFL before unblocking signals in >>> the new process. And that depends on a separate set of signal handlers. >>> >>>> Doing some tests, the main problem is in fact how to synchronize >>>> the deallocation of the stack, since without CLONE_VFORK there is no way >>>> to advertise on a success call when execve has been called. >>>> >>>> But I agree that even without CLONE_VFORK we still have a small window, >>>> between the sigprocmask and execve, that the signal might act upon the >>>> child. >>> >>> And that window shouldn't exist in the current implementation. >> >> But that's the main issue described in this first message, isn't? The child >> unblocks signals by calling sigprocmask, SIGTSTP is delivered to the child, >> but since clone hasn't exited due CLONE_VFORK, it remains stuck in clone >> until child receives SIGCONT. > > Yes, we do it this way to avoid a different bug, and trade it for > another. > >> I think to actually fix it we need a execve/execveat where the signal mask >> is set atomically, so SIGTSTP is sent to the spawned process instead of >> the libc helper one. > > Right, I don't see a way around that. > > I don't think switching back to fork by default is really an option. > The impact on latency is much worse than with vfork. I agree and I have been chatting with Christian if we can improve this with some kernel support. My idea would to add a new clone3 argument to define a signal mask and another options (either through clone3 itself or with a new execve variant) to setup the desired signal mask after execve call. The first features is more an optimization to avoid the sigprocmask (although I think we will need it anyway to proper reap the child if the spawni fails), while the second feature should fix the issue raised in this thread.