From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x31.google.com (mail-oa1-x31.google.com [IPv6:2001:4860:4864:20::31]) by sourceware.org (Postfix) with ESMTPS id D31DE3858D3C for ; Mon, 22 Aug 2022 22:30:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D31DE3858D3C Received: by mail-oa1-x31.google.com with SMTP id 586e51a60fabf-10cf9f5b500so14702356fac.2 for ; Mon, 22 Aug 2022 15:30:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:references:to :from:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc; bh=VtIlcL4M5EMEIzFPiEEoMB7rQ+Nr0m/m6VR0XLdsHXo=; b=CxWnqDYXAMOuHFUwv+E1VGVLIYcOviizyvQrl+IPqVBHmPy1bs8EC5SVCKWbFtB+V5 aoRRuzl1BwmD4JKHZZMHY3GsP+vwuEaDM12PrkdHkUWokLU84FrXPA52KGvFrXX/y5DV U00KQvnXSAncgedZN6e4P/j0Q5UM8cldoQi3XRC4GE4sWhDFbnkh/DsLNCSGBcHSyxyk rI4sGQhnDpt2VpH/Ivr6keW7bJtKMi0pNA5s2HEU8JFtG1SYOoOnqGooEVf3ytsMFdTi 2ClwtloyNo0wuow7CqlHIw4xtp8Pkx3nVEkBkfDuNBPh2jaJPnUGVLPjp0gMsYeFmod6 HtdQ== X-Gm-Message-State: ACgBeo1bKucqQsFQ2Om+S0mCOXjluVUtzMrRRDhPx9P3fmwcnHXGi0IO Nsw+MRY+VAjAkHd3xK/1b95Jb53Um9UyTQ== X-Google-Smtp-Source: AA6agR5jnLNPnGjmZo1+vw1DY9RxQ8bIpGuULacl30dRoIOshXVqGEGBuNr157bVlQRavIrT03KmOQ== X-Received: by 2002:a05:6870:344a:b0:11c:94f3:e003 with SMTP id i10-20020a056870344a00b0011c94f3e003mr197913oah.218.1661207408997; Mon, 22 Aug 2022 15:30:08 -0700 (PDT) Received: from ?IPV6:2804:18:8c4:cd30:310e:31ba:cbc:d30d? ([2804:18:8c4:cd30:310e:31ba:cbc:d30d]) by smtp.gmail.com with ESMTPSA id s41-20020a4a96ac000000b0044584998c9asm2700248ooi.38.2022.08.22.15.30.07 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 22 Aug 2022 15:30:08 -0700 (PDT) Message-ID: <2afd2b84-e0de-3495-86e4-c8618c6dd3c1@linaro.org> Date: Mon, 22 Aug 2022 19:30:06 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.1.2 Subject: Re: posix_spawn: parent can get stuck in uninterruptible sleep if child receives SIGTSTP early enough Content-Language: en-US From: Adhemerval Zanella Netto To: Rain , libc-help@sourceware.org References: <2921668c-773e-465d-9480-0abb6f979bf9@www.fastmail.com> <7727e4de-a8da-1e6b-4d7c-68a132750996@linaro.org> <64917a2f-788b-4695-b799-63bbb8a4873f@www.fastmail.com> Organization: Linaro In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-help@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-help mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Aug 2022 22:30:12 -0000 On 22/08/22 14:48, Adhemerval Zanella Netto wrote: >>> >>>> 7. However, the clone hasn't exited in the parent and so it remains stuck in the clone3 syscall until the child receives a SIGCONT. >>>> >>>> I'm not sure what a reasonable way to handle this would be on the part of my CLI tool. The tool currently just gets stuck in uninterruptible sleep, resulting in a bad user experience. >>> >>> Reading through both your twitter discussion and the bug report against your >>> tool [1] I think it is outside posix_spawn specification on how to handle >>> SIGSTOP for the helper process itself in the tiny window between process >>> creation and the setpgid. >>> >>>> >>>> Here are solutions I've thought about that don't seem to work (please correct me if I'm wrong!) >>>> 1. Setting the signal mask to include SIGTSTP. I do want to be able to send the child SIGTSTP after the clone(), and in my case the child is a third-party process so I can't depend on it to reset the signal mask. >>>> 2. Spawning a stub process that execves the real child. It seems like the same issue exists when the main process calls the stub process, if I'm understanding the code correctly, so this won't help. >>>> >>>> ... though now as I'm writing this email out, maybe one solution is: >>>> >>>> * my tool spawns a stub process with SIGTSTP masked. >>>> * the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, but at least it won't block the parent process), then execves the third-party process. >>>> >>>> Is that the solution you would recommend? >>> >>> I am not sure this would work, since SIGSTOP cannot be caught, blocked, or >>> ignored. What I think if might work is to spawn a stub process and make >>> it a new session leader with setsid so it will not have a controlling >>> terminal. The stub process will be responsible to spawn new processes, >>> so any interaction with the controlling terminal (the CTRL+Z) won't affected >>> the posix_spawn helper thread. >> >> That is definitely an interesting solution. However, is it necessary given that >> Ctrl+Z is actually SIGTSTP, which can be blocked? >> >> Thanks again. > > I think one possibility would to set the default signal actions to SIG_IGN, similar > to POSIX_SPAWN_SETSIGDEF does for SIG_DFL (Solaris have POSIX_SPAWN_SETSIGIGN_NP > as an extension). It won't help much if the signal is received in the tiny window > between the helper process start and sigaction call, so I am afraid this will only > decrease the possibility of the deadlock, not eliminate it. Sorry I got it backwards, since I forgot that glibc implementation does block all signal prior spawning the helper process. In fact using SIG_IGN as default signal handling would require the spawn process to restore it to SIG_DFL, which would prevent any process to be stopped. Which goes back to your original suggestion: > * my tool spawns a stub process with SIGTSTP masked. > * the subprocess unmasks SIGTSTP (so it could receive the SIGTSTP here, but at least it won't block the parent process), then execves the third-party process. Which I agree is a way to handle it, since we can not atomically unblock SIGTSTP and execve. I am not sure if this indeed characterize as a posix_spawn bug for glibc, since to really fixing we would either need to go back to use fork or use something like a helper process.