From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mpv-out-ksl-1.case.edu (mpv-out-ksl-1.CWRU.Edu [129.22.103.228]) by sourceware.org (Postfix) with ESMTPS id 9E2803858D35 for ; Tue, 16 Apr 2024 15:49:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9E2803858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=case.edu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=case.edu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9E2803858D35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=129.22.103.228 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713282545; cv=none; b=tq38Pvf8Psbv7i6DIvyscOovaV84ar3AZodYUJI02UteBHZFvP8GAUqtSgKOSRZFsXCqiMnuC20xRT+U8/Z+e0qR30D46pyGNODfApk6Q+sOc3DVK/0uMc32Z19LE+gzaURiDR9ep7V9MDS8IwsayMN4f1gVgTA4GdQXa7R1SmM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713282545; c=relaxed/simple; bh=OTWZ3Ue7X39jEr6f7LQlvxBg8eprot4Kq7NwrglA6XY=; h=DKIM-Signature:DKIM-Signature:Message-ID:Date:MIME-Version: Subject:To:From; b=r0d1TRGst0fMJSZFIWsImON39iqJQizxeoRwK7i4HtW7MycIwHmkJxiNyCllf45clB2MqnACsweU2C7zNox1WzO8GwotkGNa61l/aQOWcQHcvalrsqy5dRa9gHMmSy9gWddX4r9LN3foWEMlRqqRk3t4bvdUznrWRWt5Vh3uLlo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mpv-local-cfd-1.CWRU.Edu (EHLO mpv-local-cfd-1.case.edu) ([129.22.103.203]) by mpv-out-ksl-1.case.edu (MOS 4.4.8-GA FastPath queued) with ESMTP id CDR31708; Tue, 16 Apr 2024 11:49:00 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=case.edu; s=smtp-primary; t=1713282540; bh=rd7a7dowzMQ9CITfeMg1rucuOehbL8fUeE0Z9Nc6JVQ=; l=10896; h=Message-ID:Date:MIME-Version:Reply-To:Cc:Subject:To:References: From:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=edwEZONrKpdBAUEEupMkr7fUGhRe2ZcO9rReTdy6uKC4zZg3/nJLq7kduD7GWFyowk c4bjH7a0xFoubiSm0NDUk/HlmZ7nnWrQWNx/j1d9Y2bLOtdnNQtvDhohsw9GoPqGgEU E1g7qVgvuTbgPm58FwXflPxNfmzV6ai08r1MRvxRXAtDeJ4cAFhoSFbQ8pTd0+0V5xw nz54QKusTE8RMhVutc5Z1OhrSMImVSVqaN7BJoG6dyPq34n333Y2qWVe2+DonqHsb9U eVbHpVa1T8BAe3DORqekGLhuxntTSxccnVKl+faLnQO4h7iMKr7WcMOgFqRS7DMVAzd 7i92hjrg== Received: from mail-qt1-f200.google.com (EHLO mail-qt1-f200.google.com) ([209.85.160.200]) by mpv-local-cfd-1.case.edu (MOS 4.4.8-GA FastPath queued) with ESMTP id DLR90152; Tue, 16 Apr 2024 11:49:00 -0400 (EDT) Received: by mail-qt1-f200.google.com with SMTP id d75a77b69052e-4348786d28bso39811491cf.2 for ; Tue, 16 Apr 2024 08:49:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=case.edu; s=g-case; t=1713282540; x=1713887340; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:to :content-language:subject:cc:reply-to:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=rd7a7dowzMQ9CITfeMg1rucuOehbL8fUeE0Z9Nc6JVQ=; b=ajPKTGzfN26at34DKM1P6AG8Nlp9j3e5vsjkQH1wfOPji5sxCtZwwnWcvyALj42CoK 61l7YySKN0u71gQ//PDyA50EetYDSHrNeajLQ4C1ilx9UwZg7Wvn5CRXdo19Zu6Uq73B lVL06ma3VolwEc7zxNda5X0S10+5Urj+8k+gjYNczqL2eUvzf8Sa5dZAd8XRW2wxDCY1 lKkyzmI2g6Wa9Di0sWrGeSEyvKCHqZabegXPWlKfW+Gn099CxCFshyq6R397ip0/ul6y G++1h+NiDIfFhIJhc6NtwJS0Vid0J878Y+3r8rUnM+RQvtVjKSYsPKganCsu897B4GmW EmAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713282540; x=1713887340; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:to :content-language:subject:cc:reply-to:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=rd7a7dowzMQ9CITfeMg1rucuOehbL8fUeE0Z9Nc6JVQ=; b=wRA+fwvzJIvTQoL8tO/JPppl9BgujGLjFk9u0J9cfcPfQvDq+xTAdn+GzeYOLto1eT 3a5x/PT38h2Awmv92twV1sJhc7kOarF5Y4JJKunzqIR+TUlQ9ODuGiBgyQY89S0Ma9B0 vsPLU7ihdqJQFcr9ooWl4Qsw576uHSu0/0Cc1kDRhlUbAHlPO2Lj8IoLVHMWfmZCTdDO qo390twLsDIUX8bXUD48J5xC5ad06WBvNV4DYQt58I7UrESWZxqb9h6NjpiSSc78FEOq 6/xFok07nJwoSPRFM48Up1aly+Z0HocTIPS9Y4Wn0a2VsTH/z7O04pI250QEhHL4d+NG RSLQ== X-Forwarded-Encrypted: i=1; AJvYcCXhWRnzeNaWzV24UwReFCnjGFZ6LXaujwsAI1a/FNPeOGFkpRgHzWLxDAVOwjKB1aCPGhpY2huwuSAJJphBlJzX+4m+Xi/3MVWB X-Gm-Message-State: AOJu0Ywz8EamvyxzTVzcUDMTCEhLRzlOqTTpr7kozHnazca7PAwXmDNo IQKxdp4LuEviCwWwsuY6uktFqU19Fwwvm78Car0LFQjoVPTj0co4n9hdZxpbEMZl+PA5DfW06oZ VfrC+U+XcERNVoik3N9xjvB4jj9uSyxRNlherFgAhyA/RFt51SWmlhdCTZW4= X-Received: by 2002:a05:622a:149:b0:434:ae32:9159 with SMTP id v9-20020a05622a014900b00434ae329159mr17837674qtw.44.1713282539775; Tue, 16 Apr 2024 08:48:59 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH/ZamrhDTVqXhOCE5m8FcHn8C0vJelkVRdJ+wl/QTLsi/ZBIJQcSJueDlsYgqoAyFXRuk0ww== X-Received: by 2002:a05:622a:149:b0:434:ae32:9159 with SMTP id v9-20020a05622a014900b00434ae329159mr17837644qtw.44.1713282539328; Tue, 16 Apr 2024 08:48:59 -0700 (PDT) Received: from [129.22.8.211] (caleb.INS.CWRU.Edu. [129.22.8.211]) by smtp.gmail.com with ESMTPSA id bx5-20020a05622a090500b00437179d2133sm2189493qtb.88.2024.04.16.08.48.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 16 Apr 2024 08:48:58 -0700 (PDT) Message-ID: <4625270d-c8f6-42d1-afa0-fafb7a33571e@case.edu> Date: Tue, 16 Apr 2024 11:48:58 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Reply-To: chet.ramey@case.edu Cc: chet.ramey@case.edu, Zachary Santer , bug-bash , libc-alpha@sourceware.org Subject: Re: Examples of concurrent coproc usage? Content-Language: en-US To: Carl Edquist References: <9831afe6-958a-fbd3-9434-05dd0c9b602a@draigBrady.com> <317fe0e2-8cf9-d4ac-ed56-e6ebcc2baa55@cs.wisc.edu> <8c490a55-598a-adf6-67c2-eb2a6099620a@cs.wisc.edu> <88a67f36-2a56-a838-f763-f55b3073bb50@lando.namek.net> <2791ad90-a871-474d-89dd-bc6b20cdd1f2@case.edu> <86c3765e-e29d-48d5-b468-3f20b59916b2@case.edu> <6bcbd956-7296-7150-765f-63318a425d1b@cs.wisc.edu> From: Chet Ramey Autocrypt: addr=chet.ramey@case.edu; keydata= xsDiBEEOsGwRBACFa0A1oa71HSZLWxAx0svXzhOZNQZOzqHmSuGOG92jIpQpr8DpvgRh40Yp AwdcXb8QG1J5yGAKeevNE1zCFaA725vGSdHUyypHouV0xoWwukYO6qlyyX+2BZU+okBUqoWQ koWxiYaCSfzB2Ln7pmdys1fJhcgBKf3VjWCjd2XJTwCgoFJOwyBFJdugjfwjSoRSwDOIMf0D /iQKqlWhIO1LGpMrGX0il0/x4zj0NAcSwAk7LaPZbN4UPjn5pqGEHBlf1+xDDQCkAoZ/VqES GZragl4VqJfxBr29Ag0UDvNbUbXoxQsARdero1M8GiAIRc50hj7HXFoERwenbNDJL86GPLAQ OTGOCa4W2o29nFfFjQrsrrYHzVtyA/9oyKvTeEMJ7NA3VJdWcmn7gOu0FxEmSNhSoV1T4vP2 1Wf7f5niCCRKQLNyUy0wEApQi4tSysdz+AbgAc0b/bHYVzIf2uO2lIEZQNNt+3g2bmXgloWm W5fsm/di50Gm1l1Na63d3RZ00SeFQos6WEwLUHEB0yp6KXluXLLIZitEJM0gQ2hldCBSYW1l eSA8Y2hldC5yYW1leUBjYXNlLmVkdT7CYQQTEQIAIQIbAwYLCQgHAwIDFQIDAxYCAQIeAQIX gAUCRX3FIgIZAQAKCRC7WGnwZOp0q069AKCNDRn+zzN/AHbaynls/Lvq1kH/RQCgkLvF8bDs maUHSxSIPqzlGuKWDxbOwE0EQQ6wbxAEAJCukwDigRDPhAuI+lf+6P64lWanIFOXIndqhvU1 3cDbQ/Wt5LwPzm2QTvd7F+fcHOgZ8KOFScbDpjJaRqwIybMTcIN0B2pBLX/C10W1aY+cUrXZ gXUGVISEMmpaP9v02auToo7XXVEHC+XLO9IU7/xaU98FL69l6/K4xeNSBRM/AAMHA/wNAmRB pcyK0+VggZ5esQaIP/LyolAm2qwcmrd3dZi+g24s7yjV0EUwvRP7xHRDQFgkAo6++QbuecU/ J90lxrVnQwucZmfz9zgWDkT/MpfB/CNRSKLFjhYq2yHmHWT6vEjw9Ry/hF6Pc0oh1a62USdf aKAiim0nVxxQmPmiRvtCmcJJBBgRAgAJBQJBDrBvAhsMAAoJELtYafBk6nSr43AAn2ZZFQg8 Gs/zUzvXMt7evaFqVTzcAJ0cHtKpP1i/4H4R9+OsYeQdxxWxTQ== In-Reply-To: <6bcbd956-7296-7150-765f-63318a425d1b@cs.wisc.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Mirapoint-Received-SPF: 209.85.160.200 mail-qt1-f200.google.com chet.ramey@case.edu 5 none X-Mirapoint-IP-Reputation: reputation=Good-1, source=Queried, refid=tid=0001.0A742F90.661E979D.0012, actions=spf tag X-Mirapoint-IP-Reputation: reputation=good-1, source=Fixed, refid=n/a, actions=tag X-Junkmail-Status: score=8/90, host=mpv-out-ksl-1.case.edu X-Junkmail-PrAS-Raw: score=8/90, refid=2.7.2:2023.6.26.145126:17:8.707, ip=, rules=__YOUTUBE_RCVD, DKIM_SIGNATURE, __X_GOOGLE_DKIM_SIGNATURE, __X_GM_MESSAGE_STATE, __X_GOOGLE_SMTP_SOURCE, __HAS_MSGID, __SANE_MSGID, __MSGID_HEX_844412, DATE_TZ_NA, __MIME_VERSION, __USER_AGENT, __MOZILLA_USER_AGENT, __HAS_REPLYTO, __HAS_CC_HDR, __MULTIPLE_RCPTS_CC_X2, __CC_NAME, __CC_NAME_DIFF_FROM_ACC, __SUBJ_REPLY, __BOUNCE_CHALLENGE_SUBJ, __BOUNCE_NDR_SUBJ_EXEMPT, __TO_MALFORMED_2, __TO_NAME, __TO_NAME_DIFF_FROM_ACC, __HAS_REFERENCES, __REFERENCES, __HAS_FROM, FROM_EDU_TLD, __IN_REP_TO, __CT, __CT_TEXT_PLAIN, __MIME_BOUND_CHARSET, __CTE, CTE_8BIT, __REPLYTO_SAMEAS_FROM_ADDY, __REPLYTO_SAMEAS_FROM_ACC, __FROM_DOMAIN_IN_ANY_CC2, __RCPT_DOMAIN_NOT_TO, __REPLYTO_SAMEAS_FROM_DOMAIN, __DKIM_ALIGNS_1, __DKIM_ALIGNS_2, __FUR_HEADER, __PHISH_PHRASE3, __HIGHBITS, __FRAUD_BADTHINGS, __FRAUD_URGENCY, __FILESHARE_PHRASE, __WEBINAR_PHRASE, [TRUNCATED], so=2010-03-03 19:42:08, dmn=2016-08-03-0138 X-Spam-Status: No, score=0.5 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_SORBS_WEB,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 4/12/24 12:49 PM, Carl Edquist wrote: > Where with a coproc > >     coproc X { potentially short lived command with output; } >     exec {xr}<&${X[0]} {xw}>&${X[1]} > > there is technically the possibility that the coproc can finish and be > reaped before the exec command gets a chance to run and duplicate the fds. > > But, I also get what you said, that your design intent with coprocs was for > them to be longer-lived, so immediate termination was not a concern. The bigger concern was how to synchronize between the processes, but that's something that the script writer has to do on their own. >>> Personally I like the idea of 'closing' a coproc explicitly, but if it's >>> a bother to add options to the coproc keyword, then I would say just let >>> the user be responsible for closing the fds.  Once the coproc has >>> terminated _and_ the coproc's fds are closed, then the coproc can be >>> deallocated. >> >> This is not backwards compatible. coprocs may be a little-used feature, >> but you're adding a burden on the shell programmer that wasn't there >> previously. > > Ok, so, I'm trying to imagine a case where this would cause any problems or > extra work for such an existing user.  Maybe you can provide an example > from your own uses?  (Where it would cause trouble or require adding code > if the coproc deallocation were deferred until the fds are closed explicitly.) My concern was always coproc fds leaking into other processes, especially pipelines. If someone has a coproc now and is `messy' about cleaning it up, I feel like there's the possibility of deadlock. But I don't know how extensively they're used, or all the use cases, so I'm not sure how likely it is. I've learned there are users who do things with shell features I never imagined. (People wanting to use coprocs without the shell as the arbiter, for instance. :-) ) > My first thought is that in the general case, the user doesn't really need > to worry much about closing the fds for a terminated coproc anyway, as they > will all be closed implicitly when the shell exits (either an interactive > session or a script). Yes. > > [This is a common model for using coprocs, by the way, where an auxiliary > coprocess is left open for the lifetime of the shell session and never > explicitly closed.  When the shell session exits, the fds are closed > implicitly by the OS, and the coprocess sees EOF and exits on its own.] That's one common model, yes. Another is that the shell process explicitly sends a close or shutdown command to the coproc, so termination is expected. > If a user expects the coproc variable to go away automatically, that user > won't be accessing a still-open fd from that variable for anything. I'm more concerned about a pipe with unread data that would potentially cause problems. I suppose we just need more testing. > As for the forgotten-about half-closed pipe fds to the reaped coproc, I > don't see how they could lead to deadlock, nor do I see how a shell > programmer expecting the existing behavior would even attempt to access > them at all, apart from programming error. Probably not. > > The only potential issue I can imagine is if a script (or a user at an > interactive prompt) would start _so_ many of these longer-lived coprocs > (more than 500??), one at a time in succession, in a single shell session, > that all the available fds would be exhausted.  (That is, if the shell is > not closing them automatically upon coproc termination.)  Is that the > backwards compatibility concern? That's more of a "my arm hurts when I do this" situation. If a script opened 500 fds using exec redirection, resource exhaustion would be their own responsibility. > Meanwhile, the bash man page does not specify the shell's behavior for when > a coproc terminates, so you might say there's room for interpretation and > the new deferring behavior would not break any promises. I could always enable it in the devel branch and see what happens with the folks who use that. It would be three years after any release when distros would put it into production anyway. > > And as it strikes me anyway, the real "burden" on the programmer with the > existing behavior is having to make a copy of the coproc fds every time > >     coproc X { cmd; } >     exec {xr}<&${X[0]} {xw}>&${X[1]} > > and use the copies instead of the originals in order to reliably read the > final output from the coproc. Maybe, though it's easy enough to wrap that in a shell function. >>> First, just to be clear, the fds to/from the coproc pipes are not >>> invalid when the coproc terminates (you can still read from them); they >>> are only invalid after they are closed. >> >> That's only sort of true; writing to a pipe for which there is no reader >> generates SIGPIPE, which is a fatal signal. > > Eh, when I talk about an fd being "invalid" here I mean "fd is not a valid > file descriptor" (to use the language for EBADF from the man page for > various system calls like read(2), write(2), close(2)).  That's why I say > the fds only become invalid after they are closed. > > And of course the primary use I care about is reading the final output from > a completed coproc.  (Which is generally after explicitly closing the write > end.)  The shell's read fd is still open, and can be read - it'll either > return data, or return EOF, but that's not an error and not invalid. > > But since you mention it, writing to a broken pipe is still semantically > meaningful also.  (I would even say valid.)  In the typical case it's > expected behavior for a process to get killed when it attempts this and > shell pipeline programming is designed with this in mind. You'd be surprised at how often I get requests to put in an internal SIGPIPE handler to avoid problems/shell termination with builtins writing to closed pipes. > So even for write attempts, you introduce uncertain behavior by > automatically closing the fds, when the normal, predictable, valid thing > would be to die by SIGPIPE. Again, you might be surprised at how many people view that as a bug in the shell. >> If the coproc terminates, the file descriptor to write to it becomes >> invalid because it's implicitly closed. > > Yes, but the distinction I was making is that they do not become invalid > when or because the coproc terminates, they become invalid when and because > the shell closes them.  (I'm saying that if the shell did not close them > automatically, they would remain valid.) > > >>>  The surprising bit is when they become invalid unexpectedly (from the >>>  point of view of the user) because the shell closes them >>>  automatically, at the somewhat arbitrary timing when the coproc is >>>  reaped. >> >> No real difference from procsubs. > > I think I disagree?  The difference is that the replacement string for a > procsub (/dev/fd/N or a fifo path) remains valid for the command in > question.  (Right?) Using your definition of valid, I believe so, yes. Avoiding SIGPIPE depends on how the OS handles opens on /dev/fd/N: an internal dup or a handle to the same fd. In the latter case, I think the file descriptor obtained when opening /dev/fd/N would become `invalid' at the same time the process terminates. I think we're talking about our different interpretations of `invalid' (EBADF as opposed to EPIPE/SIGPIPE). > So the command in question can count on that path > being valid.  And if a procsub is used in an exec redirection, in order to > extend its use for future commands (and the redirection is guaranteed to > work, since it is guaranteed to be valid for that exec command), then the > newly opened pipe fd will not be subject to automatic closing either. Correct. > > As far as I can tell there is no arbitrary timing for when the shell closes > the fds for procsubs.  As far as I can tell, it closes them when the > command in question completes, and that's the end of the story. (There's no > waiting for the timing of the background procsub process to complete.) Right. There are reasonably well-defined rules for when redirections associated with commands are disposed, and exec redirections to procsubs just follow from those. The shell closes file descriptors (and potentially unlinks the FIFO) when it reaps the process substitution, but it takes some care not to do that prematurely, and the user isn't using those fds. > > >>> Second, why is it a problem if the variables keep their (invalid) fds >>> after closing them, if the user is the one that closed them anyway? >>> >>> Isn't this how it works with the auto-assigned fd redirections? >> >> Those are different file descriptors. >> >>> >>>       $ exec {d}<. >>>       $ echo $d >>>       10 >>>       $ exec {d}<&- >>>       $ echo $d >>>       10 >> >> The shell doesn't try to manage that object in the same way it does a >> coproc. The user has explicitly indicated they want to manage it. > > Ok - your intention makes sense then.  My reasoning was that auto-allocated > redirection fds ( {x}>file or {x}>&$N ) are a way of asking the shell to > automatically place fds in a variable for you to manage - and I imagined > 'coproc X {...}' the same way. The philosophy is the same as if you picked the file descriptor number yourself and assigned it to the variable -- the shell just does some of the bookkeeping for you so you don't have to worry about the file descriptor resource limit. You still have to manage file descriptor $x the same way you would if you had picked file descriptor 15 (for example). >> But there is a window there where a short-lived coprocess could be reaped >> before you dup the file descriptors. Since the original intent of the >> feature was that coprocs were a way to communicate with long-lived >> processes -- something more persistent than a process substitution -- it >> was not really a concern at the time. > > Makes sense.  For me, working with coprocesses is largely a more flexible > way of setting up interesting pipelines - which is where the shell excels. > > Once a 'pipework' is set up (I'm making up this word now to distinguish > from a simple pipeline), the shell does not have to be in the middle > shoveling data around - the external commands can do that on their own. My original intention for the coprocs (and Korn's from whence they came) was that the shell would be in the middle -- it's another way for the shell to do IPC. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/