* RFE: enable buffering on null-terminated data [not found] ` <9831afe6-958a-fbd3-9434-05dd0c9b602a@draigBrady.com> @ 2024-03-10 15:29 ` Zachary Santer 2024-03-10 20:36 ` Carl Edquist 0 siblings, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-03-10 15:29 UTC (permalink / raw) To: libc-alpha; +Cc: coreutils, p Was "stdbuf feature request - line buffering but for null-terminated data" See below. On Sun, Mar 10, 2024 at 5:38 AM Pádraig Brady <P@draigbrady.com> wrote: > > On 09/03/2024 16:30, Zachary Santer wrote: > > 'stdbuf --output=L' will line-buffer the command's output stream. > > Pretty useful, but that's looking for newlines. Filenames should be > > passed between utilities in a null-terminated fashion, because the > > null byte is the only byte that can't appear within one. > > > > If I want to buffer output data on null bytes, the closest I can get > > is 'stdbuf --output=0', which doesn't buffer at all. This is pretty > > inefficient. > > > > 0 means unbuffered, and Z is already taken for, I guess, zebibytes. > > --output=N, then? > > > > Would this require a change to libc implementations, or is it possible now? > > This does seem like useful functionality, > but it would require support for libc implementations first. > > cheers, > Pádraig ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RFE: enable buffering on null-terminated data 2024-03-10 15:29 ` RFE: enable buffering on null-terminated data Zachary Santer @ 2024-03-10 20:36 ` Carl Edquist 2024-03-11 3:48 ` Zachary Santer 0 siblings, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-03-10 20:36 UTC (permalink / raw) To: Zachary Santer; +Cc: libc-alpha, coreutils, p [-- Attachment #1: Type: text/plain, Size: 2952 bytes --] Hi Zack, This sounds like a potentially useful feature (it'd probably belong with a corresponding new buffer mode in setbuf(3)) ... > Filenames should be passed between utilities in a null-terminated > fashion, because the null byte is the only byte that can't appear within > one. Out of curiosity, do you have an example command line for your use case? > If I want to buffer output data on null bytes, the closest I can get is > 'stdbuf --output=0', which doesn't buffer at all. This is pretty > inefficient. I'm just thinking that find(1), for instance, will end up calling write(2) exactly once per filename (-print or -print0) if run under stdbuf unbuffered, which is the same as you'd get with a corresponding stdbuf line-buffered mode (newline or null-terminated). It seems that where line buffering improves performance over unbuffered is when there are several calls to (for example) printf(3) in constructing a single line. find(1), and some filters like grep(1), will write a line at a time in unbuffered mode, and thus don't seem to benefit at all from line buffering. On the other hand, cut(1) appears to putchar(3) a byte at a time, which in unbuffered mode will (like you say) be pretty inefficient. So, depending on your use case, a new null-terminated line buffered option may or may not actually improve efficiency over unbuffered mode. You can run your commands under strace like stdbuf --output=X strace -c -ewrite command ... | ... to count the number of actual writes for each buffering mode. Carl PS, "find -printf" recognizes a '\c' escape to flush the output, in case that helps. So "find -printf '%p\0\c'" would, for instance, already behave the same as "stdbuf --output=N find -print0" with the new stdbuf output mode you're suggesting. (Though again, this doesn't actually seem to be any more efficient than running "stdbuf --output=0 find -print0") On Sun, 10 Mar 2024, Zachary Santer wrote: > Was "stdbuf feature request - line buffering but for null-terminated data" > > See below. > > On Sun, Mar 10, 2024 at 5:38 AM Pádraig Brady <P@draigbrady.com> wrote: >> >> On 09/03/2024 16:30, Zachary Santer wrote: >>> 'stdbuf --output=L' will line-buffer the command's output stream. >>> Pretty useful, but that's looking for newlines. Filenames should be >>> passed between utilities in a null-terminated fashion, because the >>> null byte is the only byte that can't appear within one. >>> >>> If I want to buffer output data on null bytes, the closest I can get >>> is 'stdbuf --output=0', which doesn't buffer at all. This is pretty >>> inefficient. >>> >>> 0 means unbuffered, and Z is already taken for, I guess, zebibytes. >>> --output=N, then? >>> >>> Would this require a change to libc implementations, or is it possible now? >> >> This does seem like useful functionality, >> but it would require support for libc implementations first. >> >> cheers, >> Pádraig > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RFE: enable buffering on null-terminated data 2024-03-10 20:36 ` Carl Edquist @ 2024-03-11 3:48 ` Zachary Santer 2024-03-11 11:54 ` Carl Edquist 0 siblings, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-03-11 3:48 UTC (permalink / raw) To: Carl Edquist; +Cc: libc-alpha, coreutils, p [-- Attachment #1: Type: text/plain, Size: 5283 bytes --] On Sun, Mar 10, 2024 at 4:36 PM Carl Edquist <edquist@cs.wisc.edu> wrote: > > Hi Zack, > > This sounds like a potentially useful feature (it'd probably belong with a > corresponding new buffer mode in setbuf(3)) ... > > > Filenames should be passed between utilities in a null-terminated > > fashion, because the null byte is the only byte that can't appear within > > one. > > Out of curiosity, do you have an example command line for your use case? My use for 'stdbuf --output=L' is to be able to run a command within a bash coprocess. (Really, a background process communicating with the parent process through FIFOs, since Bash prints a warning message if you try to run more than one coprocess at a time. Shouldn't make a difference here.) See coproc-buffering, attached. Without making the command's output either line-buffered or unbuffered, what I'm doing there would deadlock. I feed one line in and then expect to be able to read a transformed line immediately. If that transformed line is stuck in a buffer that's still waiting to be filled, then nothing happens. I swear doing this actually makes sense in my application. $ ./coproc-buffering 100000 Line-buffered: real 0m17.795s user 0m6.234s sys 0m11.469s Unbuffered: real 0m21.656s user 0m6.609s sys 0m14.906s When I initially implemented this thing, I felt lucky that the data I was passing in were lines ending in newlines, and not null-terminated, since my script gets to benefit from 'stdbuf --output=L'. Truth be told, I don't currently have a need for --output=N. Of course, sed and all sorts of other Linux command-line tools can produce or handle null-terminated data. > > If I want to buffer output data on null bytes, the closest I can get is > > 'stdbuf --output=0', which doesn't buffer at all. This is pretty > > inefficient. > > I'm just thinking that find(1), for instance, will end up calling write(2) > exactly once per filename (-print or -print0) if run under stdbuf > unbuffered, which is the same as you'd get with a corresponding stdbuf > line-buffered mode (newline or null-terminated). > > It seems that where line buffering improves performance over unbuffered is > when there are several calls to (for example) printf(3) in constructing a > single line. find(1), and some filters like grep(1), will write a line at > a time in unbuffered mode, and thus don't seem to benefit at all from line > buffering. On the other hand, cut(1) appears to putchar(3) a byte at a > time, which in unbuffered mode will (like you say) be pretty inefficient. > > So, depending on your use case, a new null-terminated line buffered option > may or may not actually improve efficiency over unbuffered mode. I hadn't considered that. > You can run your commands under strace like > > stdbuf --output=X strace -c -ewrite command ... | ... > > to count the number of actual writes for each buffering mode. I'm running bash in MSYS2 on a Windows machine, so hopefully that doesn't invalidate any assumptions. Now setting up strace around the things within the coprocess, and only passing in one line, I now have coproc-buffering-strace, attached. Giving the argument 'L', both sed and expand call write() once. Giving the argument 0, sed calls write() twice and expand calls it a bunch of times, seemingly once for each character it outputs. So I guess that's it. $ ./coproc-buffering-strace L | Line with tabs why?| $ grep -c -F 'write:' sed-trace.txt expand-trace.txt sed-trace.txt:1 expand-trace.txt:1 $ ./coproc-buffering-strace 0 | Line with tabs why?| $ grep -c -F 'write:' sed-trace.txt expand-trace.txt sed-trace.txt:2 expand-trace.txt:30 > Carl > > > PS, "find -printf" recognizes a '\c' escape to flush the output, in case > that helps. So "find -printf '%p\0\c'" would, for instance, already > behave the same as "stdbuf --output=N find -print0" with the new stdbuf > output mode you're suggesting. > > (Though again, this doesn't actually seem to be any more efficient than > running "stdbuf --output=0 find -print0") > > On Sun, 10 Mar 2024, Zachary Santer wrote: > > > Was "stdbuf feature request - line buffering but for null-terminated data" > > > > See below. > > > > On Sun, Mar 10, 2024 at 5:38 AM Pádraig Brady <P@draigbrady.com> wrote: > >> > >> On 09/03/2024 16:30, Zachary Santer wrote: > >>> 'stdbuf --output=L' will line-buffer the command's output stream. > >>> Pretty useful, but that's looking for newlines. Filenames should be > >>> passed between utilities in a null-terminated fashion, because the > >>> null byte is the only byte that can't appear within one. > >>> > >>> If I want to buffer output data on null bytes, the closest I can get > >>> is 'stdbuf --output=0', which doesn't buffer at all. This is pretty > >>> inefficient. > >>> > >>> 0 means unbuffered, and Z is already taken for, I guess, zebibytes. > >>> --output=N, then? > >>> > >>> Would this require a change to libc implementations, or is it possible now? > >> > >> This does seem like useful functionality, > >> but it would require support for libc implementations first. > >> > >> cheers, > >> Pádraig > > > > [-- Attachment #2: coproc-buffering --] [-- Type: application/octet-stream, Size: 1154 bytes --] #!/usr/bin/env bash set -o nounset -o noglob +o braceexpand shopt -s lastpipe export LC_ALL='C.UTF-8' tab_spaces=8 sed_expr='s/[[:blank:]]+$//' test=$' \tLine with tabs\t why?\t ' repeat="${1}" coproc line_buffered { stdbuf --output=L -- \ sed --binary --regexp-extended --expression="${sed_expr}" | stdbuf --output=L -- \ expand --tabs="${tab_spaces}" } printf '%s' "Line-buffered:" time { for (( i = 0; i < repeat; i++ )); do printf '%s\n' "${test}" >&"${line_buffered[1]}" IFS='' read -r line <&"${line_buffered[0]}" printf '|%s|\n' "${line}" > /dev/null done } exec {line_buffered[0]}<&- {line_buffered[1]}>&- wait "${line_buffered_PID}" coproc unbuffered { stdbuf --output=0 -- \ sed --binary --regexp-extended --expression="${sed_expr}" | stdbuf --output=0 -- \ expand --tabs="${tab_spaces}" } printf '%s' "Unbuffered:" time { for (( i = 0; i < repeat; i++ )); do printf '%s\n' "${test}" >&"${unbuffered[1]}" IFS='' read -r line <&"${unbuffered[0]}" printf '|%s|\n' "${line}" > /dev/null done } exec {unbuffered[0]}<&- {unbuffered[1]}>&- wait "${unbuffered_PID}" [-- Attachment #3: coproc-buffering-strace --] [-- Type: application/octet-stream, Size: 695 bytes --] #!/usr/bin/env bash set -o nounset -o noglob +o braceexpand shopt -s lastpipe export LC_ALL='C.UTF-8' tab_spaces=8 sed_expr='s/[[:blank:]]+$//' test=$' \tLine with tabs\t why?\t ' buffer_setting="${1}" coproc buffer_test { stdbuf --output="${buffer_setting}" -- \ strace -e -o sed-trace.txt \ sed --binary --regexp-extended --expression="${sed_expr}" | stdbuf --output="${buffer_setting}" -- \ strace -e -o expand-trace.txt \ expand --tabs="${tab_spaces}" } printf '%s\n' "${test}" >&"${buffer_test[1]}" IFS='' read -r line <&"${buffer_test[0]}" printf '|%s|\n' "${line//$'\t'/TAB}" exec {buffer_test[0]}<&- {buffer_test[1]}>&- wait "${buffer_test_PID}" ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RFE: enable buffering on null-terminated data 2024-03-11 3:48 ` Zachary Santer @ 2024-03-11 11:54 ` Carl Edquist 2024-03-11 15:12 ` Examples of concurrent coproc usage? Zachary Santer 2024-03-12 3:34 ` RFE: enable buffering on null-terminated data Zachary Santer 0 siblings, 2 replies; 53+ messages in thread From: Carl Edquist @ 2024-03-11 11:54 UTC (permalink / raw) To: Zachary Santer; +Cc: libc-alpha, coreutils, p [-- Attachment #1: Type: text/plain, Size: 7977 bytes --] On Sun, 10 Mar 2024, Zachary Santer wrote: > On Sun, Mar 10, 2024 at 4:36 PM Carl Edquist <edquist@cs.wisc.edu> wrote: >> >> Out of curiosity, do you have an example command line for your use case? > > My use for 'stdbuf --output=L' is to be able to run a command within a > bash coprocess. Oh, cool, now you're talking! ;) > (Really, a background process communicating with the parent process > through FIFOs, since Bash prints a warning message if you try to run > more than one coprocess at a time. Shouldn't make a difference here.) (Kind of a side-note ... bash's limited coprocess handling was a long standing annoyance for me in the past, to the point that I wrote a bash coprocess management library to handle multiple active coprocess and give convenient methods for interaction. Perhaps the trickiest bit about multiple coprocesses open at once (which I suspect is the reason support was never added to bash) is that you don't want the second and subsequent coprocesses to inherit the pipe fds of prior open coprocesses. This can result in deadlock if, for instance, you close your write end to coproc1, but coproc1 continues to wait for input because coproc2 also has a copy of a write end of the pipe to coproc1's input. So you need to be smart about subsequent coprocesses first closing all fds associated with other coprocesses. Word to the wise: you might encounter this issue (coproc2 prevents coproc1 from seeing its end-of-input) even though you are rigging this up yourself with FIFOs rather than bash's coproc builtin.) > See coproc-buffering, attached. Thanks! > Without making the command's output either line-buffered or unbuffered, > what I'm doing there would deadlock. I feed one line in and then expect > to be able to read a transformed line immediately. If that transformed > line is stuck in a buffer that's still waiting to be filled, then > nothing happens. > > I swear doing this actually makes sense in my application. Yeah makes sense! I am familiar with the problem you're describing. (In my coprocess management library, I effectively run every coproc with --output=L by default, by eval'ing the output of 'env -i stdbuf -oL env', because most of the time for a coprocess, that's whats wanted/necessary.) ... Although, for your example coprocess use, where the shell both produces the input for the coproc and consumes its output, you might be able to simplify things by making the producer and consumer separate processes. Then you could do a simpler 'producer | filter | consumer' without having to worry about buffering at all. But if the producer and consumer need to be in the same process (eg they share state and are logically interdependent), then yeah that's where you need a coprocess for the filter. ... On the other hand, if the issue is that someone is producing one line at a time _interactively_ (that is, inputting text or commands from a terminal), then you might argue that the performance hit for unbuffered output will be insignificant compared to time spent waiting for terminal input. > $ ./coproc-buffering 100000 > Line-buffered: > real 0m17.795s > user 0m6.234s > sys 0m11.469s > Unbuffered: > real 0m21.656s > user 0m6.609s > sys 0m14.906s Yeah, this makes sense in your particular example. It looks like expand(1) uses putchar(3), so in unbuffered mode this translates to one write(2) call for every byte. sed(1) is not quite as bad - in unbuffered it appears to output the line and the newline terminator separately, so two write(2) calls for every line. So in both cases (but especially for expand), line buffering reduces the number of write(2) calls. (Although given your time output, you might say the performance hit for unbuffered is not that huge.) > When I initially implemented this thing, I felt lucky that the data I > was passing in were lines ending in newlines, and not null-terminated, > since my script gets to benefit from 'stdbuf --output=L'. :thumbsup: > Truth be told, I don't currently have a need for --output=N. Mmm-hmm :) > Of course, sed and all sorts of other Linux command-line tools can > produce or handle null-terminated data. Definitely. So in the general case, theoretically it seems as useful to buffer output on nul bytes. Note that for gnu sed in particular, there is a -u/--unbuffered option, which will effectively give you line buffered output, including buffering on nul bytes with -z/--null-data . ... I'll be honest though, I am having trouble imagining a realistic pipeline that filters filenames with embedded newlines using expand(1) ;) ... But, I want to be a good sport here and contrive an actual use case. So for fun, say I want to use cut(1) (which performs poorly when unbuffered) in a coprocess that takes null-terminated file paths on input and outputs the first directory component (which possibly contains embedded newlines). The basic command in the coprocess would be: cut -d/ -f1 -z but with the default block buffering for pipe output, that will hang (the problem you describe) if you expect to read a record back from it after each record sent. The unbuffered approach works, but (as discussed) is pretty inefficient: stdbuf --output=0 cut -d/ -f1 -z But, if we swap nul bytes and newlines before and after cut, then we can run cut with regular newline line buffering, and get the desired effect: stdbuf --output=0 tr '\0\n' '\n\0' | stdbuf --output=L cut -d/ -f1 | stdbuf --output=0 tr '\0\n' '\n\0' The embedded newlines in filenames will be passed by tr(1) to cut(1) as embedded nul bytes, cut will line-buffer its output, and the second tr will restore the original embedded newlines & null-terminated records. Note that unbuffered tr(1) will still output its translated input in blocks (with fwrite(3)) rather than a byte at a time, so tr will effectively give buffered output with the same size as the input records. (That is, newline or null-terminated input records will effectively produce newline or null-terminated output buffering, respectively.) I'd venture to guess that most of the standard filters could be made to pass along null-terminated records as line-buffered records the same way. Might even package it into a convenience function to set them up: swap_znl () { stdbuf -o0 tr '\0\n' '\n\0'; } nulterm_via_linebuf () { swap_znl | stdbuf -oL "$@" | swap_znl; } Then, for example, stand it up with bash's coproc: $ coproc DC1 { nulterm_via_linebuf cut -d/ -f1; } $ printf 'a\nb/c\nd/efg\0' >&${DC1[1]} $ IFS='' read -rd '' -u ${DC1[0]} DIR $ echo "[$DIR]" [a b] (or however else you manage your coprocesses.) It's a workaround, and it keeps the kind of buffering you'd get with a 'stdbuf --output=N', but to be fair the extra data shoveling is not exactly free. ... So ... again in theory I also feel like a null-terminated buffering mode for stdbuf(1) (and setbuf(3)) is kind of a missing feature. It may just be that nobody has actually had a real need for it. (Yet?) > I'm running bash in MSYS2 on a Windows machine, so hopefully that > doesn't invalidate any assumptions. Ooh. No idea. Your strace and sed might have different options than mine. Also, I am not sure if there are different pipe and fd duplication semantics, compared to linux. But, based on the examples & output you're giving, I think we're on the same page for the discussion. > Now setting up strace around the things within the coprocess, and only > passing in one line, I now have coproc-buffering-strace, attached. > Giving the argument 'L', both sed and expand call write() once. Giving > the argument 0, sed calls write() twice and expand calls it a bunch of > times, seemingly once for each character it outputs. So I guess that's > it. :thumbsup: Yeah that matches what I was seeing also. Thanks for humoring the peanut gallery here :D Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Examples of concurrent coproc usage? 2024-03-11 11:54 ` Carl Edquist @ 2024-03-11 15:12 ` Zachary Santer 2024-03-14 9:58 ` Carl Edquist 2024-03-12 3:34 ` RFE: enable buffering on null-terminated data Zachary Santer 1 sibling, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-03-11 15:12 UTC (permalink / raw) To: Carl Edquist, bug-bash; +Cc: libc-alpha Was "RFE: enable buffering on null-terminated data" On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist <edquist@cs.wisc.edu> wrote: > > On Sun, 10 Mar 2024, Zachary Santer wrote: > > > On Sun, Mar 10, 2024 at 4:36 PM Carl Edquist <edquist@cs.wisc.edu> wrote: > >> > >> Out of curiosity, do you have an example command line for your use case? > > > > My use for 'stdbuf --output=L' is to be able to run a command within a > > bash coprocess. > > Oh, cool, now you're talking! ;) > > > > (Really, a background process communicating with the parent process > > through FIFOs, since Bash prints a warning message if you try to run > > more than one coprocess at a time. Shouldn't make a difference here.) > > (Kind of a side-note ... bash's limited coprocess handling was a long > standing annoyance for me in the past, to the point that I wrote a bash > coprocess management library to handle multiple active coprocess and give > convenient methods for interaction. Perhaps the trickiest bit about > multiple coprocesses open at once (which I suspect is the reason support > was never added to bash) is that you don't want the second and subsequent > coprocesses to inherit the pipe fds of prior open coprocesses. This can > result in deadlock if, for instance, you close your write end to coproc1, > but coproc1 continues to wait for input because coproc2 also has a copy of > a write end of the pipe to coproc1's input. So you need to be smart about > subsequent coprocesses first closing all fds associated with other > coprocesses. https://lists.gnu.org/archive/html/help-bash/2021-03/msg00296.html https://lists.gnu.org/archive/html/help-bash/2021-04/msg00136.html You're on the money, though there is a preprocessor directive you can build bash with that will allow it to handle multiple concurrent coprocesses without complaining: MULTIPLE_COPROCS=1. Chet Ramey's sticking point was that he hadn't seen coprocesses used enough in the wild to satisfactorily test that his implementation did in fact keep the coproc file descriptors out of subshells. If you've got examples you can direct him to, I'd really appreciate it. > Word to the wise: you might encounter this issue (coproc2 prevents coproc1 > from seeing its end-of-input) even though you are rigging this up yourself > with FIFOs rather than bash's coproc builtin.) In my case, it's mostly a non-issue, because I fork the - now three - background processes before exec'ing automatic fds redirecting to/from their FIFO's in the parent process. All the automatic fds get put in an array, and I do close them all at the beginning of a subsequent process substitution. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-03-11 15:12 ` Examples of concurrent coproc usage? Zachary Santer @ 2024-03-14 9:58 ` Carl Edquist 2024-03-17 19:40 ` Zachary Santer ` (3 more replies) 0 siblings, 4 replies; 53+ messages in thread From: Carl Edquist @ 2024-03-14 9:58 UTC (permalink / raw) To: Zachary Santer; +Cc: bug-bash, libc-alpha [-- Attachment #1: Type: text/plain, Size: 11125 bytes --] [My apologies up front for the length of this email. The short story is I played around with the multi-coproc support: the fd closing seems to work fine to prevent deadlock, but I found one bug apparently introduced with multi-coproc support, and one other coproc bug that is not new.] On Mon, 11 Mar 2024, Zachary Santer wrote: > Was "RFE: enable buffering on null-terminated data" > > On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist <edquist@cs.wisc.edu> wrote: >> >> (Kind of a side-note ... bash's limited coprocess handling was a long >> standing annoyance for me in the past, to the point that I wrote a bash >> coprocess management library to handle multiple active coprocess and >> give convenient methods for interaction. Perhaps the trickiest bit >> about multiple coprocesses open at once (which I suspect is the reason >> support was never added to bash) is that you don't want the second and >> subsequent coprocesses to inherit the pipe fds of prior open >> coprocesses. This can result in deadlock if, for instance, you close >> your write end to coproc1, but coproc1 continues to wait for input >> because coproc2 also has a copy of a write end of the pipe to coproc1's >> input. So you need to be smart about subsequent coprocesses first >> closing all fds associated with other coprocesses. > > https://lists.gnu.org/archive/html/help-bash/2021-03/msg00296.html > https://lists.gnu.org/archive/html/help-bash/2021-04/msg00136.html Oh hey! Look at that. Thanks for the links to this thread - I gave them a read (along with the old thread from 2011-04). I feel a little bad I missed the 2021 discussion. > You're on the money, though there is a preprocessor directive you can > build bash with that will allow it to handle multiple concurrent > coprocesses without complaining: MULTIPLE_COPROCS=1. Who knew! Thanks for mentioning it. When I saw that "only one active coprocess at a time" was _still_ listed in the bugs section in bash 5, I figured multiple coprocess support had just been abandoned. Chet, that's cool that you implemented it. I kind of went all-out on my bash coprocess management library though (mostly back in 2014-2016) ... It's pretty feature-rich and pleasant to use -- to the point that I don't think there is any going-back to bash's internal coproc for me, even with multiple coprocess are support. I implemented it with shell functions, so it doesn't rely on compiling anything or the latest version of bash being present. (I even added bash3 support for older systems.) > Chet Ramey's sticking point was that he hadn't seen coprocesses used > enough in the wild to satisfactorily test that his implementation did in > fact keep the coproc file descriptors out of subshells. To be fair coproc is kind of a niche feature. But I think more people would play with it if it were less awkward to use and if they felt free to experiment with multiple coprocs. By the way, I agree with the Chet's exact description of the problems here: https://lists.gnu.org/archive/html/help-bash/2021-03/msg00282.html The issue is separate from the stdio buffering discussion; the issue here is with child processes (and I think not foreground subshells, but specifically background processes, including coprocesses) inheriting the shell's fds that are open to pipes connected to an active coprocess. Not getting a sigpipe/write failure results in a coprocess sitting around longer than it ought to, but it's not obvious (to me) how this leads to deadlock, since the shell at least has closed its read end of the pipe to that coprocess, so at least you aren't going to hang trying to read from it. On the other hand, a coprocess not seeing EOF will cause deadlock pretty readily, especially if it processes all its input before producing output (as with wc, sort, sha1sum). Trying to read from the coprocess will hang indefinitely if the coprocess is still waiting for input, which is the case if there is another copy of the write end of its read pipe open somewhere. > If you've got examples you can direct him to, I'd really appreciate it. [My original use cases for multiple coprocesses were (1) for programmatically interacting with multiple command-line database clients together, and (2) for talking to multiple interactive command-line game engines (othello) to play each other. Perl's IPC::Open2 works, too, but it's easier to experiment on the fly in bash. And in general having the freedom to play with multiple coprocesses helps mock up more complicated pipelines, or even webs of interconnected processes.] But you can create a deadlock without doing anything fancy. Well, *without multi-coproc support*, here's a simple wc example; first with a single coproc: $ coproc WC { wc; } $ exec {WC[1]}>&- $ read -u ${WC[0]} X $ echo $X 0 0 0 This works as expected. But if you try it with a second coproc (again, without multi-coproc support), the second coproc will inherit copies of the shell's read and write pipe fds to the first coproc, and the read will hang (as described above), as the first coproc doesn't see EOF: $ coproc WC { wc; } $ coproc CAT { cat; } $ exec {WC[1]}>&- $ read -u ${WC[0]} X # HANGS But, this can be observed even before attempting the read that hangs. You can 'ps' to see the user shell (bash), the coprocs' shells (bash), and the coprocs' commands (wc & cat). Then 'ls -l /proc/PID/fd/' to see what they have open: - The user shell has its copies of the read & write fds open for both coprocs (as it should) - The coproc commands (wc & cat) each have only a single read & write pipe open, on fd 0 & 1 (as they should) - The first coproc's shell (WC) has only a single read & write pipe open, on fd 0 & 1 (as it should) - The second coproc's shell (CAT) has its own read & write pipes open, on fd 0 & 1 (good), but it also has a copy of the user shell's read & write pipe fds to the first coproc (WC) open (on fd 60 & 63 in this case, which it inherited when forking from the user shell) (And in general, latter coproc shells will have stray copies of the user shell's r/w ends from all previous coprocs.) So, you can examine the situation after setting up coprocs, to see if all the coproc-related processes have just two pipes open (on fd 0 & 1). If this is the case, I think that suffices to convince me anyway that no deadlocks related to stray open fds can happen. But if any of them has other pipes open (inherited from the user shell), that indicates the problem. I tried compiling the latest bash with MULTIPLE_COPROCS=1 (version 5.2.21(1)) to test out the multi-coproc support. I tried standing up the above WC and CAT coprocs, together with some others to check that the behavior looked ok for pipelines also (which I think was one of Chet's concerns) $ coproc WC { wc; } $ coproc CAT { cat; } $ coproc CAT3 { cat | cat | cat; } $ coproc CAT4 { cat | cat | cat | cat; } $ coproc CATX { cat ; } And as far as the fd situation, everything checks out: the user shell has fds open to all the coprocs, and the coproc shells & coproc commands (including all the cat's in the pipelines) have only a single read & write pipe open on fd 0 & 1. So, the multi-coproc code seems to be closing the shell's copies correctly. [The examples are boring, but their point is just to investigate the stray-fd question.] HOWEVER!!! Unexpectedly, the new multi-coproc code seems to close the user shell's end of a coprocess's pipes, once the coprocess has terminated. When compiled with MULTIPLE_COPROCS=1, this is true even if there is only a single coproc: $ coproc WC { wc; } $ exec {WC[1]}>&- [1]+ Done coproc WC { wc; } # WC var gets cleared!! # shell's ${WC[0]} is also closed! # now, can't do: $ read -u ${WC[0]} X $ echo $X I'm attaching a "bad-coproc-log.txt" with more detailed ps & ls output examining the open fds at each step, to make it clear what's happening. This is a bug. The shell should not automatically close its read pipe to a coprocess that has terminated -- it should stay open to read the final output, and the user should be responsible for closing the read end explicitly. This is more obvious for commands that wait until they see EOF before generating any output (wc, sort, sha1sum). But it's also true for any command that produces output (filters (sed) or generators (ls)). If the shell's read end is closed automatically, any final output waiting in the pipe will be discarded. It also invites trouble if the shell variable that holds the fds gets removed unexpectedly when the coprocess terminates. (Suddenly the variable expands to an empty string.) It seems to me that the proper time to clear the coproc variable (if at all) is after the user has explicitly closed both of the fds. *Or* else add an option to the coproc keyword to explicitly close the coproc - which will close both fds and clear the variable. ... Separately, I consider the following coproc behavior to be weird, fragile, and broken. If you fg a coproc, then stop and bg it, it dies. Why? Apparently the shell abandons the coproc when it is stopped, closes the pipe fds for it, and clears the fd variable. $ coproc CAT { cat; } [1] 10391 $ fg coproc CAT { cat; } # oops! ^Z [1]+ Stopped coproc CAT { cat; } $ echo ${CAT[@]} # what happened to the fds? $ ls -lgo /proc/$$/fd/ total 0 lrwx------ 1 64 Mar 14 02:26 0 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:26 1 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:25 2 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:26 255 -> /dev/pts/3 $ bg [1]+ coproc CAT { cat; } & $ [1]+ Done coproc CAT { cat; } $ # sad user :( This behavior is not new to the multi-coproc support. But just the same it seems broken for the shell to automatically close the fds to coprocesses. That should be done explicitly by the user. >> Word to the wise: you might encounter this issue (coproc2 prevents >> coproc1 from seeing its end-of-input) even though you are rigging this >> up yourself with FIFOs rather than bash's coproc builtin.) > > In my case, it's mostly a non-issue, because I fork the - now three - > background processes before exec'ing automatic fds redirecting to/from > their FIFO's in the parent process. All the automatic fds get put in an > array, and I do close them all at the beginning of a subsequent process > substitution. That's a nice trick with the shell backgrounding all the coprocesses before connecting the fifos. But yeah, to make subsequent coprocesses you do still have to close the copy of the user shell's fds that the coprocess shell inherits. It sounds like you are doing that (nice!), but in any case it requires some care, and as these stack up it is really handy to have something manage it all for you. (Perhaps this is where I ask if you are happy with your solution or if you would like to try out something wildly more flexible...) Happy coprocessing! :) Carl [-- Attachment #2: Type: text/plain, Size: 1358 bytes --] $ coproc WC { wc; } [1] 10038 $ ps PID TTY TIME CMD 9926 pts/3 00:00:00 bash 10038 pts/3 00:00:00 bash 10039 pts/3 00:00:00 wc 10040 pts/3 00:00:00 ps $ ls -lgo /proc/{$$,10038,10039}/fd/ /proc/10038/fd/: total 0 lr-x------ 1 64 Mar 14 02:29 0 -> pipe:[81214] l-wx------ 1 64 Mar 14 02:29 1 -> pipe:[81213] lrwx------ 1 64 Mar 14 02:28 2 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:29 255 -> /dev/pts/3 /proc/10039/fd/: total 0 lr-x------ 1 64 Mar 14 02:29 0 -> pipe:[81214] l-wx------ 1 64 Mar 14 02:29 1 -> pipe:[81213] lrwx------ 1 64 Mar 14 02:28 2 -> /dev/pts/3 /proc/9926/fd/: total 0 lrwx------ 1 64 Mar 14 02:26 0 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:26 1 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:25 2 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:26 255 -> /dev/pts/3 l-wx------ 1 64 Mar 14 02:26 60 -> pipe:[81214] lr-x------ 1 64 Mar 14 02:26 63 -> pipe:[81213] $ echo ${WC[@]} 63 60 $ exec {WC[1]}>&- [1]+ Done coproc WC { wc; } $ ps PID TTY TIME CMD 9926 pts/3 00:00:00 bash 10042 pts/3 00:00:00 ps $ echo ${WC[@]} $ ls -lgo /proc/$$/fd/ total 0 lrwx------ 1 64 Mar 14 02:26 0 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:26 1 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:25 2 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:26 255 -> /dev/pts/3 ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-03-14 9:58 ` Carl Edquist @ 2024-03-17 19:40 ` Zachary Santer 2024-04-01 19:24 ` Chet Ramey ` (2 subsequent siblings) 3 siblings, 0 replies; 53+ messages in thread From: Zachary Santer @ 2024-03-17 19:40 UTC (permalink / raw) To: Carl Edquist; +Cc: bug-bash, libc-alpha On Thu, Mar 14, 2024 at 6:57 AM Carl Edquist <edquist@cs.wisc.edu> wrote: > (And in general, latter coproc shells will have stray copies of the user > shell's r/w ends from all previous coprocs.) I didn't know that without MULTIPLE_COPROCS=1, bash wouldn't even attempt to keep the fds from earlier coprocs out of later coprocs. > Unexpectedly, the new multi-coproc code seems to close the user shell's > end of a coprocess's pipes, once the coprocess has terminated. When > compiled with MULTIPLE_COPROCS=1, this is true even if there is only a > single coproc: > This is a bug. The shell should not automatically close its read pipe to > a coprocess that has terminated -- it should stay open to read the final > output, and the user should be responsible for closing the read end > explicitly. > It also invites trouble if the shell variable that holds the fds gets > removed unexpectedly when the coprocess terminates. (Suddenly the > variable expands to an empty string.) It seems to me that the proper time > to clear the coproc variable (if at all) is after the user has explicitly > closed both of the fds. *Or* else add an option to the coproc keyword to > explicitly close the coproc - which will close both fds and clear the > variable. I agree. This was the discussion in [1], where it sounds like this was the intended behavior. The array that bash originally created to store the coproc fds is removed immediately, but the fds are evidently closed at some later, indeterminate point. So, if you store the coproc fds in a different array than the one bash gave you, you might still be able to read from the read fd for a little while. That sounds suspiciously like a race condition, though. The behavior without MULTIPLE_COPROCS=1 might have changed since that discussion. > That's a nice trick with the shell backgrounding all the coprocesses > before connecting the fifos. But yeah, to make subsequent coprocesses you > do still have to close the copy of the user shell's fds that the coprocess > shell inherits. It sounds like you are doing that (nice!), but in any > case it requires some care, and as these stack up it is really handy to > have something manage it all for you. I absolutely learned more about what I was doing from that conversation with Chet three years ago. > (Perhaps this is where I ask if you are happy with your solution or if you > would like to try out something wildly more flexible...) Admittedly, I am very curious to see your bash coprocess management library. I don't know how you could implement coprocesses outside of bash's coproc keyword without using FIFOs somehow. > Happy coprocessing! :) Thanks for your detailed description of all this. [1] https://lists.gnu.org/archive/html/help-bash/2021-04/msg00136.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-03-14 9:58 ` Carl Edquist 2024-03-17 19:40 ` Zachary Santer @ 2024-04-01 19:24 ` Chet Ramey 2024-04-01 19:31 ` Chet Ramey 2024-04-03 14:32 ` Chet Ramey 2024-04-17 14:37 ` Chet Ramey 3 siblings, 1 reply; 53+ messages in thread From: Chet Ramey @ 2024-04-01 19:24 UTC (permalink / raw) To: Carl Edquist, Zachary Santer; +Cc: chet.ramey, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 1431 bytes --] On 3/14/24 5:58 AM, Carl Edquist wrote: > But you can create a deadlock without doing anything fancy. > > > Well, *without multi-coproc support*, here's a simple wc example; first > with a single coproc: > > $ coproc WC { wc; } > $ exec {WC[1]}>&- > $ read -u ${WC[0]} X > $ echo $X > 0 0 0 > > This works as expected. > > But if you try it with a second coproc (again, without multi-coproc > support), the second coproc will inherit copies of the shell's read and > write pipe fds to the first coproc, and the read will hang (as described > above), as the first coproc doesn't see EOF: > > $ coproc WC { wc; } > $ coproc CAT { cat; } > $ exec {WC[1]}>&- > $ read -u ${WC[0]} X > > # HANGS > > > But, this can be observed even before attempting the read that hangs. Let's see if we can tackle these one at a time. This seems like it would be pretty easy to fix if a coproc closed the fds corresponding to an existing coproc in the child after the fork. That wouldn't really change anything regarding how scripts have to manually manage multiple coprocs, but it will prevent the shell from hanging. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-01 19:24 ` Chet Ramey @ 2024-04-01 19:31 ` Chet Ramey 2024-04-02 16:22 ` Carl Edquist 0 siblings, 1 reply; 53+ messages in thread From: Chet Ramey @ 2024-04-01 19:31 UTC (permalink / raw) To: Carl Edquist, Zachary Santer; +Cc: chet.ramey, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 1718 bytes --] On 4/1/24 3:24 PM, Chet Ramey wrote: > On 3/14/24 5:58 AM, Carl Edquist wrote: > > >> But you can create a deadlock without doing anything fancy. >> >> >> Well, *without multi-coproc support*, here's a simple wc example; first >> with a single coproc: >> >> $ coproc WC { wc; } >> $ exec {WC[1]}>&- >> $ read -u ${WC[0]} X >> $ echo $X >> 0 0 0 >> >> This works as expected. >> >> But if you try it with a second coproc (again, without multi-coproc >> support), the second coproc will inherit copies of the shell's read and >> write pipe fds to the first coproc, and the read will hang (as described >> above), as the first coproc doesn't see EOF: >> >> $ coproc WC { wc; } >> $ coproc CAT { cat; } >> $ exec {WC[1]}>&- >> $ read -u ${WC[0]} X >> >> # HANGS >> >> >> But, this can be observed even before attempting the read that hangs. > > Let's see if we can tackle these one at a time. This seems like it would be > pretty easy to fix if a coproc closed the fds corresponding to an existing > coproc in the child after the fork. That wouldn't really change anything > regarding how scripts have to manually manage multiple coprocs, but it > will prevent the shell from hanging. > I sent this before I was ready. This would be equivalent to changing the commands to use coproc CAT { exec {WC[0]}<&- {WC[1]}>&- ; cat; } but the script writer wouldn't have to manage it. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-01 19:31 ` Chet Ramey @ 2024-04-02 16:22 ` Carl Edquist 2024-04-03 13:54 ` Chet Ramey 0 siblings, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-04-02 16:22 UTC (permalink / raw) To: Chet Ramey; +Cc: Zachary Santer, bug-bash, libc-alpha [-- Attachment #1: Type: text/plain, Size: 1949 bytes --] On Mon, 1 Apr 2024, Chet Ramey wrote: > On 4/1/24 3:24 PM, Chet Ramey wrote: >> On 3/14/24 5:58 AM, Carl Edquist wrote: >> >>> Well, *without multi-coproc support*, here's a simple wc example; first >>> with a single coproc: >>> >>> $ coproc WC { wc; } >>> $ exec {WC[1]}>&- >>> $ read -u ${WC[0]} X >>> $ echo $X >>> 0 0 0 >>> >>> This works as expected. >>> >>> But if you try it with a second coproc (again, without multi-coproc >>> support), the second coproc will inherit copies of the shell's read and >>> write pipe fds to the first coproc, and the read will hang (as described >>> above), as the first coproc doesn't see EOF: >>> >>> $ coproc WC { wc; } >>> $ coproc CAT { cat; } >>> $ exec {WC[1]}>&- >>> $ read -u ${WC[0]} X >>> >>> # HANGS >>> >>> >>> But, this can be observed even before attempting the read that hangs. >> >> Let's see if we can tackle these one at a time. This seems like it >> would be pretty easy to fix if a coproc closed the fds corresponding >> to an existing coproc in the child after the fork. That wouldn't >> really change anything regarding how scripts have to manually manage >> multiple coprocs, but it will prevent the shell from hanging. >> > > I sent this before I was ready. This would be equivalent to changing the > commands to use > > coproc CAT { exec {WC[0]}<&- {WC[1]}>&- ; cat; } > > but the script writer wouldn't have to manage it. Agreed. And just to note two things (in case it wasn't clear) - (1) the above example that hangs is with the default bash, compiled _without_ multi-coproc support; and (2): > This seems like it would be pretty easy to fix if a coproc closed the > fds corresponding to an existing coproc in the child after the fork the forked coproc has to close its fds to/from _all_ other existing coprocs (as there can be several). Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-02 16:22 ` Carl Edquist @ 2024-04-03 13:54 ` Chet Ramey 0 siblings, 0 replies; 53+ messages in thread From: Chet Ramey @ 2024-04-03 13:54 UTC (permalink / raw) To: Carl Edquist; +Cc: chet.ramey, Zachary Santer, bug-bash, libc-alpha On 4/2/24 12:22 PM, Carl Edquist wrote: >> This seems like it would be pretty easy to fix if a coproc closed the fds >> corresponding to an existing coproc in the child after the fork > > the forked coproc has to close its fds to/from _all_ other existing coprocs > (as there can be several). And there is the issue. Without multi-coproc support, the shell only keeps track of one coproc at a time, so there's only one set of pipe file descriptors to close. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-03-14 9:58 ` Carl Edquist 2024-03-17 19:40 ` Zachary Santer 2024-04-01 19:24 ` Chet Ramey @ 2024-04-03 14:32 ` Chet Ramey 2024-04-03 17:19 ` Zachary Santer 2024-04-04 12:52 ` Carl Edquist 2024-04-17 14:37 ` Chet Ramey 3 siblings, 2 replies; 53+ messages in thread From: Chet Ramey @ 2024-04-03 14:32 UTC (permalink / raw) To: Carl Edquist, Zachary Santer; +Cc: chet.ramey, bug-bash, libc-alpha On 3/14/24 5:58 AM, Carl Edquist wrote: > HOWEVER!!! > > Unexpectedly, the new multi-coproc code seems to close the user shell's end > of a coprocess's pipes, once the coprocess has terminated. When compiled > with MULTIPLE_COPROCS=1, this is true even if there is only a single coproc: > > $ coproc WC { wc; } > $ exec {WC[1]}>&- > [1]+ Done coproc WC { wc; } > > # WC var gets cleared!! > # shell's ${WC[0]} is also closed! > > # now, can't do: > > $ read -u ${WC[0]} X > $ echo $X > > I'm attaching a "bad-coproc-log.txt" with more detailed ps & ls output > examining the open fds at each step, to make it clear what's happening. It's straightforward: the coproc process terminates, the shell reaps it, marks it as dead, notifies the user that the process exited, and reaps it before printing the next prompt. I don't observe any different behavior between the default and when compiled for multiple coprocs. It depends on when the process terminates as to whether you get a prompt back and need to run an additional command before reaping the coproc (macOS, RHEL), which gives you the opportunity to run the `read' command: $ coproc WC { wc; } [1] 48057 $ exec {WC[1]}>&- $ read -u ${WC[0]} X [1]+ Done coproc WC { wc; } bash: DEBUG warning: cpl_reap: deleting 48057 $ echo $X 0 0 0 (I put in a trace statement to show exactly when the coproc gets reaped and deallocated.) I can't reproduce your results with non-interactive shells, either, with job control enabled or disabled. > This is a bug. The shell should not automatically close its read pipe to a > coprocess that has terminated -- it should stay open to read the final > output, and the user should be responsible for closing the read end > explicitly. How long should the shell defer deallocating the coproc after the process terminates? What should it do to make sure that the variables don't hang around with invalid file descriptors? Or should the user be responsible for unsetting the array variable too? (That's never been a requirement, obviously.) > It also invites trouble if the shell variable that holds the fds gets > removed unexpectedly when the coprocess terminates. (Suddenly the variable > expands to an empty string.) It seems to me that the proper time to clear > the coproc variable (if at all) is after the user has explicitly closed > both of the fds. That requires adding more plumbing than I want to, especially since the user can always save the file descriptors of interest into another variable if they want to use them after the coproc terminates. > *Or* else add an option to the coproc keyword to > explicitly close the coproc - which will close both fds and clear the > variable. Not going to add any more options to reserved words; that does more violence to the grammar than I want. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-03 14:32 ` Chet Ramey @ 2024-04-03 17:19 ` Zachary Santer 2024-04-08 15:07 ` Chet Ramey 2024-04-04 12:52 ` Carl Edquist 1 sibling, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-04-03 17:19 UTC (permalink / raw) To: chet.ramey; +Cc: Carl Edquist, bug-bash, libc-alpha On Wed, Apr 3, 2024 at 10:32 AM Chet Ramey <chet.ramey@case.edu> wrote: > > How long should the shell defer deallocating the coproc after the process > terminates? What should it do to make sure that the variables don't hang > around with invalid file descriptors? Or should the user be responsible for > unsetting the array variable too? (That's never been a requirement, > obviously.) For sake of comparison, and because I don't know the answer, what does bash do behind the scenes in this situation? exec {fd}< <( some command ) while IFS='' read -r line <&"${fd}"; do # do stuff done {fd}<&- Because the command in the process substitution isn't waiting for input, (I think) it could've exited at any point before all of its output has been consumed. Even so, bash appears to handle this seamlessly. As the programmer, I know ${fd} contains an fd that's no longer valid after this point, despite it not being unset. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-03 17:19 ` Zachary Santer @ 2024-04-08 15:07 ` Chet Ramey 2024-04-09 3:44 ` Zachary Santer 0 siblings, 1 reply; 53+ messages in thread From: Chet Ramey @ 2024-04-08 15:07 UTC (permalink / raw) To: Zachary Santer; +Cc: chet.ramey, Carl Edquist, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 1649 bytes --] On 4/3/24 1:19 PM, Zachary Santer wrote: > On Wed, Apr 3, 2024 at 10:32 AM Chet Ramey <chet.ramey@case.edu> wrote: >> >> How long should the shell defer deallocating the coproc after the process >> terminates? What should it do to make sure that the variables don't hang >> around with invalid file descriptors? Or should the user be responsible for >> unsetting the array variable too? (That's never been a requirement, >> obviously.) > > For sake of comparison, and because I don't know the answer, what does > bash do behind the scenes in this situation? > > exec {fd}< <( some command ) > while IFS='' read -r line <&"${fd}"; do > # do stuff > done > {fd}<&- > > Because the command in the process substitution isn't waiting for > input, (I think) it could've exited at any point before all of its > output has been consumed. Even so, bash appears to handle this > seamlessly. Bash doesn't close the file descriptor in $fd. Since it's used with `exec', it's under the user's control. The script here explicitly opens and closes the file descriptor, so it can read until read returns failure. It doesn't really matter when the process exits or whether the shell closes its ends of the pipe -- the script has made a copy that it can use for its own purposes. (And you need to use exec to close it when you're done.) You can do the same thing with a coproc. The question is whether or not scripts should have to. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-08 15:07 ` Chet Ramey @ 2024-04-09 3:44 ` Zachary Santer 2024-04-13 18:45 ` Chet Ramey 0 siblings, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-04-09 3:44 UTC (permalink / raw) To: chet.ramey; +Cc: Carl Edquist, bug-bash, libc-alpha On Mon, Apr 8, 2024 at 11:07 AM Chet Ramey <chet.ramey@case.edu> wrote: > > Bash doesn't close the file descriptor in $fd. Since it's used with `exec', > it's under the user's control. > > The script here explicitly opens and closes the file descriptor, so it > can read until read returns failure. It doesn't really matter when the > process exits or whether the shell closes its ends of the pipe -- the > script has made a copy that it can use for its own purposes. > (And you need to use exec to close it when you're done.) Caught that shortly after sending the email. Yeah, I know. > You can do the same thing with a coproc. The question is whether or > not scripts should have to. If there's a way to exec fds to read from and write to the same background process without first using the coproc keyword or using FIFOs I'm all ears. To me, coproc fills that gap. I'd be fine with having to close the coproc fds in subshells myself. Heck, you still have to use exec to close at least the writing coproc fd in the parent process to get the coproc to exit, regardless. The fact that the current implementation allows the coproc fds to get into process substitutions is a little weird to me. A process substitution, in combination with exec, is kind of the one other way to communicate with background processes through fds without using FIFOs. I still have to close the coproc fds there myself, right now. Consider the following situation: I've got different kinds of background processes going on, and I've got fds exec'd from process substitutions, fds from coprocs, and fds exec'd from other things, and I need to keep them all out of the various background processes. Now I need different arrays of fds, so I can close all the fds that get into a background process forked with & without trying to close the coproc fds there; while still being able to close all the fds, including the coproc fds, in process substitutions. I'm curious what the reasoning was there. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-09 3:44 ` Zachary Santer @ 2024-04-13 18:45 ` Chet Ramey 2024-04-14 2:09 ` Zachary Santer 0 siblings, 1 reply; 53+ messages in thread From: Chet Ramey @ 2024-04-13 18:45 UTC (permalink / raw) To: Zachary Santer; +Cc: chet.ramey, Carl Edquist, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 1115 bytes --] On 4/8/24 11:44 PM, Zachary Santer wrote: > The fact that the current implementation allows the coproc fds to get > into process substitutions is a little weird to me. A process > substitution, in combination with exec, is kind of the one other way > to communicate with background processes through fds without using > FIFOs. I still have to close the coproc fds there myself, right now. So are you advocating for the shell to close coproc file descriptors when forking children for command substitutions, process substitutions, and subshells, in addition to additional coprocs? Right now, it closes coproc file descriptors when forking subshells. > > Consider the following situation: I've got different kinds of > background processes going on, and I've got fds exec'd from process > substitutions, fds from coprocs, If you have more than one coproc, you have to manage all this yourself already. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-13 18:45 ` Chet Ramey @ 2024-04-14 2:09 ` Zachary Santer 0 siblings, 0 replies; 53+ messages in thread From: Zachary Santer @ 2024-04-14 2:09 UTC (permalink / raw) To: chet.ramey; +Cc: Martin D Kealey, Carl Edquist, bug-bash, libc-alpha On Sat, Apr 13, 2024 at 2:45 PM Chet Ramey <chet.ramey@case.edu> wrote: > > On 4/8/24 11:44 PM, Zachary Santer wrote: > > > The fact that the current implementation allows the coproc fds to get > > into process substitutions is a little weird to me. A process > > substitution, in combination with exec, is kind of the one other way > > to communicate with background processes through fds without using > > FIFOs. I still have to close the coproc fds there myself, right now. > > So are you advocating for the shell to close coproc file descriptors > when forking children for command substitutions, process substitutions, > and subshells, in addition to additional coprocs? Right now, it closes > coproc file descriptors when forking subshells. Yes. I couldn't come up with a way that letting the coproc fds into command substitutions could cause a problem, in the same sense that letting them into regular ( ) subshells doesn't seem like a problem. That bit is at least good for my arbitrary standard of "consistency," though. At least in my use case, trying to use the coproc file descriptors directly in a pipeline forced the use of a process substitution, because I needed the coproc fds accessible in the second segment of what would've been a three-segment pipeline. (Obviously, I'm using 'shopt -s lastpipe' here.) I ultimately chose to do 'exec {fd}> >( command )' and redirect from one command within the second segment into ${fd} instead of ending the second segment with '> >( command ); wait "${?}"'. In the first case, you have all the same drawbacks as allowing the coproc fds into a subshell forked with &. In the second case, it's effectively the same as allowing the coproc fds into the segments of a pipeline that become subshells. I guess that would be a concern if the segment of the pipeline in the parent shell closes the fds to the coproc while the pipeline is still executing. That seems like an odd thing to do, but okay. Now that I've got my own fds that I'm managing myself, I've turned that bit of code into a plain, three-segment pipeline, at least for now. > > Consider the following situation: I've got different kinds of > > background processes going on, and I've got fds exec'd from process > > substitutions, fds from coprocs, > > If you have more than one coproc, you have to manage all this yourself > already. Not if we manage to convince you to turn MULTIPLE_COPROCS=1 on by default. Or if someone builds bash that way for themselves. On Sat, Apr 13, 2024 at 2:51 PM Chet Ramey <chet.ramey@case.edu> wrote: > > On 4/9/24 10:46 AM, Zachary Santer wrote: > > >> If you want two processes to communicate (really three), you might want > >> to build with the multiple coproc support and use the shell as the > >> arbiter. > > > > If you've written a script for other people than just yourself, > > expecting all of them to build their own bash install with a > > non-default preprocessor directive is pretty unreasonable. > > This all started because I wasn't comfortable with the amount of testing > the multiple coprocs code had undergone. If we can get more people to > test these features, there's a better chance of making it the default. > > > The part that I've been missing this whole time is that using exec > > with the fds provided by the coproc keyword is actually a complete > > solution for my use case, if I'm willing to close all the resultant > > fds myself in background processes where I don't want them to go. > > Which I am. > > Good deal. > > > Whether the coproc fds should be automatically kept out of most kinds > > of subshells, like it is now; or out of more kinds than currently; is > > kind of beside the point to me now. > > Sure, but it's the potential for deadlock that we're trying to reduce. I hesitate to say to just set MULTIPLE_COPROCS=1 free and wait for people to complain. I'm stunned at my luck in getting Carl Edquist's attention directed at this. Hopefully there are other people who aren't subscribed to this email list who are interested in using this functionality, if it becomes more fully implemented. > > But, having a builtin to ensure > > the same behavior is applied to any arbitrary fd might be useful to > > people, especially if those fds get removed from process substitutions > > as well. > > What does this mean? What kind of builtin? And what `same behavior'? Let's say it's called 'nosub', and takes fd arguments. It would make the shell take responsibility for keeping those fds out of subshells. Perhaps it could take a -u flag, to make it stop keeping the fd arguments out of subshells. That would be a potential way to get bash to quit closing coproc fds in subshells, as the user is declaring that s/he is now responsible for those fds. Still a separate matter from whether those fds get closed automatically in the parent shell at any given point. People who exec fds have to know to take responsibility for everywhere they go, right now. Exec'ing fds like this is bound to be more common than using coprocs, as is, I assume, exec'ing fds to process substitutions. If there are benefits to keeping coproc fds out of subshells like bash attempts to do, the same benefits would apply if bash offers to take the burden of keeping other, user-specified fds out of subshells. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-03 14:32 ` Chet Ramey 2024-04-03 17:19 ` Zachary Santer @ 2024-04-04 12:52 ` Carl Edquist 2024-04-04 23:23 ` Martin D Kealey 2024-04-08 16:21 ` Chet Ramey 1 sibling, 2 replies; 53+ messages in thread From: Carl Edquist @ 2024-04-04 12:52 UTC (permalink / raw) To: Chet Ramey; +Cc: Zachary Santer, bug-bash, libc-alpha Hi Chet, thanks for taking the time to review this :D [My apologies again upfront for another lengthy (comprehensive?) email.] On Wed, 3 Apr 2024, Chet Ramey wrote: > On 4/2/24 12:22 PM, Carl Edquist wrote: > >> the forked coproc has to close its fds to/from _all_ other existing >> coprocs (as there can be several). > > And there is the issue. Without multi-coproc support, the shell only > keeps track of one coproc at a time, so there's only one set of pipe > file descriptors to close. Right, exactly. The example with the default build (showing the essential case that causes deadlock) was to highlight that your multi-coproc support code apparently does indeed correctly track and close all these fds, and thus prevents the deadlock issue. On Wed, 3 Apr 2024, Chet Ramey wrote: > It's straightforward: the coproc process terminates, the shell reaps it, > marks it as dead, notifies the user that the process exited, and reaps > it before printing the next prompt. I don't observe any different > behavior between the default and when compiled for multiple coprocs. > > It depends on when the process terminates as to whether you get a prompt > back and need to run an additional command before reaping the coproc > (macOS, RHEL), which gives you the opportunity to run the `read' > command: Ah, my mistake then - thanks for explaining. I must have been thrown off by the timing, running it with and without an intervening interactive prompt before the read command. When run interactively, an extra 'Enter' (or not) before the read command changes the behavior. So in that case, this issue (that the shell closes its read-end of the pipe from a reaped coproc, potentially before being able to read the final output) was already there and is not specific to the multi-coproc code. But in any case, it seems like this is a race then? That is, whether the child process terminates before or after the prompt in question. > $ coproc WC { wc; } > [1] 48057 > $ exec {WC[1]}>&- > $ read -u ${WC[0]} X > [1]+ Done coproc WC { wc; } > bash: DEBUG warning: cpl_reap: deleting 48057 > $ echo $X > 0 0 0 > > (I put in a trace statement to show exactly when the coproc gets reaped and > deallocated.) Thanks! (for taking the time to play with this) Though apparently it's still a race here. If you diagram the shell and coproc (child) processes, I think you'll see that your DEBUG statement can also happen _before_ the read command, which would then fail. You can contrive this by adding a small sleep (eg, 0.1s) at the end of execute_builtin_or_function (in execute_cmd.c), just before it returns. Eg: diff --git a/execute_cmd.c b/execute_cmd.c index ed1063e..c72f322 100644 --- a/execute_cmd.c +++ b/execute_cmd.c @@ -5535,6 +5535,7 @@ execute_builtin_or_function (words, builtin, var, redirects, discard_unwind_frame ("saved_fifos"); #endif + usleep(100000); return (result); } If I do this, I consistently see "read: X: invalid file descriptor specification" running the above 4-line "coproc WC" example in a script, demonstrating that there is no guarantee that the read command will start before the WC coproc is reaped and {WC[0]} is closed, even though it's the next statement after 'exec {WC[1]}>&-'. But (as I'll try to show) you can trip up on this race even without slowing down bash itself artificially. > I can't reproduce your results with non-interactive shells, either, with > job control enabled or disabled. That's fair; let's try it with a script: $ cat cope.sh #!/bin/bash coproc WC { wc; } jobs exec {WC[1]}>&- [[ $1 ]] && sleep "$1" jobs read -u ${WC[0]} X echo $X Run without sleep, the wc output is seen: $ ./cope.sh [1]+ Running coproc WC { wc; } & [1]+ Running coproc WC { wc; } & 0 0 0 Run with a brief sleep after closing the write end, and it breaks: $ ./cope.sh .1 [1]+ Running coproc WC { wc; } & [1]+ Done coproc WC { wc; } ./cope.sh: line 8: read: X: invalid file descriptor specification And, if I run with "0" for a sleep time, it intermittently behaves like either of the above. Racy! >> This is a bug. The shell should not automatically close its read pipe >> to a coprocess that has terminated -- it should stay open to read the >> final output, and the user should be responsible for closing the read >> end explicitly. > > How long should the shell defer deallocating the coproc after the > process terminates? I only offer my opinion here, but it strikes me that it definitely should _not_ be based on an amount of _time_. That's inherently racy. In the above example, there is only a single line to read; but the general case there may be many 'records' sitting in the pipe waiting to be processed, and processing each record may take an arbitrary amount of time. (Consider a coproc containing a sort command, for example, that produces all its output lines at once after it sees EOF, and then terminates.) Zack illustrated basically the same point with his example: > exec {fd}< <( some command ) > while IFS='' read -r line <&"${fd}"; do > # do stuff > done > {fd}<&- A process-substitution open to the shell like this is effectively a one-ended coproc (though not in the jobs list), and it behaves reliably here because the user can count on {fd} to remain open even after the child process terminates. So, the user can determine when the coproc fds are no longer needed, whether that's when EOF is hit trying to read from the coproc, or whatever other condition. Personally I like the idea of 'closing' a coproc explicitly, but if it's a bother to add options to the coproc keyword, then I would say just let the user be responsible for closing the fds. Once the coproc has terminated _and_ the coproc's fds are closed, then the coproc can be deallocated. Apparently there is already some detection in there for when the coproc fds get closed, as the {NAME[@]} fd array members get set to -1 automatically when when you do, eg, 'exec {NAME[0]}<&-'. So perhaps this won't be a radical change. Alternatively (or, additionally), you could interpret 'unset NAME' for a coproc to mean "deallocate the coproc." That is, close the {NAME[@]} fds, unset the NAME variable, and remove any coproc bookkeeping for NAME. (Though if the coproc child process hasn't terminated on its own yet, still it shouldn't be killed, and perhaps it should remain in the jobs list as a background process until it's done.) ... [And if you _really_ don't want to defer deallocating a coproc after it terminates, I suppose you can go ahead and deallocate it in terms of removing it from the jobs list and dropping any bookkeeping for it - as long as you leave the fds and fd variables intact for the user. It's a little dicey, but in theory it should not lead to deadlock, even if copies of these (now untracked) reaped-coproc pipe fds end up in other coprocs. Why not? Because (1) this coproc is dead and thus won't be trying to read anymore from its (now closed) end, and (2) reads from the shell's end will not block since there are no copies of the coproc's write end open anymore. Still, it'd be cleaner to defer deallocation, to avoid these stray (albeit harmless) copies of fds making their way into new coprocs.] > What should it do to make sure that the variables don't hang around with > invalid file descriptors? First, just to be clear, the fds to/from the coproc pipes are not invalid when the coproc terminates (you can still read from them); they are only invalid after they are closed. The surprising bit is when they become invalid unexpectedly (from the point of view of the user) because the shell closes them automatically, at the somewhat arbitrary timing when the coproc is reaped. Second, why is it a problem if the variables keep their (invalid) fds after closing them, if the user is the one that closed them anyway? Isn't this how it works with the auto-assigned fd redirections? Eg: $ exec {d}<. $ echo $d 10 $ exec {d}<&- $ echo $d 10 But, as noted, bash apparently already ensures that the variables don't hang around with invalid file descriptors, as once you close them the corresponding variable gets updated to "-1". > Or should the user be responsible for unsetting the array variable too? > (That's never been a requirement, obviously.) On the one hand, bash is already messing with the coproc array variables (setting the values to -1 when the user closes the fds), so it's not really a stretch in my mind for bash to unset the whole variable when the coproc is deallocated. On the other hand, as mentioned above, bash leaves automatically allocated fd variables intact after the user explicitly closes them. So I guess either way seems reasonable. If the user has explicitly closed both fd ends for a coproc, it should not be a surprise to the user either way - whether the variable gets unset automatically, or whether it remains with (-1 -1). Since you are already unsetting the variable when the coproc is deallocated though, I'd say it's fine to keep doing that -- just don't deallocate the coproc before the user has closed both fds. >> It also invites trouble if the shell variable that holds the fds gets >> removed unexpectedly when the coprocess terminates. (Suddenly the >> variable expands to an empty string.) It seems to me that the proper >> time to clear the coproc variable (if at all) is after the user has >> explicitly closed both of the fds. > > That requires adding more plumbing than I want to, Your project your call :D > especially since the user can always save the file descriptors of > interest into another variable if they want to use them after the coproc > terminates. *Except* that it's inherently a race condition whether the original variables will still be intact to save them. Even if you attempt to save them immediately: coproc X { exit; } X_BACKUP=( ${X[@]} ) it's not guaranteed that X_BACKUP=(...) will run before coproc X has been deallocated, and the X variable cleared. No doubt this hasn't escaped you, but in any case you can see it for yourself if you introduce a small delay in execute_coproc, in the parent, just after the call to make_child: diff --git a/execute_cmd.c b/execute_cmd.c index ed1063e..5949e3e 100644 --- a/execute_cmd.c +++ b/execute_cmd.c @@ -2440,6 +2440,8 @@ execute_coproc (command, pipe_in, pipe_out, fds_to_close) exit (estat); } + else + usleep(100000); close (rpipe[1]); close (wpipe[0]); When I try this, X_BACKUP is consistently empty. Though if you add, say, a call to "sleep 0.2" to coproc X, then X_BACKUP consistently gets a copy of X's fds in time. I am sorry if this sounds contrived, but I hope it demonstrates that closing fds and and unsetting the variable for a coproc automatically when it terminates is fundamentally flawed, because it depends on the arbitrary race timing between the two processes. >> *Or* else add an option to the coproc keyword to explicitly close the >> coproc - which will close both fds and clear the variable. > > Not going to add any more options to reserved words; that does more > violence to the grammar than I want. Not sure how you'd feel about using 'unset' on the coproc variable instead. (Though as discussed, I think the coproc terminated + fds manually closed condition is also sufficient.) ............. Anyway, as far as I'm concerned there's nothing urgent about all this, but (along with the multi-coproc support that you implemented), avoiding the current automatic deallocation behavior would seem to go a long way toward making coproc a correct and generally useful feature. Thanks for your time! Carl PS Zack you're welcome :) ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-04 12:52 ` Carl Edquist @ 2024-04-04 23:23 ` Martin D Kealey 2024-04-08 19:50 ` Chet Ramey 2024-04-08 16:21 ` Chet Ramey 1 sibling, 1 reply; 53+ messages in thread From: Martin D Kealey @ 2024-04-04 23:23 UTC (permalink / raw) To: Carl Edquist; +Cc: Chet Ramey, Zachary Santer, bug-bash, libc-alpha [-- Attachment #1: Type: text/plain, Size: 417 bytes --] I'm somewhat uneasy about having coprocs inaccessible to each other. I can foresee reasonable cases where I'd want a coproc to utilize one or more other coprocs. In particular, I can see cases where a coproc is written to by one process, and read from by another. Can we at least have the auto-close behaviour be made optional, so that it can be turned off when we want to do something more sophisticated? -Martin ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-04 23:23 ` Martin D Kealey @ 2024-04-08 19:50 ` Chet Ramey 2024-04-09 14:46 ` Zachary Santer 2024-04-09 15:58 ` Carl Edquist 0 siblings, 2 replies; 53+ messages in thread From: Chet Ramey @ 2024-04-08 19:50 UTC (permalink / raw) To: Martin D Kealey, Carl Edquist Cc: chet.ramey, Zachary Santer, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 824 bytes --] On 4/4/24 7:23 PM, Martin D Kealey wrote: > I'm somewhat uneasy about having coprocs inaccessible to each other. > I can foresee reasonable cases where I'd want a coproc to utilize one or > more other coprocs. That's not the intended purpose, so I don't think not fixing a bug to accommodate some future hypothetical use case is a good idea. That's why there's a warning message when you try to use more than one coproc -- the shell doesn't keep track of more than one. If you want two processes to communicate (really three), you might want to build with the multiple coproc support and use the shell as the arbiter. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-08 19:50 ` Chet Ramey @ 2024-04-09 14:46 ` Zachary Santer 2024-04-13 18:51 ` Chet Ramey 2024-04-09 15:58 ` Carl Edquist 1 sibling, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-04-09 14:46 UTC (permalink / raw) To: chet.ramey; +Cc: Martin D Kealey, Carl Edquist, bug-bash, libc-alpha On Mon, Apr 8, 2024 at 3:50 PM Chet Ramey <chet.ramey@case.edu> wrote: > > On 4/4/24 7:23 PM, Martin D Kealey wrote: > > I'm somewhat uneasy about having coprocs inaccessible to each other. > > I can foresee reasonable cases where I'd want a coproc to utilize one or > > more other coprocs. > > That's not the intended purpose, so I don't think not fixing a bug to > accommodate some future hypothetical use case is a good idea. That's > why there's a warning message when you try to use more than one coproc -- > the shell doesn't keep track of more than one. That use case is always going to be hypothetical if the support for it isn't really there, though, isn't it? > If you want two processes to communicate (really three), you might want > to build with the multiple coproc support and use the shell as the > arbiter. If you've written a script for other people than just yourself, expecting all of them to build their own bash install with a non-default preprocessor directive is pretty unreasonable. The part that I've been missing this whole time is that using exec with the fds provided by the coproc keyword is actually a complete solution for my use case, if I'm willing to close all the resultant fds myself in background processes where I don't want them to go. Which I am. $ coproc CAT1 { cat; } [1] 1769 $ exec {CAT1_2[0]}<&"${CAT1[0]}" {CAT1_2[1]}>&"${CAT1[1]}" {CAT1[0]}<&- {CAT1[1]}>&- $ declare -p CAT1 CAT1_2 declare -a CAT1=([0]="-1" [1]="-1") declare -a CAT1_2=([0]="10" [1]="11") $ coproc CAT2 { exec {CAT1_2[0]}<&- {CAT1_2[1]}>&-; cat; } [2] 1771 $ exec {CAT2_2[0]}<&"${CAT2[0]}" {CAT2_2[1]}>&"${CAT2[1]}" {CAT2[0]}<&- {CAT2[1]}>&- $ declare -p CAT2 CAT2_2 declare -a CAT2=([0]="-1" [1]="-1") declare -a CAT2_2=([0]="12" [1]="13") $ printf 'dog\ncat\nrabbit\ntortoise\n' >&"${CAT1_2[1]}" $ IFS='' read -r -u "${CAT1_2[0]}" line; printf '%s\n' "${?}:${line}" 0:dog $ exec {CAT1_2[1]}>&- $ IFS='' read -r -u "${CAT1_2[0]}" line; printf '%s\n' "${?}:${line}" 0:cat [1]- Done coproc CAT1 { cat; } $ IFS='' read -r -u "${CAT1_2[0]}" line; printf '%s\n' "${?}:${line}" 0:rabbit $ IFS='' read -r -u "${CAT1_2[0]}" line; printf '%s\n' "${?}:${line}" 0:tortoise $ IFS='' read -r -u "${CAT1_2[0]}" line; printf '%s\n' "${?}:${line}" 1: $ exec {CAT1_2[0]}<&- {CAT2_2[0]}<&- {CAT2_2[1]}>&- $ [2]+ Done No warning message when creating the CAT2 coproc. I swear, I was so close to getting this figured out three years ago, unless the behavior when a coproc still exists only because other non-coproc fds are pointing to it has changed since whatever version of bash I was testing in at the time. I am completely satisfied with this solution. The trial and error aspect to figuring this kind of stuff out is really frustrating. Maybe I'll take some time and write a Wooledge Wiki article on this at some point, if there isn't one already. Whether the coproc fds should be automatically kept out of most kinds of subshells, like it is now; or out of more kinds than currently; is kind of beside the point to me now. But, having a builtin to ensure the same behavior is applied to any arbitrary fd might be useful to people, especially if those fds get removed from process substitutions as well. If the code for coproc fds gets applied to these fds, then you've got more chances to see that the logic actually works correctly, if nothing else. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-09 14:46 ` Zachary Santer @ 2024-04-13 18:51 ` Chet Ramey 0 siblings, 0 replies; 53+ messages in thread From: Chet Ramey @ 2024-04-13 18:51 UTC (permalink / raw) To: Zachary Santer Cc: chet.ramey, Martin D Kealey, Carl Edquist, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 1649 bytes --] On 4/9/24 10:46 AM, Zachary Santer wrote: >> If you want two processes to communicate (really three), you might want >> to build with the multiple coproc support and use the shell as the >> arbiter. > > If you've written a script for other people than just yourself, > expecting all of them to build their own bash install with a > non-default preprocessor directive is pretty unreasonable. This all started because I wasn't comfortable with the amount of testing the multiple coprocs code had undergone. If we can get more people to test these features, there's a better chance of making it the default. > The part that I've been missing this whole time is that using exec > with the fds provided by the coproc keyword is actually a complete > solution for my use case, if I'm willing to close all the resultant > fds myself in background processes where I don't want them to go. > Which I am. Good deal. > Whether the coproc fds should be automatically kept out of most kinds > of subshells, like it is now; or out of more kinds than currently; is > kind of beside the point to me now. Sure, but it's the potential for deadlock that we're trying to reduce. > But, having a builtin to ensure > the same behavior is applied to any arbitrary fd might be useful to > people, especially if those fds get removed from process substitutions > as well. What does this mean? What kind of builtin? And what `same behavior'? -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-08 19:50 ` Chet Ramey 2024-04-09 14:46 ` Zachary Santer @ 2024-04-09 15:58 ` Carl Edquist 2024-04-13 20:10 ` Chet Ramey 1 sibling, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-04-09 15:58 UTC (permalink / raw) To: Chet Ramey, Martin D Kealey; +Cc: Zachary Santer, bug-bash, libc-alpha On 4/4/24 7:23 PM, Martin D Kealey wrote: > I'm somewhat uneasy about having coprocs inaccessible to each other. I > can foresee reasonable cases where I'd want a coproc to utilize one or > more other coprocs. > > In particular, I can see cases where a coproc is written to by one > process, and read from by another. > > Can we at least have the auto-close behaviour be made optional, so that > it can be turned off when we want to do something more sophisticated? With support for multiple coprocs, auto-closing the fds to other coprocs when creating new ones is important in order to avoid deadlocks. But if you're willing to take on management of those coproc fds yourself, you can expose them to new coprocs by making your own copies with exec redirections. But this only "kind of" works, because for some reason bash seems to close all pipe fds for external commands in coprocs, even the ones that the user explicitly copies with exec redirections. (More on that in a bit.) On Mon, 8 Apr 2024, Chet Ramey wrote: > On 4/4/24 7:23 PM, Martin D Kealey wrote: >> I'm somewhat uneasy about having coprocs inaccessible to each other. I >> can foresee reasonable cases where I'd want a coproc to utilize one or >> more other coprocs. > > That's not the intended purpose, Just a bit of levity here - i can picture Doc from back to the future exclaiming, "Marty, it's perfect! You're just not thinking 4th dimensionally!" > so I don't think not fixing a bug to accommodate some future > hypothetical use case is a good idea. That's why there's a warning > message when you try to use more than one coproc -- the shell doesn't > keep track of more than one. > > If you want two processes to communicate (really three), you might want > to build with the multiple coproc support and use the shell as the > arbiter. For what it's worth, my experience is that coprocesses in bash (rigged up by means other than the coproc keyword) become very fun and interesting when you allow for the possibility of communication between coprocesses. (Most of my use cases for coprocesses fall under this category, actually.) The most basic commands for tying multiple coprocesses together are tee(1) and paste(1), for writing to or reading from multiple coprocesses at once. You can do this already with process substitutions like tee >(cmd1) >(cmd2) paste <(cmd3) <(cmd4) My claim here is that there are uses for this where these commands are all separate coprocesses; that is, you'd want to read the output from cmd1 and cmd2 separately, and provide input for cmd3 and cmd4 separately. (I'll try to send some examples in a later email.) Nevertheless it's still crucial to keep the shell's existing coprocess fds out of new coprocesses, otherwise you easily run yourself into deadlock. Now, if you built bash with multiple coproc support, I would have expected you could still rig this up, by doing the redirection work explicitly yourself. Something like this: coproc UP { stdbuf -oL tr a-z A-Z; } coproc DOWN { stdbuf -oL tr A-Z a-z; } # make user-managed backup copies of coproc fds exec {up_r}<&${UP[0]} {up_w}>&${UP[1]} exec {down_r}<&${DOWN[0]} {down_w}>&${DOWN[1]} coproc THREEWAY { tee /dev/fd/$up_w /dev/fd/$down_w; } But the above doesn't actually work, as it seems that the coproc shell (THREEWAY) closes specifically all the pipe fds (beyond 0,1,2), even the user-managed ones explicitly copied with exec. As a result, you get back errors like this: tee: /dev/fd/11: No such file or directory tee: /dev/fd/13: No such file or directory That's the case even if you do something more explicit like: coproc UP_AND_OUT { tee /dev/fd/99 99>&$up_w; } the '99>&$up_w' redirection succeeds, showing that the coproc does have access to its backup fd $up_w (*), but apparently the shell closes fd 99 (as well as $up_w) before exec'ing the tee command. Note the coproc shell only does this with pipes; it leaves other user managed fds like files or directories alone. I have no idea why that's the case, and i wonder whether it's intentional or an oversight. But anyway, i imagine that if one wants to use multi coproc support (which requires automatically closing the shell's coproc fds for new coprocs), and wants to set up multiple coprocs to communicate amongst themselves, then the way to go would be explicit redirections. (But again, this requires fixing this peculiar behavior where the coproc shell closes even the user managed copies of pipe fds before exec'ing external commands.) (*) to prove that the coproc shell does have access to $up_w, we can make a shell-only replacement for tee(1) : (actually works) fdtee () { local line fd while read -r line; do for fd; do printf '%s\n' "$line" >&$fd; done; done; } coproc UP { stdbuf -oL tr a-z A-Z; } coproc DOWN { stdbuf -oL tr A-Z a-z; } # make user-managed backup copies of coproc fds exec {up_r}<&${UP[0]} {up_w}>&${UP[1]} exec {down_r}<&${DOWN[0]} {down_w}>&${DOWN[1]} stdout=1 coproc THREEWAY { fdtee $stdout $up_w $down_w; } # save these too, for safe keeping exec {tee_r}<&${THREEWAY[0]} {tee_w}>&${THREEWAY[1]} Then: (actually works) $ echo 'Greetings!' >&$tee_w $ read -u $tee_r plain $ read -u $up_r upped $ read -u $down_r downed $ echo "[$plain] [$upped] [$downed]" [Greetings!] [GREETINGS!] [greetings!] This is a pretty trivial example just to demonstrate the concept. But once you have the freedom to play with it, you find more interesting, useful applications. Of course, for the above technique to be generally useful, external commands need access to these user-managed fds (copied with exec). (I have no idea why the coproc shell closes them.) The shell is crippled when limited to builtins. (I'll try to tidy up some working examples with my coprocess management library this week, for the curious.) Juicy thread hey? I can hardly keep up! :) Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-09 15:58 ` Carl Edquist @ 2024-04-13 20:10 ` Chet Ramey 2024-04-14 18:43 ` Zachary Santer 2024-04-15 17:01 ` Carl Edquist 0 siblings, 2 replies; 53+ messages in thread From: Chet Ramey @ 2024-04-13 20:10 UTC (permalink / raw) To: Carl Edquist, Martin D Kealey Cc: chet.ramey, Zachary Santer, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 3386 bytes --] On 4/9/24 11:58 AM, Carl Edquist wrote: > On 4/4/24 7:23 PM, Martin D Kealey wrote: > >> I'm somewhat uneasy about having coprocs inaccessible to each other. I >> can foresee reasonable cases where I'd want a coproc to utilize one or >> more other coprocs. >> >> In particular, I can see cases where a coproc is written to by one >> process, and read from by another. >> >> Can we at least have the auto-close behaviour be made optional, so that >> it can be turned off when we want to do something more sophisticated? > > With support for multiple coprocs, auto-closing the fds to other coprocs > when creating new ones is important in order to avoid deadlocks. > > But if you're willing to take on management of those coproc fds yourself, > you can expose them to new coprocs by making your own copies with exec > redirections. > > But this only "kind of" works, because for some reason bash seems to close > all pipe fds for external commands in coprocs, even the ones that the user > explicitly copies with exec redirections. > > (More on that in a bit.) > > > On Mon, 8 Apr 2024, Chet Ramey wrote: > >> On 4/4/24 7:23 PM, Martin D Kealey wrote: >>> I'm somewhat uneasy about having coprocs inaccessible to each other. I >>> can foresee reasonable cases where I'd want a coproc to utilize one or >>> more other coprocs. >> >> That's not the intended purpose, The original intent was to allow the shell to drive a long-running process that ran more-or-less in parallel with it. Look at examples/scripts/bcalc for an example of that kind of use. > > For what it's worth, my experience is that coprocesses in bash (rigged up > by means other than the coproc keyword) become very fun and interesting > when you allow for the possibility of communication between coprocesses. > (Most of my use cases for coprocesses fall under this category, actually.) Sure, as long as you're willing to take on file descriptor management yourself. I just don't want to make it a new requirement, since it's never been one before. > Now, if you built bash with multiple coproc support, I would have expected > you could still rig this up, by doing the redirection work explicitly > yourself. Something like this: > > coproc UP { stdbuf -oL tr a-z A-Z; } > coproc DOWN { stdbuf -oL tr A-Z a-z; } > > # make user-managed backup copies of coproc fds > exec {up_r}<&${UP[0]} {up_w}>&${UP[1]} > exec {down_r}<&${DOWN[0]} {down_w}>&${DOWN[1]} > > coproc THREEWAY { tee /dev/fd/$up_w /dev/fd/$down_w; } > > > But the above doesn't actually work, as it seems that the coproc shell > (THREEWAY) closes specifically all the pipe fds (beyond 0,1,2), even the > user-managed ones explicitly copied with exec. File descriptors the user saves with exec redirections beyond [0-2] are set to close-on-exec. POSIX makes that behavior unspecified, but bash has always done it. Shells don't offer any standard way to modify the state of that flag, but there is the `fdflags' loadable builtin you can experiment with to change close-on-exec. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-13 20:10 ` Chet Ramey @ 2024-04-14 18:43 ` Zachary Santer 2024-04-15 18:55 ` Chet Ramey 2024-04-15 17:01 ` Carl Edquist 1 sibling, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-04-14 18:43 UTC (permalink / raw) To: chet.ramey; +Cc: Carl Edquist, Martin D Kealey, bug-bash, libc-alpha On Sat, Apr 13, 2024 at 4:10 PM Chet Ramey <chet.ramey@case.edu> wrote: > > The original intent was to allow the shell to drive a long-running process > that ran more-or-less in parallel with it. Look at examples/scripts/bcalc > for an example of that kind of use. $ ./bcalc equation: -12 ./bcalc: line 94: history: -1: invalid option history: usage: history [-c] [-d offset] [n] or history -anrw [filename] or history -ps arg [arg...] -12 equation: exit diff --git a/examples/scripts/bcalc b/examples/scripts/bcalc index bc7e2b40..826eca4f 100644 --- a/examples/scripts/bcalc +++ b/examples/scripts/bcalc @@ -91,7 +91,7 @@ do esac # save to the history list - history -s "$EQN" + history -s -- "$EQN" # run it through bc calc "$EQN" ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-14 18:43 ` Zachary Santer @ 2024-04-15 18:55 ` Chet Ramey 0 siblings, 0 replies; 53+ messages in thread From: Chet Ramey @ 2024-04-15 18:55 UTC (permalink / raw) To: Zachary Santer Cc: chet.ramey, Carl Edquist, Martin D Kealey, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 617 bytes --] On 4/14/24 2:43 PM, Zachary Santer wrote: > On Sat, Apr 13, 2024 at 4:10 PM Chet Ramey <chet.ramey@case.edu> wrote: >> >> The original intent was to allow the shell to drive a long-running process >> that ran more-or-less in parallel with it. Look at examples/scripts/bcalc >> for an example of that kind of use. > > $ ./bcalc > equation: -12 > ./bcalc: line 94: history: -1: invalid option Good catch, thanks. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-13 20:10 ` Chet Ramey 2024-04-14 18:43 ` Zachary Santer @ 2024-04-15 17:01 ` Carl Edquist 2024-04-17 14:20 ` Chet Ramey 1 sibling, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-04-15 17:01 UTC (permalink / raw) To: Chet Ramey; +Cc: Martin D Kealey, Zachary Santer, bug-bash, libc-alpha [-- Attachment #1: Type: text/plain, Size: 5088 bytes --] On Sat, 13 Apr 2024, Chet Ramey wrote: > The original intent was to allow the shell to drive a long-running > process that ran more-or-less in parallel with it. Look at > examples/scripts/bcalc for an example of that kind of use. Thanks for mentioning this example. As you understand, this model use case does not require closing the coproc fds when finished, because they will be closed implicitly when the shell exits. (As bcalc itself admits.) And if the coproc is left open for the lifetime of the shell, the alternate behavior of deferring the coproc deallocation (until both coproc fds are closed) would not require anything extra from the user. The bcalc example does close both coproc fds though - both at the end, and whenever it resets. And so in this example (which as you say, was the original intent), the user is already explicitly closing both coproc fds explicitly; so the alternate deferring behavior would not require anything extra from the user here either. ... Yet another point brought to light by the bcalc example relates to the coproc pid variable. The reset() function first closes the coproc pipe fds, then sleeps for a second to give the BC coproc some time to finish. An alternative might be to 'wait' for the coproc to finish (likely faster than sleeping for a second). But you have to make and use your $coproc_pid copy rather than $BC_PID directly, because 'wait $BC_PID' may happen before or after the coproc is reaped and BC_PID is unset. (As the bcalc author seems to understand.) So in general the coproc *_PID variable only seems usable for making a copy when starting the coproc. The info page has the following: > The process ID of the shell spawned to execute the coprocess is > available as the value of the variable 'NAME_PID'. The 'wait' builtin > command may be used to wait for the coprocess to terminate. But it seems to me that the copy is necessary, and it is never reliable to run 'wait $NAME_PID'. Because any time the shell is in a position to wait for the coproc to finish, by that time it's going to be a race whether or not NAME_PID is still set. So this is another example for me of why it would be handy if coproc deallocation were deferred until explicit user action (closing both coproc fds, or unsetting the coproc variable). That way ${NAME[@]} and $NAME_PID could reliably be used directly without having to make copies. Anyway, just food for thought if down the line you make a shell option for coproc deallocation behavior. >> Now, if you built bash with multiple coproc support, I would have >> expected you could still rig this up, by doing the redirection work >> explicitly yourself. Something like this: >> >> coproc UP { stdbuf -oL tr a-z A-Z; } >> coproc DOWN { stdbuf -oL tr A-Z a-z; } >> >> # make user-managed backup copies of coproc fds >> exec {up_r}<&${UP[0]} {up_w}>&${UP[1]} >> exec {down_r}<&${DOWN[0]} {down_w}>&${DOWN[1]} >> >> coproc THREEWAY { tee /dev/fd/$up_w /dev/fd/$down_w; } >> >> >> But the above doesn't actually work, as it seems that the coproc shell >> (THREEWAY) closes specifically all the pipe fds (beyond 0,1,2), even >> the user-managed ones explicitly copied with exec. > > File descriptors the user saves with exec redirections beyond [0-2] are > set to close-on-exec. POSIX makes that behavior unspecified, but bash > has always done it. Ah, ok, thanks. I believe I found where this gets set in do_redirection_internal() in redir.c. (Whew, a big function.) As far as I can tell the close-on-exec state is "duplicated" rather than set unconditionally. That is, the new fd in a redirection is only set close-on-exec if the source is. (Good, because in general I rely on redirections to be made available to external commands.) But apparently coproc marks its pipe fds close-on-exec, so there's no way to expose manual copies of these fds to external commands. So, that explains the behavior I was seeing ... It's just a bit too bad for anyone that actually wants to do more elaborate coproc interconnections with manual redirections, as they're limited to shell builtins. ... I might pose a question to ponder about this though: With the multi-coproc support code, is it still necessary to set the coproc pipe fds close-on-exec? (If, perhaps, they're already getting explicitly closed in the right places.) Because if the new coproc fds are _not_ set close-on-exec, in general that would allow the user to do manual redirections for external commands (eg tee(1) or paste(1)) to communicate with multiple coproc fds together. > Shells don't offer any standard way to modify the state of that flag, > but there is the `fdflags' loadable builtin you can experiment with to > change close-on-exec. Thanks for the tip. It's nice to know there is a workaround to leave copies of the coproc fds open across exec; though for now I will probably continue setting up pipes in the shell by methods other than the coproc keyword. Cheers, Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-15 17:01 ` Carl Edquist @ 2024-04-17 14:20 ` Chet Ramey 2024-04-20 22:04 ` Carl Edquist 0 siblings, 1 reply; 53+ messages in thread From: Chet Ramey @ 2024-04-17 14:20 UTC (permalink / raw) To: Carl Edquist Cc: chet.ramey, Martin D Kealey, Zachary Santer, bug-bash, libc-alpha On 4/15/24 1:01 PM, Carl Edquist wrote: > On Sat, 13 Apr 2024, Chet Ramey wrote: > >> The original intent was to allow the shell to drive a long-running >> process that ran more-or-less in parallel with it. Look at >> examples/scripts/bcalc for an example of that kind of use. > > Thanks for mentioning this example. As you understand, this model use case > does not require closing the coproc fds when finished, because they will be > closed implicitly when the shell exits. (As bcalc itself admits.) > > And if the coproc is left open for the lifetime of the shell, the alternate > behavior of deferring the coproc deallocation (until both coproc fds are > closed) would not require anything extra from the user. > > The bcalc example does close both coproc fds though - both at the end, and > whenever it resets. And so in this example (which as you say, was the > original intent), the user is already explicitly closing both coproc fds > explicitly; so the alternate deferring behavior would not require anything > extra from the user here either. > > ... > > Yet another point brought to light by the bcalc example relates to the > coproc pid variable. The reset() function first closes the coproc pipe > fds, then sleeps for a second to give the BC coproc some time to finish. > > An alternative might be to 'wait' for the coproc to finish (likely faster > than sleeping for a second). If the coproc has some problem and doesn't exit immediately, `wait' without options will hang. That's why I opted for the sleep/kill-as-insurance combo. (And before you ask why I didn't use `wait -n', I wrote bcalc in 30 minutes after someone asked me a question about doing floating point math with awk in a shell script, and it worked.) -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-17 14:20 ` Chet Ramey @ 2024-04-20 22:04 ` Carl Edquist 2024-04-22 16:06 ` Chet Ramey 0 siblings, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-04-20 22:04 UTC (permalink / raw) To: Chet Ramey; +Cc: Zachary Santer, bug-bash, libc-alpha On Wed, 17 Apr 2024, Chet Ramey wrote: > On 4/15/24 1:01 PM, Carl Edquist wrote: >> >> Yet another point brought to light by the bcalc example relates to the >> coproc pid variable. The reset() function first closes the coproc >> pipe fds, then sleeps for a second to give the BC coproc some time to >> finish. >> >> An alternative might be to 'wait' for the coproc to finish (likely >> faster than sleeping for a second). > > If the coproc has some problem and doesn't exit immediately, `wait' > without options will hang. That's why I opted for the > sleep/kill-as-insurance combo. Yes that much was clear from the script itself. I didn't mean any of that as a critique of the bcalc script. I just meant it brought to light the point that the coproc pid variable is another thing in the current deallocate-on-terminate behavior, that needs to be copied before it can be used reliably. (With the 'kill' or 'wait' builtins.) Though I do suspect that the most common case with coprocs is that closing the shell's read and write fds to the coproc is enough to cause the coproc to finish promptly - as neither read attempts on its stdin nor write attempts on its stdout can block anymore. I think this is _definitely_ true for the BC coproc in the bcalc example. But it's kind of a distraction to get hung up on that detail, because in the general case there may very well be other scenarios where it would be appropriate to, um, _nudge_ the coproc a bit with the kill command. > (And before you ask why I didn't use `wait -n', I wrote bcalc in 30 > minutes after someone asked me a question about doing floating point > math with awk in a shell script, and it worked.) It's fine! It's just an example, after all :) Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-20 22:04 ` Carl Edquist @ 2024-04-22 16:06 ` Chet Ramey 2024-04-27 16:56 ` Carl Edquist 0 siblings, 1 reply; 53+ messages in thread From: Chet Ramey @ 2024-04-22 16:06 UTC (permalink / raw) To: Carl Edquist; +Cc: chet.ramey, Zachary Santer, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 818 bytes --] On 4/20/24 6:04 PM, Carl Edquist wrote: > I think this is _definitely_ true for the BC coproc in the bcalc example. > But it's kind of a distraction to get hung up on that detail, because in > the general case there may very well be other scenarios where it would be > appropriate to, um, _nudge_ the coproc a bit with the kill command. You might be surprised. The OP was sending thousands of calculations to (I think) GNU bc, which had some resource consumption issue that resulted in it eventually hanging, unresponsive. The kill was the solution there. I imagine there are similar scenarios with other tools. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-22 16:06 ` Chet Ramey @ 2024-04-27 16:56 ` Carl Edquist 2024-04-28 17:50 ` Chet Ramey 0 siblings, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-04-27 16:56 UTC (permalink / raw) To: Chet Ramey; +Cc: Zachary Santer, bug-bash, libc-alpha On Mon, 22 Apr 2024, Chet Ramey wrote: > You might be surprised. The OP was sending thousands of calculations to > (I think) GNU bc, which had some resource consumption issue that > resulted in it eventually hanging, unresponsive. The kill was the > solution there. I imagine there are similar scenarios with other tools. Ok, you got me! I take it back. I hadn't considered bc operations being cpu/memory intensive. But that possibility makes sense - given that it's arbitrary precision I guess you can ask for a number to the billionth power and never see the end of it :) Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-27 16:56 ` Carl Edquist @ 2024-04-28 17:50 ` Chet Ramey 0 siblings, 0 replies; 53+ messages in thread From: Chet Ramey @ 2024-04-28 17:50 UTC (permalink / raw) To: Carl Edquist; +Cc: chet.ramey, Zachary Santer, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 1025 bytes --] On 4/27/24 12:56 PM, Carl Edquist wrote: > > On Mon, 22 Apr 2024, Chet Ramey wrote: > >> You might be surprised. The OP was sending thousands of calculations to >> (I think) GNU bc, which had some resource consumption issue that resulted >> in it eventually hanging, unresponsive. The kill was the solution there. >> I imagine there are similar scenarios with other tools. > > Ok, you got me! I take it back. > > I hadn't considered bc operations being cpu/memory intensive. But that > possibility makes sense - given that it's arbitrary precision I guess you > can ask for a number to the billionth power and never see the end of it :) I'm not sure it was that so much as the long-running nature of the coproc. A resource leak might never be noticeable except in this (admittedly uncommon) scenario. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-04 12:52 ` Carl Edquist 2024-04-04 23:23 ` Martin D Kealey @ 2024-04-08 16:21 ` Chet Ramey 2024-04-12 16:49 ` Carl Edquist 1 sibling, 1 reply; 53+ messages in thread From: Chet Ramey @ 2024-04-08 16:21 UTC (permalink / raw) To: Carl Edquist; +Cc: chet.ramey, Zachary Santer, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 7022 bytes --] On 4/4/24 8:52 AM, Carl Edquist wrote: > Zack illustrated basically the same point with his example: > >> exec {fd}< <( some command ) >> while IFS='' read -r line <&"${fd}"; do >> # do stuff >> done >> {fd}<&- > > A process-substitution open to the shell like this is effectively a > one-ended coproc (though not in the jobs list), and it behaves reliably > here because the user can count on {fd} to remain open even after the child > process terminates. That exposes the fundamental difference. The procsub is essentially the same kind of object as a coproc, but it exposes the pipe endpoint(s) as filenames. The shell maintains open file descriptors to the child process whose input or output it exposes as a FIFO or a file in /dev/fd, since you have to have a reader and a writer. The shell closes the file descriptor and, if necessary, removes the FIFO when the command for which that was one of the word expansions (or a redirection) completes. coprocs are designed to be longer-lived, and not associated with a particular command or redirection. But the important piece is that $fd is not the file descriptor the shell keeps open to the procsub -- it's a new file descriptor, dup'd from the original by the redirection. Since it was used with `exec', it persists until the script explicitly closes it. It doesn't matter when the shell reaps the procsub and closes the file descriptor(s) -- the copy in $fd remains until the script explicitly closes it. You might get read returning failure at some point, but the shell won't close $fd for you. Since procsubs expand to filenames, even opening them is sufficient to give you a new file descriptor (with the usual caveats about how different OSs handle the /dev/fd device). You can do this yourself with coprocs right now, with no changes to the shell. > So, the user can determine when the coproc fds are no longer needed, > whether that's when EOF is hit trying to read from the coproc, or whatever > other condition. Duplicating the file descriptor will do that for you. > Personally I like the idea of 'closing' a coproc explicitly, but if it's a > bother to add options to the coproc keyword, then I would say just let the > user be responsible for closing the fds. Once the coproc has terminated > _and_ the coproc's fds are closed, then the coproc can be deallocated. This is not backwards compatible. coprocs may be a little-used feature, but you're adding a burden on the shell programmer that wasn't there previously. > Apparently there is already some detection in there for when the coproc fds > get closed, as the {NAME[@]} fd array members get set to -1 automatically > when when you do, eg, 'exec {NAME[0]}<&-'. So perhaps this won't be a > radical change. Yes, there is some limited checking in the redirection code, since the shell is supposed to manage the coproc file descriptors for the user. > > Alternatively (or, additionally), you could interpret 'unset NAME' for a > coproc to mean "deallocate the coproc." That is, close the {NAME[@]} fds, > unset the NAME variable, and remove any coproc bookkeeping for NAME. Hmmm. That's not unreasonable. >> What should it do to make sure that the variables don't hang around with >> invalid file descriptors? > > First, just to be clear, the fds to/from the coproc pipes are not invalid > when the coproc terminates (you can still read from them); they are only > invalid after they are closed. That's only sort of true; writing to a pipe for which there is no reader generates SIGPIPE, which is a fatal signal. If the coproc terminates, the file descriptor to write to it becomes invalid because it's implicitly closed. If you restrict yourself to reading from coprocs, or doing one initial write and then only reading from there on, you can avoid this, but it's not the general case. > The surprising bit is when they become invalid unexpectedly (from the point > of view of the user) because the shell closes them automatically, at the > somewhat arbitrary timing when the coproc is reaped. No real difference from procsubs. > Second, why is it a problem if the variables keep their (invalid) fds after > closing them, if the user is the one that closed them anyway? > > Isn't this how it works with the auto-assigned fd redirections? Those are different file descriptors. > > $ exec {d}<. > $ echo $d > 10 > $ exec {d}<&- > $ echo $d > 10 The shell doesn't try to manage that object in the same way it does a coproc. The user has explicitly indicated they want to manage it. > But, as noted, bash apparently already ensures that the variables don't > hang around with invalid file descriptors, as once you close them the > corresponding variable gets updated to "-1". Yes, the shell trying to be helpful. It's a managed object. > If the user has explicitly closed both fd ends for a coproc, it should not > be a surprise to the user either way - whether the variable gets unset > automatically, or whether it remains with (-1 -1). > > Since you are already unsetting the variable when the coproc is deallocated > though, I'd say it's fine to keep doing that -- just don't deallocate the > coproc before the user has closed both fds. It's just not backwards compatible. I might add an option to enable that kind of management, but probably not for bash-5.3. > *Except* that it's inherently a race condition whether the original > variables will still be intact to save them. > > Even if you attempt to save them immediately: > > coproc X { exit; } > X_BACKUP=( ${X[@]} ) > > it's not guaranteed that X_BACKUP=(...) will run before coproc X has been > deallocated, and the X variable cleared. That's not what I mean about saving the file descriptors. But there is a window there where a short-lived coprocess could be reaped before you dup the file descriptors. Since the original intent of the feature was that coprocs were a way to communicate with long-lived processes -- something more persistent than a process substitution -- it was not really a concern at the time. >>> *Or* else add an option to the coproc keyword to explicitly close the >>> coproc - which will close both fds and clear the variable. >> >> Not going to add any more options to reserved words; that does more >> violence to the grammar than I want. > > Not sure how you'd feel about using 'unset' on the coproc variable > instead. (Though as discussed, I think the coproc terminated + fds > manually closed condition is also sufficient.) That does sound promising. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-08 16:21 ` Chet Ramey @ 2024-04-12 16:49 ` Carl Edquist 2024-04-16 15:48 ` Chet Ramey 0 siblings, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-04-12 16:49 UTC (permalink / raw) To: Chet Ramey; +Cc: Zachary Santer, bug-bash, libc-alpha [-- Attachment #1: Type: text/plain, Size: 12139 bytes --] On Mon, 8 Apr 2024, Chet Ramey wrote: > On 4/4/24 8:52 AM, Carl Edquist wrote: > >> Zack illustrated basically the same point with his example: >> >>> exec {fd}< <( some command ) >>> while IFS='' read -r line <&"${fd}"; do >>> # do stuff >>> done >>> {fd}<&- >> >> A process-substitution open to the shell like this is effectively a >> one-ended coproc (though not in the jobs list), and it behaves reliably >> here because the user can count on {fd} to remain open even after the >> child process terminates. > > That exposes the fundamental difference. The procsub is essentially the > same kind of object as a coproc, but it exposes the pipe endpoint(s) as > filenames. The shell maintains open file descriptors to the child > process whose input or output it exposes as a FIFO or a file in /dev/fd, > since you have to have a reader and a writer. The shell closes the file > descriptor and, if necessary, removes the FIFO when the command for > which that was one of the word expansions (or a redirection) completes. > coprocs are designed to be longer-lived, and not associated with a > particular command or redirection. > > But the important piece is that $fd is not the file descriptor the shell > keeps open to the procsub -- it's a new file descriptor, dup'd from the > original by the redirection. Since it was used with `exec', it persists > until the script explicitly closes it. It doesn't matter when the shell > reaps the procsub and closes the file descriptor(s) -- the copy in $fd > remains until the script explicitly closes it. You might get read > returning failure at some point, but the shell won't close $fd for you. > > Since procsubs expand to filenames, even opening them is sufficient to > give you a new file descriptor (with the usual caveats about how > different OSs handle the /dev/fd device). > > You can do this yourself with coprocs right now, with no changes to the > shell. > > >> So, the user can determine when the coproc fds are no longer needed, >> whether that's when EOF is hit trying to read from the coproc, or >> whatever other condition. > > Duplicating the file descriptor will do that for you. Thanks for the explanation, that all makes sense. One technical difference in my mind is that doing this with a procsub is reliably safe: exec {fd}< <( some command ) since the expanded pathname (/dev/fd/N or the fifo alternative) will stay around for the duration of the exec command, so there is no concern about whether or not the dup redirection will succeed. Where with a coproc coproc X { potentially short lived command with output; } exec {xr}<&${X[0]} {xw}>&${X[1]} there is technically the possibility that the coproc can finish and be reaped before the exec command gets a chance to run and duplicate the fds. But, I also get what you said, that your design intent with coprocs was for them to be longer-lived, so immediate termination was not a concern. >> Personally I like the idea of 'closing' a coproc explicitly, but if >> it's a bother to add options to the coproc keyword, then I would say >> just let the user be responsible for closing the fds. Once the coproc >> has terminated _and_ the coproc's fds are closed, then the coproc can >> be deallocated. > > This is not backwards compatible. coprocs may be a little-used feature, > but you're adding a burden on the shell programmer that wasn't there > previously. Ok, so, I'm trying to imagine a case where this would cause any problems or extra work for such an existing user. Maybe you can provide an example from your own uses? (Where it would cause trouble or require adding code if the coproc deallocation were deferred until the fds are closed explicitly.) My first thought is that in the general case, the user doesn't really need to worry much about closing the fds for a terminated coproc anyway, as they will all be closed implicitly when the shell exits (either an interactive session or a script). [This is a common model for using coprocs, by the way, where an auxiliary coprocess is left open for the lifetime of the shell session and never explicitly closed. When the shell session exits, the fds are closed implicitly by the OS, and the coprocess sees EOF and exits on its own.] If a user expects the coproc variable to go away automatically, that user won't be accessing a still-open fd from that variable for anything. As for the forgotten-about half-closed pipe fds to the reaped coproc, I don't see how they could lead to deadlock, nor do I see how a shell programmer expecting the existing behavior would even attempt to access them at all, apart from programming error. The only potential issue I can imagine is if a script (or a user at an interactive prompt) would start _so_ many of these longer-lived coprocs (more than 500??), one at a time in succession, in a single shell session, that all the available fds would be exhausted. (That is, if the shell is not closing them automatically upon coproc termination.) Is that the backwards compatibility concern? Because otherwise it seems like stray fds for terminated coprocs would be benign. ... Meanwhile, the bash man page does not specify the shell's behavior for when a coproc terminates, so you might say there's room for interpretation and the new deferring behavior would not break any promises. And as it strikes me anyway, the real "burden" on the programmer with the existing behavior is having to make a copy of the coproc fds every time coproc X { cmd; } exec {xr}<&${X[0]} {xw}>&${X[1]} and use the copies instead of the originals in order to reliably read the final output from the coproc. ... Though I can hear Obi-Wan Kenobi gently saying to Luke, "You must do what you feel is right, of course." >>> What should it do to make sure that the variables don't hang around >>> with invalid file descriptors? >> >> First, just to be clear, the fds to/from the coproc pipes are not >> invalid when the coproc terminates (you can still read from them); they >> are only invalid after they are closed. > > That's only sort of true; writing to a pipe for which there is no reader > generates SIGPIPE, which is a fatal signal. Eh, when I talk about an fd being "invalid" here I mean "fd is not a valid file descriptor" (to use the language for EBADF from the man page for various system calls like read(2), write(2), close(2)). That's why I say the fds only become invalid after they are closed. And of course the primary use I care about is reading the final output from a completed coproc. (Which is generally after explicitly closing the write end.) The shell's read fd is still open, and can be read - it'll either return data, or return EOF, but that's not an error and not invalid. But since you mention it, writing to a broken pipe is still semantically meaningful also. (I would even say valid.) In the typical case it's expected behavior for a process to get killed when it attempts this and shell pipeline programming is designed with this in mind. But when you try to write to a terminated coproc when you have the shell automatically closing its write end, you get an unpredictable situation: - If the write happens after the coproc terminates but before the shell reaps it (and closes the fds), then you will generate a SIGPIPE, which by default gracefully kills the shell (as is normal for programs in a pipeline). - On the other hand, if the write happens after the shell reaps it and closes the fds, you will get a bad (invalid) file descriptor error message, without killing the shell. So even for write attempts, you introduce uncertain behavior by automatically closing the fds, when the normal, predictable, valid thing would be to die by SIGPIPE. (That's my take anyway.) > If the coproc terminates, the file descriptor to write to it becomes > invalid because it's implicitly closed. Yes, but the distinction I was making is that they do not become invalid when or because the coproc terminates, they become invalid when and because the shell closes them. (I'm saying that if the shell did not close them automatically, they would remain valid.) >> The surprising bit is when they become invalid unexpectedly (from the >> point of view of the user) because the shell closes them >> automatically, at the somewhat arbitrary timing when the coproc is >> reaped. > > No real difference from procsubs. I think I disagree? The difference is that the replacement string for a procsub (/dev/fd/N or a fifo path) remains valid for the command in question. (Right?) So the command in question can count on that path being valid. And if a procsub is used in an exec redirection, in order to extend its use for future commands (and the redirection is guaranteed to work, since it is guaranteed to be valid for that exec command), then the newly opened pipe fd will not be subject to automatic closing either. As far as I can tell there is no arbitrary timing for when the shell closes the fds for procsubs. As far as I can tell, it closes them when the command in question completes, and that's the end of the story. (There's no waiting for the timing of the background procsub process to complete.) >> Second, why is it a problem if the variables keep their (invalid) fds >> after closing them, if the user is the one that closed them anyway? >> >> Isn't this how it works with the auto-assigned fd redirections? > > Those are different file descriptors. > >> >> $ exec {d}<. >> $ echo $d >> 10 >> $ exec {d}<&- >> $ echo $d >> 10 > > The shell doesn't try to manage that object in the same way it does a > coproc. The user has explicitly indicated they want to manage it. Ok - your intention makes sense then. My reasoning was that auto-allocated redirection fds ( {x}>file or {x}>&$N ) are a way of asking the shell to automatically place fds in a variable for you to manage - and I imagined 'coproc X {...}' the same way. >> If the user has explicitly closed both fd ends for a coproc, it should >> not be a surprise to the user either way - whether the variable gets >> unset automatically, or whether it remains with (-1 -1). >> >> Since you are already unsetting the variable when the coproc is >> deallocated though, I'd say it's fine to keep doing that -- just don't >> deallocate the coproc before the user has closed both fds. > > It's just not backwards compatible. I might add an option to enable > that kind of management, but probably not for bash-5.3. Ah, nice idea. No hurry on my end - but yeah if you imagine the alternate behavior is somehow going to cause problems for existing uses (eg, the fd exhaustion mentioned earlier) then yeah a shell option for the deallocation behavior would at least be a way for users to get reliable behavior without the burden of duping the fds manually every time. > But there is a window there where a short-lived coprocess could be > reaped before you dup the file descriptors. Since the original intent of > the feature was that coprocs were a way to communicate with long-lived > processes -- something more persistent than a process substitution -- it > was not really a concern at the time. Makes sense. For me, working with coprocesses is largely a more flexible way of setting up interesting pipelines - which is where the shell excels. Once a 'pipework' is set up (I'm making up this word now to distinguish from a simple pipeline), the shell does not have to be in the middle shoveling data around - the external commands can do that on their own. So in my mind, thinking about the "lifetime" of a coproc is often not so different from thinking about the lifetime of a regular pipeline, once you set up the plumbing for your commands. The timing of individual parts of a pipeline finishing shouldn't really matter, as long as the pipes serve their purpose to deliver output from one part to the next. Thanks for your time, and happy Friday :) Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-12 16:49 ` Carl Edquist @ 2024-04-16 15:48 ` Chet Ramey 2024-04-20 23:11 ` Carl Edquist 0 siblings, 1 reply; 53+ messages in thread From: Chet Ramey @ 2024-04-16 15:48 UTC (permalink / raw) To: Carl Edquist; +Cc: chet.ramey, Zachary Santer, bug-bash, libc-alpha On 4/12/24 12:49 PM, Carl Edquist wrote: > Where with a coproc > > coproc X { potentially short lived command with output; } > exec {xr}<&${X[0]} {xw}>&${X[1]} > > there is technically the possibility that the coproc can finish and be > reaped before the exec command gets a chance to run and duplicate the fds. > > But, I also get what you said, that your design intent with coprocs was for > them to be longer-lived, so immediate termination was not a concern. The bigger concern was how to synchronize between the processes, but that's something that the script writer has to do on their own. >>> Personally I like the idea of 'closing' a coproc explicitly, but if it's >>> a bother to add options to the coproc keyword, then I would say just let >>> the user be responsible for closing the fds. Once the coproc has >>> terminated _and_ the coproc's fds are closed, then the coproc can be >>> deallocated. >> >> This is not backwards compatible. coprocs may be a little-used feature, >> but you're adding a burden on the shell programmer that wasn't there >> previously. > > Ok, so, I'm trying to imagine a case where this would cause any problems or > extra work for such an existing user. Maybe you can provide an example > from your own uses? (Where it would cause trouble or require adding code > if the coproc deallocation were deferred until the fds are closed explicitly.) My concern was always coproc fds leaking into other processes, especially pipelines. If someone has a coproc now and is `messy' about cleaning it up, I feel like there's the possibility of deadlock. But I don't know how extensively they're used, or all the use cases, so I'm not sure how likely it is. I've learned there are users who do things with shell features I never imagined. (People wanting to use coprocs without the shell as the arbiter, for instance. :-) ) > My first thought is that in the general case, the user doesn't really need > to worry much about closing the fds for a terminated coproc anyway, as they > will all be closed implicitly when the shell exits (either an interactive > session or a script). Yes. > > [This is a common model for using coprocs, by the way, where an auxiliary > coprocess is left open for the lifetime of the shell session and never > explicitly closed. When the shell session exits, the fds are closed > implicitly by the OS, and the coprocess sees EOF and exits on its own.] That's one common model, yes. Another is that the shell process explicitly sends a close or shutdown command to the coproc, so termination is expected. > If a user expects the coproc variable to go away automatically, that user > won't be accessing a still-open fd from that variable for anything. I'm more concerned about a pipe with unread data that would potentially cause problems. I suppose we just need more testing. > As for the forgotten-about half-closed pipe fds to the reaped coproc, I > don't see how they could lead to deadlock, nor do I see how a shell > programmer expecting the existing behavior would even attempt to access > them at all, apart from programming error. Probably not. > > The only potential issue I can imagine is if a script (or a user at an > interactive prompt) would start _so_ many of these longer-lived coprocs > (more than 500??), one at a time in succession, in a single shell session, > that all the available fds would be exhausted. (That is, if the shell is > not closing them automatically upon coproc termination.) Is that the > backwards compatibility concern? That's more of a "my arm hurts when I do this" situation. If a script opened 500 fds using exec redirection, resource exhaustion would be their own responsibility. > Meanwhile, the bash man page does not specify the shell's behavior for when > a coproc terminates, so you might say there's room for interpretation and > the new deferring behavior would not break any promises. I could always enable it in the devel branch and see what happens with the folks who use that. It would be three years after any release when distros would put it into production anyway. > > And as it strikes me anyway, the real "burden" on the programmer with the > existing behavior is having to make a copy of the coproc fds every time > > coproc X { cmd; } > exec {xr}<&${X[0]} {xw}>&${X[1]} > > and use the copies instead of the originals in order to reliably read the > final output from the coproc. Maybe, though it's easy enough to wrap that in a shell function. >>> First, just to be clear, the fds to/from the coproc pipes are not >>> invalid when the coproc terminates (you can still read from them); they >>> are only invalid after they are closed. >> >> That's only sort of true; writing to a pipe for which there is no reader >> generates SIGPIPE, which is a fatal signal. > > Eh, when I talk about an fd being "invalid" here I mean "fd is not a valid > file descriptor" (to use the language for EBADF from the man page for > various system calls like read(2), write(2), close(2)). That's why I say > the fds only become invalid after they are closed. > > And of course the primary use I care about is reading the final output from > a completed coproc. (Which is generally after explicitly closing the write > end.) The shell's read fd is still open, and can be read - it'll either > return data, or return EOF, but that's not an error and not invalid. > > But since you mention it, writing to a broken pipe is still semantically > meaningful also. (I would even say valid.) In the typical case it's > expected behavior for a process to get killed when it attempts this and > shell pipeline programming is designed with this in mind. You'd be surprised at how often I get requests to put in an internal SIGPIPE handler to avoid problems/shell termination with builtins writing to closed pipes. > So even for write attempts, you introduce uncertain behavior by > automatically closing the fds, when the normal, predictable, valid thing > would be to die by SIGPIPE. Again, you might be surprised at how many people view that as a bug in the shell. >> If the coproc terminates, the file descriptor to write to it becomes >> invalid because it's implicitly closed. > > Yes, but the distinction I was making is that they do not become invalid > when or because the coproc terminates, they become invalid when and because > the shell closes them. (I'm saying that if the shell did not close them > automatically, they would remain valid.) > > >>> The surprising bit is when they become invalid unexpectedly (from the >>> point of view of the user) because the shell closes them >>> automatically, at the somewhat arbitrary timing when the coproc is >>> reaped. >> >> No real difference from procsubs. > > I think I disagree? The difference is that the replacement string for a > procsub (/dev/fd/N or a fifo path) remains valid for the command in > question. (Right?) Using your definition of valid, I believe so, yes. Avoiding SIGPIPE depends on how the OS handles opens on /dev/fd/N: an internal dup or a handle to the same fd. In the latter case, I think the file descriptor obtained when opening /dev/fd/N would become `invalid' at the same time the process terminates. I think we're talking about our different interpretations of `invalid' (EBADF as opposed to EPIPE/SIGPIPE). > So the command in question can count on that path > being valid. And if a procsub is used in an exec redirection, in order to > extend its use for future commands (and the redirection is guaranteed to > work, since it is guaranteed to be valid for that exec command), then the > newly opened pipe fd will not be subject to automatic closing either. Correct. > > As far as I can tell there is no arbitrary timing for when the shell closes > the fds for procsubs. As far as I can tell, it closes them when the > command in question completes, and that's the end of the story. (There's no > waiting for the timing of the background procsub process to complete.) Right. There are reasonably well-defined rules for when redirections associated with commands are disposed, and exec redirections to procsubs just follow from those. The shell closes file descriptors (and potentially unlinks the FIFO) when it reaps the process substitution, but it takes some care not to do that prematurely, and the user isn't using those fds. > > >>> Second, why is it a problem if the variables keep their (invalid) fds >>> after closing them, if the user is the one that closed them anyway? >>> >>> Isn't this how it works with the auto-assigned fd redirections? >> >> Those are different file descriptors. >> >>> >>> $ exec {d}<. >>> $ echo $d >>> 10 >>> $ exec {d}<&- >>> $ echo $d >>> 10 >> >> The shell doesn't try to manage that object in the same way it does a >> coproc. The user has explicitly indicated they want to manage it. > > Ok - your intention makes sense then. My reasoning was that auto-allocated > redirection fds ( {x}>file or {x}>&$N ) are a way of asking the shell to > automatically place fds in a variable for you to manage - and I imagined > 'coproc X {...}' the same way. The philosophy is the same as if you picked the file descriptor number yourself and assigned it to the variable -- the shell just does some of the bookkeeping for you so you don't have to worry about the file descriptor resource limit. You still have to manage file descriptor $x the same way you would if you had picked file descriptor 15 (for example). >> But there is a window there where a short-lived coprocess could be reaped >> before you dup the file descriptors. Since the original intent of the >> feature was that coprocs were a way to communicate with long-lived >> processes -- something more persistent than a process substitution -- it >> was not really a concern at the time. > > Makes sense. For me, working with coprocesses is largely a more flexible > way of setting up interesting pipelines - which is where the shell excels. > > Once a 'pipework' is set up (I'm making up this word now to distinguish > from a simple pipeline), the shell does not have to be in the middle > shoveling data around - the external commands can do that on their own. My original intention for the coprocs (and Korn's from whence they came) was that the shell would be in the middle -- it's another way for the shell to do IPC. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-16 15:48 ` Chet Ramey @ 2024-04-20 23:11 ` Carl Edquist 2024-04-22 16:12 ` Chet Ramey 0 siblings, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-04-20 23:11 UTC (permalink / raw) To: Chet Ramey; +Cc: Zachary Santer, bug-bash, libc-alpha On Tue, 16 Apr 2024, Chet Ramey wrote: > The bigger concern was how to synchronize between the processes, but > that's something that the script writer has to do on their own. Right. It can be tricky and depends entirely on what the user's up to. > My concern was always coproc fds leaking into other processes, > especially pipelines. If someone has a coproc now and is `messy' about > cleaning it up, I feel like there's the possibility of deadlock. I get where you're coming from with the concern. I would welcome being shown otherwise, but as far as I can tell, deadlock is a ghost of a concern once the coproc is dead. Maybe it helps to step through it ... - First, where does deadlock start? (In the context of pipes) I think the answer is: When there is a read or write attempted on a pipe that blocks (indefinitely). - What causes a read or a write on a pipe to block? A pipe read blocks when a corresponding write-end is open, but there is no data available to read. A pipe write blocks when a corresponding read-end is open, but the pipe is full. - Are the coproc's corresponding ends of the shell's pipe fds open? Well, not if the coproc is really dead. - Will a read or write ever be attempted? If the shell's stray coproc fds are left open, sure they will leak into pipelines too - but since they're forgotten, in theory no command will actually attempt to use them. - What if a command attempts to use these stray fds anyway, by mistake? If the coproc is really dead, then its side of the pipe fds will have been closed. Thus read/write attempts on the fds on the shell's side (either from the shell itself, or from commands / pipelines that the fds leaked into) WILL NOT BLOCK, and thus will not result in deadlock. (A read attempt will hit EOF, a write attempt will get SIGPIPE/EPIPE.) HOPEFULLY that is enough to put any reasonable fears of deadlock to bed - at least in terms of the shell's leaked fds leading to deadlock. - But what if the _coproc_ leaked its pipe fds before it died? At this point I think perhaps we get into what you called a "my arm hurts when I do this" situation. It kind of breaks the whole coproc model: if the stdin/stdout of a coproc are still open by one of the coproc's children, then I might say the coproc is not really dead. But anyway I want to be a good sport, for completeness. An existing use case that would lead to trouble would perhaps have to look something like this: The shell sends a quit command to a coproc, without closing the shell's coproc fds. The coproc has a child, then exits. The coproc (parent) is dead. The coproc's child has inherited the coproc's pipe fds. The script author _expects_ that the coproc parent will exit, and expects that this will trigger the old behavior, that the shell will automatically close its fds to the coproc parent. Thus the author _expects_ that the coproc exiting will, indirectly but automatically, cause any blocked reads/writes on stdin/stdout in the coproc's child to stop blocking. Thus the author _expects_ the coproc's child to promptly complete, even though its output _will not be consumable_ (because the author _expects_ that its stdout will be attached to a broken pipe). But [here's where the potential problem starts] with the new deferring behavior, the shell's coproc fds are not automatically closed, and thus the coproc's _child_ does not stop blocking, and thus the author's short-lived expectations for this coproc's useless child are dashed to the ground, while that child is left standing idle until the cows come home. (That is, until the shell exits.) It really seems like a contrived and senseless scenario, doesn't it? (Even to me!) [And an even more far-fetched scenario: a coproc transmits copies of its pipe fds to another process over a unix socket ancillary message (SCM_RIGHTS), instead of to a child by inheritance. The rest of the story is the same, and equally senseless.] > But I don't know how extensively they're used, or all the use cases, so > I'm not sure how likely it is. I've learned there are users who do > things with shell features I never imagined. (People wanting to use > coprocs without the shell as the arbiter, for instance. :-) ) Hehe... Well, yeah, once you gift-wrap yourself a friendly, reliable interface and have the freedom to play with it to your heart's content - you find some fun things to do with coprocesses. (Much like regular shell pipelines.) I get your meaning though - without knowing all the potential uses, it's hard to say with absolute certainty that no user will be negatively affected by a new improvement or bug fix. >> [This is a common model for using coprocs, by the way, where an >> auxiliary coprocess is left open for the lifetime of the shell session >> and never explicitly closed. When the shell session exits, the fds are >> closed implicitly by the OS, and the coprocess sees EOF and exits on >> its own.] > > That's one common model, yes. Another is that the shell process > explicitly sends a close or shutdown command to the coproc, so > termination is expected. Right, but here also (after sending a quit command) the conclusion is the same as my point just below - that if the user is expecting the coproc to terminate, and expecting the current behavior that as a result the coproc variable will go away automatically, then that variable is as good as forgotten to the user. >> If a user expects the coproc variable to go away automatically, that >> user won't be accessing a still-open fd from that variable for >> anything. > > I'm more concerned about a pipe with unread data that would potentially > cause problems. I suppose we just need more testing. If I understand you right, you are talking about a scenario like this: - a coproc writes to its output pipe - the coproc terminates - the shell leaves its fd for the read end of this pipe open - there is unread data left sitting in this pipe - [theoretical concern here] Is that right? I can't imagine this possibly leading to deadlock. Either (1) the user has forgotten about this pipe, and never attempts to read from it, or (2) the user attempts to read from this pipe, returning some or all of the data, and possibly hitting EOF, but in any case DOES NOT BLOCK. (I'm sorry if this is basically restating what I've already said earlier.) > That's more of a "my arm hurts when I do this" situation. If a script > opened 500 fds using exec redirection, resource exhaustion would be > their own responsibility. Ha, good! [I had a small fear that fd exhaustion might have been your actual concern.] >> Meanwhile, the bash man page does not specify the shell's behavior for >> when a coproc terminates, so you might say there's room for >> interpretation and the new deferring behavior would not break any >> promises. > > I could always enable it in the devel branch and see what happens with > the folks who use that. It would be three years after any release when > distros would put it into production anyway. Oh, fun :) >> But since you mention it, writing to a broken pipe is still >> semantically meaningful also. (I would even say valid.) In the >> typical case it's expected behavior for a process to get killed when it >> attempts this and shell pipeline programming is designed with this in >> mind. > > You'd be surprised at how often I get requests to put in an internal > SIGPIPE handler to avoid problems/shell termination with builtins > writing to closed pipes. Ah, well, I get it though. It _is_ a bit jarring to see your shell get blown away with something like this - $ exec 9> >(typo) $ ... $ echo >&9 # Boom! So it does not surprise me that you have some users puzzling over it. But FWIW I do think it is the most consistent & correct behavior. Plus, of course, the user can install their own shell handler code for that case, or downgrade the effect to a non-fatal error with $ trap '' SIGPIPE >> So even for write attempts, you introduce uncertain behavior by >> automatically closing the fds, when the normal, predictable, valid >> thing would be to die by SIGPIPE. > > Again, you might be surprised at how many people view that as a bug in > the shell. I'm not terribly surprised, since at first (before reasoning about it) the behavior is admittedly alarming. ("What happened to my terminal?!?!") But I'd argue the alternative is worse, because then it's an unpredictable race between SIGPIPE (which they're complaining about) and EBADF. > I think we're talking about our different interpretations of `invalid' > (EBADF as opposed to EPIPE/SIGPIPE). Right - just explaining; I think by now we are on the same page. > My original intention for the coprocs (and Korn's from whence they came) > was that the shell would be in the middle -- it's another way for the > shell to do IPC. And coprocesses are great for this, too! It's just that external commands in a sense are extensions of the shell. The arms and legs, you might say, for doing the heavy lifting. Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-20 23:11 ` Carl Edquist @ 2024-04-22 16:12 ` Chet Ramey 0 siblings, 0 replies; 53+ messages in thread From: Chet Ramey @ 2024-04-22 16:12 UTC (permalink / raw) To: Carl Edquist; +Cc: chet.ramey, Zachary Santer, bug-bash, libc-alpha [-- Attachment #1.1: Type: text/plain, Size: 521 bytes --] On 4/20/24 7:11 PM, Carl Edquist wrote: >> I could always enable it in the devel branch and see what happens with >> the folks who use that. It would be three years after any release when >> distros would put it into production anyway. > > Oh, fun :) I'll enable it in the next devel branch push, so within a week. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-03-14 9:58 ` Carl Edquist ` (2 preceding siblings ...) 2024-04-03 14:32 ` Chet Ramey @ 2024-04-17 14:37 ` Chet Ramey 2024-04-20 22:04 ` Carl Edquist 3 siblings, 1 reply; 53+ messages in thread From: Chet Ramey @ 2024-04-17 14:37 UTC (permalink / raw) To: Carl Edquist, Zachary Santer; +Cc: chet.ramey, bug-bash, libc-alpha On 3/14/24 5:58 AM, Carl Edquist wrote: > Separately, I consider the following coproc behavior to be weird, fragile, > and broken. Yes, I agree that coprocs should survive being suspended. The most recent devel branch push has code to prevent the coproc being reaped if it's stopped and not terminated. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Examples of concurrent coproc usage? 2024-04-17 14:37 ` Chet Ramey @ 2024-04-20 22:04 ` Carl Edquist 0 siblings, 0 replies; 53+ messages in thread From: Carl Edquist @ 2024-04-20 22:04 UTC (permalink / raw) To: Chet Ramey; +Cc: Zachary Santer, bug-bash, libc-alpha On Wed, 17 Apr 2024, Chet Ramey wrote: > Yes, I agree that coprocs should survive being suspended. The most > recent devel branch push has code to prevent the coproc being reaped if > it's stopped and not terminated. Oh, nice! :) Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RFE: enable buffering on null-terminated data 2024-03-11 11:54 ` Carl Edquist 2024-03-11 15:12 ` Examples of concurrent coproc usage? Zachary Santer @ 2024-03-12 3:34 ` Zachary Santer 2024-03-14 14:15 ` Carl Edquist 1 sibling, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-03-12 3:34 UTC (permalink / raw) To: Carl Edquist; +Cc: libc-alpha, coreutils, p [-- Attachment #1: Type: text/plain, Size: 3163 bytes --] On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist <edquist@cs.wisc.edu> wrote: > > (In my coprocess management library, I effectively run every coproc with > --output=L by default, by eval'ing the output of 'env -i stdbuf -oL env', > because most of the time for a coprocess, that's whats wanted/necessary.) Surrounded by 'set -a' and 'set +a', I guess? Now that's interesting. I just added that to a script I have that prints lines output by another command that it runs, generally a build script, to the command line, but updating the same line over and over again. I want to see if it updates more continuously like that. > ... Although, for your example coprocess use, where the shell both > produces the input for the coproc and consumes its output, you might be > able to simplify things by making the producer and consumer separate > processes. Then you could do a simpler 'producer | filter | consumer' > without having to worry about buffering at all. But if the producer and > consumer need to be in the same process (eg they share state and are > logically interdependent), then yeah that's where you need a coprocess for > the filter. Yeah, there's really no way to break what I'm doing into a standard pipeline. > (Although given your time output, you might say the performance hit for > unbuffered is not that huge.) We see a somewhat bigger difference, at least proportionally, if we get bash more or less out of the way. See command-buffering, attached. Standard: real 0m0.202s user 0m0.280s sys 0m0.076s Line-buffered: real 0m0.497s user 0m0.374s sys 0m0.545s Unbuffered: real 0m0.648s user 0m0.544s sys 0m0.702s In coproc-buffering, unbuffered output was 21.7% slower than line-buffered output, whereas here it's 30.4% slower. Of course, using line-buffered or unbuffered output in this situation makes no sense. Where it might be useful in a pipeline is when an earlier command in a pipeline might only print things occasionally, and you want those things transformed and printed to the command line immediately. > So ... again in theory I also feel like a null-terminated buffering mode > for stdbuf(1) (and setbuf(3)) is kind of a missing feature. My assumption is that line-buffering through setbuf(3) was implemented for printing to the command line, so its availability to stdbuf(1) is just a useful side effect. In the BUGS section in the man page for stdbuf(1), we see: On GLIBC platforms, specifying a buffer size, i.e., using fully buffered mode will result in undefined operation. If I'm not mistaken, then buffer modes other than 0 and L don't actually work. Maybe I should count my blessings here. I don't know what's going on in the background that would explain glibc not supporting any of that, or stdbuf(1) implementing features that aren't supported on the vast majority of systems where it will be installed. > It may just > be that nobody has actually had a real need for it. (Yet?) I imagine if anybody has, they just set --output=0 and moved on. Bash scripts aren't the fastest thing in the world, anyway. [-- Attachment #2: command-buffering --] [-- Type: application/octet-stream, Size: 887 bytes --] #!/usr/bin/env bash set -o nounset -o noglob +o braceexpand shopt -s lastpipe export LC_ALL='C.UTF-8' tab_spaces=8 sed_expr='s/[[:blank:]]+$//' test=$' \tLine with tabs\t why?\t ' repeat="${1}" for (( i = 0; i < repeat; i++ )); do printf '%s\n' "${test}" done > tab-input.txt printf '%s' "Standard:" time { sed --binary --regexp-extended --expression="${sed_expr}" < tab-input.txt | expand --tabs="${tab_spaces}" > /dev/null } printf '%s' "Line-buffered:" time { stdbuf --output=L -- \ sed --binary --regexp-extended --expression="${sed_expr}" < tab-input.txt | stdbuf --output=L -- \ expand --tabs="${tab_spaces}" > /dev/null } printf '%s' "Unbuffered:" time { stdbuf --output=0 -- \ sed --binary --regexp-extended --expression="${sed_expr}" < tab-input.txt | stdbuf --output=0 -- \ expand --tabs="${tab_spaces}" > /dev/null } ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RFE: enable buffering on null-terminated data 2024-03-12 3:34 ` RFE: enable buffering on null-terminated data Zachary Santer @ 2024-03-14 14:15 ` Carl Edquist 2024-03-18 0:12 ` Zachary Santer 0 siblings, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-03-14 14:15 UTC (permalink / raw) To: Zachary Santer; +Cc: libc-alpha, coreutils, p [-- Attachment #1: Type: text/plain, Size: 7508 bytes --] On Mon, 11 Mar 2024, Zachary Santer wrote: > On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist <edquist@cs.wisc.edu> > wrote: >> >> (In my coprocess management library, I effectively run every coproc >> with --output=L by default, by eval'ing the output of 'env -i stdbuf >> -oL env', because most of the time for a coprocess, that's whats >> wanted/necessary.) > > Surrounded by 'set -a' and 'set +a', I guess? Now that's interesting. Ah, no - I use the 'VAR=VAL command line' syntax so that it's specific to the command (it's not left exported to the shell). Effectively the coprocess commands are run with LD_PRELOAD=... _STDBUF_O=L command line This allow running shell functions for the command line, which will all get the desired stdbuf behavior. Because you can't pass a shell function (within the context of the current shell) as the command to stdbuf. As far as I can tell, the stdbuf tool sets LD_PRELOAD (to point to libstdbuf.so) and your custom buffering options in _STDBUF_{I,O,E}, in the environment for the program it runs. The double-env thing there is just a way to cleanly get exactly the env vars that stdbuf sets. The values don't change, but since they are an implementation detail of stdbuf, it's a bit more portable to grab the values this way rather than hard code them. This is done only once per shell session to extract the values, and save them to a private variable, and then they are used for the command line as show above. Of course, if "command line" starts with "stdbuf --output=0" or whatever, that will override the new line-buffered default. You can definitely export it to your shell though, either with 'set -a' like you said, or with the export command. After that everything you run should get line-buffered stdio by default. > I just added that to a script I have that prints lines output by another > command that it runs, generally a build script, to the command line, but > updating the same line over and over again. I want to see if it updates > more continuously like that. So, a lot of times build scripts run a bunch of individual commands. Each of those commands has an implied flush when it terminates, so you will get the output from each of them promptly (as each command completes), even without using stdbuf. Where things get sloppy is if you add some stuff in a pipeline after your build script, which results in things getting block-buffered along the way: $ ./build.sh | sed s/what/ever/ | tee build.log And there you will definitely see a difference. sloppy () { for x in {1..10}; do sleep .2; echo $x; done | sed s/^/:::/ | cat } { echo before: sloppy echo export $(env -i stdbuf -oL env) echo after: sloppy } > Yeah, there's really no way to break what I'm doing into a standard > pipeline. I admit I'm curious what you're up to :) > Of course, using line-buffered or unbuffered output in this situation > makes no sense. Where it might be useful in a pipeline is when an > earlier command in a pipeline might only print things occasionally, and > you want those things transformed and printed to the command line > immediately. Right ... And in that case, losing the performance benefit of a larger block buffer is a smaller price to pay. > My assumption is that line-buffering through setbuf(3) was implemented > for printing to the command line, so its availability to stdbuf(1) is > just a useful side effect. Right, stdbuf(1) leverages setbuf(3). setbuf(3) tweaks the buffering behavior of stdio streams (stdin, stdout, stderr, and anything else you open with, eg, fopen(3)). It's not really limited to terminal applications, but yeah it makes it easier to ensure that your calls to printf(3) actually get output after each line (whether that's to a file or a pipe or a tty), without having to call an explicit fflush(3) of stdout every time. stdbuf(1) sets LD_PRELOAD to libstdbuf.so for your program, causing it to call setbuf(3) at program startup based on the values of _STDBUF_* in the environment (which stdbuf(1) also sets). (That's my read of it anyway.) > In the BUGS section in the man page for stdbuf(1), we see: On GLIBC > platforms, specifying a buffer size, i.e., using fully buffered mode > will result in undefined operation. Eheh xD Oh, I imagine "undefined operation" means something more like "unspecified" here. stdbuf(1) uses setbuf(3), so the behavior you'll get should be whatever the setbuf(3) from the libc on your system does. I think all this means is that the C/POSIX standards are a bit loose about what is required of setbuf(3) when a buffer size is specified, and there is room in the standard for it to be interpreted as only a hint. > If I'm not mistaken, then buffer modes other than 0 and L don't actually > work. Maybe I should count my blessings here. I don't know what's going > on in the background that would explain glibc not supporting any of > that, or stdbuf(1) implementing features that aren't supported on the > vast majority of systems where it will be installed. Hey try it right? Works for me (on glibc-2.23) $ for s in 8k 16k 32k 1M; do echo ::: $s ::: { stdbuf -o$s strace -ewrite tr 1 2 } < /dev/zero 2>&1 > /dev/null | head -3 echo done ::: 8k ::: write(1, "\0\0\0\0\0\0\0\0"..., 8192) = 8192 write(1, "\0\0\0\0\0\0\0\0"..., 8192) = 8192 write(1, "\0\0\0\0\0\0\0\0"..., 8192) = 8192 ::: 16k ::: write(1, "\0\0\0\0\0\0\0\0"..., 16384) = 16384 write(1, "\0\0\0\0\0\0\0\0"..., 16384) = 16384 write(1, "\0\0\0\0\0\0\0\0"..., 16384) = 16384 ::: 32k ::: write(1, "\0\0\0\0\0\0\0\0"..., 32768) = 32768 write(1, "\0\0\0\0\0\0\0\0"..., 32768) = 32768 write(1, "\0\0\0\0\0\0\0\0"..., 32768) = 32768 ::: 1M ::: write(1, "\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 >> It may just be that nobody has actually had a real need for it. >> (Yet?) > > I imagine if anybody has, they just set --output=0 and moved on. Bash > scripts aren't the fastest thing in the world, anyway. Ouch. Ouch. Ouuuuch. :) While that's true if you're talking about bash itself doing the actual computation and data processing, the main work of the shell is making it easy to set up pipelines for other (very fast) programs to pass their data around. The stdbuf tool is not meant for the shell! It's meant for those very fast programs that the shell stands up. Using stdbuf to tweak a very fast program, causing it to output more often at newlines over pipes rather than at block boundaries, does slow down those programs somewhat. But as we've discussed, this is necessary for certain pipelines that have two-way communication (including coprocesses), or in general any time you want the output immediately. What may not be obvious is that the shell does not need to get involved with writing input for a coprocess or reading its output - the shell can start other (very fast) programs with input/output redirected to/from the coprocess pipes to do that processing. My point though earlier was that a null-terminated record buffering mode, as useful as it sounds on the surface (for null-terminated paths), may actually be something _nobody_ has ever actually needed for an actual (not contrived) workflow. But then again I say "Yet?" - because, never say never. Happy line-buffering :) Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RFE: enable buffering on null-terminated data 2024-03-14 14:15 ` Carl Edquist @ 2024-03-18 0:12 ` Zachary Santer 2024-03-19 5:24 ` Kaz Kylheku 0 siblings, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-03-18 0:12 UTC (permalink / raw) To: Carl Edquist; +Cc: libc-alpha, coreutils, p On Thu, Mar 14, 2024 at 11:14 AM Carl Edquist <edquist@cs.wisc.edu> wrote: > Where things get sloppy is if you add some stuff in a pipeline after your > build script, which results in things getting block-buffered along the > way: > > $ ./build.sh | sed s/what/ever/ | tee build.log > > And there you will definitely see a difference. Sadly, the man page for stdbuf specifically calls out tee as being unaffected by stdbuf, because it adjusts the buffering of its standard streams itself. The script I mentioned pipes everything through tee, and I don't think I'm willing to refactor it not to. Ah well. > Oh, I imagine "undefined operation" means something more like > "unspecified" here. stdbuf(1) uses setbuf(3), so the behavior you'll get > should be whatever the setbuf(3) from the libc on your system does. > > I think all this means is that the C/POSIX standards are a bit loose about > what is required of setbuf(3) when a buffer size is specified, and there > is room in the standard for it to be interpreted as only a hint. > Works for me (on glibc-2.23) Thanks for setting me straight here. > What may not be obvious is that the shell does not need to get involved > with writing input for a coprocess or reading its output - the shell can > start other (very fast) programs with input/output redirected to/from the > coprocess pipes to do that processing. Gosh, I'd like to see an example of that, too. > My point though earlier was that a null-terminated record buffering mode, > as useful as it sounds on the surface (for null-terminated paths), may > actually be something _nobody_ has ever actually needed for an actual (not > contrived) workflow. I considered how it seemed like something people could need years ago and only thought to email into email lists about it last weekend. Maybe there are all sorts of people out there who have been using 'stdbuf --output=0' on null-terminated data for years and never thought to raise the issue. I know that's not a very strong argument, though. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RFE: enable buffering on null-terminated data 2024-03-18 0:12 ` Zachary Santer @ 2024-03-19 5:24 ` Kaz Kylheku 2024-03-19 12:50 ` Zachary Santer 0 siblings, 1 reply; 53+ messages in thread From: Kaz Kylheku @ 2024-03-19 5:24 UTC (permalink / raw) To: Zachary Santer; +Cc: Carl Edquist, libc-alpha, coreutils, p On 2024-03-17 17:12, Zachary Santer wrote: > On Thu, Mar 14, 2024 at 11:14 AM Carl Edquist <edquist@cs.wisc.edu> wrote: > >> Where things get sloppy is if you add some stuff in a pipeline after your >> build script, which results in things getting block-buffered along the >> way: >> >> $ ./build.sh | sed s/what/ever/ | tee build.log >> >> And there you will definitely see a difference. > > Sadly, the man page for stdbuf specifically calls out tee as being > unaffected by stdbuf, because it adjusts the buffering of its standard > streams itself. The script I mentioned pipes everything through tee, > and I don't think I'm willing to refactor it not to. Ah well. But what tee does is set up _IONBF on its output streams, including stdout. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RFE: enable buffering on null-terminated data 2024-03-19 5:24 ` Kaz Kylheku @ 2024-03-19 12:50 ` Zachary Santer 2024-03-20 8:55 ` Carl Edquist 0 siblings, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-03-19 12:50 UTC (permalink / raw) To: Kaz Kylheku; +Cc: Carl Edquist, libc-alpha, coreutils, p On Tue, Mar 19, 2024 at 1:24 AM Kaz Kylheku <kaz@kylheku.com> wrote: > > But what tee does is set up _IONBF on its output streams, > including stdout. So it doesn't buffer at all. Awesome. Nevermind. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: RFE: enable buffering on null-terminated data 2024-03-19 12:50 ` Zachary Santer @ 2024-03-20 8:55 ` Carl Edquist 2024-04-19 0:16 ` Modify buffering of standard streams via environment variables (not LD_PRELOAD)? Zachary Santer 0 siblings, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-03-20 8:55 UTC (permalink / raw) To: Zachary Santer; +Cc: Kaz Kylheku, libc-alpha, coreutils, p [-- Attachment #1: Type: text/plain, Size: 1499 bytes --] On Tue, 19 Mar 2024, Zachary Santer wrote: > On Tue, Mar 19, 2024 at 1:24 AM Kaz Kylheku <kaz@kylheku.com> wrote: >> >> But what tee does is set up _IONBF on its output streams, >> including stdout. > > So it doesn't buffer at all. Awesome. Nevermind. Yay! :D And since tee uses fwrite to copy whatever input is available, that will mean 'records' are output on the same boundaries as the input (whether that be newlines, nuls, or just block boundaries). So putting tee in the middle of a pipeline shouldn't itself interfere with whatever else you're up to. (AND it's still relatively efficient, compared to some tools like cut that putchar a byte at a time.) My note about pipelines like this though: $ ./build.sh | sed s/what/ever/ | tee build.log is that with the default stdio buffering, while all the commands in build.sh will be implicitly self-flushing, the sed in the middle will end up batching its output into blocks, so tee will also repeat them in blocks. However, if stdbuf's magic env vars are exported in your shell (either by doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply by first starting a new shell with 'stdbuf -oL bash'), then every command in your pipelines will start with the new default line-buffered stdout. That way your line-items from build.sh should get passed all the way through the pipeline as they are produced. (But, proof's in the pudding, so whatever works for you :D ) Happy putting all the way! Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Modify buffering of standard streams via environment variables (not LD_PRELOAD)? 2024-03-20 8:55 ` Carl Edquist @ 2024-04-19 0:16 ` Zachary Santer 2024-04-19 9:32 ` Pádraig Brady 2024-04-20 16:00 ` Carl Edquist 0 siblings, 2 replies; 53+ messages in thread From: Zachary Santer @ 2024-04-19 0:16 UTC (permalink / raw) To: Carl Edquist; +Cc: Kaz Kylheku, libc-alpha, coreutils, Pádraig Brady Was "RFE: enable buffering on null-terminated data" On Wed, Mar 20, 2024 at 4:54 AM Carl Edquist <edquist@cs.wisc.edu> wrote: > > However, if stdbuf's magic env vars are exported in your shell (either by > doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply > by first starting a new shell with 'stdbuf -oL bash'), then every command > in your pipelines will start with the new default line-buffered stdout. > That way your line-items from build.sh should get passed all the way > through the pipeline as they are produced. Finally had a chance to try to build with 'stdbuf --output=L --error=L --' in front of the build script, and it caused some crazy problems. I was building Ada, though, so pretty good chance that part of the build chain doesn't link against libc at all. I got a bunch of ERROR: ld.so: object '/usr/libexec/coreutils/libstdbuf.so' from LD_PRELOAD cannot be preloaded: ignored. And then it somehow caused compiler errors relating to the size of what would be pointer types. Cleared out all the build products and tried again without stdbuf and everything was fine. From the original thread just within the coreutils email list, "stdbuf feature request - line buffering but for null-terminated data": On Tue, Mar 12, 2024 at 12:42 PM Kaz Kylheku <kaz@kylheku.com> wrote: > > I would say that if it is implemented, the programs which require > it should all make provisions to set it up themselves. > > stdbuf is a hack/workaround for programs that ignore the > issue of buffering. Specifically, programs which send information > to one of the three standard streams, such that the information > is required in a timely way. Those streams become fully buffered > when not connected to a terminal. I think I've partially come around to this point of view. However, instead of expecting all sorts of individual programs to implement their own buffering mode command-line options, could this be handled with environment variables, but without LD_PRELOAD? I don't know if libc itself can check for those environment variables and adjust each program's buffering on its own, but if so, that would be a much simpler solution. You could compare this to the various locale environment variables, though I think a lot of commands whose behavior differ from locale to locale do have to implement their own handling of that internally, at least to some extent. This seems like somewhat less of a hack, and if no part of a program looks for those environment variables, it isn't going to find itself getting broken by the dynamic linker. It's just not going to change its buffering. Additionally, things that don't link against libc could still honor these environment variables, if the developers behind them care to put in the effort. Zack ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)? 2024-04-19 0:16 ` Modify buffering of standard streams via environment variables (not LD_PRELOAD)? Zachary Santer @ 2024-04-19 9:32 ` Pádraig Brady 2024-04-19 11:36 ` Zachary Santer 2024-04-20 16:00 ` Carl Edquist 1 sibling, 1 reply; 53+ messages in thread From: Pádraig Brady @ 2024-04-19 9:32 UTC (permalink / raw) To: Zachary Santer, Carl Edquist; +Cc: Kaz Kylheku, libc-alpha, coreutils On 19/04/2024 01:16, Zachary Santer wrote: > Was "RFE: enable buffering on null-terminated data" > > On Wed, Mar 20, 2024 at 4:54 AM Carl Edquist <edquist@cs.wisc.edu> wrote: >> >> However, if stdbuf's magic env vars are exported in your shell (either by >> doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply >> by first starting a new shell with 'stdbuf -oL bash'), then every command >> in your pipelines will start with the new default line-buffered stdout. >> That way your line-items from build.sh should get passed all the way >> through the pipeline as they are produced. > > Finally had a chance to try to build with 'stdbuf --output=L --error=L > --' in front of the build script, and it caused some crazy problems. I > was building Ada, though, so pretty good chance that part of the build > chain doesn't link against libc at all. > > I got a bunch of > ERROR: ld.so: object '/usr/libexec/coreutils/libstdbuf.so' from > LD_PRELOAD cannot be preloaded: ignored. > > And then it somehow caused compiler errors relating to the size of > what would be pointer types. Cleared out all the build products and > tried again without stdbuf and everything was fine. > >>From the original thread just within the coreutils email list, "stdbuf > feature request - line buffering but for null-terminated data": > On Tue, Mar 12, 2024 at 12:42 PM Kaz Kylheku <kaz@kylheku.com> wrote: >> >> I would say that if it is implemented, the programs which require >> it should all make provisions to set it up themselves. >> >> stdbuf is a hack/workaround for programs that ignore the >> issue of buffering. Specifically, programs which send information >> to one of the three standard streams, such that the information >> is required in a timely way. Those streams become fully buffered >> when not connected to a terminal. > > I think I've partially come around to this point of view. However, > instead of expecting all sorts of individual programs to implement > their own buffering mode command-line options, could this be handled > with environment variables, but without LD_PRELOAD? I don't know if > libc itself can check for those environment variables and adjust each > program's buffering on its own, but if so, that would be a much > simpler solution. > > You could compare this to the various locale environment variables, > though I think a lot of commands whose behavior differ from locale to > locale do have to implement their own handling of that internally, at > least to some extent. > > This seems like somewhat less of a hack, and if no part of a program > looks for those environment variables, it isn't going to find itself > getting broken by the dynamic linker. It's just not going to change > its buffering. > > Additionally, things that don't link against libc could still honor > these environment variables, if the developers behind them care to put > in the effort. env variables are what I proposed 18 years ago now: https://sourceware.org/bugzilla/show_bug.cgi?id=2457 cheers, Pádraig ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)? 2024-04-19 9:32 ` Pádraig Brady @ 2024-04-19 11:36 ` Zachary Santer 2024-04-19 12:26 ` Pádraig Brady 0 siblings, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-04-19 11:36 UTC (permalink / raw) To: Pádraig Brady; +Cc: Carl Edquist, Kaz Kylheku, libc-alpha, coreutils On Fri, Apr 19, 2024 at 5:32 AM Pádraig Brady <P@draigbrady.com> wrote: > > env variables are what I proposed 18 years ago now: > https://sourceware.org/bugzilla/show_bug.cgi?id=2457 And the "resistance to that" from the Red Hat people 24 years ago is listed on a website that doesn't exist anymore. If I'm to argue with a guy from 18 years ago... Ulrich Drepper wrote: > Hell, no. Programs expect a certain buffer mode and perhaps would work > unexpectedly if this changes. By setting a mode to unbuffered, for instance, > you can easily DoS a system. I can think about enough other reasons why this is > a terrible idea. Programs explicitly must request a buffering scheme so that it > matches the way the program uses the stream. If buffering were set according to the env vars before the program configures buffers on its end, if it chooses to, then the env vars have no effect. This is how the stdbuf util works, right now. Would programs that expect a certain buffer mode not set that mode explicitly themselves? Are you allowing untrusted users to set env vars for important daemons or something? How is this a valid concern? This is specific to the standard streams, 0-2. Buffering of stdout and stderr is already configured dynamically by libc. If it's going to a terminal, it's line-buffered. If it's not, it's fully buffered. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)? 2024-04-19 11:36 ` Zachary Santer @ 2024-04-19 12:26 ` Pádraig Brady 2024-04-19 16:11 ` Zachary Santer 0 siblings, 1 reply; 53+ messages in thread From: Pádraig Brady @ 2024-04-19 12:26 UTC (permalink / raw) To: Zachary Santer; +Cc: Carl Edquist, Kaz Kylheku, libc-alpha, coreutils On 19/04/2024 12:36, Zachary Santer wrote: > On Fri, Apr 19, 2024 at 5:32 AM Pádraig Brady <P@draigbrady.com> wrote: >> >> env variables are what I proposed 18 years ago now: >> https://sourceware.org/bugzilla/show_bug.cgi?id=2457 > > And the "resistance to that" from the Red Hat people 24 years ago is > listed on a website that doesn't exist anymore. > > If I'm to argue with a guy from 18 years ago... > > Ulrich Drepper wrote: >> Hell, no. Programs expect a certain buffer mode and perhaps would work >> unexpectedly if this changes. By setting a mode to unbuffered, for instance, >> you can easily DoS a system. I can think about enough other reasons why this is >> a terrible idea. Programs explicitly must request a buffering scheme so that it >> matches the way the program uses the stream. > > If buffering were set according to the env vars before the program > configures buffers on its end, if it chooses to, then the env vars > have no effect. This is how the stdbuf util works, right now. Would > programs that expect a certain buffer mode not set that mode > explicitly themselves? Are you allowing untrusted users to set env > vars for important daemons or something? How is this a valid concern? > > This is specific to the standard streams, 0-2. Buffering of stdout and > stderr is already configured dynamically by libc. If it's going to a > terminal, it's line-buffered. If it's not, it's fully buffered. Playing devil's advocate, I guess programs may be depending on the automatic buffering modes set. I guess the thinking is that it was too easy to perturb the system with env vars, though you can already do that with LD_PRELOAD. Perhaps at this stage we should consider stdbuf ubiquitous enough to suffice, noting that it's also supported on FreeBSD. I'm surprised that the LD_PRELOAD setting is breaking your ada build, and it would be interesting to determine the reason for that. cheers, Pádraig ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)? 2024-04-19 12:26 ` Pádraig Brady @ 2024-04-19 16:11 ` Zachary Santer 0 siblings, 0 replies; 53+ messages in thread From: Zachary Santer @ 2024-04-19 16:11 UTC (permalink / raw) To: Pádraig Brady; +Cc: Carl Edquist, Kaz Kylheku, libc-alpha, coreutils On Fri, Apr 19, 2024 at 8:26 AM Pádraig Brady <P@draigbrady.com> wrote: > > Perhaps at this stage we should consider stdbuf ubiquitous enough to suffice, > noting that it's also supported on FreeBSD. Alternatively, if glibc were modified to act on these hypothetical environment variables, it would be trivial to have stdbuf simply set those, to ensure backwards compatibility. > I'm surprised that the LD_PRELOAD setting is breaking your ada build, > and it would be interesting to determine the reason for that. If I had that kind of time... ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)? 2024-04-19 0:16 ` Modify buffering of standard streams via environment variables (not LD_PRELOAD)? Zachary Santer 2024-04-19 9:32 ` Pádraig Brady @ 2024-04-20 16:00 ` Carl Edquist 2024-04-20 20:00 ` Zachary Santer 1 sibling, 1 reply; 53+ messages in thread From: Carl Edquist @ 2024-04-20 16:00 UTC (permalink / raw) To: Zachary Santer; +Cc: libc-alpha, coreutils, Pádraig Brady [-- Attachment #1: Type: text/plain, Size: 1425 bytes --] On Thu, 18 Apr 2024, Zachary Santer wrote: > On Wed, Mar 20, 2024 at 4:54 AM Carl Edquist <edquist@cs.wisc.edu> wrote: >> >> However, if stdbuf's magic env vars are exported in your shell (either >> by doing a trick like 'export $(env -i stdbuf -oL env)', or else more >> simply by first starting a new shell with 'stdbuf -oL bash'), then >> every command in your pipelines will start with the new default >> line-buffered stdout. That way your line-items from build.sh should get >> passed all the way through the pipeline as they are produced. > > Finally had a chance to try to build with 'stdbuf --output=L --error=L > --' in front of the build script, and it caused some crazy problems. For what it's worth, when I was trying that out msys2 (since that's what you said you were using), I also ran into some very weird errors when just trying to export LD_PRELOAD and _STDBUF_O to what stdbuf -oL sets. It was weird because I didn't see issues when just running a command (including bash) directly under stdbuf. I didn't get to the bottom of it though and I don't have access to a windows laptop any more to experiment. Also I might ask, why are you setting "--error=L" ? Not that this is the problem you're seeing, but in any case stderr is unbuffered by default, and you might mess up the output a bit by line buffering it, if it's expecting to output partial lines for progress or whatever. Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)? 2024-04-20 16:00 ` Carl Edquist @ 2024-04-20 20:00 ` Zachary Santer 2024-04-20 21:45 ` Carl Edquist 0 siblings, 1 reply; 53+ messages in thread From: Zachary Santer @ 2024-04-20 20:00 UTC (permalink / raw) To: Carl Edquist; +Cc: libc-alpha, coreutils, Pádraig Brady On Sat, Apr 20, 2024 at 11:58 AM Carl Edquist <edquist@cs.wisc.edu> wrote: > > On Thu, 18 Apr 2024, Zachary Santer wrote: > > > > Finally had a chance to try to build with 'stdbuf --output=L --error=L > > --' in front of the build script, and it caused some crazy problems. > > For what it's worth, when I was trying that out msys2 (since that's what > you said you were using), I also ran into some very weird errors when just > trying to export LD_PRELOAD and _STDBUF_O to what stdbuf -oL sets. It was > weird because I didn't see issues when just running a command (including > bash) directly under stdbuf. I didn't get to the bottom of it though and > I don't have access to a windows laptop any more to experiment. This was actually in RHEL 7. stdbuf --output=L --error=L -- "${@}" 2>&1 | tee log-file | while IFS='' read -r line; do # do stuff done # And then obviously the arguments to this script give the command I want it to run. > Also I might ask, why are you setting "--error=L" ? > > Not that this is the problem you're seeing, but in any case stderr is > unbuffered by default, and you might mess up the output a bit by line > buffering it, if it's expecting to output partial lines for progress or > whatever. I don't know how buffering works when stdout and stderr get redirected to the same pipe. You'd think, whatever it is, it would have to be smart enough to keep them interleaved in the same order they were printed to in. That in mind, I would assume they both get placed into the same block buffer by default. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: Modify buffering of standard streams via environment variables (not LD_PRELOAD)? 2024-04-20 20:00 ` Zachary Santer @ 2024-04-20 21:45 ` Carl Edquist 0 siblings, 0 replies; 53+ messages in thread From: Carl Edquist @ 2024-04-20 21:45 UTC (permalink / raw) To: Zachary Santer; +Cc: libc-alpha, coreutils, Pádraig Brady On Sat, 20 Apr 2024, Zachary Santer wrote: > This was actually in RHEL 7. Oh. In that case it might be worth looking into ... > I don't know how buffering works when stdout and stderr get redirected > to the same pipe. You'd think, whatever it is, it would have to be smart > enough to keep them interleaved in the same order they were printed to > in. That in mind, I would assume they both get placed into the same > block buffer by default. My take is always to try it and find out. Though in this case I think the default (without using stdbuf) is that the program's stderr is output to the pipe immediately (ie, unbuffered) on each library call (fprintf(3), fputs(3), putc(3), fwrite(3)), while stdout is written to the pipe at block boundaries - even though fd 1 and 2 refer to the same pipe. If you force line buffering for stdout and stderr, that is likely what you want, and it will interleave _lines_ in the order that they were printed. However, stdout and stderr are still separate streams even if they refer to the same output file/pipe/device, so partial lines are not interleaved in the order that they were printed. For example: #include <stdio.h> int main() { putc('a', stderr); putc('1', stdout); putc('b', stderr); putc('2', stdout); putc('c', stderr); putc('3', stdout); putc('\n', stderr); putc('\n', stdout); return 0; } will output "abc\n123\n" instead of "a1b2c3\n\n", even if you run it as $ ./abc123 2>&1 | cat or $ stdbuf -oL -eL ./abc123 2>&1 | cat ... Not that that's relevant for what you're doing :) Carl ^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2024-04-28 17:50 UTC | newest] Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CABkLJULa8c0zr1BkzWLTpAxHBcpb15Xms0-Q2OOVCHiAHuL0uA@mail.gmail.com> [not found] ` <9831afe6-958a-fbd3-9434-05dd0c9b602a@draigBrady.com> 2024-03-10 15:29 ` RFE: enable buffering on null-terminated data Zachary Santer 2024-03-10 20:36 ` Carl Edquist 2024-03-11 3:48 ` Zachary Santer 2024-03-11 11:54 ` Carl Edquist 2024-03-11 15:12 ` Examples of concurrent coproc usage? Zachary Santer 2024-03-14 9:58 ` Carl Edquist 2024-03-17 19:40 ` Zachary Santer 2024-04-01 19:24 ` Chet Ramey 2024-04-01 19:31 ` Chet Ramey 2024-04-02 16:22 ` Carl Edquist 2024-04-03 13:54 ` Chet Ramey 2024-04-03 14:32 ` Chet Ramey 2024-04-03 17:19 ` Zachary Santer 2024-04-08 15:07 ` Chet Ramey 2024-04-09 3:44 ` Zachary Santer 2024-04-13 18:45 ` Chet Ramey 2024-04-14 2:09 ` Zachary Santer 2024-04-04 12:52 ` Carl Edquist 2024-04-04 23:23 ` Martin D Kealey 2024-04-08 19:50 ` Chet Ramey 2024-04-09 14:46 ` Zachary Santer 2024-04-13 18:51 ` Chet Ramey 2024-04-09 15:58 ` Carl Edquist 2024-04-13 20:10 ` Chet Ramey 2024-04-14 18:43 ` Zachary Santer 2024-04-15 18:55 ` Chet Ramey 2024-04-15 17:01 ` Carl Edquist 2024-04-17 14:20 ` Chet Ramey 2024-04-20 22:04 ` Carl Edquist 2024-04-22 16:06 ` Chet Ramey 2024-04-27 16:56 ` Carl Edquist 2024-04-28 17:50 ` Chet Ramey 2024-04-08 16:21 ` Chet Ramey 2024-04-12 16:49 ` Carl Edquist 2024-04-16 15:48 ` Chet Ramey 2024-04-20 23:11 ` Carl Edquist 2024-04-22 16:12 ` Chet Ramey 2024-04-17 14:37 ` Chet Ramey 2024-04-20 22:04 ` Carl Edquist 2024-03-12 3:34 ` RFE: enable buffering on null-terminated data Zachary Santer 2024-03-14 14:15 ` Carl Edquist 2024-03-18 0:12 ` Zachary Santer 2024-03-19 5:24 ` Kaz Kylheku 2024-03-19 12:50 ` Zachary Santer 2024-03-20 8:55 ` Carl Edquist 2024-04-19 0:16 ` Modify buffering of standard streams via environment variables (not LD_PRELOAD)? Zachary Santer 2024-04-19 9:32 ` Pádraig Brady 2024-04-19 11:36 ` Zachary Santer 2024-04-19 12:26 ` Pádraig Brady 2024-04-19 16:11 ` Zachary Santer 2024-04-20 16:00 ` Carl Edquist 2024-04-20 20:00 ` Zachary Santer 2024-04-20 21:45 ` Carl Edquist
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).