On 7/3/2020 7:09 AM, Ken Brown via Cygwin wrote: > On 7/2/2020 1:50 PM, Morten Kjærulff via Cygwin wrote: >> I think we got a new release around the beginning of June, right? >> You said that there were still issues (I can confirm). >> If it can help, here is the output I see today of above scripts: >> >> $ ./tp.sh > [...] >>        0 [fifo_reader] diff 1806 C:\cygwin\bin\diff.exe: *** fatal >> error - Can't update my handlers, Win32 error 87 > > Thanks for the report and the simple test case.  Obviously I still have more > work to do on this. Hi Morten, I've attempted to fix the bugs (see https://cygwin.com/pipermail/cygwin-patches/2020q3/010380.html). With these patches installed, I no longer see a fatal error or hanging diff processes. But your script still doesn't work as you expect. On a typical run of the parallel part, 6 or 7 of the 10 diff processes see the FIFO t.pip as empty. Here's a sample run under strace, so that I could see what was going on. 7 of the 10 diff processes saw t.pip as empty on this run. $ strace -o tpip.sh.strace sh -c ./tpip.sh PID PPID PGID WINPID TTY UID STIME COMMAND 1307 1306 1307 10932 pty1 197609 17:50:12 /usr/bin/bash 18426 1 18426 9360 cons0 197609 06:47:31 /usr/bin/sh 18429 18426 18426 13900 cons0 197609 06:47:32 /usr/bin/ps 1306 1 1306 3768 ? 197609 17:50:11 /usr/bin/mintty 18424 1307 18424 21840 pty1 197609 06:47:31 /usr/bin/strace result1 start 10 0 0 0 0 0 0 0 0 0 0 result1 end 0a1,2 > line1 > line2 0a1,2 > line1 > line2 0a1,2 > line1 > line2 0a1,2 > line1 > line2 0a1,2 > line1 > line2 0a1,2 0a1,2 > line1 > line1 > line2 > line2 result2 start 10 0 1 1 0 1 1 1 1 1 0 result2 end PID PPID PGID WINPID TTY UID STIME COMMAND 18480 18430 18426 15580 cons0 197609 06:47:33 /usr/bin/cp 1307 1306 1307 10932 pty1 197609 17:50:12 /usr/bin/bash 18484 18426 18426 21264 cons0 197609 06:47:44 /usr/bin/ps 18430 18426 18426 23472 cons0 197609 06:47:32 /usr/bin/sh 18426 1 18426 9360 cons0 197609 06:47:31 /usr/bin/sh 1306 1 1306 3768 ? 197609 17:50:11 /usr/bin/mintty 18424 1307 18424 21840 pty1 197609 06:47:31 /usr/bin/strace I'm attaching your script for ease of reference, and I'm attaching an excerpt from the strace output, to which I've added a few comments. The excerpt shows all open, close, read, and write system calls involving t.pip. Here's a summary of what you can see from those system calls in the parallel part of the script. In what follows, I've called the diff processes diff-1, diff-2,..., diff-10, and similarly for the cp processes (although there are only four of them). 1. cp-1 tries to open t.pip for writing and blocks. It unblocks when diff-1 opens t.pip for reading, and both processes run to completion as expected. 2. diff-2, diff-3, diff-4, and diff-5 try to open t.pip for reading, and they block until cp-2 opens it for writing. Then cp-2 writes 12 bytes to t.pip and closes it, and the four diff processes all try to read. diff-4 gets there first and reads the 12 bytes; it reads once more and sees EOF because there is no data available in the pipe and there are no writers open[1], so it considers those 12 bytes to constitute the file t.pip. It later exits with success. diff-2, diff-3, and diff-5 all complete their reads before cp-3 opens t.pip. They see EOF for the same reason as above, so t.pip appears empty and they exit with failure. 3. diff-6, diff-7, diff-8, diff-9, and diff-10 try to open t.pip for reading, and they block until cp-3 opens it for writing. Then cp-3 writes 12 bytes to t.pip and closes it, and the five diff processes all try to read. diff-10 gets there first and reads 12 bytes followed by EOF; it later exits with success. diff-6, diff-7, diff-8, and diff-9 all complete their reads before cp-4 opens t.pip. They see EOF, so t.pip appears empty and they exit with failure. 4. cp-4 tries to open t.pip and blocks because there are no more diff processes. I've run your script on Linux a few times, and it usually[2] behaves as you expect, with all diff processes succeeding. For reasons I don't understand, the diff and cp processes apparently alternate most of the time on Linux, rather than having 4 or 5 diff processes lumped together between the cp processes as on Cygwin. If someone can figure out the reason for the difference, and if it turns out to be related to the FIFO code, I could try to modify the code to make Cygwin behave more like Linux. Ken [1] From https://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html: When attempting to read from an empty pipe or FIFO: * If no process has the pipe open for writing, read() shall return 0 to indicate end-of-file. [2] But I did have one Linux run in which one of the ten diff processes saw an empty t.pip and failed as on Cygwin.